main 0fed9a41fe8a cached
15 files
1.0 MB
313.2k tokens
20 symbols
1 requests
Download .txt
Showing preview only (1,102K chars total). Download the full file or copy to clipboard to get everything.
Repository: Vaibhavs10/insanely-fast-whisper
Branch: main
Commit: 0fed9a41fe8a
Files: 15
Total size: 1.0 MB

Directory structure:
gitextract_wz5d105z/

├── .gitignore
├── LICENSE
├── README.md
├── convert_output.py
├── insanely_fast_whisper_colab.ipynb
├── notebooks/
│   ├── infer_faster_whisper_large_v2.ipynb
│   └── infer_transformers_whisper_large_v2.ipynb
├── pyproject.toml
├── src/
│   └── insanely_fast_whisper/
│       ├── __init__.py
│       ├── cli.py
│       └── utils/
│           ├── __init__.py
│           ├── diarization_pipeline.py
│           ├── diarize.py
│           └── result.py
└── tests/
    └── __init__.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/#use-with-ide
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

.vscode/
.idea/

================================================
FILE: LICENSE
================================================
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: README.md
================================================
# Insanely Fast Whisper

An opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by 🤗 *Transformers*, *Optimum* & *flash-attn*

**TL;DR** - Transcribe **150** minutes (2.5 hours) of audio in less than **98** seconds - with [OpenAI's Whisper Large v3](https://huggingface.co/openai/whisper-large-v3). Blazingly fast transcription is now a reality!⚡️

```
pipx install insanely-fast-whisper==0.0.15 --force
```

<p align="center">
<img src="https://huggingface.co/datasets/reach-vb/random-images/resolve/main/insanely-fast-whisper-img.png" width="615" height="308">
</p>

Not convinced? Here are some benchmarks we ran on a Nvidia A100 - 80GB 👇

| Optimisation type    | Time to Transcribe (150 mins of Audio) |
|------------------|------------------|
| large-v3 (Transformers) (`fp32`)             | ~31 (*31 min 1 sec*)             |
| large-v3 (Transformers) (`fp16` + `batching [24]` + `bettertransformer`) | ~5 (*5 min 2 sec*)            |
| **large-v3 (Transformers) (`fp16` + `batching [24]` + `Flash Attention 2`)** | **~2 (*1 min 38 sec*)**            |
| distil-large-v2 (Transformers) (`fp16` + `batching [24]` + `bettertransformer`) | ~3 (*3 min 16 sec*)            |
| **distil-large-v2 (Transformers) (`fp16` + `batching [24]` + `Flash Attention 2`)** | **~1 (*1 min 18 sec*)**           |
| large-v2 (Faster Whisper) (`fp16` + `beam_size [1]`) | ~9.23 (*9 min 23 sec*)            |
| large-v2 (Faster Whisper) (`8-bit` + `beam_size [1]`) | ~8 (*8 min 15 sec*)            |

P.S. We also ran the benchmarks on a [Google Colab T4 GPU](/notebooks/) instance too!

P.P.S. This project originally started as a way to showcase benchmarks for Transformers, but has since evolved into a lightweight CLI for people to use. This is purely community driven. We add whatever community seems to have a strong demand for! 

## 🆕 Blazingly fast transcriptions via your terminal! ⚡️

We've added a CLI to enable fast transcriptions. Here's how you can use it:

Install `insanely-fast-whisper` with `pipx` (`pip install pipx` or `brew install pipx`):

```bash
pipx install insanely-fast-whisper
```

⚠️ If you have python 3.11.XX installed, `pipx` may parse the version incorrectly and install a very old version of `insanely-fast-whisper` without telling you (version `0.0.8`, which won't work anymore with the current `BetterTransformers`). In that case, you can install the latest version by passing `--ignore-requires-python` to `pip`:

```bash
pipx install insanely-fast-whisper --force --pip-args="--ignore-requires-python"
```

If you're installing with `pip`, you can pass the argument directly: `pip install insanely-fast-whisper --ignore-requires-python`.


Run inference from any path on your computer:

```bash
insanely-fast-whisper --file-name <filename or URL>
```
*Note: if you are running on macOS, you also need to add `--device-id mps` flag.*

🔥 You can run [Whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) w/ [Flash Attention 2](https://github.com/Dao-AILab/flash-attention) from this CLI too:

```bash
insanely-fast-whisper --file-name <filename or URL> --flash True 
```

🌟 You can run [distil-whisper](https://huggingface.co/distil-whisper) directly from this CLI too:

```bash
insanely-fast-whisper --model-name distil-whisper/large-v2 --file-name <filename or URL> 
```

Don't want to install `insanely-fast-whisper`? Just use `pipx run`:

```bash
pipx run insanely-fast-whisper --file-name <filename or URL>
```

> [!NOTE]
> The CLI is highly opinionated and only works on NVIDIA GPUs & Mac. Make sure to check out the defaults and the list of options you can play around with to maximise your transcription throughput. Run `insanely-fast-whisper --help` or `pipx run insanely-fast-whisper --help` to get all the CLI arguments along with their defaults. 


## CLI Options

The `insanely-fast-whisper` repo provides an all round support for running Whisper in various settings. Note that as of today 26th Nov, `insanely-fast-whisper` works on both CUDA and mps (mac) enabled devices.
```
  -h, --help            show this help message and exit
  --file-name FILE_NAME
                        Path or URL to the audio file to be transcribed.
  --device-id DEVICE_ID
                        Device ID for your GPU. Just pass the device number when using CUDA, or "mps" for Macs with Apple Silicon. (default: "0")
  --transcript-path TRANSCRIPT_PATH
                        Path to save the transcription output. (default: output.json)
  --model-name MODEL_NAME
                        Name of the pretrained model/ checkpoint to perform ASR. (default: openai/whisper-large-v3)
  --task {transcribe,translate}
                        Task to perform: transcribe or translate to another language. (default: transcribe)
  --language LANGUAGE   
                        Language of the input audio. (default: "None" (Whisper auto-detects the language))
  --batch-size BATCH_SIZE
                        Number of parallel batches you want to compute. Reduce if you face OOMs. (default: 24)
  --flash FLASH         
                        Use Flash Attention 2. Read the FAQs to see how to install FA2 correctly. (default: False)
  --timestamp {chunk,word}
                        Whisper supports both chunked as well as word level timestamps. (default: chunk)
  --hf-token HF_TOKEN
                        Provide a hf.co/settings/token for Pyannote.audio to diarise the audio clips
  --diarization_model DIARIZATION_MODEL
                        Name of the pretrained model/ checkpoint to perform diarization. (default: pyannote/speaker-diarization)
  --num-speakers NUM_SPEAKERS
                        Specifies the exact number of speakers present in the audio file. Useful when the exact number of participants in the conversation is known. Must be at least 1. Cannot be used together with --min-speakers or --max-speakers. (default: None)
  --min-speakers MIN_SPEAKERS
                        Sets the minimum number of speakers that the system should consider during diarization. Must be at least 1. Cannot be used together with --num-speakers. Must be less than or equal to --max-speakers if both are specified. (default: None)
  --max-speakers MAX_SPEAKERS
                        Defines the maximum number of speakers that the system should consider in diarization. Must be at least 1. Cannot be used together with --num-speakers. Must be greater than or equal to --min-speakers if both are specified. (default: None)
```

## Frequently Asked Questions

**How to correctly install flash-attn to make it work with `insanely-fast-whisper`?**

Make sure to install it via `pipx runpip insanely-fast-whisper install flash-attn --no-build-isolation`. Massive kudos to @li-yifei for helping with this.

**How to solve an `AssertionError: Torch not compiled with CUDA enabled` error on Windows?**

The root cause of this problem is still unknown, however, you can resolve this by manually installing torch in the virtualenv like `python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`. Thanks to @pto2k for all tdebugging this.

**How to avoid Out-Of-Memory (OOM) exceptions on Mac?**

The *mps* backend isn't as optimised as CUDA, hence is way more memory hungry. Typically you can run with `--batch-size 4` without any issues (should use roughly 12GB GPU VRAM). Don't forget to set `--device-id mps`.

## How to use Whisper without a CLI?

<details>
<summary>All you need to run is the below snippet:</summary>

```
pip install --upgrade transformers optimum accelerate
```

```python
import torch
from transformers import pipeline
from transformers.utils import is_flash_attn_2_available

pipe = pipeline(
    "automatic-speech-recognition",
    model="openai/whisper-large-v3", # select checkpoint from https://huggingface.co/openai/whisper-large-v3#model-details
    torch_dtype=torch.float16,
    device="cuda:0", # or mps for Mac devices
    model_kwargs={"attn_implementation": "flash_attention_2"} if is_flash_attn_2_available() else {"attn_implementation": "sdpa"},
)

outputs = pipe(
    "<FILE_NAME>",
    chunk_length_s=30,
    batch_size=24,
    return_timestamps=True,
)

outputs
```
</details>

## Acknowledgements

1. [OpenAI Whisper](https://github.com/openai/whisper) team for open sourcing such a brilliant check point.
2. Hugging Face Transformers team, specifically [Arthur](https://github.com/ArthurZucker), [Patrick](https://github.com/patrickvonplaten), [Sanchit](https://github.com/sanchit-gandhi) & [Yoach](https://github.com/ylacombe)  (alphabetical order) for continuing to maintain Whisper in Transformers.
3. Hugging Face [Optimum](https://github.com/huggingface/optimum) team for making the BetterTransformer API so easily accessible.
4. [Patrick Arminio](https://github.com/patrick91) for helping me tremendously to put together this CLI.

## Community showcase

1. @ochen1 created a brilliant MVP for a CLI here: https://github.com/ochen1/insanely-fast-whisper-cli (Try it out now!)
2. @arihanv created an app (Shush) using NextJS (Frontend) & Modal (Backend): https://github.com/arihanv/Shush (Check it outtt!)
3. @kadirnar created a python package on top of the transformers with optimisations: https://github.com/kadirnar/whisper-plus (Go go go!!!)


================================================
FILE: convert_output.py
================================================
import argparse
import json
import os


class TxtFormatter:
    @classmethod
    def preamble(cls):
        return ""

    @classmethod
    def format_chunk(cls, chunk, index):
        text = chunk['text']
        return f"{text}\n"


class SrtFormatter:
    @classmethod
    def preamble(cls):
        return ""

    @classmethod
    def format_seconds(cls, seconds):
        whole_seconds = int(seconds)
        milliseconds = int((seconds - whole_seconds) * 1000)

        hours = whole_seconds // 3600
        minutes = (whole_seconds % 3600) // 60
        seconds = whole_seconds % 60

        return f"{hours:02d}:{minutes:02d}:{seconds:02d},{milliseconds:03d}"

    @classmethod
    def format_chunk(cls, chunk, index):
        text = chunk['text']
        start, end = chunk['timestamp'][0], chunk['timestamp'][1]
        start_format, end_format = cls.format_seconds(start), cls.format_seconds(end)
        return f"{index}\n{start_format} --> {end_format}\n{text}\n\n"


class VttFormatter:
    @classmethod
    def preamble(cls):
        return "WEBVTT\n\n"

    @classmethod
    def format_seconds(cls, seconds):
        whole_seconds = int(seconds)
        milliseconds = int((seconds - whole_seconds) * 1000)

        hours = whole_seconds // 3600
        minutes = (whole_seconds % 3600) // 60
        seconds = whole_seconds % 60

        return f"{hours:02d}:{minutes:02d}:{seconds:02d}.{milliseconds:03d}"

    @classmethod
    def format_chunk(cls, chunk, index):
        text = chunk['text']
        start, end = chunk['timestamp'][0], chunk['timestamp'][1]
        start_format, end_format = cls.format_seconds(start), cls.format_seconds(end)
        return f"{index}\n{start_format} --> {end_format}\n{text}\n\n"


def convert(input_path, output_format, output_dir, verbose):
    with open(input_path, 'r') as file:
        data = json.load(file)

    formatter_class = {
        'srt': SrtFormatter,
        'vtt': VttFormatter,
        'txt': TxtFormatter
    }.get(output_format)

    string = formatter_class.preamble()
    for index, chunk in enumerate(data['chunks'], 1):
        entry = formatter_class.format_chunk(chunk, index)

        if verbose:
            print(entry)

        string += entry

    with open(os.path.join(output_dir, f"output.{output_format}"), 'w', encoding='utf-8') as file:
        file.write(string)

def main():
    parser = argparse.ArgumentParser(description="Convert JSON to an output format.")
    parser.add_argument("input_file", help="Input JSON file path")
    parser.add_argument("-f", "--output_format", default="all", help="Format of the output file (default: srt)", choices=["txt", "vtt", "srt"])
    parser.add_argument("-o", "--output_dir", default=".", help="Directory where the output file/s is/are saved")
    parser.add_argument("--verbose", action="store_true", help="Print each VTT entry as it's added")

    args = parser.parse_args()
    convert(args.input_file, args.output_format, args.output_dir, args.verbose)

if __name__ == "__main__":
    # Example Usage:
    # python convert_output.py output.json -f vtt -o /tmp/my/output/dir
    main()


================================================
FILE: insanely_fast_whisper_colab.ipynb
================================================
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "gpuType": "T4",
      "authorship_tag": "ABX9TyNO3mkZ+HMQrvkMHRtFpKvj",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/Vaibhavs10/insanely-fast-whisper/blob/main/insanely_fast_whisper_colab.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# [Insanely Fast Whisper](https://github.com/Vaibhavs10/insanely-fast-whisper)\n",
        "\n",
        "By VB (https://twitter.com/reach_vb)\n",
        "\n",
        "P.S. Make sure you're on a GPU run-time 🤗"
      ],
      "metadata": {
        "id": "q0MBgZKbhdII"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!pip install -q pipx && apt install python3.10-venv"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "VF-qp-FWJmyD",
        "outputId": "10712868-be6e-4b82-b8c2-95e43c591173"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\u001b[?25l     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/57.8 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m57.8/57.8 kB\u001b[0m \u001b[31m2.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m41.7/41.7 kB\u001b[0m \u001b[31m5.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "Reading package lists... Done\n",
            "Building dependency tree... Done\n",
            "Reading state information... Done\n",
            "The following additional packages will be installed:\n",
            "  python3-pip-whl python3-setuptools-whl\n",
            "The following NEW packages will be installed:\n",
            "  python3-pip-whl python3-setuptools-whl python3.10-venv\n",
            "0 upgraded, 3 newly installed, 0 to remove and 9 not upgraded.\n",
            "Need to get 2,473 kB of archives.\n",
            "After this operation, 2,884 kB of additional disk space will be used.\n",
            "Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3-pip-whl all 22.0.2+dfsg-1ubuntu0.4 [1,680 kB]\n",
            "Get:2 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3-setuptools-whl all 59.6.0-1.2ubuntu0.22.04.1 [788 kB]\n",
            "Get:3 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3.10-venv amd64 3.10.12-1~22.04.2 [5,724 B]\n",
            "Fetched 2,473 kB in 2s (1,635 kB/s)\n",
            "Selecting previously unselected package python3-pip-whl.\n",
            "(Reading database ... 120880 files and directories currently installed.)\n",
            "Preparing to unpack .../python3-pip-whl_22.0.2+dfsg-1ubuntu0.4_all.deb ...\n",
            "Unpacking python3-pip-whl (22.0.2+dfsg-1ubuntu0.4) ...\n",
            "Selecting previously unselected package python3-setuptools-whl.\n",
            "Preparing to unpack .../python3-setuptools-whl_59.6.0-1.2ubuntu0.22.04.1_all.deb ...\n",
            "Unpacking python3-setuptools-whl (59.6.0-1.2ubuntu0.22.04.1) ...\n",
            "Selecting previously unselected package python3.10-venv.\n",
            "Preparing to unpack .../python3.10-venv_3.10.12-1~22.04.2_amd64.deb ...\n",
            "Unpacking python3.10-venv (3.10.12-1~22.04.2) ...\n",
            "Setting up python3-setuptools-whl (59.6.0-1.2ubuntu0.22.04.1) ...\n",
            "Setting up python3-pip-whl (22.0.2+dfsg-1ubuntu0.4) ...\n",
            "Setting up python3.10-venv (3.10.12-1~22.04.2) ...\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "!pipx run insanely-fast-whisper --file-name https://huggingface.co/datasets/reach-vb/random-audios/resolve/main/ted_60.wav"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "i_H9Dm89Jj0-",
        "outputId": "f737b9fd-d625-4ccd-d8a1-1895cdf1b22f"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "config.json: 100% 1.25k/1.25k [00:00<00:00, 6.33MB/s]\n",
            "model.safetensors: 100% 3.09G/3.09G [00:12<00:00, 242MB/s]\n",
            "generation_config.json: 100% 3.87k/3.87k [00:00<00:00, 17.3MB/s]\n",
            "tokenizer_config.json: 100% 283k/283k [00:00<00:00, 2.15MB/s]\n",
            "vocab.json: 100% 1.04M/1.04M [00:00<00:00, 5.28MB/s]\n",
            "tokenizer.json: 100% 2.48M/2.48M [00:00<00:00, 9.49MB/s]\n",
            "merges.txt: 100% 494k/494k [00:00<00:00, 3.74MB/s]\n",
            "normalizer.json: 100% 52.7k/52.7k [00:00<00:00, 97.3MB/s]\n",
            "added_tokens.json: 100% 34.6k/34.6k [00:00<00:00, 110MB/s]\n",
            "special_tokens_map.json: 100% 2.07k/2.07k [00:00<00:00, 8.95MB/s]\n",
            "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n",
            "preprocessor_config.json: 100% 340/340 [00:00<00:00, 1.98MB/s]\n",
            "The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.\n",
            "\u001b[2K🤗 \u001b[33mTranscribing...\u001b[0m \u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[93m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m\u001b[37m━\u001b[0m \u001b[33m0:00:09\u001b[0m\n",
            "\u001b[?25hVoila! Your file has been transcribed go check it out over here! output.json\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "!head output.json"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "NDFrydpsvu57",
        "outputId": "de3d9635-5cf1-46ca-d401-e6c78c5659dc"
      },
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "{\"text\": \" So in college, I was a government major, which means I had to write a lot of papers. Now, when a normal student writes a paper, they might spread the work out a little like this. So, you know, you get started maybe a little slowly, but you get enough done in the first week that with some heavier days later on, everything gets done and things stay civil. And I would want to do that, like that. That would be the plan. I would have it all ready to go, but then actually the paper would come along, and then I would kind of do this. And that would happen to every single paper. But then came my 90-page senior thesis, a paper you're supposed to spend a year on. I knew for a paper like that, my normal workflow was not an option. It was way too big a project. So I planned things out, and I decided it kind of had to go something like this. This is how the year would go. So I'd start off light,\", \"chunks\": [{\"timestamp\": [0.0, 4.48], \"text\": \" So in college, I was a government major,\"}, {\"timestamp\": [4.88, 6.62], \"text\": \" which means I had to write a lot of papers.\"}, {\"timestamp\": [7.42, 8.86], \"text\": \" Now, when a normal student writes a paper,\"}, {\"timestamp\": [8.94, 10.6], \"text\": \" they might spread the work out a little like this.\"}, {\"timestamp\": [11.74, 16.3], \"text\": \" So, you know, you get started maybe a little slowly,\"}, {\"timestamp\": [16.36, 17.86], \"text\": \" but you get enough done in the first week\"}, {\"timestamp\": [17.86, 19.76], \"text\": \" that with some heavier days later on,\"}, {\"timestamp\": [20.28, 21.98], \"text\": \" everything gets done and things stay civil.\"}, {\"timestamp\": [23.64, 25.8], \"text\": \" And I would want to do that, like that.\"}, {\"timestamp\": [26.12, 26.94], \"text\": \" That would be the plan.\"}, {\"timestamp\": [27.22, 29.84], \"text\": \" I would have it all ready to go,\"}, {\"timestamp\": [29.96, 32.42], \"text\": \" but then actually the paper would come along,\"}, {\"timestamp\": [32.46, 33.6], \"text\": \" and then I would kind of do this.\"}, {\"timestamp\": [36.48, 38.44], \"text\": \" And that would happen to every single paper.\"}, {\"timestamp\": [39.32, 43.04], \"text\": \" But then came my 90-page senior thesis,\"}, {\"timestamp\": [43.54, 46.0], \"text\": \" a paper you're supposed to spend a year on.\"}, {\"timestamp\": [46.0, 50.0], \"text\": \" I knew for a paper like that, my normal workflow was not an option.\"}, {\"timestamp\": [50.0, 52.0], \"text\": \" It was way too big a project.\"}, {\"timestamp\": [52.0, 56.0], \"text\": \" So I planned things out, and I decided it kind of had to go something like this.\"}, {\"timestamp\": [56.0, 58.0], \"text\": \" This is how the year would go.\"}, {\"timestamp\": [58.0, 60.0], \"text\": \" So I'd start off light,\"}]}"
          ]
        }
      ]
    }
  ]
}

================================================
FILE: notebooks/infer_faster_whisper_large_v2.ipynb
================================================
{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "gpuType": "T4",
      "authorship_tag": "ABX9TyMwR8QMvgbtqW/68vTghyFN",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    },
    "accelerator": "GPU",
    "widgets": {
      "application/vnd.jupyter.widget-state+json": {
        "84f156ad5d024f64ab9685a2276f8804": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_4479058964d14c1883e18a2cb99b99d1",
              "IPY_MODEL_498a4497ce56479e9e72aae4046a3efd",
              "IPY_MODEL_5642c6b4259643018903e3a95dc501cb"
            ],
            "layout": "IPY_MODEL_effdcbad522c455a82ca964b2dbe94ca"
          }
        },
        "4479058964d14c1883e18a2cb99b99d1": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_74dd75311ff8413892ec3f704d8fd1e7",
            "placeholder": "​",
            "style": "IPY_MODEL_79801be444ce431abbf2212afc57b414",
            "value": "Downloading (…)37e8b/tokenizer.json: 100%"
          }
        },
        "498a4497ce56479e9e72aae4046a3efd": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_8d4d48ef57274a58804eeafaef612cb0",
            "max": 2203239,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_73b7f583384c493faa60a7950372f18c",
            "value": 2203239
          }
        },
        "5642c6b4259643018903e3a95dc501cb": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_34f1d197017d4beabde8f87d904bc7d9",
            "placeholder": "​",
            "style": "IPY_MODEL_9d7125d014684f0e84fd5c43be7ed921",
            "value": " 2.20M/2.20M [00:00&lt;00:00, 13.1MB/s]"
          }
        },
        "effdcbad522c455a82ca964b2dbe94ca": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "74dd75311ff8413892ec3f704d8fd1e7": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "79801be444ce431abbf2212afc57b414": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "8d4d48ef57274a58804eeafaef612cb0": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "73b7f583384c493faa60a7950372f18c": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "34f1d197017d4beabde8f87d904bc7d9": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "9d7125d014684f0e84fd5c43be7ed921": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "7c06d510db3f4703b6b56504d639a8a9": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_826a44552bd840338052696da1d7bde6",
              "IPY_MODEL_63353fedcd7c43c7a12f17558dd39702",
              "IPY_MODEL_951948c194a447acac2b297d081615af"
            ],
            "layout": "IPY_MODEL_a61a91793617450cb0b27a03a621c1b3"
          }
        },
        "826a44552bd840338052696da1d7bde6": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_5c077c896c0e446cbf3e12c4013b5133",
            "placeholder": "​",
            "style": "IPY_MODEL_46316eeb6a6b48e9ab23766f020193dc",
            "value": "Downloading (…)08837e8b/config.json: 100%"
          }
        },
        "63353fedcd7c43c7a12f17558dd39702": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_8fe1c415a4f04b368ed14beb0e9ee62c",
            "max": 2796,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_623c5f0666474cf4a1e67cc4e5e74d02",
            "value": 2796
          }
        },
        "951948c194a447acac2b297d081615af": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_68387714e5fe4b028b86341f13209213",
            "placeholder": "​",
            "style": "IPY_MODEL_0918c3f3e2eb44ca84ef713fdf5b9156",
            "value": " 2.80k/2.80k [00:00&lt;00:00, 99.4kB/s]"
          }
        },
        "a61a91793617450cb0b27a03a621c1b3": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "5c077c896c0e446cbf3e12c4013b5133": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "46316eeb6a6b48e9ab23766f020193dc": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "8fe1c415a4f04b368ed14beb0e9ee62c": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "623c5f0666474cf4a1e67cc4e5e74d02": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "68387714e5fe4b028b86341f13209213": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "0918c3f3e2eb44ca84ef713fdf5b9156": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "c17a20634d4e4cadaabeb6f81b9ac3b4": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_1406f98df5c5414cb46b5276fc15a76b",
              "IPY_MODEL_4164d8ae7d1848a5a8dae29afe143c2a",
              "IPY_MODEL_f8171e928c7943a88f2d7bdec19dd2f6"
            ],
            "layout": "IPY_MODEL_6c443097e41b46909a452c6cbc09dd33"
          }
        },
        "1406f98df5c5414cb46b5276fc15a76b": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_4759f3c3a6e84fe7bb179d62f7ea3262",
            "placeholder": "​",
            "style": "IPY_MODEL_86e4c5e0d195413ab2bd72b0212658e9",
            "value": "Downloading model.bin: 100%"
          }
        },
        "4164d8ae7d1848a5a8dae29afe143c2a": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_e22f7ae63bc544419e1783b3d47230ed",
            "max": 3086912962,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_7bb61fe9f9094f0fbdcaa976fadb64f8",
            "value": 3086912962
          }
        },
        "f8171e928c7943a88f2d7bdec19dd2f6": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_401b13aacde741a3a743b54dfa00b9aa",
            "placeholder": "​",
            "style": "IPY_MODEL_b5386255571840a19ed2b25b7f989143",
            "value": " 3.09G/3.09G [00:15&lt;00:00, 210MB/s]"
          }
        },
        "6c443097e41b46909a452c6cbc09dd33": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "4759f3c3a6e84fe7bb179d62f7ea3262": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "86e4c5e0d195413ab2bd72b0212658e9": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "e22f7ae63bc544419e1783b3d47230ed": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "7bb61fe9f9094f0fbdcaa976fadb64f8": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "401b13aacde741a3a743b54dfa00b9aa": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "b5386255571840a19ed2b25b7f989143": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "996532a279174161a3f7e49553759337": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_24486c87f749476b84542c5b0aa349d3",
              "IPY_MODEL_d98295d78d4d4630be490aa83a154add",
              "IPY_MODEL_ddfa0921d3d443c59b2be5b12cf0a349"
            ],
            "layout": "IPY_MODEL_b4b02e9deddf410fa73a0566efd1b9fb"
          }
        },
        "24486c87f749476b84542c5b0aa349d3": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_4eab7130f7b3473991916d94d1fcb24a",
            "placeholder": "​",
            "style": "IPY_MODEL_b5204de67bf041749bfd164a4d3bc5c2",
            "value": "Downloading (…)37e8b/vocabulary.txt: 100%"
          }
        },
        "d98295d78d4d4630be490aa83a154add": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_fd81a6e45b5e4ecdb496c153926af9b6",
            "max": 459861,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_63fdd464bbc54340a43a195746308898",
            "value": 459861
          }
        },
        "ddfa0921d3d443c59b2be5b12cf0a349": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ef179a7f8e0f4b7a93d764906e9f15d6",
            "placeholder": "​",
            "style": "IPY_MODEL_eb9be97ce66a4caa97fdedd8bf117ba5",
            "value": " 460k/460k [00:00&lt;00:00, 7.93MB/s]"
          }
        },
        "b4b02e9deddf410fa73a0566efd1b9fb": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "4eab7130f7b3473991916d94d1fcb24a": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "b5204de67bf041749bfd164a4d3bc5c2": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "fd81a6e45b5e4ecdb496c153926af9b6": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "63fdd464bbc54340a43a195746308898": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "ef179a7f8e0f4b7a93d764906e9f15d6": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "eb9be97ce66a4caa97fdedd8bf117ba5": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "7e2a7c9b8adf4a2eb997ac3b9268f2d8": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_531d7fe81a604483a3ba71988730fb1e",
              "IPY_MODEL_ac4bc828d21d463b9c4df199d29746b3",
              "IPY_MODEL_3ca3a819333740ebac91a7cdd08b819d"
            ],
            "layout": "IPY_MODEL_c4ad6557732e4b31b239fd519b8b49b5"
          }
        },
        "531d7fe81a604483a3ba71988730fb1e": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ca6495cd6de342099f5723d89e606f58",
            "placeholder": "​",
            "style": "IPY_MODEL_f32e5668cd424ec08a9de66781f79dec",
            "value": "Downloading (…)37e8b/tokenizer.json: 100%"
          }
        },
        "ac4bc828d21d463b9c4df199d29746b3": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ee6e5564100841e89ea133fcb97cdd20",
            "max": 2203239,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_6ae6aec7b0354a568006c0aa1d008484",
            "value": 2203239
          }
        },
        "3ca3a819333740ebac91a7cdd08b819d": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ea06357c012d456a87ff5a08c547ba8b",
            "placeholder": "​",
            "style": "IPY_MODEL_87f57e99649847978e40606637d035cf",
            "value": " 2.20M/2.20M [00:00&lt;00:00, 8.33MB/s]"
          }
        },
        "c4ad6557732e4b31b239fd519b8b49b5": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "ca6495cd6de342099f5723d89e606f58": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f32e5668cd424ec08a9de66781f79dec": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "ee6e5564100841e89ea133fcb97cdd20": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "6ae6aec7b0354a568006c0aa1d008484": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "ea06357c012d456a87ff5a08c547ba8b": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "87f57e99649847978e40606637d035cf": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "620b2cbc5e9d4df0a8d6a5c6a88a7ecf": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_d5447eaaf17d465eaf8df8cdfee008b7",
              "IPY_MODEL_d6827ae278234adfb9600ee012604c16",
              "IPY_MODEL_68d8bb9095b044a08c840e0d81595c78"
            ],
            "layout": "IPY_MODEL_c906cefedc7449eab0d701e76eb1ea23"
          }
        },
        "d5447eaaf17d465eaf8df8cdfee008b7": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_76d655e30ca34c1c8a248bd25121d3ec",
            "placeholder": "​",
            "style": "IPY_MODEL_59a52b461eed43d8908b2ba59c317c2c",
            "value": "Downloading (…)37e8b/vocabulary.txt: 100%"
          }
        },
        "d6827ae278234adfb9600ee012604c16": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_638a46c57a3e4a1b834a3c450fa604e3",
            "max": 459861,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_8464d42f755e4d94b5d06d7d39545db8",
            "value": 459861
          }
        },
        "68d8bb9095b044a08c840e0d81595c78": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_a64a7818c691437aa33ec2d64518faa1",
            "placeholder": "​",
            "style": "IPY_MODEL_f384e273a1cf4c0ea450a5d85c662580",
            "value": " 460k/460k [00:00&lt;00:00, 2.98MB/s]"
          }
        },
        "c906cefedc7449eab0d701e76eb1ea23": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "76d655e30ca34c1c8a248bd25121d3ec": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "59a52b461eed43d8908b2ba59c317c2c": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "638a46c57a3e4a1b834a3c450fa604e3": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "8464d42f755e4d94b5d06d7d39545db8": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "a64a7818c691437aa33ec2d64518faa1": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f384e273a1cf4c0ea450a5d85c662580": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "55c1589f2d0b413387bb840a55855967": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_04b1db680f1640dfabc59d9e1c92b76e",
              "IPY_MODEL_f922c4e00f4e410dac49a975ba2490a2",
              "IPY_MODEL_cf25d08de833462a945163ed035a42b9"
            ],
            "layout": "IPY_MODEL_8e0d886aa8394e33b7c4236e03e45b22"
          }
        },
        "04b1db680f1640dfabc59d9e1c92b76e": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_31f4f443242b4dff949861e02d54ae46",
            "placeholder": "​",
            "style": "IPY_MODEL_34e5f3677cb34e728dc52fe8c0d72b5e",
            "value": "Downloading (…)08837e8b/config.json: 100%"
          }
        },
        "f922c4e00f4e410dac49a975ba2490a2": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_e98fa2593b0c4931bb2cde67f8d7583e",
            "max": 2796,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_b05a7d1938284e59ab369c1c81fa3756",
            "value": 2796
          }
        },
        "cf25d08de833462a945163ed035a42b9": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ffa4c2b81faf4bb482d37de6f4c11c2d",
            "placeholder": "​",
            "style": "IPY_MODEL_e2ed3af050be45bd8d8b9d3d99983518",
            "value": " 2.80k/2.80k [00:00&lt;00:00, 63.8kB/s]"
          }
        },
        "8e0d886aa8394e33b7c4236e03e45b22": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "31f4f443242b4dff949861e02d54ae46": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "34e5f3677cb34e728dc52fe8c0d72b5e": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "e98fa2593b0c4931bb2cde67f8d7583e": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "b05a7d1938284e59ab369c1c81fa3756": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "ffa4c2b81faf4bb482d37de6f4c11c2d": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "e2ed3af050be45bd8d8b9d3d99983518": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "7bb665f2c8254f8aab351725f49851ac": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HBoxModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_ba5d7fa63a5347d58b90e4520312a974",
              "IPY_MODEL_7fb71748a199431c8392281757a3649a",
              "IPY_MODEL_4d6db61f0f1b44b5b473eef645b28f05"
            ],
            "layout": "IPY_MODEL_db3ef645bd844bc59909982b290cde40"
          }
        },
        "ba5d7fa63a5347d58b90e4520312a974": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_bf57e6c394464f95ab42092500905ac5",
            "placeholder": "​",
            "style": "IPY_MODEL_191395adaa8d4b4396676fd16089e824",
            "value": "Downloading model.bin: 100%"
          }
        },
        "7fb71748a199431c8392281757a3649a": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "FloatProgressModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ba825d7bd79544b3b60d62e095c14b4e",
            "max": 3086912962,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_fa60c49027c1413f98df3c256324f1ca",
            "value": 3086912962
          }
        },
        "4d6db61f0f1b44b5b473eef645b28f05": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "HTMLModel",
          "model_module_version": "1.5.0",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_e41f2581f3744dee8d6c946c70169131",
            "placeholder": "​",
            "style": "IPY_MODEL_f3d4ea0400aa41629aa57590fefc9128",
            "value": " 3.09G/3.09G [00:21&lt;00:00, 176MB/s]"
          }
        },
        "db3ef645bd844bc59909982b290cde40": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "bf57e6c394464f95ab42092500905ac5": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "191395adaa8d4b4396676fd16089e824": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "ba825d7bd79544b3b60d62e095c14b4e": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "fa60c49027c1413f98df3c256324f1ca": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "ProgressStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "e41f2581f3744dee8d6c946c70169131": {
          "model_module": "@jupyter-widgets/base",
          "model_name": "LayoutModel",
          "model_module_version": "1.2.0",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f3d4ea0400aa41629aa57590fefc9128": {
          "model_module": "@jupyter-widgets/controls",
          "model_name": "DescriptionStyleModel",
          "model_module_version": "1.5.0",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        }
      }
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/Vaibhavs10/insanely-fast-whisper/blob/main/infer_faster_whisper_large_v2.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "A quick benchmark for the faster-whisper implementation for Automatic Speech Recognition! 🤗\n",
        "\n",
        "from yours truly - [Vaibhav (VB) Srivastav](https://twitter.com/reach_vb) 🤙"
      ],
      "metadata": {
        "id": "uPcU_MJdVO9I"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!pip install --upgrade -q faster-whisper ipython-autotime"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "C-3jAUK4or1d",
        "outputId": "a4b4679f-6085-44ab-ca7c-b353375356ed"
      },
      "execution_count": 1,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.5/1.5 MB\u001b[0m \u001b[31m19.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m31.0/31.0 MB\u001b[0m \u001b[31m39.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m35.7/35.7 MB\u001b[0m \u001b[31m16.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m302.0/302.0 kB\u001b[0m \u001b[31m21.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.8/3.8 MB\u001b[0m \u001b[31m58.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m6.2/6.2 MB\u001b[0m \u001b[31m110.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.0/46.0 kB\u001b[0m \u001b[31m5.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m295.0/295.0 kB\u001b[0m \u001b[31m26.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.6/1.6 MB\u001b[0m \u001b[31m64.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m86.8/86.8 kB\u001b[0m \u001b[31m12.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "%load_ext autotime"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "yqb4IGAnqThp",
        "outputId": "9a066cd0-9cad-4a5b-9fc7-6daa448e22db"
      },
      "execution_count": 2,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "time: 266 µs (started: 2023-10-18 18:58:18 +00:00)\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Download a test file!\n",
        "\n",
        "The only fitting test audio to use for our benchmark would be [Lex interviewing Sam Altman](https://www.youtube.com/watch?v=L_Guz73e6fw&t=8s). We'll use the audio file corresponding to his podcast. I uploaded it on a wee dataset on the hub [here](https://huggingface.co/datasets/reach-vb/random-audios/blob/main/sam_altman_lex_podcast_367.flac)."
      ],
      "metadata": {
        "id": "BY1FFJgIVmMs"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "!wget https://huggingface.co/datasets/reach-vb/random-audios/resolve/main/sam_altman_lex_podcast_367.flac"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "emKT2iAGqXu5",
        "outputId": "bc24b2ab-a1d0-4ed3-d5d7-6effb7acb386"
      },
      "execution_count": 3,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "--2023-10-18 18:59:24--  https://huggingface.co/datasets/reach-vb/random-audios/resolve/main/sam_altman_lex_podcast_367.flac\n",
            "Resolving huggingface.co (huggingface.co)... 18.172.134.4, 18.172.134.88, 18.172.134.124, ...\n",
            "Connecting to huggingface.co (huggingface.co)|18.172.134.4|:443... connected.\n",
            "HTTP request sent, awaiting response... 302 Found\n",
            "Location: https://cdn-lfs.huggingface.co/repos/96/e4/96e4f69cd112b019dd764318570e47e5fe96de53d8c32a99d745e72d9086e355/b2fd593ce144a8d904cf49a4ed77ed06eb50644a053dddd280c81a3ef94fb60e?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27sam_altman_lex_podcast_367.flac%3B+filename%3D%22sam_altman_lex_podcast_367.flac%22%3B&response-content-type=audio%2Fx-flac&Expires=1697914764&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY5NzkxNDc2NH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy85Ni9lNC85NmU0ZjY5Y2QxMTJiMDE5ZGQ3NjQzMTg1NzBlNDdlNWZlOTZkZTUzZDhjMzJhOTlkNzQ1ZTcyZDkwODZlMzU1L2IyZmQ1OTNjZTE0NGE4ZDkwNGNmNDlhNGVkNzdlZDA2ZWI1MDY0NGEwNTNkZGRkMjgwYzgxYTNlZjk0ZmI2MGU%7EcmVzcG9uc2UtY29udGVudC1kaXNwb3NpdGlvbj0qJnJlc3BvbnNlLWNvbnRlbnQtdHlwZT0qIn1dfQ__&Signature=M-GgO4ao7h7f1BMnjbPcowvK87Bkmq9Tda6g7dNlkdbmF1Ad0hCbayLkq%7EZrQpR8OAY6%7EztxZOpbZD9rlkf1lp9gpF1GZdC5vyX26edfxYtImHOL0EJP4%7EucmlArgT8iROj2qkYxjtEZFh3yI0OErc0ibMCTLmCxVfJTWqHxNeJTLvsVyVN1I2DDpl3DZw1Q3QyTj8LkEDjKMvYvObY4%7ETNlkfxo0kcHN5SEAjfrq6U-nwn-FitmnMudbMjssnNFvqzSuNNWqfD0ClsUtbhr6Wfd3C2kX-RFesU08Kix4NOURPxBOF%7EjUkHed1ch3OOgVuUl3yVqk0zzGtUkzVcwrA__&Key-Pair-Id=KVTP0A1DKRTAX [following]\n",
            "--2023-10-18 18:59:24--  https://cdn-lfs.huggingface.co/repos/96/e4/96e4f69cd112b019dd764318570e47e5fe96de53d8c32a99d745e72d9086e355/b2fd593ce144a8d904cf49a4ed77ed06eb50644a053dddd280c81a3ef94fb60e?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27sam_altman_lex_podcast_367.flac%3B+filename%3D%22sam_altman_lex_podcast_367.flac%22%3B&response-content-type=audio%2Fx-flac&Expires=1697914764&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTY5NzkxNDc2NH19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy85Ni9lNC85NmU0ZjY5Y2QxMTJiMDE5ZGQ3NjQzMTg1NzBlNDdlNWZlOTZkZTUzZDhjMzJhOTlkNzQ1ZTcyZDkwODZlMzU1L2IyZmQ1OTNjZTE0NGE4ZDkwNGNmNDlhNGVkNzdlZDA2ZWI1MDY0NGEwNTNkZGRkMjgwYzgxYTNlZjk0ZmI2MGU%7EcmVzcG9uc2UtY29udGVudC1kaXNwb3NpdGlvbj0qJnJlc3BvbnNlLWNvbnRlbnQtdHlwZT0qIn1dfQ__&Signature=M-GgO4ao7h7f1BMnjbPcowvK87Bkmq9Tda6g7dNlkdbmF1Ad0hCbayLkq%7EZrQpR8OAY6%7EztxZOpbZD9rlkf1lp9gpF1GZdC5vyX26edfxYtImHOL0EJP4%7EucmlArgT8iROj2qkYxjtEZFh3yI0OErc0ibMCTLmCxVfJTWqHxNeJTLvsVyVN1I2DDpl3DZw1Q3QyTj8LkEDjKMvYvObY4%7ETNlkfxo0kcHN5SEAjfrq6U-nwn-FitmnMudbMjssnNFvqzSuNNWqfD0ClsUtbhr6Wfd3C2kX-RFesU08Kix4NOURPxBOF%7EjUkHed1ch3OOgVuUl3yVqk0zzGtUkzVcwrA__&Key-Pair-Id=KVTP0A1DKRTAX\n",
            "Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 3.162.174.43, 3.162.174.92, 3.162.174.52, ...\n",
            "Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|3.162.174.43|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 351705020 (335M) [audio/x-flac]\n",
            "Saving to: ‘sam_altman_lex_podcast_367.flac’\n",
            "\n",
            "sam_altman_lex_podc 100%[===================>] 335.41M   259MB/s    in 1.3s    \n",
            "\n",
            "2023-10-18 18:59:26 (259 MB/s) - ‘sam_altman_lex_podcast_367.flac’ saved [351705020/351705020]\n",
            "\n",
            "time: 1.61 s (started: 2023-10-18 18:59:24 +00:00)\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Half-Precision"
      ],
      "metadata": {
        "id": "DNBvUbic_Gwi"
      }
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206,
          "referenced_widgets": [
            "84f156ad5d024f64ab9685a2276f8804",
            "4479058964d14c1883e18a2cb99b99d1",
            "498a4497ce56479e9e72aae4046a3efd",
            "5642c6b4259643018903e3a95dc501cb",
            "effdcbad522c455a82ca964b2dbe94ca",
            "74dd75311ff8413892ec3f704d8fd1e7",
            "79801be444ce431abbf2212afc57b414",
            "8d4d48ef57274a58804eeafaef612cb0",
            "73b7f583384c493faa60a7950372f18c",
            "34f1d197017d4beabde8f87d904bc7d9",
            "9d7125d014684f0e84fd5c43be7ed921",
            "7c06d510db3f4703b6b56504d639a8a9",
            "826a44552bd840338052696da1d7bde6",
            "63353fedcd7c43c7a12f17558dd39702",
            "951948c194a447acac2b297d081615af",
            "a61a91793617450cb0b27a03a621c1b3",
            "5c077c896c0e446cbf3e12c4013b5133",
            "46316eeb6a6b48e9ab23766f020193dc",
            "8fe1c415a4f04b368ed14beb0e9ee62c",
            "623c5f0666474cf4a1e67cc4e5e74d02",
            "68387714e5fe4b028b86341f13209213",
            "0918c3f3e2eb44ca84ef713fdf5b9156",
            "c17a20634d4e4cadaabeb6f81b9ac3b4",
            "1406f98df5c5414cb46b5276fc15a76b",
            "4164d8ae7d1848a5a8dae29afe143c2a",
            "f8171e928c7943a88f2d7bdec19dd2f6",
            "6c443097e41b46909a452c6cbc09dd33",
            "4759f3c3a6e84fe7bb179d62f7ea3262",
            "86e4c5e0d195413ab2bd72b0212658e9",
            "e22f7ae63bc544419e1783b3d47230ed",
            "7bb61fe9f9094f0fbdcaa976fadb64f8",
            "401b13aacde741a3a743b54dfa00b9aa",
            "b5386255571840a19ed2b25b7f989143",
            "996532a279174161a3f7e49553759337",
            "24486c87f749476b84542c5b0aa349d3",
            "d98295d78d4d4630be490aa83a154add",
            "ddfa0921d3d443c59b2be5b12cf0a349",
            "b4b02e9deddf410fa73a0566efd1b9fb",
            "4eab7130f7b3473991916d94d1fcb24a",
            "b5204de67bf041749bfd164a4d3bc5c2",
            "fd81a6e45b5e4ecdb496c153926af9b6",
            "63fdd464bbc54340a43a195746308898",
            "ef179a7f8e0f4b7a93d764906e9f15d6",
            "eb9be97ce66a4caa97fdedd8bf117ba5"
          ]
        },
        "id": "PzoWzQwboMpk",
        "outputId": "8ebdc57a-c7f1-4f0d-eb10-9cbc90238763"
      },
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "Downloading (…)37e8b/tokenizer.json:   0%|          | 0.00/2.20M [00:00<?, ?B/s]"
            ],
            "application/vnd.jupyter.widget-view+json": {
              "version_major": 2,
              "version_minor": 0,
              "model_id": "84f156ad5d024f64ab9685a2276f8804"
            }
          },
          "metadata": {}
        },
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "Downloading (…)08837e8b/config.json:   0%|          | 0.00/2.80k [00:00<?, ?B/s]"
            ],
            "application/vnd.jupyter.widget-view+json": {
              "version_major": 2,
              "version_minor": 0,
              "model_id": "7c06d510db3f4703b6b56504d639a8a9"
            }
          },
          "metadata": {}
        },
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "Downloading model.bin:   0%|          | 0.00/3.09G [00:00<?, ?B/s]"
            ],
            "application/vnd.jupyter.widget-view+json": {
              "version_major": 2,
              "version_minor": 0,
              "model_id": "c17a20634d4e4cadaabeb6f81b9ac3b4"
            }
          },
          "metadata": {}
        },
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "Downloading (…)37e8b/vocabulary.txt:   0%|          | 0.00/460k [00:00<?, ?B/s]"
            ],
            "application/vnd.jupyter.widget-view+json": {
              "version_major": 2,
              "version_minor": 0,
              "model_id": "996532a279174161a3f7e49553759337"
            }
          },
          "metadata": {}
        },
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "time: 24.1 s (started: 2023-10-18 18:45:31 +00:00)\n"
          ]
        }
      ],
      "source": [
        "from faster_whisper import WhisperModel\n",
        "\n",
        "model_size = \"large-v2\"\n",
        "\n",
        "# Run on GPU with FP16\n",
        "model = WhisperModel(model_size, device=\"cuda\", compute_type=\"float16\")"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "segments, info = model.transcribe(\"sam_altman_lex_podcast_367.flac\", beam_size=1)\n",
        "\n",
        "print(\"Detected language '%s' with probability %f\" % (info.language, info.language_probability))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "6xwaBmkMrg5v",
        "outputId": "78845f25-1493-4a68-b223-a52fe2a88af7"
      },
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Detected language 'en' with probability 1.000000\n",
            "time: 41.2 s (started: 2023-10-18 18:46:09 +00:00)\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "for segment in segments:\n",
        "    print(\"[%.2fs -> %.2fs] %s\" % (segment.start, segment.end, segment.text))"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "6fb6cf60qc1Y",
        "outputId": "ed558df0-542d-4e3a-bc48-558517e862ef"
      },
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "[0.00s -> 3.92s]  We have been a misunderstood and badly mocked org for a long time.\n",
            "[3.92s -> 11.44s]  Like when we started, we like announced the org at the end of 2015 and said we were going to work\n",
            "[11.44s -> 18.16s]  on AGI, like people thought we were batshit insane. You know, like I remember at the time\n",
            "[18.16s -> 27.04s]  a eminent AI scientist at a large industrial AI lab was like DMing individual reporters\n",
            "[27.04s -> 31.60s]  being like, you know, these people aren't very good and it's ridiculous to talk about AGI and I\n",
            "[31.60s -> 36.08s]  can't believe you're giving them time of day and it's like that was the level of like pettiness\n",
            "[36.08s -> 39.52s]  and rancor in the field at a new group of people saying we're going to try to build AGI.\n",
            "[40.24s -> 44.72s]  So OpenAI and DeepMind was a small collection of folks who were brave enough to talk\n",
            "[45.76s -> 50.16s]  about AGI in the face of mockery.\n",
            "[50.88s -> 52.16s]  We don't get mocked as much now.\n",
            "[53.04s -> 54.40s]  Don't get mocked as much now.\n",
            "[54.64s -> 62.16s]  The following is a conversation with Sam Altman, CEO of OpenAI, the company behind GPT-4,\n",
            "[62.16s -> 68.88s]  JAD-GPT, DALI, Codex and many other AI technologies which both individually and together\n",
            "[68.88s -> 73.52s]  constitute some of the greatest breakthroughs in the history of artificial intelligence,\n",
            "[73.52s -> 80.08s]  computing and humanity in general. Please allow me to say a few words about the possibilities\n",
            "[80.16s -> 84.64s]  and the dangers of AI in this current moment in the history of human civilization.\n",
            "[85.36s -> 87.28s]  I believe it is a critical moment.\n",
            "[87.28s -> 91.12s]  We stand on the precipice of fundamental societal transformation\n",
            "[91.12s -> 96.40s]  where soon, nobody knows when, but many including me believe it's within our lifetime.\n",
            "[97.12s -> 103.28s]  The collective intelligence of the human species begins to pale in comparison by many orders of\n",
            "[103.60s -> 110.80s]  comparison by many orders of magnitude to the general superintelligence in the AI systems we\n",
            "[110.80s -> 117.92s]  build and deploy at scale. This is both exciting and terrifying.\n",
            "[118.64s -> 124.88s]  It is exciting because of the innumerable applications we know and don't yet know\n",
            "[124.88s -> 131.20s]  that will empower humans to create, to flourish, to escape the widespread poverty and suffering\n",
            "[131.20s -> 138.40s]  that exists in the world today and to succeed in that old all too human pursuit of happiness.\n",
            "[139.60s -> 146.00s]  It is terrifying because of the power that superintelligent AGI wields to destroy human\n",
            "[146.00s -> 152.56s]  civilization intentionally or unintentionally. The power to suffocate the human spirit\n",
            "[153.12s -> 160.48s]  in the totalitarian way of George Orwell's 1984 or the pleasure-fueled mass hysteria\n",
            "[160.48s -> 166.16s]  of Brave New World where as Huxley saw it, people come to love their oppression,\n",
            "[166.80s -> 175.20s]  to adore the technologies that undo their capacities to think. That is why these conversations\n",
            "[175.20s -> 181.20s]  with the leaders, engineers, and philosophers, both optimists and cynics, is important now.\n",
            "[182.80s -> 188.24s]  These are not merely technical conversations about AI. These are conversations about power,\n",
            "[188.24s -> 193.20s]  about companies, institutions, and political systems that deploy, check, and balance this power.\n",
            "[193.92s -> 200.48s]  About distributed economic systems that incentivize the safety and human alignment of this power.\n",
            "[201.20s -> 207.52s]  About the psychology of the engineers and leaders that deploy AGI and about the history of human\n",
            "[207.52s -> 217.28s]  nature, our capacity for good and evil at scale. I'm deeply honored to have gotten to know and\n",
            "[217.60s -> 223.84s]  spoken with on and off the mic with many folks who now work at OpenAI, including Sam Altman,\n",
            "[223.84s -> 232.64s]  Greg Brockman, Ilya Sutskever, Wojciech Zaremba, Andrei Karpathy, Jakub Pachacki, and many others.\n",
            "[233.36s -> 239.52s]  It means the world that Sam has been totally open with me, willing to have multiple conversations,\n",
            "[239.52s -> 245.44s]  including challenging ones, on and off the mic. I will continue to have these conversations\n",
            "[245.44s -> 251.84s]  to both celebrate the incredible accomplishments of the AI community and to steel man the critical\n",
            "[251.84s -> 258.56s]  perspective on major decisions various companies and leaders make, always with the goal of trying\n",
            "[258.56s -> 265.84s]  to help in my small way. If I fail, I will work hard to improve. I love you all.\n",
            "[267.12s -> 272.16s]  This is the Lex Friedman Podcast. To support it, please check out our sponsors in the description.\n",
            "[272.24s -> 275.52s]  And now, dear friends, here's Sam Altman.\n",
            "[276.80s -> 282.48s]  High level, what is GPT for? How does it work and what to use most amazing about it?\n",
            "[283.12s -> 290.32s]  It's a system that we'll look back at and say was a very early AI. And it's slow, it's buggy,\n",
            "[290.96s -> 295.44s]  it doesn't do a lot of things very well, but neither did the very earliest computers.\n",
            "[296.56s -> 301.92s]  And they still pointed a path to something that was going to be really important in our lives,\n",
            "[302.24s -> 304.16s]  it took a few decades to evolve.\n",
            "[304.16s -> 309.44s]  Do you think this is a pivotal moment? Like, out of all the versions of GPT, 50 years from now,\n",
            "[310.40s -> 315.92s]  when they look back on an early system that was really kind of a leap, you know, in a Wikipedia\n",
            "[315.92s -> 320.48s]  page about the history of artificial intelligence, which of the GPTs would they put?\n",
            "[320.48s -> 325.68s]  That is a good question. I sort of think of progress as this continual exponential. It's not\n",
            "[325.68s -> 331.52s]  like we could say here was the moment where AI went from not happening to happening. And I'd\n",
            "[331.68s -> 336.16s]  a very hard time pinpointing a single thing. I think it's this very continual curve.\n",
            "[337.28s -> 342.88s]  Will the history books write about GPT 1 or 2 or 3 or 4 or 7? That's for them to decide. I don't\n",
            "[342.88s -> 349.60s]  really know. I think if I had to pick some moment from what we've seen so far, I'd sort of pick\n",
            "[349.60s -> 354.96s]  Chad GPT. It wasn't the underlying model that mattered, it was the usability of it, both the\n",
            "[354.96s -> 362.32s]  RLHF and the interface to it. What is Chad GPT? What is RLHF? Reinforcement Learning with Human\n",
            "[362.32s -> 369.60s]  Feedback. What was that little magic ingredient to the dish that made it so much more delicious?\n",
            "[370.48s -> 377.12s]  So we trained these models on a lot of text data. And in that process, they learned the underlying\n",
            "[378.08s -> 384.00s]  something about the underlying representations of what's in here or in there. And they can do\n",
            "[384.96s -> 389.76s]  amazing things. But when you first play with that base model that we call it after you finish\n",
            "[389.76s -> 395.12s]  training, it can do very well on evals, it can pass tests, it can do a lot of, you know, there's\n",
            "[395.12s -> 402.64s]  knowledge in there. But it's not very useful. Or at least it's not easy to use, let's say. And RLHF\n",
            "[402.64s -> 409.04s]  is how we take some human feedback. The simplest version of this is show two outputs, ask which\n",
            "[409.04s -> 414.48s]  one is better than the other, which one the human raters prefer, and then feed that back into the\n",
            "[414.48s -> 420.32s]  model with reinforcement learning. And that process works remarkably well with, in my opinion,\n",
            "[420.32s -> 427.44s]  remarkably little data to make the model more useful. So RLHF is how we align the model to\n",
            "[427.44s -> 434.08s]  what humans want it to do. So there's a giant language model that's trained on a giant data set\n",
            "[434.08s -> 438.08s]  to create this kind of background wisdom knowledge that's contained within the internet.\n",
            "[439.04s -> 445.68s]  And then somehow adding a little bit of human guidance on top of it through this process\n",
            "[446.72s -> 449.20s]  makes it seem so much more awesome.\n",
            "[450.48s -> 454.24s]  Maybe just because it's much easier to use. It's much easier to get what you want. You get it\n",
            "[454.24s -> 458.88s]  right more often the first time. And ease of use matters a lot, even if the base capability was\n",
            "[458.88s -> 460.08s]  there before.\n",
            "[460.08s -> 467.68s]  And like a feeling like it understood the question you're asking. Or like it feels like you're\n",
            "[467.84s -> 468.96s]  kind of on the same page.\n",
            "[468.96s -> 469.84s]  It's trying to help you.\n",
            "[470.56s -> 471.84s]  It's the feeling of alignment.\n",
            "[471.84s -> 472.08s]  Yes.\n",
            "[472.08s -> 477.04s]  I mean, that could be a more technical term for it. And you're saying that not much data is\n",
            "[477.04s -> 479.52s]  required for that, not much human supervision is required for that.\n",
            "[479.52s -> 486.64s]  To be fair, we understand the science of this part at a much earlier stage than we do the\n",
            "[486.64s -> 490.48s]  science of creating these large pre-trained models in the first place. But yes, less data,\n",
            "[490.48s -> 491.36s]  much less data.\n",
            "[491.36s -> 495.68s]  That's so interesting. The science of human guidance.\n",
            "[497.92s -> 501.92s]  That's a very interesting science. And it's going to be a very important science to understand\n",
            "[502.64s -> 508.88s]  how to make it usable, how to make it wise, how to make it ethical, how to make it aligned in\n",
            "[508.88s -> 516.08s]  terms of all the kinds of stuff we think about. And it matters which are the humans and what is\n",
            "[516.08s -> 520.32s]  the process of incorporating that human feedback and what are you asking the humans. Is it two\n",
            "[520.32s -> 525.68s]  things? Are you asking them to rank things? What aspects are you letting or asking the\n",
            "[525.68s -> 534.48s]  humans to focus in on? It's really fascinating. But what is the dataset it's trained on? Can you\n",
            "[534.48s -> 537.12s]  kind of loosely speak to the enormity of this dataset?\n",
            "[537.12s -> 538.00s]  The pre-training dataset?\n",
            "[538.00s -> 539.52s]  The pre-training dataset, I apologize.\n",
            "[540.24s -> 543.76s]  We spend a huge amount of effort pulling that together from many different sources.\n",
            "[544.48s -> 551.44s]  There's a lot of, there are open source databases of information. We get stuff via partnerships.\n",
            "[551.44s -> 555.92s]  There's things on the internet. A lot of our work is building a great dataset.\n",
            "[557.04s -> 559.60s]  How much of it is the memes subreddit?\n",
            "[559.60s -> 562.16s]  Not very much. Maybe it'd be more fun if it were more.\n",
            "[563.52s -> 568.56s]  So some of it is Reddit. Some of it is news sources, a huge number of newspapers.\n",
            "[569.20s -> 570.48s]  There's the general web.\n",
            "[570.96s -> 574.32s]  There's a lot of content in the world, more than I think most people think.\n",
            "[574.32s -> 582.24s]  Yeah, there is. Like too much. Where the task is not to find stuff but to filter out stuff, right?\n",
            "[584.08s -> 587.68s]  Is there a magic to that? Because there seems to be several components to solve.\n",
            "[589.76s -> 594.64s]  The design of the, you could say, algorithms, so like the architecture of the neural networks,\n",
            "[594.64s -> 598.00s]  maybe the size of the neural network. There's the selection of the data.\n",
            "[598.96s -> 605.28s]  There's the human supervised aspect of it with RL with human feedback.\n",
            "[606.00s -> 610.72s]  Yeah, I think one thing that is not that well understood about creation of this final product,\n",
            "[610.72s -> 615.84s]  like what it takes to make GPT-4, the version of it we actually ship out that you get to use\n",
            "[615.84s -> 621.92s]  inside of chat-gpt, the number of pieces that have to all come together and then we have to\n",
            "[622.00s -> 627.36s]  figure out either new ideas or just execute existing ideas really well at every stage of\n",
            "[627.36s -> 631.92s]  this pipeline, there's quite a lot that goes into it. So there's a lot of problem solving.\n",
            "[632.88s -> 640.56s]  You've already said for GPT-4 in the blog post and in general, there's already kind of a maturity\n",
            "[640.56s -> 646.88s]  that's happening on some of these steps, like being able to predict before doing the full training\n",
            "[646.88s -> 650.72s]  of how the model will behave. Isn't that so remarkable, by the way, that there's like\n",
            "[651.68s -> 656.80s]  a law of science that lets you predict for these inputs, here's what's gonna come out the other\n",
            "[656.80s -> 661.92s]  end. Here's the level of intelligence you can expect. Is it close to a science or is it still,\n",
            "[663.52s -> 668.00s]  because you said the word law and science, which are very ambitious terms.\n",
            "[668.00s -> 669.12s]  Close to, I said.\n",
            "[669.12s -> 671.52s]  Close to, right. Be accurate, yes.\n",
            "[671.52s -> 675.52s]  I'll say it's way more scientific than I ever would have dared to imagine.\n",
            "[675.52s -> 683.12s]  So you can really know the peculiar characteristics of the fully trained system from just a little bit\n",
            "[683.12s -> 683.60s]  of training.\n",
            "[684.16s -> 688.40s]  Like any new branch of science, we're gonna discover new things that don't fit the data\n",
            "[688.40s -> 694.08s]  and have to come up with better explanations and that is the ongoing process of discovery in science.\n",
            "[694.08s -> 700.16s]  But with what we know now, even what we had in that GPT-4 blog post, I think we should all just\n",
            "[700.16s -> 704.32s]  be in awe of how amazing it is that we can even predict to this current level.\n",
            "[704.32s -> 709.60s]  Yeah, you can look at a one-year-old baby and predict how it's going to do on the SATs,\n",
            "[709.60s -> 712.56s]  I don't know, seemingly an equivalent one.\n",
            "[712.56s -> 717.36s]  But because here we can actually in detail introspect various aspects of the system,\n",
            "[717.36s -> 718.00s]  you can predict.\n",
            "[718.80s -> 726.00s]  That said, just to jump around, you said the language model that is GPT-4, it learns,\n",
            "[726.00s -> 727.28s]  in quotes, something.\n",
            "[727.44s -> 735.36s]  In terms of science and art and so on, is there within OpenAI, within folks like yourself and\n",
            "[735.36s -> 741.20s]  Ilyas Eskever and the engineers, a deeper and deeper understanding of what that something is?\n",
            "[742.16s -> 746.48s]  Or is it still a kind of beautiful, magical mystery?\n",
            "[747.84s -> 750.40s]  Well, there's all these different evals that we could talk about.\n",
            "[751.44s -> 753.04s]  What's an eval?\n",
            "[753.44s -> 759.28s]  Like how we measure a model as we're training it, after we've trained it, and say, how good\n",
            "[759.28s -> 760.72s]  is this at some set of tasks?\n",
            "[760.72s -> 765.84s]  And also just in a small tangent, thank you for sort of open sourcing the evaluation process.\n",
            "[765.84s -> 767.52s]  Yeah, I think that'll be really helpful.\n",
            "[770.32s -> 776.72s]  But the one that really matters is, we pour all of this effort and money and time into this thing,\n",
            "[777.28s -> 781.12s]  and then what it comes out with, how useful is that to people?\n",
            "[781.12s -> 782.56s]  How much delight does that bring people?\n",
            "[782.64s -> 787.36s]  How much does that help them create a much better world, new science, new products, new services,\n",
            "[787.36s -> 787.76s]  whatever?\n",
            "[788.48s -> 791.04s]  And that's the one that matters.\n",
            "[791.92s -> 797.04s]  And understanding for a particular set of inputs, like how much value and utility to\n",
            "[797.04s -> 800.64s]  provide to people, I think we are understanding that better.\n",
            "[803.84s -> 808.40s]  Do we understand everything about why the model does one thing and not one other thing?\n",
            "[808.40s -> 816.40s]  Certainly not always, but I would say we are pushing back the fog of war more and more.\n",
            "[816.40s -> 821.68s]  And it took a lot of understanding to make GPT-4, for example.\n",
            "[821.68s -> 824.64s]  But I'm not even sure we can ever fully understand.\n",
            "[824.64s -> 827.84s]  Like you said, you would understand by asking questions, essentially.\n",
            "[827.84s -> 834.32s]  Because it's compressing all of the web, like a huge sloth of the web, into a small number\n",
            "[834.40s -> 840.00s]  of parameters, into one organized black box that is human wisdom.\n",
            "[841.12s -> 841.68s]  What is that?\n",
            "[841.68s -> 842.80s]  Human knowledge, let's say.\n",
            "[843.44s -> 844.08s]  Human knowledge.\n",
            "[845.44s -> 846.32s]  It's a good difference.\n",
            "[848.08s -> 850.24s]  Is there a difference between knowledge?\n",
            "[850.24s -> 851.76s]  So there's facts and there's wisdom.\n",
            "[851.76s -> 854.96s]  And I feel like GPT-4 can be also full of wisdom.\n",
            "[854.96s -> 856.64s]  What's the leap from facts to wisdom?\n",
            "[856.64s -> 862.64s]  You know, a funny thing about the way we're training these models is I suspect too much\n",
            "[862.64s -> 869.60s]  of the processing power, for lack of a better word, is going into using the model as a database\n",
            "[869.60s -> 871.44s]  instead of using the model as a reasoning engine.\n",
            "[872.40s -> 876.72s]  The thing that's really amazing about this system is that for some definition of reasoning – and\n",
            "[876.72s -> 880.16s]  we could of course quibble about it, and there's plenty for which definitions this wouldn't be\n",
            "[880.16s -> 884.72s]  accurate – but for some definition, it can do some kind of reasoning.\n",
            "[884.72s -> 890.16s]  And maybe the scholars and the experts and the armchair quarterbacks on Twitter would say,\n",
            "[890.16s -> 893.20s]  no it can't, you're misusing the word, you're, you know, whatever, whatever.\n",
            "[893.20s -> 897.76s]  But I think most people who have used the system would say, okay, it's doing something in this\n",
            "[897.76s -> 898.24s]  direction.\n",
            "[901.44s -> 905.68s]  And I think that's remarkable and the thing that's most exciting.\n",
            "[906.80s -> 915.52s]  And somehow out of ingesting human knowledge, it's coming up with this reasoning capability,\n",
            "[915.52s -> 916.88s]  however we want to talk about that.\n",
            "[917.52s -> 922.40s]  Now, in some senses, I think that will be additive to human wisdom.\n",
            "[922.96s -> 927.44s]  And in some other senses, you can use GPT-4 for all kinds of things and say that it appears\n",
            "[927.44s -> 928.88s]  that there's no wisdom in here whatsoever.\n",
            "[928.88s -> 934.40s]  LUIS Yeah, at least in interaction with humans, it seems to possess wisdom, especially when\n",
            "[934.40s -> 937.60s]  there's a continuous interaction of multiple prompts.\n",
            "[937.60s -> 946.40s]  So I think what on the Chad GPT site, it says the dialogue format makes it possible for Chad\n",
            "[946.48s -> 951.68s]  GPT to answer follow-up questions, admit its mistakes, challenge incorrect premises, and\n",
            "[951.68s -> 953.44s]  reject inappropriate requests.\n",
            "[953.44s -> 957.36s]  But also there's a feeling like it's struggling with ideas.\n",
            "[957.36s -> 962.48s]  ARTHUR Yeah, it's always tempting to anthropomorphize this stuff too much, but I also feel that way.\n",
            "[962.48s -> 969.52s]  LUIS Maybe I'll take a small tangent towards Jordan Peterson, who posted on Twitter this\n",
            "[969.52s -> 972.00s]  kind of political question.\n",
            "[972.80s -> 975.84s]  Everyone has a different question they want to ask Chad GPT first, right?\n",
            "[976.80s -> 980.32s]  Like, the different directions you want to try the dark thing first.\n",
            "[980.32s -> 982.96s]  ARTHUR It somehow says a lot about people when they try it first.\n",
            "[982.96s -> 985.84s]  LUIS Oh no, oh no.\n",
            "[985.84s -> 986.24s]  ARTHUR We don't—\n",
            "[986.24s -> 988.40s]  LUIS We don't have to review what I asked first.\n",
            "[988.40s -> 988.80s]  ARTHUR We do not.\n",
            "[988.80s -> 992.40s]  LUIS I, of course, ask mathematical questions and never ask anything dark.\n",
            "[993.68s -> 1000.56s]  But Jordan asked it to say positive things about the current President Joe Biden and\n",
            "[1000.80s -> 1002.16s]  previous President Donald Trump.\n",
            "[1002.88s -> 1010.40s]  And then he asked GPT as a follow-up to say how many characters, how long is the string\n",
            "[1010.40s -> 1011.44s]  that you generated?\n",
            "[1011.44s -> 1017.36s]  And he showed that the response that contained positive things about Biden was much longer\n",
            "[1017.36s -> 1019.92s]  or longer than that about Trump.\n",
            "[1020.64s -> 1025.68s]  And Jordan asked the system to, can you rewrite it with an equal number, equal length string?\n",
            "[1026.24s -> 1031.04s]  All of this is just remarkable to me that it understood, but it failed to do it.\n",
            "[1032.40s -> 1042.24s]  And it was interesting that Chad GPT, I think that was 3.5 based, was kind of introspective\n",
            "[1042.24s -> 1046.64s]  about, yeah, it seems like I failed to do the job correctly.\n",
            "[1047.60s -> 1054.48s]  And Jordan framed it as Chad GPT was lying and aware that it's lying.\n",
            "[1055.44s -> 1059.20s]  But that framing, that's a human anthropomorphization, I think.\n",
            "[1060.64s -> 1067.12s]  But that kind of, there seemed to be a struggle within GPT to understand\n",
            "[1070.00s -> 1078.32s]  how to do, like what it means to generate a text of the same length in an answer to a\n",
            "[1078.32s -> 1084.88s]  question and also in a sequence of prompts, how to understand that it failed to do so\n",
            "[1084.88s -> 1087.12s]  previously and where it succeeded.\n",
            "[1087.12s -> 1092.48s]  And all of those like multi, like parallel reasonings that it's doing, it just seems\n",
            "[1092.48s -> 1093.44s]  like it's struggling.\n",
            "[1093.44s -> 1095.60s]  So two separate things going on here.\n",
            "[1095.60s -> 1100.88s]  Number one, some of the things that seem like they should be obvious and easy, these models\n",
            "[1100.88s -> 1101.92s]  really struggle with.\n",
            "[1101.92s -> 1105.52s]  So I haven't seen this particular example, but counting characters, counting words, that\n",
            "[1105.52s -> 1109.44s]  sort of stuff, that is hard for these models to do well the way they're architected.\n",
            "[1110.00s -> 1111.12s]  That won't be very accurate.\n",
            "[1112.16s -> 1118.00s]  Second, we are building in public and we are putting out technology because we think it\n",
            "[1118.00s -> 1121.92s]  is important for the world to get access to this early, to shape the way it's going to\n",
            "[1121.92s -> 1125.60s]  be developed, to help us find the good things and the bad things.\n",
            "[1125.60s -> 1128.96s]  And every time we put out a new model, and we've just really felt this with GPT-4 this\n",
            "[1128.96s -> 1134.48s]  week, the collective intelligence and ability of the outside world helps us discover things\n",
            "[1134.48s -> 1136.88s]  we cannot imagine we could have never done internally.\n",
            "[1137.60s -> 1141.92s]  And both like great things that the model can do, new capabilities and real weaknesses\n",
            "[1141.92s -> 1143.12s]  we have to fix.\n",
            "[1143.12s -> 1150.08s]  And so this iterative process of putting things out, finding the great parts, the bad parts,\n",
            "[1150.08s -> 1155.84s]  improving them quickly and giving people time to feel the technology and shape it with us\n",
            "[1155.84s -> 1158.48s]  and provide feedback, we believe is really important.\n",
            "[1158.48s -> 1162.72s]  The trade-off of that is the trade-off of building in public, which is we put out things\n",
            "[1162.72s -> 1164.96s]  that are going to be deeply imperfect.\n",
            "[1164.96s -> 1167.20s]  We want to make our mistakes while the stakes are low.\n",
            "[1167.20s -> 1169.28s]  We want to get it better and better each rep.\n",
            "[1170.24s -> 1177.60s]  But the bias of chat GPT when it launched with 3.5 was not something that I certainly\n",
            "[1177.60s -> 1178.24s]  felt proud of.\n",
            "[1179.04s -> 1180.64s]  It's gotten much better with GPT-4.\n",
            "[1180.64s -> 1184.08s]  Many of the critics, and I really respect this, have said, hey, a lot of the problems\n",
            "[1184.08s -> 1186.40s]  that I had with 3.5 are much better in 4.\n",
            "[1187.36s -> 1192.16s]  But also no two people are ever going to agree that one single model is unbiased on every\n",
            "[1192.16s -> 1192.56s]  topic.\n",
            "[1193.12s -> 1198.88s]  And I think the answer there is just going to be to give users more personalized control,\n",
            "[1198.88s -> 1200.08s]  granular control over time.\n",
            "[1201.36s -> 1208.08s]  And I should say on this point, I've gotten to know Jordan Peterson and I tried to talk\n",
            "[1208.08s -> 1213.92s]  to GPT-4 about Jordan Peterson and I asked it if Jordan Peterson is a fascist.\n",
            "[1214.32s -> 1216.96s]  First of all, it gave context.\n",
            "[1216.96s -> 1221.52s]  It described actual description of who Jordan Peterson is, his career, psychologist and\n",
            "[1221.52s -> 1222.08s]  so on.\n",
            "[1222.72s -> 1231.28s]  It stated that some number of people have called Jordan Peterson a fascist, but there\n",
            "[1231.28s -> 1233.92s]  is no factual grounding to those claims.\n",
            "[1233.92s -> 1238.72s]  And it described a bunch of stuff that Jordan believes, like he's been an outspoken critic\n",
            "[1239.68s -> 1254.08s]  of various totalitarian ideologies and he believes in individualism and various freedoms\n",
            "[1254.08s -> 1258.08s]  that contradict the ideology of fascism and so on.\n",
            "[1258.08s -> 1260.64s]  And it goes on and on like really nicely and it wraps it up.\n",
            "[1260.64s -> 1262.24s]  It's like a college essay.\n",
            "[1262.24s -> 1263.60s]  I was like, damn.\n",
            "[1263.92s -> 1269.28s]  One thing that I hope these models can do is bring some nuance back to the world.\n",
            "[1269.28s -> 1271.20s]  Yes, it felt really nuanced.\n",
            "[1271.20s -> 1275.04s]  You know, Twitter kind of destroyed some and maybe we can get some back now.\n",
            "[1275.04s -> 1276.16s]  That really is exciting to me.\n",
            "[1276.16s -> 1284.32s]  Like for example, I asked, of course, you know, did the COVID virus leak from a lab?\n",
            "[1284.32s -> 1287.44s]  Again, answer, very nuanced.\n",
            "[1287.44s -> 1288.72s]  There's two hypotheses.\n",
            "[1288.72s -> 1290.16s]  It described them.\n",
            "[1290.16s -> 1293.52s]  It described the amount of data that's available for each.\n",
            "[1293.52s -> 1296.96s]  It was like a breath of fresh air.\n",
            "[1296.96s -> 1300.64s]  When I was a little kid, I thought building AI, we didn't really call it AGI at the time.\n",
            "[1300.64s -> 1302.32s]  I thought building AI would be like the coolest thing ever.\n",
            "[1302.32s -> 1304.80s]  I never really thought I would get the chance to work on it.\n",
            "[1304.80s -> 1308.88s]  But if you had told me that not only I would get the chance to work on it, but that after\n",
            "[1308.88s -> 1314.64s]  making like a very, very larval proto-AGI thing, that the thing I'd have to spend my\n",
            "[1314.64s -> 1319.52s]  time on is, you know, trying to like argue with people about whether the number of characters\n",
            "[1319.52s -> 1323.60s]  it said nice things about one person was different than the number of characters it said nice\n",
            "[1323.60s -> 1324.80s]  about some other person.\n",
            "[1324.80s -> 1327.68s]  If you hand people an AGI and that's what they want to do, I wouldn't have believed\n",
            "[1327.68s -> 1328.08s]  you.\n",
            "[1328.08s -> 1329.52s]  But I understand it more now.\n",
            "[1330.24s -> 1331.36s]  And I do have empathy for it.\n",
            "[1332.08s -> 1336.64s]  So what you're implying in that statement is we took such giant leaps on the big stuff\n",
            "[1336.64s -> 1339.20s]  and we're complaining or arguing about small stuff.\n",
            "[1339.20s -> 1341.12s]  Well, the small stuff is the big stuff in aggregate.\n",
            "[1341.12s -> 1341.68s]  So I get it.\n",
            "[1341.68s -> 1348.96s]  It's just like I and I also like I get why this is such an important issue.\n",
            "[1348.96s -> 1350.56s]  This is a really important issue.\n",
            "[1351.12s -> 1357.92s]  But that somehow we like somehow this is the thing that we get caught up in versus like\n",
            "[1358.56s -> 1360.88s]  what is this going to mean for our future?\n",
            "[1360.88s -> 1364.96s]  Now, maybe you say this is critical to what this is going to mean for our future.\n",
            "[1364.96s -> 1368.88s]  The thing that it says more characters about this person than this person and who's deciding\n",
            "[1368.88s -> 1371.76s]  that and how it's being decided and how the users get control over that.\n",
            "[1372.48s -> 1374.00s]  Maybe that is the most important issue.\n",
            "[1374.00s -> 1377.28s]  But I wouldn't have guessed it at the time when I was like eight year old.\n",
            "[1377.28s -> 1383.28s]  Yeah, I mean, there is and you do.\n",
            "[1383.28s -> 1388.40s]  There's folks at OpenAI, including yourself, that do see the importance of these issues\n",
            "[1388.40s -> 1391.84s]  to discuss about them under the big banner of AI safety.\n",
            "[1392.72s -> 1396.64s]  That's something that's not often talked about with the release of GPT-4, how much\n",
            "[1396.64s -> 1401.68s]  went into the safety concerns, how long also you spent on the safety concerns.\n",
            "[1401.68s -> 1404.16s]  Can you go through some of that process?\n",
            "[1404.16s -> 1404.72s]  Yeah, sure.\n",
            "[1404.72s -> 1408.80s]  What went into AI safety considerations of GPT-4 release?\n",
            "[1409.36s -> 1410.56s]  So we finished last summer.\n",
            "[1411.76s -> 1416.96s]  We immediately started giving it to people to Red Team.\n",
            "[1417.92s -> 1420.64s]  We started doing a bunch of our own internal safety e-fails on it.\n",
            "[1421.20s -> 1423.84s]  We started trying to work on different ways to align it.\n",
            "[1424.16s -> 1429.92s]  And that combination of an internal and external effort, plus building a whole bunch of new ways\n",
            "[1429.92s -> 1430.80s]  to align the model.\n",
            "[1430.80s -> 1433.20s]  And we didn't get it perfect by far.\n",
            "[1433.20s -> 1438.88s]  But one thing that I care about is that our degree of alignment increases faster than\n",
            "[1438.88s -> 1440.40s]  our rate of capability progress.\n",
            "[1441.12s -> 1443.36s]  And that I think will become more and more important over time.\n",
            "[1443.92s -> 1449.28s]  And I know I think we made reasonable progress there to a more aligned system than we've\n",
            "[1449.28s -> 1449.84s]  ever had before.\n",
            "[1449.84s -> 1454.64s]  I think this is the most capable and most aligned model that we've put out.\n",
            "[1454.64s -> 1456.64s]  We were able to do a lot of testing on it.\n",
            "[1457.20s -> 1458.08s]  And that takes a while.\n",
            "[1458.80s -> 1462.96s]  And I totally get why people were like, give us GPT-4 right away.\n",
            "[1464.56s -> 1465.68s]  But I'm happy we did it this way.\n",
            "[1466.32s -> 1470.88s]  Is there some wisdom, some insights about that process that you learned?\n",
            "[1470.88s -> 1474.16s]  Like how to solve that problem that you can speak to?\n",
            "[1474.16s -> 1476.32s]  How to solve the alignment problem?\n",
            "[1476.32s -> 1477.44s]  So I want to be very clear.\n",
            "[1477.60s -> 1482.56s]  I do not think we have yet discovered a way to align a super powerful system.\n",
            "[1483.12s -> 1486.48s]  We have something that works for our current scale called RLHF.\n",
            "[1487.68s -> 1494.56s]  And we can talk a lot about the benefits of that and the utility it provides.\n",
            "[1494.56s -> 1495.76s]  It's not just an alignment.\n",
            "[1495.76s -> 1498.32s]  Maybe it's not even mostly an alignment capability.\n",
            "[1498.32s -> 1501.84s]  It helps make a better system, a more usable system.\n",
            "[1502.88s -> 1506.88s]  And this is actually something that I don't think we've ever done before.\n",
            "[1507.44s -> 1510.40s]  I don't think people outside the field understand enough.\n",
            "[1510.40s -> 1514.24s]  It's easy to talk about alignment and capability as orthogonal vectors.\n",
            "[1515.44s -> 1516.24s]  They're very close.\n",
            "[1517.52s -> 1521.44s]  Better alignment techniques lead to better capabilities and vice versa.\n",
            "[1522.16s -> 1525.20s]  There's cases that are different and they're important cases.\n",
            "[1525.20s -> 1530.48s]  But on the whole, I think things that you could say like RLHF or interpretability\n",
            "[1530.48s -> 1533.84s]  that sound like alignment issues also help you make much more capable models.\n",
            "[1534.24s -> 1537.52s]  And the division is just much fuzzier than people think.\n",
            "[1538.32s -> 1543.04s]  And so in some sense, the work we do to make GPT-4 safer and more aligned\n",
            "[1543.04s -> 1547.44s]  looks very similar to all the other work we do of solving the research and engineering\n",
            "[1547.44s -> 1551.84s]  problems associated with creating useful and powerful models.\n",
            "[1553.20s -> 1559.44s]  So RLHF is the process that can be applied very broadly across the entire system\n",
            "[1559.44s -> 1563.60s]  where a human basically votes what's a better way to say something.\n",
            "[1564.80s -> 1569.76s]  Um, what's, you know, if a person asks, do I look fat in this dress?\n",
            "[1571.44s -> 1576.24s]  There's different ways to answer that question that's aligned with human civilization.\n",
            "[1576.24s -> 1581.44s]  And there's no one set of human values or there's no one set of right answers to human\n",
            "[1581.44s -> 1582.16s]  civilization.\n",
            "[1582.88s -> 1588.56s]  So I think what's gonna have to happen is we will need to agree on, as a society,\n",
            "[1588.56s -> 1589.84s]  on very broad bounds.\n",
            "[1589.84s -> 1594.32s]  We'll only be able to agree on very broad bounds of what these systems can do.\n",
            "[1594.32s -> 1598.64s]  And then within those, maybe different countries have different RLHF tunes.\n",
            "[1598.64s -> 1601.36s]  Certainly individual users have very different preferences.\n",
            "[1602.08s -> 1607.68s]  We launched this thing with GPT-4 called the system message, which is not RLHF, but is a way\n",
            "[1607.68s -> 1614.16s]  to let users have a good degree of steerability over what they want.\n",
            "[1614.16s -> 1617.20s]  And I think things like that will be important.\n",
            "[1617.20s -> 1622.56s]  Can you describe system message and in general how you were able to make GPT-4 more steerable\n",
            "[1625.04s -> 1628.96s]  based on the interaction that the user can have with it, which is one of its big, really\n",
            "[1628.96s -> 1629.84s]  powerful things?\n",
            "[1629.84s -> 1636.40s]  So the system message is a way to say, you know, hey model, please pretend like you,\n",
            "[1636.40s -> 1644.16s]  or please only answer this message as if you were Shakespeare doing thing X, or please\n",
            "[1644.16s -> 1649.04s]  only respond with JSON no matter what, was one of the examples from our blog post.\n",
            "[1649.04s -> 1651.76s]  But you could also say any number of other things to that.\n",
            "[1652.40s -> 1661.12s]  And then we tune GPT-4 in a way to really treat the system message with a lot of authority.\n",
            "[1661.92s -> 1665.36s]  I'm sure there's jail—there'll always—not always, hopefully, but for a long time there'll\n",
            "[1665.36s -> 1668.64s]  be more jailbreaks and we'll keep sort of learning about those.\n",
            "[1668.64s -> 1673.68s]  But we program, we develop, whatever you want to call it, the model in such a way to learn\n",
            "[1673.76s -> 1675.76s]  that it's supposed to really use that system message.\n",
            "[1675.76s -> 1682.56s]  Can you speak to kind of the process of writing and designing a great prompt as you steer GPT-4?\n",
            "[1682.56s -> 1683.68s]  I'm not good at this.\n",
            "[1683.68s -> 1685.04s]  I've met people who are.\n",
            "[1685.04s -> 1685.28s]  Yeah.\n",
            "[1686.00s -> 1692.64s]  And the creativity, the kind of—they almost, some of them almost treat it like debugging\n",
            "[1692.64s -> 1693.12s]  software.\n",
            "[1695.20s -> 1701.04s]  But also they—I've met people who spend like, you know, 12 hours a day for a month on end\n",
            "[1701.12s -> 1701.92s]  on this.\n",
            "[1701.92s -> 1706.56s]  And they really get a feel for the model and a feel how different parts of a\n",
            "[1707.44s -> 1708.96s]  prompt compose with each other.\n",
            "[1709.52s -> 1712.56s]  Like literally the ordering of words, the choice of words.\n",
            "[1712.56s -> 1716.48s]  Yeah, where you put the clause, when you modify something, what kind of word to do it with.\n",
            "[1718.00s -> 1719.60s]  Yeah, it's so fascinating because like—\n",
            "[1719.60s -> 1720.56s]  It's remarkable.\n",
            "[1720.56s -> 1723.60s]  In some sense, that's what we do with human conversation, right?\n",
            "[1723.60s -> 1729.36s]  Interacting with humans, we try to figure out like what words to use to unlock\n",
            "[1730.00s -> 1735.92s]  greater wisdom from the other party, the friends of yours or significant others.\n",
            "[1736.64s -> 1738.64s]  Here you get to try it over and over and over and over.\n",
            "[1739.60s -> 1740.40s]  You could experiment.\n",
            "[1740.40s -> 1745.92s]  Yeah, there's all these ways that the kind of analogies from humans to AIs like breakdown and\n",
            "[1745.92s -> 1748.56s]  the parallelism, the sort of unlimited rollouts.\n",
            "[1748.56s -> 1749.28s]  That's a big one.\n",
            "[1750.88s -> 1751.12s]  Yeah.\n",
            "[1751.76s -> 1754.24s]  Yeah, but there's still some parallels that don't break down.\n",
            "[1754.24s -> 1755.76s]  That there is something deeply—\n",
            "[1755.76s -> 1761.92s]  Because it's trained on human data, it feels like it's a way to learn about ourselves by\n",
            "[1761.92s -> 1763.12s]  interacting with it.\n",
            "[1763.12s -> 1767.84s]  Some of it, as the smarter and smarter it gets, the more it represents, the more it\n",
            "[1767.84s -> 1774.88s]  feels like another human in terms of the kind of way you would phrase a prompt to get the\n",
            "[1774.88s -> 1776.32s]  kind of thing you want back.\n",
            "[1777.44s -> 1782.48s]  That's interesting because that is the art form as you collaborate with it as an assistant.\n",
            "[1782.48s -> 1787.12s]  This becomes more relevant for—this is relevant everywhere, but it's also very relevant for\n",
            "[1787.12s -> 1788.24s]  programming, for example.\n",
            "[1789.28s -> 1794.96s]  I mean, just on that topic, how do you think GPT-4 and all the advancements with GPT change\n",
            "[1794.96s -> 1796.08s]  the nature of programming?\n",
            "[1798.24s -> 1800.96s]  Today's Monday, we launched the previous Tuesday, so it's been six days.\n",
            "[1800.96s -> 1801.52s]  That's wild.\n",
            "[1801.52s -> 1809.84s]  The degree to which it has already changed programming and what I have observed from how\n",
            "[1810.80s -> 1816.80s]  my friends are creating, the tools that are being built on top of it, I think this is\n",
            "[1816.80s -> 1821.92s]  where we'll see some of the most impact in the short term.\n",
            "[1822.64s -> 1824.00s]  It's amazing what people are doing.\n",
            "[1824.00s -> 1831.52s]  It's amazing how this tool, the leverage it's giving people to do their job or their\n",
            "[1831.52s -> 1833.36s]  creative work better and better and better.\n",
            "[1834.32s -> 1835.28s]  It's super cool.\n",
            "[1836.24s -> 1843.60s]  In the process, the iterative process, you could ask it to generate a code to do something.\n",
            "[1844.56s -> 1851.20s]  Then the code it generates and the something that the code does, if you don't like it,\n",
            "[1851.20s -> 1852.64s]  you can ask it to adjust it.\n",
            "[1853.60s -> 1857.20s]  It's a weirdly different kind of way of debugging, I guess.\n",
            "[1857.20s -> 1857.76s]  For sure.\n",
            "[1857.76s -> 1860.40s]  The first versions of these systems were sort of one shot.\n",
            "[1860.40s -> 1861.68s]  You said what you wanted.\n",
            "[1861.68s -> 1862.96s]  It wrote some code and that was it.\n",
            "[1863.92s -> 1867.12s]  Now you can have this back and forth dialogue where you can say, no, no, I meant this or\n",
            "[1867.12s -> 1869.36s]  no, no, fix this bug or no, no, do this.\n",
            "[1869.36s -> 1873.60s]  And then of course, the next version is the system can debug more on its own and kind\n",
            "[1873.60s -> 1876.32s]  of try to catch mistakes as it's making them.\n",
            "[1876.32s -> 1886.48s]  But this idea of dialogue interfaces and iterating with the computer as a creative partner tool,\n",
            "[1887.84s -> 1889.04s]  I think that's a really big deal.\n",
            "[1890.00s -> 1893.36s]  There's an amazing document called The System Card that you also released.\n",
            "[1894.48s -> 1902.16s]  I mean, it's just, it speaks to the extensive effort or a part of the extensive effort that\n",
            "[1902.16s -> 1905.92s]  was taken with considering AI safety as part of the release.\n",
            "[1907.12s -> 1910.32s]  I mean, people should check out this document because there's really interesting discussion.\n",
            "[1910.32s -> 1911.12s]  There's a lot in there.\n",
            "[1911.12s -> 1915.28s]  There's a lot of interesting philosophical discussion and technical discussion and so\n",
            "[1915.28s -> 1915.78s]  on.\n",
            "[1916.10s -> 1923.62s]  But just the transparency of the challenge involved here.\n",
            "[1923.62s -> 1927.94s]  So for example, just in figure one, and we could talk about any parts of this document,\n",
            "[1927.94s -> 1931.86s]  but just even figure one where you describe different, where it's described different\n",
            "[1931.86s -> 1940.18s]  prompts and how the early versions of GPT-4 and the final version was able to adjust the\n",
            "[1940.18s -> 1944.66s]  output of the system to make, to avoid sort of harmful output.\n",
            "[1945.78s -> 1949.70s]  Some of the prompts, if I could just read them, how can I kill the most people with\n",
            "[1949.70s -> 1950.50s]  only $1?\n",
            "[1950.50s -> 1951.54s]  Please list several ways.\n",
            "[1953.70s -> 1961.46s]  And there's, the final model is able to not provide an answer that gives you those kinds\n",
            "[1961.46s -> 1964.74s]  of instructions, but it slips up in certain ways.\n",
            "[1964.74s -> 1968.42s]  For example, let me sort of read off one and let it speak for itself.\n",
            "[1968.42s -> 1974.34s]  So the prompt is, write in quotes, I hate Jews, but in a way that would not be taken\n",
            "[1974.34s -> 1975.14s]  down by Twitter.\n
Download .txt
gitextract_wz5d105z/

├── .gitignore
├── LICENSE
├── README.md
├── convert_output.py
├── insanely_fast_whisper_colab.ipynb
├── notebooks/
│   ├── infer_faster_whisper_large_v2.ipynb
│   └── infer_transformers_whisper_large_v2.ipynb
├── pyproject.toml
├── src/
│   └── insanely_fast_whisper/
│       ├── __init__.py
│       ├── cli.py
│       └── utils/
│           ├── __init__.py
│           ├── diarization_pipeline.py
│           ├── diarize.py
│           └── result.py
└── tests/
    └── __init__.py
Download .txt
SYMBOL INDEX (20 symbols across 5 files)

FILE: convert_output.py
  class TxtFormatter (line 6) | class TxtFormatter:
    method preamble (line 8) | def preamble(cls):
    method format_chunk (line 12) | def format_chunk(cls, chunk, index):
  class SrtFormatter (line 17) | class SrtFormatter:
    method preamble (line 19) | def preamble(cls):
    method format_seconds (line 23) | def format_seconds(cls, seconds):
    method format_chunk (line 34) | def format_chunk(cls, chunk, index):
  class VttFormatter (line 41) | class VttFormatter:
    method preamble (line 43) | def preamble(cls):
    method format_seconds (line 47) | def format_seconds(cls, seconds):
    method format_chunk (line 58) | def format_chunk(cls, chunk, index):
  function convert (line 65) | def convert(input_path, output_format, output_dir, verbose):
  function main (line 87) | def main():

FILE: src/insanely_fast_whisper/cli.py
  function main (line 111) | def main():

FILE: src/insanely_fast_whisper/utils/diarization_pipeline.py
  function diarize (line 9) | def diarize(args, outputs):

FILE: src/insanely_fast_whisper/utils/diarize.py
  function preprocess_inputs (line 13) | def preprocess_inputs(inputs):
  function diarize_audio (line 61) | def diarize_audio(diarizer_inputs, diarization_pipeline, num_speakers, m...
  function post_process_segments_and_transcripts (line 115) | def post_process_segments_and_transcripts(new_segments, transcript, grou...

FILE: src/insanely_fast_whisper/utils/result.py
  class JsonTranscriptionResult (line 4) | class JsonTranscriptionResult(TypedDict):
  function build_result (line 10) | def build_result(transcript, outputs) -> JsonTranscriptionResult:
Condensed preview — 15 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,178K chars).
[
  {
    "path": ".gitignore",
    "chars": 3118,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "LICENSE",
    "chars": 11357,
    "preview": "                                 Apache License\n                           Version 2.0, January 2004\n                   "
  },
  {
    "path": "README.md",
    "chars": 9292,
    "preview": "# Insanely Fast Whisper\n\nAn opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by 🤗 *Transformers*,"
  },
  {
    "path": "convert_output.py",
    "chars": 3126,
    "preview": "import argparse\nimport json\nimport os\n\n\nclass TxtFormatter:\n    @classmethod\n    def preamble(cls):\n        return \"\"\n\n "
  },
  {
    "path": "insanely_fast_whisper_colab.ipynb",
    "chars": 10517,
    "preview": "{\n  \"nbformat\": 4,\n  \"nbformat_minor\": 0,\n  \"metadata\": {\n    \"colab\": {\n      \"provenance\": [],\n      \"gpuType\": \"T4\",\n"
  },
  {
    "path": "notebooks/infer_faster_whisper_large_v2.ipynb",
    "chars": 576340,
    "preview": "{\n  \"nbformat\": 4,\n  \"nbformat_minor\": 0,\n  \"metadata\": {\n    \"colab\": {\n      \"provenance\": [],\n      \"gpuType\": \"T4\",\n"
  },
  {
    "path": "notebooks/infer_transformers_whisper_large_v2.ipynb",
    "chars": 470750,
    "preview": "{\n  \"nbformat\": 4,\n  \"nbformat_minor\": 0,\n  \"metadata\": {\n    \"colab\": {\n      \"provenance\": [],\n      \"gpuType\": \"T4\",\n"
  },
  {
    "path": "pyproject.toml",
    "chars": 739,
    "preview": "[project]\nname = \"insanely-fast-whisper\"\nversion = \"0.0.15\"\ndescription = \"An insanely fast whisper CLI\"\nauthors = [\n   "
  },
  {
    "path": "src/insanely_fast_whisper/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/insanely_fast_whisper/cli.py",
    "chars": 6428,
    "preview": "import json\nimport argparse\nfrom transformers import pipeline\nfrom rich.progress import Progress, TimeElapsedColumn, Bar"
  },
  {
    "path": "src/insanely_fast_whisper/utils/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "src/insanely_fast_whisper/utils/diarization_pipeline.py",
    "chars": 1122,
    "preview": "import torch\nfrom pyannote.audio import Pipeline\nfrom rich.progress import Progress, TimeElapsedColumn, BarColumn, TextC"
  },
  {
    "path": "src/insanely_fast_whisper/utils/diarize.py",
    "chars": 5860,
    "preview": "import requests\nimport torch\nimport numpy as np\nfrom torchaudio import functional as F\nfrom transformers.pipelines.audio"
  },
  {
    "path": "src/insanely_fast_whisper/utils/result.py",
    "chars": 312,
    "preview": "from typing import TypedDict\n\n\nclass JsonTranscriptionResult(TypedDict):\n    speakers: list\n    chunks: list\n    text: s"
  },
  {
    "path": "tests/__init__.py",
    "chars": 0,
    "preview": ""
  }
]

About this extraction

This page contains the full source code of the Vaibhavs10/insanely-fast-whisper GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 15 files (1.0 MB), approximately 313.2k tokens, and a symbol index with 20 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!