Full Code of allenai/tango for AI

main 6aaa8ff0f203 cached

329 files

1.2 MB

305.9k tokens

1387 symbols

1 requests

Download .txt

Showing preview only (1,296K chars total). Download the full file or copy to clipboard to get everything.

Repository: allenai/tango
Branch: main
Commit: 6aaa8ff0f203
Files: 329
Total size: 1.2 MB

Directory structure:
gitextract_p3vof5f6/

├── .dockerignore
├── .github/
│   ├── CONTRIBUTING.md
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.yml
│   │   ├── documentation.yml
│   │   └── feature_request.yml
│   ├── dependabot.yml
│   └── workflows/
│       ├── changelog.yml
│       ├── docker.yml
│       ├── docker_testing.yml
│       ├── integration_tests.yml
│       ├── main.yml
│       └── update_dependency_pr.yml
├── .gitignore
├── .readthedocs.yaml
├── CHANGELOG.md
├── CITATION.cff
├── Dockerfile
├── Dockerfile.test
├── LICENSE
├── Makefile
├── README.md
├── RELEASE_PROCESS.md
├── docs/
│   ├── .gitignore
│   ├── Makefile
│   ├── make.bat
│   └── source/
│       ├── _static/
│       │   └── css/
│       │       └── custom.css
│       ├── api/
│       │   ├── commands.rst
│       │   ├── components/
│       │   │   ├── executor.rst
│       │   │   ├── format.rst
│       │   │   ├── index.rst
│       │   │   ├── step.rst
│       │   │   ├── step_cache.rst
│       │   │   ├── step_graph.rst
│       │   │   ├── step_info.rst
│       │   │   └── workspace.rst
│       │   ├── det_hash.rst
│       │   ├── exceptions.rst
│       │   ├── integrations/
│       │   │   ├── beaker.rst
│       │   │   ├── datasets.rst
│       │   │   ├── fairscale.rst
│       │   │   ├── flax.rst
│       │   │   ├── gs.rst
│       │   │   ├── index.rst
│       │   │   ├── torch.rst
│       │   │   ├── transformers.rst
│       │   │   └── wandb.rst
│       │   ├── logging.rst
│       │   ├── sequences.rst
│       │   ├── settings.rst
│       │   └── utilities.rst
│       ├── conf.py
│       ├── examples/
│       │   ├── euler.md
│       │   ├── eval_p3.md
│       │   ├── index.rst
│       │   └── train_lm.md
│       ├── faq.md
│       ├── first_steps.md
│       ├── index.md
│       └── installation.md
├── examples/
│   ├── euler/
│   │   ├── README.md
│   │   ├── complex_arithmetic.py
│   │   ├── euler.jsonnet
│   │   ├── euler_general.jsonnet
│   │   └── run.sh
│   ├── eval_p3/
│   │   ├── README.md
│   │   ├── config.jsonnet
│   │   └── eval.py
│   ├── finetune/
│   │   ├── __init__.py
│   │   ├── config.jsonnet
│   │   ├── snli_steps.py
│   │   └── test.py
│   ├── finetune_resnet/
│   │   ├── .gitignore
│   │   ├── config.jsonnet
│   │   └── resnet_steps.py
│   ├── flax/
│   │   ├── config.jsonnet
│   │   ├── run.sh
│   │   └── xsum.py
│   └── train_lm/
│       ├── .gitignore
│       ├── README.md
│       ├── config.jsonnet
│       ├── test.py
│       └── tokenize_step.py
├── integration_tests/
│   ├── README.md
│   └── fairscale_benchmarks/
│       ├── README.md
│       ├── config.jsonnet
│       └── run.sh
├── pyproject.toml
├── scripts/
│   ├── entrypoint.sh
│   ├── hash_extras.py
│   ├── prepare_changelog.py
│   ├── prepare_citation_cff.py
│   ├── release.sh
│   └── release_notes.py
├── tango/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── common/
│   │   ├── __init__.py
│   │   ├── aliases.py
│   │   ├── dataset_dict.py
│   │   ├── det_hash.py
│   │   ├── exceptions.py
│   │   ├── file_lock.py
│   │   ├── from_params.py
│   │   ├── lazy.py
│   │   ├── logging.py
│   │   ├── params.py
│   │   ├── registrable.py
│   │   ├── remote_utils.py
│   │   ├── sequences.py
│   │   ├── testing/
│   │   │   ├── __init__.py
│   │   │   └── steps.py
│   │   ├── tqdm.py
│   │   └── util.py
│   ├── executor.py
│   ├── executors/
│   │   ├── __init__.py
│   │   └── multicore_executor.py
│   ├── format.py
│   ├── integrations/
│   │   ├── __init__.py
│   │   ├── beaker/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── entrypoint.sh
│   │   │   ├── executor.py
│   │   │   ├── step_cache.py
│   │   │   └── workspace.py
│   │   ├── datasets/
│   │   │   └── __init__.py
│   │   ├── fairscale/
│   │   │   ├── __init__.py
│   │   │   ├── fsdp_config.py
│   │   │   ├── module_wrapper.py
│   │   │   └── training_engine.py
│   │   ├── flax/
│   │   │   ├── __init__.py
│   │   │   ├── data.py
│   │   │   ├── eval.py
│   │   │   ├── eval_callback.py
│   │   │   ├── format.py
│   │   │   ├── model.py
│   │   │   ├── optim.py
│   │   │   ├── train.py
│   │   │   ├── train_callback.py
│   │   │   ├── train_config.py
│   │   │   ├── util.py
│   │   │   └── wrapper.py
│   │   ├── gs/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── step_cache.py
│   │   │   └── workspace.py
│   │   ├── torch/
│   │   │   ├── __init__.py
│   │   │   ├── data.py
│   │   │   ├── eval.py
│   │   │   ├── eval_callback.py
│   │   │   ├── exceptions.py
│   │   │   ├── format.py
│   │   │   ├── model.py
│   │   │   ├── optim.py
│   │   │   ├── train.py
│   │   │   ├── train_callback.py
│   │   │   ├── train_config.py
│   │   │   ├── training_engine.py
│   │   │   └── util.py
│   │   ├── transformers/
│   │   │   ├── __init__.py
│   │   │   ├── config.py
│   │   │   ├── data.py
│   │   │   ├── finetune.py
│   │   │   ├── ia3.py
│   │   │   ├── model.py
│   │   │   ├── optim.py
│   │   │   ├── run_generation.py
│   │   │   ├── soft_prompt.py
│   │   │   └── tokenizer.py
│   │   └── wandb/
│   │       ├── __init__.py
│   │       ├── flax_train_callback.py
│   │       ├── step_cache.py
│   │       ├── torch_train_callback.py
│   │       ├── util.py
│   │       └── workspace.py
│   ├── py.typed
│   ├── settings.py
│   ├── step.py
│   ├── step_cache.py
│   ├── step_caches/
│   │   ├── __init__.py
│   │   ├── local_step_cache.py
│   │   ├── memory_step_cache.py
│   │   └── remote_step_cache.py
│   ├── step_graph.py
│   ├── step_info.py
│   ├── steps/
│   │   ├── __init__.py
│   │   ├── dataset_remix.py
│   │   ├── print.py
│   │   └── shell_step.py
│   ├── version.py
│   ├── workspace.py
│   └── workspaces/
│       ├── __init__.py
│       ├── local_workspace.py
│       ├── memory_workspace.py
│       └── remote_workspace.py
├── test_fixtures/
│   ├── __init__.py
│   ├── beaker/
│   │   └── nvidia_smi.yml
│   ├── common/
│   │   ├── params_example.jsonnet
│   │   └── params_example.yaml
│   ├── experiment/
│   │   ├── hello_world.jsonnet
│   │   ├── logging_check.jsonnet
│   │   ├── multiprocessing.jsonnet
│   │   ├── noisy.jsonnet
│   │   └── random.jsonnet
│   ├── integrations/
│   │   ├── __init__.py
│   │   ├── common/
│   │   │   └── __init__.py
│   │   ├── datasets/
│   │   │   └── config.json
│   │   ├── fairscale/
│   │   │   ├── __init__.py
│   │   │   ├── components.py
│   │   │   └── config.jsonnet
│   │   ├── flax/
│   │   │   ├── __init__.py
│   │   │   ├── config.jsonnet
│   │   │   └── xsum.py
│   │   └── torch/
│   │       ├── __init__.py
│   │       ├── eval.jsonnet
│   │       ├── train.jsonnet
│   │       ├── train_dist.jsonnet
│   │       └── train_streaming.jsonnet
│   └── v1_local_workspace/
│       └── cache/
│           ├── AdditionStep-34AiXoyiPKADMUnhcBzFYd6JeMcgx4DP/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── CosineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── ExponentiateStep-Rf73w34zWJcBrQafpAkxDvXR4mq3MXC9/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── MultiplyStep-2ZG7wPj9WLn5PgpYyPVHw9Qg7VM1mhwf/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── MultiplyStep-4SRzHCCqYGs2PLeT8LeL5ukrCWGJoiae/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── SineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           └── SubtractionStep-YCdedqjmmd9GUFi96VzPXD5tAVho3CTz/
│               ├── cache-metadata.json
│               ├── conda-environment.yaml
│               ├── executor-metadata.json
│               ├── lock
│               ├── requirements.txt
│               └── stepinfo.dill
└── tests/
    ├── __init__.py
    ├── common/
    │   ├── __init__.py
    │   ├── dataset_dict_test.py
    │   ├── det_hash_test.py
    │   ├── from_params_pep_563_test.py
    │   ├── from_params_test.py
    │   ├── params_test.py
    │   ├── registrable_test.py
    │   ├── sequences_test.py
    │   └── util_test.py
    ├── end_to_end/
    │   ├── test_dataset_dict_from_separate_steps.py
    │   ├── test_lazy_input_with_another_step.py
    │   ├── test_multicore_cli.py
    │   ├── test_non_cacheable_into_cacheable_multiple_runs.py
    │   ├── test_registered_runs.py
    │   ├── test_run_single_step.py
    │   ├── test_step_indexing.py
    │   ├── test_steps_that_fail.py
    │   └── test_uncacheable_leaf_steps.py
    ├── executor_test.py
    ├── executors/
    │   ├── __init__.py
    │   └── multicore_executor_test.py
    ├── format_test.py
    ├── integrations/
    │   ├── __init__.py
    │   ├── beaker/
    │   │   ├── __init__.py
    │   │   ├── conftest.py
    │   │   ├── executor_test.py
    │   │   ├── step_cache_test.py
    │   │   └── workspace_test.py
    │   ├── datasets/
    │   │   ├── __init__.py
    │   │   └── dataset_test.py
    │   ├── fairscale/
    │   │   ├── __init__.py
    │   │   └── train_test.py
    │   ├── flax/
    │   │   ├── __init__.py
    │   │   ├── data_test.py
    │   │   ├── format_test.py
    │   │   ├── optim_test.py
    │   │   └── train_test.py
    │   ├── gs/
    │   │   ├── __init__.py
    │   │   ├── step_cache_test.py
    │   │   └── workspace_test.py
    │   ├── torch/
    │   │   ├── __init__.py
    │   │   ├── data_test.py
    │   │   ├── det_hash_test.py
    │   │   ├── eval_test.py
    │   │   ├── format_test.py
    │   │   ├── optim_test.py
    │   │   ├── train_callback_test.py
    │   │   ├── train_test.py
    │   │   └── training_engine_test.py
    │   ├── transformers/
    │   │   ├── data_test.py
    │   │   ├── finetune_test.py
    │   │   ├── ia3_test.py
    │   │   ├── run_generation_test.py
    │   │   └── soft_prompt_test.py
    │   └── wandb/
    │       ├── __init__.py
    │       ├── step_cache_test.py
    │       └── workspace_test.py
    ├── main_test.py
    ├── step_caches/
    │   ├── __init__.py
    │   └── local_step_cache_test.py
    ├── step_graph_test.py
    ├── step_info_test.py
    ├── step_test.py
    ├── steps/
    │   ├── __init__.py
    │   ├── dataset_remix_test.py
    │   └── shell_step_test.py
    └── workspaces/
        ├── __init__.py
        ├── local_workspace_test.py
        └── memory_workspace_test.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .dockerignore
================================================
.dockerignore
**.pyc
**/__pycache__
.gitignore
.git
.coverage
.mypy_cache
docs
examples
tests
test_fixtures
integration_tests
dist
*.egg-info


================================================
FILE: .github/CONTRIBUTING.md
================================================
# Contributing

Thanks for considering contributing! Please read this document to learn the various ways you can contribute to this project and how to go about doing it.

## Bug reports and feature requests

### Did you find a bug?

First, do [a quick search](https://github.com/allenai/tango/issues) to see whether your issue has already been reported.
If your issue has already been reported, please comment on the existing issue.

Otherwise, open [a new GitHub issue](https://github.com/allenai/tango/issues). Be sure to include a clear title
and description. The description should include as much relevant information as possible. The description should
explain how to reproduce the erroneous behavior as well as the behavior you expect to see. Ideally you would include a
code sample or an executable test case demonstrating the expected behavior.

### Do you have a suggestion for an enhancement or new feature?

We use GitHub issues to track feature requests. Before you create a feature request:

- Make sure you have a clear idea of the enhancement you would like. If you have a vague idea, consider discussing
  it first on a GitHub issue.
- Check the documentation to make sure your feature does not already exist.
- Do [a quick search](https://github.com/allenai/tango/issues) to see whether your feature has already been suggested.

When creating your request, please:

- Provide a clear title and description.
- Explain why the enhancement would be useful. It may be helpful to highlight the feature in other libraries.
- Include code examples to demonstrate how the enhancement would be used.

## Making a pull request

When you're ready to contribute code to address an open issue, please follow these guidelines to help us be able to review your pull request (PR) quickly.

1.  **Initial setup** (only do this once)

    <details><summary>Expand details 👇</summary><br/>

    If you haven't already done so, please [fork](https://help.github.com/en/enterprise/2.13/user/articles/fork-a-repo) this repository on GitHub.

    Then clone your fork locally with

        git clone https://github.com/USERNAME/tango.git

    or

        git clone git@github.com:USERNAME/tango.git

    At this point the local clone of your fork only knows that it came from _your_ repo, github.com/USERNAME/tango.git, but doesn't know anything the _main_ repo, [https://github.com/allenai/tango.git](https://github.com/allenai/tango). You can see this by running

        git remote -v

    which will output something like this:

        origin https://github.com/USERNAME/tango.git (fetch)
        origin https://github.com/USERNAME/tango.git (push)

    This means that your local clone can only track changes from your fork, but not from the main repo, and so you won't be able to keep your fork up-to-date with the main repo over time. Therefore you'll need to add another "remote" to your clone that points to [https://github.com/allenai/tango.git](https://github.com/allenai/tango). To do this, run the following:

        git remote add upstream https://github.com/allenai/tango.git

    Now if you do `git remote -v` again, you'll see

        origin https://github.com/USERNAME/tango.git (fetch)
        origin https://github.com/USERNAME/tango.git (push)
        upstream https://github.com/allenai/tango.git (fetch)
        upstream https://github.com/allenai/tango.git (push)

    Finally, you'll need to create a Python 3 virtual environment suitable for working on this project. There a number of tools out there that making working with virtual environments easier.
    The most direct way is with the [`venv` module](https://docs.python.org/3.8/library/venv.html) in the standard library, but if you're new to Python or you don't already have a recent Python 3 version installed on your machine,
    we recommend [Miniconda](https://docs.conda.io/en/latest/miniconda.html).

    On Mac, for example, you can install Miniconda with [Homebrew](https://brew.sh/):

        brew install miniconda

    Then you can create and activate a new Python environment by running:

        conda create -n tango python=3.9
        conda activate tango

    Once your virtual environment is activated, you can install your local clone in "editable mode" with

        pip install -U pip setuptools wheel
        pip install -e '.[dev,all]'

    The "editable mode" comes from the `-e` argument to `pip`, and essential just creates a symbolic link from the site-packages directory of your virtual environment to the source code in your local clone. That way any changes you make will be immediately reflected in your virtual environment.

    To test your installation, just run

        tango info

    </details>

2.  **Ensure your fork is up-to-date**

    <details><summary>Expand details 👇</summary><br/>

    Once you've added an "upstream" remote pointing to [https://github.com/allenai/tango.git](https://github.com/allenai/tango), keeping your fork up-to-date is easy:

        git checkout main  # if not already on main
        git pull --rebase upstream main
        git push

    </details>

3.  **Create a new branch to work on your fix or enhancement**

    <details><summary>Expand details 👇</summary><br/>

    Committing directly to the main branch of your fork is not recommended. It will be easier to keep your fork clean if you work on a separate branch for each contribution you intend to make.

    You can create a new branch with

        # replace BRANCH with whatever name you want to give it
        git checkout -b BRANCH
        git push -u origin BRANCH

    </details>

4.  **Test your changes**

    <details><summary>Expand details 👇</summary><br/>

    Our continuous integration (CI) testing runs [a number of checks](https://github.com/allenai/tango/actions) for each pull request on [GitHub Actions](https://github.com/features/actions). You can run most of these tests locally, which is something you should do _before_ opening a PR to help speed up the review process and make it easier for us.

    First, you should run [`isort`](https://github.com/PyCQA/isort) and [`black`](https://github.com/psf/black) to make sure you code is formatted consistently.
    Many IDEs support code formatters as plugins, so you may be able to setup isort and black to run automatically everytime you save.
    For example, [`black.vim`](https://github.com/psf/black/tree/master/plugin) will give you this functionality in Vim. But both `isort` and `black` are also easy to run directly from the command line.
    Just run this from the root of your clone:

        isort .
        black .

    Our CI also uses [`ruff`](https://github.com/charliermarsh/ruff) to lint the code base and [`mypy`](http://mypy-lang.org/) for type-checking. You should run both of these next with

        ruff check .

    and

        mypy .

    We also strive to maintain high test coverage, so most contributions should include additions to [the unit tests](https://github.com/allenai/tango/tree/main/tests). These tests are run with [`pytest`](https://docs.pytest.org/en/latest/), which you can use to locally run any test modules that you've added or changed.

    For example, if you've fixed a bug in `tango/a/b.py`, you can run the tests specific to that module with

        pytest -v tests/a/b_test.py

    If your contribution involves additions to any public part of the API, we require that you write docstrings
    for each function, method, class, or module that you add.
    See the [Writing docstrings](#writing-docstrings) section below for details on the syntax.
    You should test to make sure the API documentation can build without errors by running

        make docs

    If the build fails, it's most likely due to small formatting issues. If the error message isn't clear, feel free to comment on this in your pull request.

    And finally, please update the [CHANGELOG](https://github.com/allenai/tango/blob/main/CHANGELOG.md) with notes on your contribution in the "Unreleased" section at the top.

    After all of the above checks have passed, you can now open [a new GitHub pull request](https://github.com/allenai/tango/pulls).
    Make sure you have a clear description of the problem and the solution, and include a link to relevant issues.

    We look forward to reviewing your PR!

    </details>

### Writing docstrings

We use [Sphinx](https://www.sphinx-doc.org/en/master/index.html) to build our API docs, which automatically parses all docstrings
of public classes and methods. All docstrings should adhere to the [Numpy styling convention](https://www.sphinx-doc.org/en/master/usage/extensions/example_numpy.html).

## Adding a new integration

In order to add a new integration, there are several additional steps and guidelines you should follow
in addition to everything listed in [Making a pull request](#making-a-pull-request).

1. First start by creating a new submodule `tango.integrations.name_of_integration` and put all of the code for your integration in there.
2. Then you must add a module docstring to the `__init__.py` file of the submodule which imports all of the public components of the integration,
   and defines the [`__all__`](https://docs.python.org/3/tutorial/modules.html#importing-from-a-package) special variable to include all of those components.
   This ensures all of the public components will show up in the documentation.
3. Next that you should add unit tests of your code to `tests/integrations/name_of_integration/`.
4. Then add a new file `docs/source/api/integrations/name_of_integration.rst`, and include the directive:

   ```
   .. automodule:: tango.integrations.name_of_integration
      :members:
   ```

   Take a look at any of the other files in that folder to see how it should look exactly.

5. And then add `name_of_integration` to the `toctree` in `docs/source/api/integrations/index.rst`.
6. After that, add any additional requirements that your integration depends on to `requirements.txt`. Be sure to put those under the "Extra dependencies for integrations" section,
   and add the special inline comment `# needed by: name_of_integration`.
7. And finally, in the `checks` job definition in `.github/workflows/main.yml`, add a new object
   to the matrix for your integration following the other examples there.


================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.yml
================================================
name: 🐛 Bug Report
description: Create a report to help us reproduce and fix the bug
labels: 'bug'

body:
- type: markdown
  attributes:
    value: >
      #### Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/allenai/tango/issues?q=is%3Aissue+sort%3Acreated-desc+).
- type: textarea
  attributes:
    label: 🐛 Describe the bug
    description: |
      Please provide a clear and concise description of what the bug is.

      If relevant, add a minimal example so that we can reproduce the error by running the code. It is very important for the snippet to be as succinct (minimal) as possible, so please take time to trim down any irrelevant code to help us debug efficiently. We are going to copy-paste your code and we expect to get the same result as you did: avoid any external data, and include the relevant imports, etc. For example:

      ```python
      # All necessary imports at the beginning
      import tango

      # A succinct reproducing example trimmed down to the essential parts:
      assert False is True, "Oh no!"
      ```

      If the code is too long (hopefully, it isn't), feel free to put it in a public gist and link it in the issue: https://gist.github.com.

      Please also paste or describe the results you observe along with the expected results. If you observe an error, please paste the error message including the **full** traceback of the exception. It may be relevant to wrap error messages in ```` ```triple quotes blocks``` ````.
    placeholder: |
      A clear and concise description of what the bug is.
  validations:
    required: true
- type: textarea
  attributes:
    label: Versions
    description: |
      Please run the following and paste the output below.
      ```sh
      python --version && pip freeze
      ```
  validations:
    required: true
- type: markdown
  attributes:
    value: >
      Thanks for contributing 🎉!


================================================
FILE: .github/ISSUE_TEMPLATE/documentation.yml
================================================
name: 📚 Documentation
description: Report an issue related to https://ai2-tango.readthedocs.io/latest
labels: 'documentation'

body:
- type: textarea
  attributes:
    label: 📚 The doc issue
    description: >
      A clear and concise description of what content in https://ai2-tango.readthedocs.io/latest is an issue.
  validations:
    required: true
- type: textarea
  attributes:
    label: Suggest a potential alternative/fix
    description: >
      Tell us how we could improve the documentation in this regard.
- type: markdown
  attributes:
    value: >
      Thanks for contributing 🎉!


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.yml
================================================
name: 🚀 Feature request
description: Submit a proposal/request for a new feature
labels: 'feature request'

body:
- type: textarea
  attributes:
    label: 🚀 The feature, motivation and pitch
    description: >
      A clear and concise description of the feature proposal. Please outline the motivation for the proposal. Is your feature request related to a specific problem? e.g., *"I'm working on X and would like Y to be possible"*. If this is related to another GitHub issue, please link here too.
  validations:
    required: true
- type: textarea
  attributes:
    label: Alternatives
    description: >
      A description of any alternative solutions or features you've considered, if any.
- type: textarea
  attributes:
    label: Additional context
    description: >
      Add any other context or screenshots about the feature request.
- type: markdown
  attributes:
    value: >
      Thanks for contributing 🎉!


================================================
FILE: .github/dependabot.yml
================================================
version: 2
updates:
- package-ecosystem: pip
  directory: "/"
  schedule:
    interval: "daily"
  open-pull-requests-limit: 10
- package-ecosystem: "github-actions"
  directory: "/"
  schedule:
    interval: "daily"


================================================
FILE: .github/workflows/changelog.yml
================================================
name: Changelog

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

on:
  pull_request:
    branches:
      - main
    paths:
      - 'tango/**'

jobs:
  changelog:
    name: CHANGELOG
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'

    steps:
    - uses: actions/checkout@v3
      with:
        fetch-depth: 0

    - name: Check that CHANGELOG has been updated
      run: |
        # If this step fails, this means you haven't updated the CHANGELOG.md
        # file with notes on your contribution.
        git diff --name-only $(git merge-base origin/main HEAD) | grep '^CHANGELOG.md$' && echo "Thanks for helping keep our CHANGELOG up-to-date!"


================================================
FILE: .github/workflows/docker.yml
================================================
name: Docker

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

on:
  pull_request:
    branches:
      - main
    paths:
      - "Dockerfile"
      - ".dockerignore"
      - "pyproject.toml"
  push:
    tags:
      - "v*.*.*"

jobs:
  build:
    name: Build (${{ matrix.build.tag }})
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        build:
          - base_image: ghcr.io/allenai/pytorch:1.12.1-cuda11.3-python3.9
            tag: cuda11.3
    env:
      IMAGE_NAME: ghcr.io/allenai/tango
    steps:
      - uses: actions/checkout@v3

      - name: Build Docker image
        run: |
          docker build --build-arg BASE_IMAGE=${{ matrix.build.base_image }} -t "${IMAGE_NAME}:${{ matrix.build.tag }}" .

      - name: Test Docker image
        run: |
          docker run --rm "${IMAGE_NAME}:${{ matrix.build.tag }}" info

      - name: Log in to ghcr.io
        if: github.event_name != 'pull_request'
        run: |
          echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin

      - name: Push latest to ghcr.io
        if: github.event_name != 'pull_request'
        run: |
          docker push "${IMAGE_NAME}:${{ matrix.build.tag }}"

      - name: Push release version to ghcr.io
        if: startsWith(github.ref, 'refs/tags/')
        run: |
          GITHUB_TAG=${GITHUB_REF#refs/tags/}
          docker tag "${IMAGE_NAME}:${{ matrix.build.tag }}" "${IMAGE_NAME}:${GITHUB_TAG}-${{ matrix.build.tag }}"
          docker push "${IMAGE_NAME}:${GITHUB_TAG}-${{ matrix.build.tag }}"


================================================
FILE: .github/workflows/docker_testing.yml
================================================
# This workflow is just for building our Docker image for GPU testing on Beaker,
# and pushing it to Beaker. We only run it when the relevant Dockerfile (or .dockerignore) changes.
name: Docker testing

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

on:
  pull_request:
    branches:
      - main
    paths:
      - 'Dockerfile.test'
      - '.dockerignore'
      - 'scripts/entrypoint.sh'
  push:
    branches:
      - main
    paths:
      - 'Dockerfile.test'
      - '.dockerignore'
      - 'scripts/entrypoint.sh'

jobs:
  build:
    name: Build
    runs-on: ubuntu-latest
    env:
      BEAKER_TOKEN: ${{ secrets.BEAKER_TOKEN }}
      BEAKER_WORKSPACE: ai2/tango-testing
      IMAGE_NAME: tango-testing
    steps:
      - uses: actions/checkout@v3

      - uses: allenai/setup-beaker@v2
        with:
          token: ${{ secrets.BEAKER_TOKEN }}
          workspace: ${{ env.BEAKER_WORKSPACE }}

      - name: Build Docker image
        run: |
          docker build -t "$IMAGE_NAME" -f Dockerfile.test .

      - name: Determine current commit SHA (pull request)
        if: github.event_name == 'pull_request'
        run: |
          echo "COMMIT_SHA=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV

      - name: Determine current commit SHA (push)
        if: github.event_name != 'pull_request'
        run: |
          echo "COMMIT_SHA=$GITHUB_SHA" >> $GITHUB_ENV

      - name: Test Docker image
        run: |
          docker run --rm --env COMMIT_SHA="$COMMIT_SHA" "$IMAGE_NAME" tango info

       # In order to push a new version of an image to beaker, we have to delete the old version first.
       # This doesn't actually delete the backing Docker image, so we'll still benefit from layer
       # caching when we push new versions. But we have to be careful to minimize the amount
       # of time between deletion and creation, because during that time any Beaker job trying to start
       # that depends on that image will fail. So to minimize this downtime, we first push a
       # "temp" version of the image, then delete the current one and quickly rename the "temp" one to take its place.
       # The image might not exist yet though, so it's okay if the delete fails.

      - name: Delete existing commit image
        continue-on-error: true
        run: |
          beaker image delete petew/${{ env.IMAGE_NAME }}-${{ env.COMMIT_SHA }}

      - name: Upload new commit image
        run: |
          beaker image create --workspace ${{ env.BEAKER_WORKSPACE }} --name ${{ env.IMAGE_NAME }}-${{ env.COMMIT_SHA }} ${{ env.IMAGE_NAME }}

      - name: Delete existing image
        if: github.event_name != 'pull_request'
        continue-on-error: true
        run: |
          beaker image delete petew/${{ env.IMAGE_NAME }}

      - name: Rename new commit image to final image
        if: github.event_name != 'pull_request'
        run: |
          beaker image rename petew/${{ env.IMAGE_NAME }}-${{ env.COMMIT_SHA }} ${{ env.IMAGE_NAME }}


================================================
FILE: .github/workflows/integration_tests.yml
================================================
name: Integration tests

on:
  workflow_dispatch:
    inputs:
      test:
        description: the integration test to run
        default: fairscale_benchmarks
        required: true
        type: choice
        options:
          - fairscale_benchmarks
      cluster:
        description: the beaker cluster to run the test on
        default: ai2/tango-integration-tests
        required: true
        type: choice
        options:
          - ai2/tango-integration-tests
          - ai2/allennlp-cirrascale
  # Uncomment this trigger to test changes on a pull request.
  # You also have to uncomment the lines below that mention 'for pull request checks'
  # pull_request:
  #   branches:
  #     - '*'

jobs:
  run_test:
    name: ${{ github.event.inputs.test }}
    # name: fairscale_benchmarks  # for pull request checks
    runs-on: [ubuntu-latest]
    timeout-minutes: 60
    env:
      TEST_NAME: ${{ github.event.inputs.test }}
      # TEST_NAME: fairscale_benchmarks  # for pull request checks
      BEAKER_TOKEN: ${{ secrets.BEAKER_TOKEN }}
      BEAKER_WORKSPACE: ai2/tango-integration-tests
      BEAKER_CLUSTER: ${{ github.event.inputs.cluster }}
      # BEAKER_CLUSTER: ai2/allennlp-cirrascale  # for pull request checks
      IMAGE_NAME: petew/tango-testing
    steps:
      - uses: actions/checkout@v3

      - name: Validate inputs
        run: |
          # The 'test' input should be a directory in `integration_tests/`
          test -d "integration_tests/${TEST_NAME}"

      - name: Determine current commit SHA (pull request)
        if: github.event_name == 'pull_request'
        run: |
          echo "COMMIT_SHA=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV

      - name: Determine current commit SHA (push)
        if: github.event_name != 'pull_request'
        run: |
          echo "COMMIT_SHA=$GITHUB_SHA" >> $GITHUB_ENV

      - name: Install beaker client
        shell: bash
        run: |
          mkdir -p "$HOME/bin"

          # Download and install from latest GitHub release.
          curl -s https://api.github.com/repos/allenai/beaker/releases/latest \
            | grep 'browser_download_url.*linux' \
            | cut -d '"' -f 4 \
            | wget -qi - \
          && tar -xvzf beaker_linux.tar.gz -C "$HOME/bin"

          # Add to path.
          echo "$HOME/bin" >> "$GITHUB_PATH"

      - name: Verify beaker install
        run: |
          beaker account whoami

      - name: Create beaker experiment config
        run: |
          cat >beaker_config.yml << EOL
          version: v2-alpha
          description: ${{ env.TEST_NAME }}
          tasks:
            - name: test
              image:
                beaker: ${{ env.IMAGE_NAME }}
              command: ["/entrypoint.sh", "integration_tests/${{ env.TEST_NAME }}/run.sh"]
              envVars:
                - name: COMMIT_SHA
                  value: $COMMIT_SHA
                - name: WANDB_API_KEY
                  secret: WANDB_API_KEY
                - name: FILE_FRIENDLY_LOGGING
                  value: "true"
                - name: TOKENIZERS_PARALLELISM  # set this to avoid warnings
                  value: "true"
                - name: PYTHONUNBUFFERED
                  value: "true"
              result:
                path: '/results'
              resources:
                gpuCount: 4
              context:
                cluster: ${{ env.BEAKER_CLUSTER }}
                priority: normal
          EOL
          cat beaker_config.yml

      - name: Submit beaker job
        run: |
          TIMESTAMP=$(date +%H%M%S)
          EXPERIMENT=$(beaker experiment create beaker_config.yml --workspace $BEAKER_WORKSPACE --name "${TEST_NAME}-${{ github.run_number }}-${TIMESTAMP}" | awk '{print $2}')
          if [ -z "$EXPERIMENT" ]; then
            exit 1
          else
            echo "EXPERIMENT=$EXPERIMENT" >> $GITHUB_ENV
            echo "Experiment $EXPERIMENT submitted. See progress at https://beaker.org/ex/$EXPERIMENT"
          fi

      - name: Wait for job to finish
        run: |
          beaker experiment await $EXPERIMENT test finalized --timeout 60m
          # Check the job's exit code.
          test $(beaker experiment get $EXPERIMENT --format=json | jq '.[0].jobs[0].status.exitCode') -eq 0

      - name: Get logs
        if: always()
        run: |
          # EXPERIMENT could be empty if the submission step failed.
          # We'll exit right away if that's the case.
          if [ -z "$EXPERIMENT" ]; then
            echo "No logs to show"
            exit 0
          fi

          # Download logs from beaker.
          beaker experiment results $EXPERIMENT --prefix out.log --output results

          # If the experiment failed during startup, there might not be any logs.
          if [ -f results/test/out.log ]; then
            echo ""
            echo ">>> Logs:"
            echo ""
            cat results/test/out.log
          else
            echo "No logs to show"
          fi

      - name: Stop job
        if: cancelled()
        run: |
          if [ ! -z "$EXPERIMENT" ]; then
            beaker experiment stop $EXPERIMENT
          fi


================================================
FILE: .github/workflows/main.yml
================================================
name: Main

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}

on:
  pull_request:
    branches:
      - "*"
  push:
    branches:
      - main
    tags:
      - "v*.*.*"

env:
  CACHE_PREFIX: v5 # Change this to invalidate existing cache.
  PYTHON_PATH: ./
  DEFAULT_PYTHON: 3.9
  WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
  BEAKER_TOKEN: ${{ secrets.BEAKER_TOKEN }}
  BEAKER_WORKSPACE: ai2/tango-testing
  BEAKER_IMAGE: petew/tango-testing
  GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

jobs:
  checks:
    name: python ${{ matrix.python }} - ${{ matrix.task.name }}
    runs-on: [ubuntu-latest]
    timeout-minutes: 30
    permissions:
      contents: "read"
      id-token: "write"
    strategy:
      fail-fast: false
      matrix:
        python: ["3.9"]
        task:
          - name: Lint
            extras: dev,all
            requires_torch: true
            run: |
              ruff check .

          - name: Type check
            extras: dev,all
            requires_torch: true
            run: |
              mypy --check-untyped-defs .

          - name: Build
            extras: dev,all
            requires_torch: true
            run: |
              tango --version
              python -m build

          - name: Style
            extras: dev
            requires_torch: false
            run: |
              isort --check .
              black --check .

          - name: Docs
            extras: dev,all
            requires_torch: true
            run: |
              cd docs && make html SPHINXOPTS="-W --keep-going"

          - name: Test
            extras: dev
            requires_torch: false
            run: |
              pytest -v --durations=10 --color=yes --doctest-modules --ignore=tests/integrations --ignore=tango/integrations tests/ tango/

          - name: Datasets integration
            extras: dev,datasets
            requires_torch: false
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/datasets tests/integrations/datasets

          - name: PyTorch integration
            extras: dev,torch
            requires_torch: true
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/torch tests/integrations/torch

          - name: Transformers integration
            extras: dev,flax,transformers
            requires_torch: true
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/transformers tests/integrations/transformers

          - name: FairScale integration
            extras: dev,fairscale
            requires_torch: true
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/fairscale tests/integrations/fairscale

          - name: W&B integration
            extras: dev,torch,flax,wandb
            requires_torch: true
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/wandb tests/integrations/wandb

          - name: Beaker integration
            extras: dev,beaker
            requires_torch: false
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/beaker tests/integrations/beaker

          - name: Flax integration
            extras: dev,flax,transformers
            requires_torch: false
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/flax tests/integrations/flax

          - name: GS integration
            extras: dev,gs
            requires_torch: false
            run: |
              pytest -v --color=yes --doctest-modules tango/integrations/gs tests/integrations/gs

          - name: Example - train_lm
            extras: dev,all
            requires_torch: true
            run: |
              cd examples/train_lm
              pytest -v --color=yes test.py

        include:
          # Run the core tests on other Python versions as well.
          - task:
              name: Test
              extras: dev
              requires_torch: false
              run: |
                pytest -v --durations=10 --color=yes --doctest-modules --ignore=tests/integrations --ignore=tango/integrations tests/ tango/
            python: "3.8"

          - task:
              name: Test
              extras: dev
              requires_torch: false
              run: |
                pytest -v --durations=10 --color=yes --doctest-modules --ignore=tests/integrations --ignore=tango/integrations tests/ tango/
            python: "3.10"

    steps:
      - uses: "actions/checkout@v3"
      - name: Checkout
        if: github.event_name != 'pull_request'
        uses: actions/checkout@v3

      # For pull requests we need to checkout the HEAD commit instead of the merge
      # commit since some tests depend on having an existing commit.
      - name: Checkout (pull request)
        if: github.event_name == 'pull_request'
        uses: actions/checkout@v3
        with:
          ref: ${{ github.event.pull_request.head.sha }}

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python }}

      - name: Install prerequisites
        run: |
          pip install --upgrade pip setuptools wheel virtualenv

      - name: Set build variables
        shell: bash
        run: |
          set -e
          # Get the exact Python version to use in the cache key.
          echo "PYTHON_VERSION=$(python --version)" >> $GITHUB_ENV
          echo "RUNNER_ARCH=$(uname -m)" >> $GITHUB_ENV
          # Use week number in cache key so we can refresh the cache weekly.
          echo "WEEK_NUMBER=$(date +%V)" >> $GITHUB_ENV
          echo "EXTRAS_HASH=$(python scripts/hash_extras.py ${{ matrix.task.extras }})" >> $GITHUB_ENV

      - uses: actions/cache@v3
        id: virtualenv-cache
        with:
          path: .venv
          key: ${{ env.CACHE_PREFIX }}-${{ env.WEEK_NUMBER }}-${{ runner.os }}-${{ env.RUNNER_ARCH }}-${{ env.PYTHON_VERSION }}-${{ env.EXTRAS_HASH }}-${{ hashFiles('pyproject.toml') }}

      - name: Setup virtual environment (no cache hit)
        if: steps.virtualenv-cache.outputs.cache-hit != 'true'
        run: |
          test -d .venv || virtualenv -p $(which python) --copies --reset-app-data .venv

      # Reference: https://github.com/marketplace/actions/authenticate-to-google-cloud#setup
      - name: Authenticate to Google Cloud
        if: matrix.task.name == 'GS integration'
        uses: "google-github-actions/auth@v1"
        with:
          workload_identity_provider: "projects/10554368204/locations/global/workloadIdentityPools/tango-ci-pool/providers/tango-ci-provider"
          service_account: "tango-service@ai2-allennlp.iam.gserviceaccount.com"

      - name: Pre-install torch
        if: steps.virtualenv-cache.outputs.cache-hit != 'true' && (contains(matrix.task.extras, 'torch') || contains(matrix.task.extras, 'all') || matrix.task.requires_torch)
        run: |
          . .venv/bin/activate
          pip install torch==2.0.0 --extra-index-url https://download.pytorch.org/whl/cpu

      - name: Pre-install flax
        if: steps.virtualenv-cache.outputs.cache-hit != 'true' && (contains(matrix.task.extras, 'flax') || contains(matrix.task.extras, 'all'))
        run: |
          . .venv/bin/activate
          pip install flax jax jaxlib "tensorflow-cpu>=2.9.1" optax

      - name: Install editable (no cache hit)
        if: steps.virtualenv-cache.outputs.cache-hit != 'true'
        run: |
          . .venv/bin/activate
          pip install -e .[${{ matrix.task.extras }}]

      - name: Install editable (cache hit)
        if: steps.virtualenv-cache.outputs.cache-hit == 'true'
        run: |
          . .venv/bin/activate
          pip install --no-deps -e .[${{ matrix.task.extras }}]

      - name: Show environment info
        run: |
          . .venv/bin/activate
          echo "========= Python location ==========="
          which python
          echo "========= Python version ============"
          python --version
          echo "========= Python packages ==========="
          pip freeze
          echo "========= Tango installation ========"
          tango info

      - name: ${{ matrix.task.name }}
        run: |
          . .venv/bin/activate
          ${{ matrix.task.run }}

      - name: Upload package distribution files
        if: matrix.task.name == 'Build' && matrix.python == env.DEFAULT_PYTHON
        uses: actions/upload-artifact@v3
        with:
          name: package
          path: dist

      - name: Upload docs build
        if: matrix.task.name == 'Docs' && matrix.python == env.DEFAULT_PYTHON
        uses: actions/upload-artifact@v3
        with:
          name: docs
          path: docs/build

      - name: Clean up
        if: always()
        run: |
          . .venv/bin/activate
          pip uninstall -y ai2-tango

  gpu_tests:
    name: GPU Tests
    runs-on: ubuntu-latest
    steps:
      - name: Determine current commit SHA (pull request)
        if: github.event_name == 'pull_request'
        run: |
          echo "COMMIT_SHA=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV

      - name: Determine current commit SHA (push)
        if: github.event_name != 'pull_request'
        run: |
          echo "COMMIT_SHA=$GITHUB_SHA" >> $GITHUB_ENV

      - name: GPU Tests
        uses: allenai/beaker-run-action@v1.2
        with:
          spec: |
            version: v2
            description: GPU Tests
            budget: ai2/oe-training
            tasks:
              - name: tests
                image:
                  beaker: ${{ env.BEAKER_IMAGE }}
                context:
                  preemptible: true 
                resources:
                  gpuCount: 2
                envVars:
                  - name: COMMIT_SHA
                    value: ${{ env.COMMIT_SHA }}
                command: ["/entrypoint.sh", "pytest", "-v", "-m", "gpu", "tests/"]
                result:
                  path: /unused
          token: ${{ secrets.BEAKER_TOKEN }}
          workspace: ${{ env.BEAKER_WORKSPACE }}
          clusters: ai2/general-cirrascale,ai2/allennlp-cirrascale,ai2/aristo-cirrascale,ai2/mosaic-cirrascale,ai2/s2-cirrascale

  release:
    name: Release
    runs-on: ubuntu-latest
    needs: [gpu_tests, checks]
    if: startsWith(github.ref, 'refs/tags/')
    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: ${{ env.DEFAULT_PYTHON }}

      - name: Install requirements
        run: |
          pip install -e .[dev]

      - name: Prepare environment
        run: |
          echo "RELEASE_VERSION=${GITHUB_REF#refs/tags/v}" >> $GITHUB_ENV
          echo "TAG=${GITHUB_REF#refs/tags/}" >> $GITHUB_ENV

      - name: Download package distribution files
        uses: actions/download-artifact@v3
        with:
          name: package
          path: dist

      - name: Generate release notes
        run: |
          python scripts/release_notes.py > ${{ github.workspace }}-RELEASE_NOTES.md

      - name: Publish package to PyPI
        run: |
          twine upload -u __token__ -p ${{ secrets.PYPI_PASSWORD }} dist/*

      - name: Publish GitHub release
        uses: softprops/action-gh-release@v1
        with:
          body_path: ${{ github.workspace }}-RELEASE_NOTES.md
          prerelease: ${{ contains(env.TAG, 'rc') }}
          files: |
            dist/*


================================================
FILE: .github/workflows/update_dependency_pr.yml
================================================
name: Update dependency PR

on:
  pull_request:
    types:
      - opened
    paths:
      - "pyproject.toml"

permissions:
  pull-requests: write

jobs:
  torch:
    name: torch
    runs-on: ubuntu-latest
    if: startsWith(github.head_ref, 'dependabot/pip/torch-')
    steps:
      - uses: actions/github-script@v6
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: 'Hello! This is a [PyTorch](https://pytorch.org/) upgrade, which means you will also need to update:\n- [ ] The base image in `Dockerfile`\n- [ ] The base image in `Dockerfile.test`\n- [ ] The torch version hard-coded in `.github/workflows/main.yml`'
            })


================================================
FILE: .gitignore
================================================
# build artifacts

.eggs/
.mypy_cache
ai2_tango.egg-info/
build/
dist/
pip-wheel-metadata/
runs/
workspace/

# dev tools

.envrc
.python-version
.idea
.venv/
.vscode/
/*.iml


# jupyter notebooks

.ipynb_checkpoints


# miscellaneous

.cache/
doc/_build/
*.swp
.DS_Store


# python

*.pyc
*.pyo
__pycache__


# testing and continuous integration

.coverage
.pytest_cache/
.benchmarks

# documentation build artifacts

docs/build
site/

# internal experiment configs
*-internal.jsonnet
*-internal.json
*-internal.yaml
*-internal.yml


================================================
FILE: .readthedocs.yaml
================================================
version: 2

sphinx:
  configuration: docs/source/conf.py
  fail_on_warning: true

build:
  os: ubuntu-22.04
  tools:
    python: "3.10"

python:
  install:
    - method: pip
      path: .
      extra_requirements:
        - dev
        - all


================================================
FILE: CHANGELOG.md
================================================
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased

### Fixed

- Fixed a bunch of dependencies
- Upgraded to new version of wandb

## [v1.3.2](https://github.com/allenai/tango/releases/tag/v1.3.2) - 2023-10-27

### Fixed

- Fix issues with gcloud auth in beaker executor.

## [v1.3.1](https://github.com/allenai/tango/releases/tag/v1.3.1) - 2023-10-25

### Fixed

- Minor bugs in the `GSWorkspace()`.

### Changed

- Added CLI-style execution functions for experiments defined in Python.
- Added `display()` to `ExecutorOutput` for producing a table that summarizes the run.

## [v1.3.0](https://github.com/allenai/tango/releases/tag/v1.3.0) - 2023-10-13

### Added
 - Added the `Workspace.remove_step()` method to safely remove steps.
- The `GSWorkspace()` can now be initialized with google cloud bucket subfolders.

### Changed

- The `BeakerExecutor` now uses the HEAD commit at the time the executor is instantiated to executor a step instead of the HEAD commit at the time the step is run.

### Fixed

- Removed unnecessary code coverage dev requirements.
- Fixed issue where new version of torch caused no LR schedulers to be registered.
- Updated pinned versions of jax, jaxlib, and flax.

## [v1.2.1](https://github.com/allenai/tango/releases/tag/v1.2.1) - 2023-04-06

### Added

- Added the following workspace methods to support the Tango viz UI: `Workspace.search_registered_runs()`, `Workspace.search_step_info()`, `Workspace.num_registered_runs()`, and `Workspace.num_steps()`.

### Fixed

- Fixes a bug where `FromParams` would fail to parse when an object takes a `Step` argument directly.
- Changed a name so we don't override the built-in name `set`.
- Fixed a bug that would cause O(n^2) memory consumption in dense step graphs.


## [v1.2.0](https://github.com/allenai/tango/releases/tag/v1.2.0) - 2023-02-10

### Added

- You can now add arguments to steps without invalidating the cache. See `Step.SKIP_DEFAULT_ARGUMENTS`.
- Fixed integration status messages in `tango info` command.
- Added abstractions for `RemoteClient`, `RemoteStepCache`, and `RemoteWorkspace`.
- Added a GS integration that comes with `GSWorkspace`, a remote `Workspace` implementation that uses google cloud storage.
- You can now bind functional steps to the underlying `Step` instance with `@step(bind=True)`, meaning the first argument to the function will be a `Step`.
- Added `ShellStep` for running arbitrary shell commands.
- Added `@make_registrable` decorator to make arbitrary functions registrable, to make it easier to refer to them in tango configurations.

### Fixed

- Jsonnet parsing is now much faster and works on Windows.
- Warnings about locks are now reliably printed every 30 seconds
- We now make sure Beaker jobs have the latest version of beaker-py, so that we're compatible with the latest API changes.
- Stopping early now works when the metric doesn't change at all.
- Fixed bug with `FromParams` which didn't handle variable length tuples correctly.

### Changed

- The default log level for Tango is now `warning`.
- You can specify multiple steps with `-s` from the `tango run` command.


## [v1.1.0](https://github.com/allenai/tango/releases/tag/v1.1.0) - 2022-12-01

### Added

- Added `gpu_type` field to `StepResources`. The `BeakerExecutor` can use this to determine which clusters to a submit a step to.
- Added `machine` field to `StepResources`. You can set this to "local" when using the `BeakerExecutor` to force it to run the step locally.
- Added `--ext-var` argument to `tango run` for setting JSONNET external variables
  when loading the experiment config.
- Added `@step()` decorator to create `Step` classes from functions.
- Added the `transformers::with_soft_prompt` integration, to make soft-prompted prefix transformers easy.

### Removed

- Removed PyTorch Lightning integration.
- Removed `tango server` command and `--serve/--no-serve` option for `tango run`.
- Removed `source_release.py`, which was checked in by accident.

### Fixed

- Fixed issue where Executor `parallelism` option in a Tango settings file would be ignored.
- Fixed a bug where the unique ID of a step that depends on a key-value of the result of another step could change if the name of the other step changes.
- Fixed a bug where importing certain libraries (like torchmetrics) would mess with our exception handling because they set `sys.excepthook` for some reason. Now we always reset `sys.excepthook` after importing.
- The type hints for the flax trainer suggested that the training split is optional when in fact it's mandatory.
- Made `BeakerWorkspace` / `BeakerStepLock` more robust when a job is preempted.
- Minor performance improvements for the Beaker executor and workspace.
- Fixed bug with `step_extra_dependencies` where uncacheable dependencies wouldn't be run.


## [v1.0.2](https://github.com/allenai/tango/releases/tag/v1.0.2) - 2022-11-14

### Changed

- `BeakerScheduler` can now return a list of clusters.

## [v1.0.1](https://github.com/allenai/tango/releases/tag/v1.0.1) - 2022-10-20

### Fixed

- `LightningTrainStep` now can take a `Lazy` model object which results in a gauranteed deterministic hash.
- Fixed issue where remote `Workspace` implementations like `WandbWorkspace` and `BeakerWorkspace` would use the same local cache regardless of the W&B / Beaker workspace
  being used.
- Fixed bug with `TorchEvalStep` when constructing callbacks.
- Fixed some import error issues caused when an integration is not installed.
- Fix incorrect reporting of final results in `MulticoreExecutor`.

### Changed

- Wandb step cache retries api call in case of timeout
- `beaker-py >= 1.11` required.

## [v1.0.0](https://github.com/allenai/tango/releases/tag/v1.0.0) - 2022-10-05

### Added

- Added `step_extra_dependencies` input field to `Step` class that can be used to force a dependency on another step even if the current step doesn't directly depend on the output of the other step. See [#418](https://github.com/allenai/tango/issues/418) for more context.

### Changed

- `beaker-py >= 1.10` required.

### Fixed

- Long log lines will be soft-wrapped to ensure that links are clickable.
- Fixed a bug where some workspaces could be left in a bad state if a step's `Format` failed to serialize the step's result in `Workspace.step_finished()`.
- Sometimes functions and methods end up as arguments to steps, which means we have to hash them. Instead of taking
  a hash of the function, we now take a hash of the function's module and name.
- Fixed a bug with the Beaker executor where it would hang at the end of a run if a step failed that is a dependency of another step.
- Fixed tests to work with new version of transformers.
- Fixed `Executor.execute_sub_graph_for_step()` to be able to run the step's dependencies in parallel.


## [v0.14.0](https://github.com/allenai/tango/releases/tag/v0.14.0) - 2022-09-20

### Added

- Adds a function to modify a Hugging Face transformer with IA3 adaptors
- Added a `BeakerScheduler` registrable class, specified as the argument `scheduler` to `BeakerExecutor`, which controls the resources assigned to steps ran on Beaker.
  Users can implement their own `BeakerScheduler` subclasses to customize the resource assignment behavior.

### Changed

- In the `tango run` command, `--no-server` is now the default. Use `--server` to start the server.

### Fixed

- Made `BeakerExecutor` more robust to connection, timeout, SSL, and other recoverable HTTP errors.
- Made the `BeakerStepLock` more robust, and as a result `BeakerWorkspace` is more
  robust and should require less manual intervention for locks in a bad state.
- Fixed a bug with the internal scheduling logic of the `BeakerExecutor` which
  could delay submitting some steps in parallel.
- Fixed a bug where creating a `StepInfo` object from params might result in unnecessary imports.
- Fixed a bug where canceling the Beaker executor might not work properly.
- Fixed a bug where the trainer trains too much when `train_epochs` is set and you're using gradient accumulation.
- Fixed a bug where included modules might not be found when using multiprocessing when they're not on `sys.path` / `PYTHONPATH`.
- Fixed how the results of uncacheable steps are displayed by `tango run`.
- Beaker executor won't run duplicate cacheable steps at the same time.

## [v0.13.0](https://github.com/allenai/tango/releases/tag/v0.13.0) - 2022-09-07

### Added

- You can now reference into a particular index of the result of another step in a config. For example: `{type: "ref", ref: "some_previous_step", key: 0}`.
  The key field can be an integer if the result of the referenced step is a list or tuple, or a string if the result of the referenced step is a dictionary.
- Added `priority` parameter to Beaker executor for setting the default task priority for Beaker jobs.
- Added `Workspace.step_result()` method for getting a step's result from the latest
  run.
- `tango run` will now display a URL to the logs for failed steps when you use the `BeakerExecutor`.

### Changed

- The `TorchTrainStep` now enables monitoring arbitrary model outputs during training. `TorchTrainEngine.forward_train` now returns a tuple `loss, model_outputs` for each micro batch and the list of model outputs for all micro batches in a batch is passed to the `TrainCallback.log_batch` and `TrainCallback.post_batch`.
- Tango will now automatically search Python modules in the current working directory
  for registered classes so that you don't always need to use the `--include-package` setting.
- The minimum supported Python version is now 3.8.
- Added support for PyTorch Lightning 1.7.x
- The Beaker Executor will no-longer live-stream logs from Beaker jobs, but logs will be viewable on Beaker and more readable.
- Only the Beaker executor requires a clean working directory

### Fixed

- Fixed a bug that did not allow a wandb artifact's type to be set from a step's metadata dictionary. 
- Fixed a bug with how the Beaker executor streams log lines from Beaker which sometimes resulted in messages missing some starting characters, and tqdm lines being duplicated.
- Fixed a bug in the Beaker workspace where the lock dataset wouldn't be removed if the step
  was found to be in an invalid state.
- Improved cluster choice logic in `BeakerExecutor` to ensure greater diversity of clusters when submitting many steps at once.
- Fixed bug where sub-processes of the multicore executor would use the wrong executor if `executor` was defined in a `tango.yml` file.
- Deterministic hashes for numpy and torch tensors were not deterministic. Now they are.


## [v0.12.0](https://github.com/allenai/tango/releases/tag/v0.12.0) - 2022-08-23

### Added

- **Step resources:**
  - Added a `step_resources` parameter to the `Step` class which should be used to describe the computational resources required to run a step.
    `Executor` implementations can use this information. For example, if your step needs 2 GPUs, you should set
    `step_resources=StepResources(gpu_count=2)` (`"step_resources": {"gpu_count": 2}` in the configuration language).
  - Added a `Step.resources()` property method. By default this returns the value specified by the `step_resources` parameter.
    If your step implementation always requires the same resources, you can just override this method so you don't have to provide
    the `step_resources` parameter.
- **Step execution:**
  - Added an `executor` field to the `tango.yml` settings. You can use this to define the executor you want to use by default.
  - Added a Beaker `Executor` to the Beaker integration, registered as an `Executor` with the name "beaker".
    To use this executor, add these lines to your `tango.yml` file:
    ```yaml
    executor:
      type: beaker
      beaker_workspace: ai2/my-workspace
      clusters:
        - ai2/general-cirrascale
    ```
    See the docs for the `BeakerExecutor` for more information on the input parameters.
- **Step class:**
  - Added a metadata field to the step class API. This can be set through the class
    variable `METADATA` or through the constructor argument `step_metadata`.
- **Weights & Biases integration:**
  - You can now change the artifact kind for step result artifacts by adding a field
    called "artifact_kind" to a step's metadata.
    For models, setting "artifact_kind" to "model" will add the corresponding artifact to W&B's new model zoo.

### Changed

- **CLI:**
  - The `tango run` command will throw an error if you have uncommitted changes in your repository, unless
    you use the `--allow-dirty` flag.
  - The `tango run` command will use the lightweight base executor (single process) by default.
    To use the multi-process executor, set `-j/--parallelism` to 1 or higher or -1 to use all available CPU cores.

### Fixed

- Fixed bug where `StepInfo` environment and platform metadata could be out-of-date if a step is run again due to failure.
- Fixed a bug where an unfortunate combination of early stopping and decreasing model performance could result in a crash in the torch trainer.

## [v0.11.0](https://github.com/allenai/tango/releases/tag/v0.11.0) - 2022-08-04

### Added

- Added a [Flax](https://flax.readthedocs.io/en/latest/) integration along with an example config.

## [v0.10.1](https://github.com/allenai/tango/releases/tag/v0.10.1) - 2022-07-26

### Fixed

- Fixed issue where the StepInfo config argument could be parsed into a Step. 
- Restored capability to run tests out-of-tree.

## [v0.10.0](https://github.com/allenai/tango/releases/tag/v0.10.0) - 2022-07-07

### Changed

- Renamed `workspace` parameter of `BeakerWorkspace` class to `beaker_workspace`.
- `Executor` class is now a `Registrable` base class. `MulticoreExecutor` is registered as "multicore".

### Removed

- Removed `StepExecutionMetadata`. Its fields have been absorbed into `StepInfo`.

### Fixed

- Improved `Step.ensure_result()` such that the step's result doesn't have to be read from the cache.
- Fixed an issue with the output from `MulticoreExecutor` such that it's now consistent with the default `Executor` for steps that were found in the cache.
- One of our error messages referred to a configuration file that no longer exists.
- Improved performance of `BeakerWorkspace`.

### Added

- Added the ability to train straight `Model` instead of just `Lazy[Model]`


## [v0.9.1](https://github.com/allenai/tango/releases/tag/v0.9.1) - 2022-06-24

### Fixed

- Fixed non-deterministic behavior in `TorchTrainStep`.
- Fixed bug in `BeakerWorkspace` where `.step_info(step)` would raise a `KeyError` if the step hasn't been registered as part of a run yet.
- Fixed a bug in `BeakerWorkspace` where it would send too many requests to the beaker service.
- Fixed a bug where `WandbWorkspace.step_finished()` or `.step_failed()` would crash if called
  from a different process than `.step_starting()`.
- Fixed a bug in `WandbWorkspace.step_finished()` which led to a `RuntimeError` sometimes while
  caching the result of a step.


## [v0.9.0](https://github.com/allenai/tango/releases/tag/v0.9.0) - 2022-06-01

### Added

- Added a [Beaker](https://beaker.org) integration that comes with `BeakerWorkspace`, a remote `Workspace` implementation that uses Beaker Datasets under the hood.
- Added a `datasets::dataset_remix` step that provides the split remixing functionality of `tango.steps.datasest_remix.DatasetRemixStep` now for Huggingface `DatasetDict`.
- Added a config and code example of Registrable to the First Step docs with edits for clarity.

### Changed

- If you try to import something from a tango integration that is not fully installed due to missing dependencies, an `IntegrationMissingError` will be raised
instead of `ModuleNotFound`.
- You can now set `-j 0` in `tango run` to disable multicore execution altogether.

### Fixed

- Improved how steps and workspaces handle race conditions when different processes are competing to execute the same step. This would result in a `RuntimeError` before with most workspaces, but now it's handled gracefully.
- Fixed bug which caused GradScaler state to not be saved and loaded with checkpoints. 

## [v0.8.0](https://github.com/allenai/tango/releases/tag/v0.8.0) - 2022-05-19

### Added

- Added a Weights & Baises remote `Workspace` implementation: `WandbWorkspace`, registered as "wandb".
  This can be instantiated from a workspace URL in the form "wandb://entity/project".
- Added a method `Workspace.step_result_for_run` which gives the result of a step given the run name and step name within that run.
- Added property `Workspace.url`, which returns a URL for the workspace that can be used to instantiate the exact same workspace using `Workspace.from_url()`. Subclasses must implement this.

### Changed

- `StepInfo` start and end times will be always be in UTC now.
- `WandbTrainCallback` now logs system metrics from each worker process in distributed training.
- `StepCache.__contains__()` and `StepCache.__getitem__()` now take accept either a `Step` or `StepInfo` as an argument (`Union[Step, StepInfo]`).
- Refactored `tango.step_graph.StepGraph` to allow initialization from a `Dict[str, Step]`.
- `Executor.execute_step_graph()` now attempts to execute all steps and summarizes success/failures.

### Fixed

- Fixed bug with `LocalWorkspace.from_parsed_url()` ([#278](https://github.com/allenai/tango/issues/278)).
- Deprecation warnings will now be logged from `tango` CLI.
- Fixed the text format in the case of serializing an iterator of string.
- Added missing default value of `None` to `TangoGlobalSettings.find_or_default()`.
- Mypy has become incompatible with transformers and datasets, so we have to disable the checks in some places.
- The `VERSION` member of step arguments that were wrapped in `Lazy` were not respected. Now they are.


## [v0.7.0](https://github.com/allenai/tango/releases/tag/v0.7.0) - 2022-04-19

### Added

- Added the "-n/--name" option to `tango run`. This option allows the user to give the run an arbitrary name.
- Added a convenience property `.workspace` to `Step` class that can be called from a step's `.run()` method to get the current `Workspace` being used.
- Gave `FromParams` objects (which includes all `Registrable` objects) the ability to version themselves.
- Added CLI option to run a single step in a config using `--step-name` or `-s`.
- Added a `MultiCoreExecutor` that executes steps in parallel.
- Added an `ExecutorOutput` dataclass that is returned by `Executor.execute_step_graph()`.
- `StepGraph` now prints itself in a readable way.
- Tango now automatically detects when it's running under a debugger, and disables multicore support accordingly. Many debuggers can't properly follow sub-processes, so this is a convenience for people who love debuggers.
- Added more models to the stuff we can import from the transformers library.
- Added new example for finetuning text-to-text models.

### Changed

- Renamed `click_logger` to `cli_logger`, and we now use [rich](https://github.com/Textualize/rich)'s logging `Handler` as the default handler, which means prettier output, better tracebacks, and you can use rich's markup syntax with the `cli_logger` to easily add style to text.
- Refactored `tango.step_graph.StepGraph` to allow initialization from a `Dict[str, Step]`.
- `Executor.execute_step_graph()` now attempts to execute all steps and summarizes success/failures.
- Upgraded PyTorch version in `tango` Docker image to latest `v1.11.0+cu113`.
- `RunGeneration` now allows model object as input.

### Fixed

- Fixed bug that mistakenly disallowed fully-qualified names containing `"_"` (underscores) in the config.
- Fixed bug where `TorchTrainStep` working directory would be left in an unrecoverable state if training failed after saving the final model weights.
- Fixed bug in `FromParams` where `**kwargs` might be passed down to the constructors of arguments.
- Fixed bug in the way dependencies are tracked between steps.
- Fixed bug that caused `MulticoreExecutor` to hang in case of a failing step that was required recursively (not directly) downstream.
- Fixed bug in the way dependencies are tracked between steps
- Compatibility with PyTorch Lightning 1.6


## [v0.6.0](https://github.com/allenai/tango/releases/tag/v0.6.0) - 2022-02-25

### Added

- New example that finetunes a pre-trained ResNet model on the Cats & Dogs dataset.
- Added a '@requires_gpus' decorator for marking tests as needing GPUs. Tests marked with this will be run in the "GPU Tests" workflow
  on dual k80 GPUs via Beaker.
- Added the "-w/--workspace" option to `tango run` and `tango server` commands. This option takes a path or URL, and instantiates the workspace from the URL using the newly added `Workspace.from_url()` method.
- Added the "workspace" field to `TangoGlobalSettings`.
- Added the "environment" field to `TangoGlobalSettings` for setting environment variables each
  time `tango` is run.
- Added a utility function to get a `StepGraph` directly from a file.
- Added `tango.settings` module and `tango settings` group of commands.
- A format for storing sequences as `SqliteSparseSequence`
- A way to massage kwargs before they determine the unique ID of a `Step`

### Changed

- `local_workspace.ExecutorMetadata` renamed to `StepExecutionMetadata` and now saved as `execution-metadata.json`.
- `tango run` without the option "-w/--workspace" or "-d/--workspace-dir" will now use a `MemoryWorkspace` instead of a `LocalWorkspace` in a temp directory, unless you've specified
  a default workspace in a `TangoGlobalSettings` file.
- Moved `tango.workspace.MemoryWorkspace` and `tango.local_workspace.LocalWorkspace` to `tango.workspaces.*`.
- Moved `tango.step_cache.MemoryStepCache` and `tango.step_cache.LocalStepCache` to `tango.step_caches.*`.
- Deprecated the `-d/--workspace-dir` command-line option. Please use `-w/--workspace` instead.

### Fixed

- Fixed a small bug `LocalWorkspace` would fail to capture the conda environment in our Docker image.
- Fixed activation of `FILE_FRIENDLY_LOGGING` when set from the corresponding environment variable.
- Fixed setting log level via the environment variable `TANGO_LOG_LEVEL`.
- Use relative paths within the `work_dir` for symbolic links to the latest and the best checkpoints in `TorchTrainStep`.
- Fixed some scenarios where Tango can hang after finishing all steps.
- `distributed_port` and `log_every` parameters won't factor into `TorchTrainStep`'s unique ID.
- `MappedSequence` now works with slicing.
- `MappedSequence` now works with Huggingface `Dataset`.
- Uncacheable steps are now visible in Tango UI.
- Fixed bug in `Registrable.list_available()` where an error might be raised if the default implementation hadn't been explicitly imported.
- Fixed issue where having a default argument to the `run()` method wasn't getting applied to the step's unique ID.


## [v0.5.0](https://github.com/allenai/tango/releases/tag/v0.5.0) - 2022-02-09

### Added

- Added `TrainingEngine` abstraction to torch integration.
- Added [FairScale](https://fairscale.readthedocs.io/en/latest/) with a `FairScaleTrainingEngine`
  that leverages FairScale's `FullyShardedDataParallel`. This is meant to be used within the `TorchTrainStep`.
- All PyTorch components (such as learning rate schedulers, optimizers, data collators, etc) from the
  transformers library and now registered under the corresponding class in the torch integration.
  For example, transformers `Adafactor` optimizer is registered as an `Optimizer` under the name
  "transformers::Adafactor". More details can be found in the documentation for the transformers integration.

### Changed

- Various changes to the parameters othe `TorchTrainStep` due to the introduction of the `TrainingEngine` class.
- Params logged as `DEBUG` level instead of `INFO` to reduce noise in logs.
- The waiting message for `FileLock` is now clear about which file it's waiting for.
- Added an easier way to get the default Tango global config
- Most methods to `TorchTrainCallback` also take an `epoch` parameter now.
- `WandbTrainCallback` now logs peak GPU memory occupied by PyTorch tensors per worker. This is useful because W&B's system metrics only display the total GPU memory reserved by PyTorch, which is always higher than the actual amount of GPU memory occupied by tensors. So these new metrics give a more accurate view into how much memory your training job is actually using.
- Plain old Python functions can now be used in `Lazy` objects.
- `LocalWorkspace` now creates a symlink to the outputs of the latest run.
- Tango is now better at guessing when a step has died and should be re-run.
- Tango is now more lenient about registering the same class under the same name twice.
- When you use `dict` instead of `Dict` in your type annotations, you now get a legible error message. Same for `List`, `Tuple`, and `Set`.

### Fixed

- Fixed a bug in `Registrable` and `FromParams` where registered function constructors would not properly construct
  arguments that were classes.
- Fixed a bug in `FromParams` that would cause a crash when an argument to the constructor had the name `params`.
- Made `FromParams` more efficient by only trying to parse the params as a `Step` when it looks like it actually could be a step.
- Fixed bug where `Executor` would crash if `git` command could not be found.
- Fixed bug where validation settings were not interpreted the right way by the torch trainer.
- When you register the same name twice using `Registrable`, you get an error message. That error message now contains the correct class name.


## [v0.4.0](https://github.com/allenai/tango/releases/tag/v0.4.0) - 2022-01-27

### Changed

- Default log level is `WARNING` instead of `ERROR`.
- The web UI now renders the step graph left-to-right.
- The web UI now shows runs by date, with the most recent run at the top.
- The web UI now shows steps in a color-coded way.
- The `tango run` command now prints user-friendly paths if possible.
- The `--include-package` flag now also accepts paths instead of module names.
- `tango.common.sqlite_sparse_sequence.SqliteSparseSequence` now lives at `tango.common.sequences.SqliteSparseSequence`.

### Fixed

- Ensure tqdm log lines always make it into the log file `out.log` even when log level is `WARNING` or `ERROR`.
- Numerous parts of Tango now have documentation when they didn't before.


## [v0.4.0rc5](https://github.com/allenai/tango/releases/tag/v0.4.0rc5) - 2022-01-19

### Added

- Added `TorchEvalStep` to torch integration, registered as "torch::eval".

### Changed

- Renamed `aggregate_val_metric` to `auto_aggregate_val_metric` in `TorchTrainStep`.
- `devices` parameter to `TorchTrainStep` replaced with `device_count: int`.
- Run name printed at the end of a run so it's easier to find.
- Type information added to package data. See [PEP 561](https://www.python.org/dev/peps/pep-0561) for more information.
- A new integration, `transformers`, with two new steps for running seq2seq models.
- Added `logging_tqdm`, if you don't want a progress bar, but you still want to see progress in the logs.
- Added `threaded_generator()`, for wrapping generators so that they run in a separate thread from the generator's consumer.
- Added a new example for evaluating the T0 model on XSum, a summarization task.
- Added `MappedSequence` for functionally wrapping sequences.
- Added `TextFormat`, in case you want to store the output of your steps in raw text instead of JSON.
- Steps can now list arguments in `SKIP_ID_ARGUMENTS` to indicate that the argument should not affect a step's
  unique id. This is useful for arguments that affect the execution of a step, but not the output.
- `Step` now implements `__str__`, so steps look pretty in the debugger.
- Added `DatasetCombineStep`, a step that combines multiple datasets into one.
- Added `common.logging.initialize_worker_logging()` function for configuring logging from worker processes/threads.
- Logs from `tango run ...` will be written to a file called `out.log` in the run directory.

### Fixed

- Fixed torch `StopEarlyCallback` state not being recovered properly on restarts.
- Fixed file friendly logging by removing special styling characters.
- Ensured exceptions captured in logs.
- `LocalWorkspace` now works properly with uncacheable steps.
- When a Tango run got killed hard, with `kill -9`, or because the machine lost power, `LocalWorkspace` would
  sometimes keep a step marked as "running", preventing further executions. This still happens sometimes, but it
  is now much less likely (and Tango gives you instructions for how to fix it).
- To make all this happen, `LocalWorkspace` now saves step info in a Sqlite database. Unfortunately that means that
  the workspace format changes and existing workspace directories won't work properly with it.
- Fixed premature cleanup of temporary directories when using `MemoryWorkspace`


## [v0.4.0rc4](https://github.com/allenai/tango/releases/tag/v0.4.0rc4) - 2021-12-20

### Fixed

- Fixed a bug where `StepInfo` fails to deserialize when `error` is an exception that can't be pickled.


## [v0.4.0rc3](https://github.com/allenai/tango/releases/tag/v0.4.0rc3) - 2021-12-15

### Added

- Added `DatasetsFormat` format and `LoadStreamingDataset` step to `datasets` integration.
- `SqliteDictFormat` for datasets.
- Added `pre_epoch()` and `post_epoch()` callback methods to PyTorch `TrainCallback`.

### Changed

- `LoadDataset` step from `datasets` integration is now cacheable, using the `DatasetsFormat` format by default.
  But this only works with non-streaming datasets. For streaming datasets, you should use the `LoadStreamingDataset` step instead.

### Fixed

- Fixed bug where `KeyboardInterrupt` exceptions were not handled properly by steps and workspaces.
- `WandbTrainCallback` now will use part of the step's unique ID as the name for the W&B run by default, to make
  it easier to indentify which tango step corresponds to each run in W&B.
- `WandbTrainCallback` will save the entire `TrainConfig` object to the W&B config.


## [v0.4.0rc2](https://github.com/allenai/tango/releases/tag/v0.4.0rc2) - 2021-12-13

### Added

- Sample experiment configurations that prove Euler's identity

### Changed

- Loosened `Click` dependency to include v7.0.
- Loosened `datasets` dependency.
- Tightened `petname` dependency to exclude next major release for safety.

### Fixed

- `Workspace`, `MemoryWorkspace`, and `LocalWorkspace` can now be imported directly from the `tango`
  base module.
- Uncacheable leaf steps would never get executed. This is now fixed.
- We were treating failed steps as if they were completed by accident.
- The visualization had a problem with showing steps that never executed because a dependency failed.
- Fixed a bug where `Lazy` inputs to a `Step` would fail to resolve arguments that come from the result
  of another step.
- Fixed a bug in `TorchTrainStep` where some arguments for distributed training (`devices`, `distributed_port`) weren't being set properly.


## [v0.4.0rc1](https://github.com/allenai/tango/releases/tag/v0.4.0rc1) - 2021-11-30

### Added

- Introduced the concept of the `Workspace`, with `LocalWorkspace` and `MemoryWorkspace` as initial implementations.
- Added a stub of a webserver that will be able to visualize runs as they happen.
- Added separate classes for `LightningTrainingTypePlugin`, `LightningPrecisionPlugin`, `LightningClusterEnvironmentPlugin`, `LightningCheckpointPlugin` for compatibility with `pytorch-lightning>=1.5.0`.
- Added a visualization of workspaces that can show step graphs while they're executing.

### Removed

- Removed old `LightningPlugin` class
- Removed requirement of the `overrides` package

### Changed

- Made it possible to construct a step graph out of `Step` objects, instead of constructing it out of `StepStub` objects.
- Removed dataset fingerprinting code, since we can now use `Step` to make sure things are cached.
- Made steps deterministic by default.
- Brought back `MemoryStepCache`, so we can run steps without configuring anything.
- W&B `torch::TrainCallback` logs with `step=step+1` now so that training curves in the W&B dashboard
  match up with checkpoints saved locally and are easier to read (e.g. step 10000 instead of 9999).
- `filelock >= 3.4` required, parameter `poll_intervall`  to `tango.common.file_lock.FileLock.acquire` renamed
  to `poll_interval`.

### Fixed

- Fixed bug in `FromParams` where a parameter to a `FromParams` class may not be instantiated correctly
  if it's a class with a generic type parameter.

## [v0.3.6](https://github.com/allenai/tango/releases/tag/v0.3.6) - 2021-11-12

### Added

- Added a `.log_batch()` method on `torch::TrainCallback` which is given the average loss across
  distributed workers, but only called every `log_every` steps.

### Removed

- Removed `.pre_log_batch()` method on `torch::TrainCallback`.

### Fixed

- Fixed typo in parameter name `remove_stale_checkpoints` in `TorchTrainStep` (previously was `remove_state_checkpoints`).
- Fixed bug in `FromParams` that would cause failures when `from __future__ import annotations`
  was used with Python older than 3.10. See [PEP 563](https://www.python.org/dev/peps/pep-0563/)
  for details.

## [v0.3.5](https://github.com/allenai/tango/releases/tag/v0.3.5) - 2021-11-05

### Fixed

- Fixed a bug in `FromParams` where the "type" parameter was ignored in some cases
  where the `Registrable` base class did not directly inherit from `Registrable`.

## [v0.3.4](https://github.com/allenai/tango/releases/tag/v0.3.4) - 2021-11-04

### Added

- Added `StopEarlyCallback`, a `torch::TrainCallback` for early stopping.
- Added parameter `remove_stale_checkpoints` to `TorchTrainStep`.

### Changed

- Minor changes to `torch::TrainCallback` interface.
- Weights & Biases `torch::TrainCallback` now logs best validation metric score.

## [v0.3.3](https://github.com/allenai/tango/releases/tag/v0.3.3) - 2021-11-04

### Added

- Added support for PEP 604 in `FromParams`, i.e. writing union types as "X | Y" instead of "Union[X, Y]".
- [internals] Added a spot for miscellaneous end-to-end integration tests (not to be confused with "tests of integrations") in `tests/end_to_end/`.
- [internals] Core tests now run on all officially supported Python versions.

### Fixed

- Fixed a bug in `FromParams` where non-`FromParams` class parameters were not instantiated
  properly (or at all).
- Fixed a bug in `FromParams` where kwargs were not passed on from a wrapper class to the wrapped class.
- Fixed small bug where some errors from git would be printed when executor metadata is created
  outside of a git repository.

## [v0.3.2](https://github.com/allenai/tango/releases/tag/v0.3.2) - 2021-11-01

### Fixed

- Fixed a bug with `FromParams` that caused `.from_params()` to fail when the params contained
  an object that was already instantiated.
- tango command no longer installs a SIGTERM handler, which fixes some bugs with integrations that use multiprocessing.

## [v0.3.1](https://github.com/allenai/tango/releases/tag/v0.3.1) - 2021-10-29

### Changed
- Updated the `LightningTrainStep` to optionally take in a `LightningDataModule` as input.

## [v0.3.0](https://github.com/allenai/tango/releases/tag/v0.3.0) - 2021-10-28

### Added

- Added `IterableDatasetDict`, a version of `DatasetDict` for streaming-like datasets.
- Added a [PyTorch Lightning](https://www.pytorchlightning.ai) integration with `LightningTrainStep`.

### Fixed

- Fixed bug with `FromParams` and `Lazy` where extra arguments would sometimes be passed down through
  to a `Lazy` class when they shouldn't.

## [v0.2.4](https://github.com/allenai/tango/releases/tag/v0.2.4) - 2021-10-22

### Added

- Added support for [torch 1.10.0](https://github.com/pytorch/pytorch/releases).

### Changed

- `--file-friendly-logging` flag is now an option to the main `tango` command, so needs
  to be passed before `run`, e.g. `tango --file-friendly-logging run ...`.

### Fixed

- Fixed bug with `Step.from_params`.
- Ensure logging is initialized is spawn processes during distributed training with `TorchTrainStep`.

## [v0.2.3](https://github.com/allenai/tango/releases/tag/v0.2.3) - 2021-10-21

### Added

- Added support for global settings file, `tango.yml`.
- Added 'include_package' (array of string) param to config spec.
- Added a custom error `StopEarly` that a `TrainCallback` can raise within the `TorchTrainStep`
  to stop training early without crashing.
- Added step config, tango command, and tango version to executor metadata.
- Executor now also saves pip dependencies and conda environment files to the run directory
  for each step.

### Fixed

- Ensured `**kwargs` arguments are logged in `FromParams`.

## [v0.2.2](https://github.com/allenai/tango/releases/tag/v0.2.2) - 2021-10-19

### Added

- Added new steps to `datasets` integration: `ConcatenateDatasets` ("datasets::concatenate") and `InterleaveDatasets` (datasets::interleave).
- Added `__contains__` and `__iter__` methods on `DatasetDict` so that it is now a `Mapping` class.
- Added `tango info` command that - among other things - displays which integrations are installed.

## [v0.2.1](https://github.com/allenai/tango/releases/tag/v0.2.1) - 2021-10-18

### Added

- Added `convert_to_tango_dataset_dict()` function in the `datasets` integration.
  It's important for step caching purposes to use this to convert a HF `DatasetDict`
  to a native Tango `DatasetDict` when that `DatasetDict` is part of the input to another
  step. Otherwise the HF `DatasetDict` will have to be pickled to determine its hash.

### Changed

- `Format.checksum()` is now an abstract method. Subclasses should only compute checksum
  on the serialized artifact and nothing else in the directory.
- [internals] Changed the relationship between `Executor`, `StepCache`, and `Step.`
  `Executor` now owns the `StepCache`, and `Step` never interacts with `StepCache` directly.

## [v0.2.0](https://github.com/allenai/tango/releases/tag/v0.2.0) - 2021-10-15

### Added

- Added a [Weights & Biases](https://wandb.ai) integration with a training callback ("wandb::log")
  for `TorchTrainStep` ("torch::train") that logs training and validation metrics to W&B.

### Fixed

- Fixed `Format.checksum()` when there is a symlink to a directory in the cache folder.

## [v0.1.3](https://github.com/allenai/tango/releases/tag/v0.1.3) - 2021-10-15

### Added

- Added the ability to track a metric other than "loss" for validation in `TorchTrainStep` ("torch::train").

### Fixed

- Final model returned from `TorchTrainStep` ("torch::train") will have best weights loaded.
- Checkpoints are saved from `TorchTrainStep` ("torch::train") even when there is no validation loop.
- Fixed `TorchTrainStep` ("torch::train") when `validation_split` is `None`.
- Fixed distributed training with `TorchTrainStep` ("torch::train") on GPU devices.

## [v0.1.2](https://github.com/allenai/tango/releases/tag/v0.1.2) - 2021-10-13

### Added

- Added support for YAML configuration files.

## [v0.1.1](https://github.com/allenai/tango/releases/tag/v0.1.1) - 2021-10-12

### Added

- `TorchTrainStep` now displays a progress bar while saving a checkpoint to file.
- The default executor now saves a "executor-metadata.json" file to the directory for each step.

### Changed

- Renamed `DirectoryStepCache` to `LocalStepCache` (registered as "local").
- `LocalStepCache` saves metadata to `cache-metadata.json` instead of `metadata.json`.

### Fixed

- Fixed bug with `TorchTrainStep` during distributed training.
- `FromParams` will automatically convert strings into `Path` types now when the annotation
  is `Path`.

## [v0.1.0](https://github.com/allenai/tango/releases/tag/v0.1.0) - 2021-10-11

### Added

- Added `StepGraph` and `Executor` abstractions.
- Added a basic PyTorch training step registered as `"torch::train"`, along with other registrable
  components, such as `Model`, `DataLoader`, `Sampler`, `DataCollator`, `Optimizer`, and `LRScheduler`.
- Added `DatasetRemixStep` in `tango.steps`.
- Added module `tango.common.sequences`.
- Added `DatasetDict` class in `tango.common.dataset_dict`.
- Added [🤗 Datasets](https://github.com/huggingface/datasets) integration.
- Added command-line options to set log level or disable logging completely.

### Changed

- `Step.work_dir`, `Step.unique_id`, `Step.dependencies`, and `Step.recursive_dependencies`
  are now a properties instead of methods.
- `tango run` command will acquire a lock on the directory to avoid race conditions.
- Integrations can now be installed with `pip install tango[INTEGRATION_NAME]`. For example,
  `pip install tango[torch]`.
- Added method `Registrable.search_modules()` for automatically finding and importing the modules
  where a given ``name`` might be registered.
- `FromParams.from_params()` and `Registrable.resolve_class_name` will now call `Registrable.search_modules()` to automatically import modules where the type might be defined.
  Thus for classes that are defined and registered within any `tango.*` submodules it is not necessary to explicitly import them.

### Fixed

- `Step` implementations can now take arbitrary `**kwargs` in their `run()` methods.

## [v0.0.3](https://github.com/allenai/tango/releases/tag/v0.0.3) - 2021-09-27

### Added

- Added `tango` command.

## [v0.0.2](https://github.com/allenai/tango/releases/tag/v0.0.2) - 2021-09-27

### Added

- Ported over core tango components from AllenNLP.

## [v0.0.1](https://github.com/allenai/tango/releases/tag/v0.0.1) - 2021-09-22

### Added

- Added initial project boilerplate.


================================================
FILE: CITATION.cff
================================================
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Groeneveld"
  given-names: "Dirk"
  affiliation: "Allen Institute for Artificial Intelligence"
- family-names: "Bhagia"
  given-names: "Akshita"
  affiliation: "Allen Institute for Artificial Intelligence"
- family-names: "Walsh"
  given-names: "Pete"
  affiliation: "Allen Institute for Artificial Intelligence"
title: "AI2 Tango"
abstract: "Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project."
version: "1.3.2"
repository-code: "https://github.com/allenai/tango"
license: "Apache-2.0"
date-released: "2023-10-27"
repository-code: "https://github.com/allenai/tango"


================================================
FILE: Dockerfile
================================================
# This Dockerfile can be used to build a Docker image suitable for tango projects.

ARG BASE_IMAGE=ghcr.io/allenai/pytorch:2.0.0-cuda11.7-python3.10
FROM ${BASE_IMAGE}

WORKDIR /stage

COPY . .
RUN /opt/conda/bin/pip install --no-cache-dir .[all]

WORKDIR /workspace

RUN rm -rf /stage/

ENTRYPOINT ["/opt/conda/bin/tango"]


================================================
FILE: Dockerfile.test
================================================
# This Dockerfile is for building an image suitable for running tango's GPU tests and integration tests.
# There are no instruction lines in this Dockerfile that install tango. Instead, the entrypoint
# script handles installing tango from a particular commit at runtime, based on the environment
# variable "COMMIT_SHA". That way we don't need to rebuild and push the image each time we run
# tests, and we can be sure the dependencies are always up-to-date.

FROM ghcr.io/allenai/pytorch:2.0.0-cuda11.7-python3.10

COPY scripts/entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

WORKDIR /testing

ENTRYPOINT ["/entrypoint.sh"]


================================================
FILE: LICENSE
================================================
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: Makefile
================================================
.PHONY : docs
docs :
	rm -rf docs/build/
	sphinx-autobuild -b html --watch tango/ --watch examples/ docs/source/ docs/build/

.PHONY : run-checks
run-checks :
	isort --check .
	black --check .
	ruff check .
	mypy --check-untyped-defs .
	CUDA_VISIBLE_DEVICES='' pytest -v --color=yes --doctest-modules --ignore=tests/integrations --ignore=tango/integrations tests/ tango/
	CUDA_VISIBLE_DEVICES='' pytest -v --color=yes --doctest-modules tango/integrations/torch tests/integrations/torch
	CUDA_VISIBLE_DEVICES='' pytest -v --color=yes --doctest-modules tango/integrations/transformers tests/integrations/transformers


================================================
FILE: README.md
================================================
<div align="center">
<br>
<img src="https://raw.githubusercontent.com/allenai/tango/main/docs/source/_static/tango_final_horizontal.png" width="600"/>
<br>
<br>
<p>
<!-- start tagline -->
AI2 Tango replaces messy directories and spreadsheets full of file versions by organizing experiments into discrete steps that can be cached and reused throughout the lifetime of a research project.
<!-- end tagline -->
</p>
<hr/>
<a href="https://github.com/allenai/tango/actions">
    <img alt="CI" src="https://github.com/allenai/tango/workflows/CI/badge.svg?event=push&branch=main">
</a>
<a href="https://pypi.org/project/ai2-tango/">
    <img alt="PyPI" src="https://img.shields.io/pypi/v/ai2-tango">
</a>
<a href="https://ai2-tango.readthedocs.io/en/latest/?badge=latest">
    <img src="https://readthedocs.org/projects/ai2-tango/badge/?version=latest" alt="Documentation Status" />
</a>
<a href="https://github.com/allenai/tango/blob/main/LICENSE">
    <img alt="License" src="https://img.shields.io/github/license/allenai/tango.svg?color=blue&cachedrop">
</a>
<br/>
</div>

## Quick links

- [Documentation](https://ai2-tango.readthedocs.io/)
- [PyPI Package](https://pypi.org/project/ai2-tango/)
- [Contributing](https://github.com/allenai/tango/blob/main/CONTRIBUTING.md)
- [License](https://github.com/allenai/tango/blob/main/LICENSE)

## In this README

- [Quick start](#quick-start)
- [Installation](#installation)
  - [Installing with PIP](#installing-with-pip)
  - [Installing with Conda](#installing-with-conda)
  - [Installing from source](#installing-from-source)
  - [Checking your installation](#checking-your-installation)
  - [Docker image](#docker-image)
- [FAQ](#faq)
- [Team](#team)
- [License](#license)

## Quick start

Create a Tango step:

```python
# hello.py

from tango import step

@step()
def hello(name: str) -> str:
    message = f"Hello, {name}!"
    print(message)
    return message
```

And create a corresponding experiment configuration file:

```jsonnet
// hello.jsonnet

{
  steps: {
    hello: {
      type: "hello",
      name: "World",
    }
  }
}
```

Then run the experiment using a local workspace to cache the result:

```bash
tango run hello.jsonnet -w /tmp/workspace
```

You'll see something like this in the output:

```
Starting new run expert-llama
● Starting step "hello"...
Hello, World!
✓ Finished step "hello"
✓ Finished run expert-llama
```

If you run this a second time the output will now look like this:

```
Starting new run open-crab
✓ Found output for step "hello" in cache...
✓ Finished run open-crab
```

You won't see "Hello, World!" this time because the result of the step was found in the cache, so it wasn't run again.

For a more detailed introduction check out the [First Steps](https://ai2-tango.readthedocs.io/en/latest/first_steps.html) walk-through.

## Installation

<!-- start install -->

**ai2-tango** requires Python 3.8 or later.

### Installing with `pip`

**ai2-tango** is available [on PyPI](https://pypi.org/project/ai2-tango/). Just run

```bash
pip install ai2-tango
```

To install with a specific integration, such as `torch` for example, run

```bash
pip install 'ai2-tango[torch]'
```

To install with all integrations, run

```bash
pip install 'ai2-tango[all]'
```

### Installing with `conda`

**ai2-tango** is available on conda-forge. You can install just the base package with

```bash
conda install tango -c conda-forge
```

You can pick and choose from the integrations with one of these:

```bash
conda install tango-datasets -c conda-forge
conda install tango-torch -c conda-forge
conda install tango-wandb -c conda-forge
```

You can also install everything:

```bash
conda install tango-all -c conda-forge
```

Even though **ai2-tango** itself is quite small, installing everything will pull in a lot of dependencies.
Don't be surprised if this takes a while!

### Installing from source

To install **ai2-tango** from source, first clone [the repository](https://github.com/allenai/tango):

```bash
git clone https://github.com/allenai/tango.git
cd tango
```

Then run

```bash
pip install -e '.[all]'
```

To install with only a specific integration, such as `torch` for example, run

```bash
pip install -e '.[torch]'
```

Or to install just the base tango library, you can run

```bash
pip install -e .
```

### Checking your installation

Run

```bash
tango info
```

to check your installation.

### Docker image

You can build a Docker image suitable for tango projects by using [the official Dockerfile](https://github.com/allenai/tango/blob/main/Dockerfile) as a starting point for your own Dockerfile, or you can simply use one of our [prebuilt images](https://github.com/allenai/tango/pkgs/container/tango) as a base image in your Dockerfile. For example:

```Dockerfile
# Start from a prebuilt tango base image.
# You can choose the right tag from the available options here:
# https://github.com/allenai/tango/pkgs/container/tango/versions
FROM ghcr.io/allenai/tango:cuda11.3

# Install your project's additional requirements.
COPY requirements.txt .
RUN /opt/conda/bin/pip install --no-cache-dir -r requirements.txt

# Install source code.
# This instruction copies EVERYTHING in the current directory (build context),
# which may not be what you want. Consider using a ".dockerignore" file to
# exclude files and directories that you don't want on the image.
COPY . .
```

Make sure to choose the right base image for your use case depending on the version of tango you're using and the CUDA version that your host machine supports.
You can see a list of all available image tags [on GitHub](https://github.com/allenai/tango/pkgs/container/tango/versions).

<!-- end install -->

## FAQ

<!-- start faq -->

### Why is the library named Tango?

The motivation behind this library is that we can make research easier by composing it into well-defined steps.  What happens when you choreograph a number of steps together?  Well, you get a dance.  And since our [team's leader](https://nasmith.github.io/) is part of a tango band, "AI2 Tango" was an obvious choice!

### How can I debug my steps through the Tango CLI?

You can run the `tango` command through [pdb](https://docs.python.org/3/library/pdb.html). For example:

```bash
python -m pdb -m tango run config.jsonnet
```

### How is Tango different from [Metaflow](https://metaflow.org), [Airflow](https://airflow.apache.org), or [redun](https://github.com/insitro/redun)?

We've found that existing DAG execution engines like these tools are great for production workflows but not as well suited for messy, collaborative research projects
where code is changing constantly. AI2 Tango was built *specifically* for these kinds of research projects.

### How does Tango's caching mechanism work?

AI2 Tango caches the results of steps based on the `unique_id` of the step. The `unique_id` is essentially a hash of all of the inputs to the step along with:

1. the step class's fully qualified name, and
2. the step class's `VERSION` class variable (an arbitrary string).

Unlike other workflow engines like [redun](https://github.com/insitro/redun), Tango does *not* take into account the source code of the class itself (other than its fully qualified name) because we've found that using a hash of the source code bytes is way too sensitive and less transparent for users.
When you change the source code of your step in a meaningful way you can just manually change the `VERSION` class variable to indicate to Tango
that the step has been updated.

<!-- end faq -->

## Team

<!-- start team -->

**ai2-tango** is developed and maintained by the AllenNLP team, backed by [the Allen Institute for Artificial Intelligence (AI2)](https://allenai.org/).
AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.
To learn more about who specifically contributed to this codebase, see [our contributors](https://github.com/allenai/tango/graphs/contributors) page.

<!-- end team -->

## License

<!-- start license -->

**ai2-tango** is licensed under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
A full copy of the license can be found [on GitHub](https://github.com/allenai/tango/blob/main/LICENSE).

<!-- end license -->


================================================
FILE: RELEASE_PROCESS.md
================================================
# GitHub Release Process

## Steps

1. Update the version in `tango/version.py`.

2. Run the release script:

    ```bash
    ./scripts/release.sh
    ```

    This will automatically update the CHANGELOG, commit the changes to the CHANGELOG and `version.py` (and any other files you might have changed),
    and then create a new tag in git which will trigger a workflow on GitHub Actions that handles the rest.

## Fixing a failed release

If for some reason the GitHub Actions release workflow failed with an error that needs to be fixed, you'll have to delete both the tag and corresponding release from GitHub. After you've pushed a fix, delete the tag from your local clone with

```bash
git tag -l | xargs git tag -d && git fetch -t
```

Then repeat the steps above.


================================================
FILE: docs/.gitignore
================================================
build


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS    ?=
SPHINXBUILD   ?= sphinx-build
SOURCEDIR     = source
BUILDDIR      = build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)


================================================
FILE: docs/make.bat
================================================
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
	set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
	echo.
	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
	echo.installed, then set the SPHINXBUILD environment variable to point
	echo.to the full path of the 'sphinx-build' executable. Alternatively you
	echo.may add the Sphinx directory to PATH.
	echo.
	echo.If you don't have Sphinx installed, grab it from
	echo.https://www.sphinx-doc.org/
	exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd


================================================
FILE: docs/source/_static/css/custom.css
================================================


================================================
FILE: docs/source/api/commands.rst
================================================
Commands
========

.. automodule:: tango.__main__


================================================
FILE: docs/source/api/components/executor.rst
================================================
Executor
========

Base class
----------

.. autoclass:: tango.executor.Executor
   :members: 

.. autoclass:: tango.executor.ExecutorOutput
   :members: 

.. autoclass:: tango.executor.ExecutionMetadata
   :members: 


================================================
FILE: docs/source/api/components/format.rst
================================================
Format
======

Base class
----------

.. autoclass:: tango.format.Format
   :members: 
   :private-members:

Implementations
---------------

.. automodule:: tango.format
   :members:
   :exclude-members: Format,read,write,checksum


================================================
FILE: docs/source/api/components/index.rst
================================================
Components
==========

The core components of **AI2 Tango**.

.. toctree::
   :maxdepth: 2
   :caption: Components

   step
   step_info
   step_graph
   workspace
   step_cache
   format
   executor


================================================
FILE: docs/source/api/components/step.rst
================================================
Step
====

Base class
----------

.. autoclass:: tango.step.Step
   :members: 
   :special-members:
   :exclude-members: from_params

.. autofunction:: tango.step.step

.. autoclass:: tango.step.WithUnresolvedSteps
   :members:

.. autoclass:: tango.step.StepResources
   :members:

Implementations
---------------

.. automodule:: tango.steps
   :members:


================================================
FILE: docs/source/api/components/step_cache.rst
================================================
StepCache
=========

Base class
----------

.. autoclass:: tango.step_cache.StepCache
   :members: 
   :special-members:

Implementations
---------------

.. autoclass:: tango.step_caches.LocalStepCache
   :members:

.. autoclass:: tango.step_caches.MemoryStepCache

Metadata
--------

.. autoclass:: tango.step_cache.CacheMetadata
   :members:


================================================
FILE: docs/source/api/components/step_graph.rst
================================================
StepGraph
=========

.. autoclass:: tango.step_graph.StepGraph
   :members: 


================================================
FILE: docs/source/api/components/step_info.rst
================================================
StepInfo
========

.. autoclass:: tango.step_info.StepInfo
   :member-order: bysource
   :members:

.. autoclass:: tango.step_info.StepState
   :member-order: bysource
   :members:

.. autoclass:: tango.step_info.PlatformMetadata
   :member-order: bysource
   :members:

.. autoclass:: tango.step_info.EnvironmentMetadata
   :member-order: bysource
   :members:

.. autoclass:: tango.step_info.GitMetadata
   :member-order: bysource
   :members:

.. autoclass:: tango.step_info.TangoMetadata
   :member-order: bysource
   :members:


================================================
FILE: docs/source/api/components/workspace.rst
================================================
Workspace
=========

Base class
----------

.. autoclass:: tango.workspace.Workspace
   :members:

Implementations
---------------

.. autoclass:: tango.workspaces.LocalWorkspace

.. autoclass:: tango.workspaces.MemoryWorkspace

Metadata
--------

.. autoclass:: tango.workspace.Run
   :members:

.. autoclass:: tango.workspace.RunInfo
   :members:

Miscellaneous
-------------

.. autoclass:: tango.workspace.RunSort
   :members:

.. autoclass:: tango.workspace.StepInfoSort
   :members:


================================================
FILE: docs/source/api/det_hash.rst
================================================
Deterministic Hashing
=====================

In order to detect whether a :class:`~tango.step.Step` has to be re-run or not, Tango relies on some tools to compute
deterministic hashes from the inputs to the :class:`~tango.step.Step`.

The center-piece of this module is the :func:`~tango.common.det_hash.det_hash` function, which computes a deterministic hash of an
arbitrary Python object. The other things in this module influence how that works in various ways.

.. automodule:: tango.common.det_hash
   :members:


================================================
FILE: docs/source/api/exceptions.rst
================================================
Exceptions
==========

.. autoexception:: tango.common.exceptions.TangoError
   :members:

.. automodule:: tango.common.exceptions
   :members:
   :exclude-members: TangoError


================================================
FILE: docs/source/api/integrations/beaker.rst
================================================
🧪 Beaker
=========

.. automodule:: tango.integrations.beaker

Reference
---------

.. autoclass:: tango.integrations.beaker.BeakerWorkspace

.. autoclass:: tango.integrations.beaker.BeakerStepCache

.. autoclass:: tango.integrations.beaker.BeakerExecutor
   :members: DEFAULT_BEAKER_IMAGE

.. autoclass:: tango.integrations.beaker.BeakerScheduler
   :members:

.. autoclass:: tango.integrations.beaker.SimpleBeakerScheduler

.. autoclass:: tango.integrations.beaker.ResourceAssignment
   :members:

.. autoclass:: tango.integrations.beaker.ResourceAssignmentError


================================================
FILE: docs/source/api/integrations/datasets.rst
================================================
🤗 Datasets
===========

.. automodule:: tango.integrations.datasets

Reference
---------

.. autofunction:: tango.integrations.datasets.convert_to_tango_dataset_dict

.. autoclass:: tango.integrations.datasets.DatasetsFormat

.. autoclass:: tango.integrations.datasets.LoadDataset
   :members:

.. autoclass:: tango.integrations.datasets.LoadStreamingDataset
   :members:

.. autoclass:: tango.integrations.datasets.InterleaveDatasets
   :members:

.. autoclass:: tango.integrations.datasets.ConcatenateDatasets
   :members:

.. autoclass:: tango.integrations.datasets.DatasetRemixStep
   :members:

================================================
FILE: docs/source/api/integrations/fairscale.rst
================================================
🔥 FairScale
============

.. automodule:: tango.integrations.fairscale

Reference
---------

.. autoclass:: tango.integrations.fairscale.FairScaleTrainingEngine

.. autoclass:: tango.integrations.fairscale.FSDPConfig
    :members:

.. autofunction:: tango.integrations.fairscale.with_wrapped_modules


================================================
FILE: docs/source/api/integrations/flax.rst
================================================
Flax
=======

.. automodule:: tango.integrations.flax

Reference
---------

Train step
~~~~~~~~~~

.. autoclass:: tango.integrations.flax.FlaxTrainStep
   :members:

.. autoclass:: tango.integrations.flax.TrainConfig
   :members:

Eval step
~~~~~~~~~

.. autoclass:: tango.integrations.flax.FlaxEvalStep
   :members:

Flax format
~~~~~~~~~~~~

.. autoclass:: tango.integrations.flax.FlaxFormat

Model
~~~~~

.. autoclass:: tango.integrations.flax.Model
   :members:

Optim
~~~~~

.. autoclass:: tango.integrations.flax.Optimizer
   :members:

.. autoclass:: tango.integrations.flax.LRScheduler
   :members:

Data
~~~~

.. autoclass:: tango.integrations.flax.DataLoader
   :members:

.. autoclass:: tango.integrations.flax.FlaxDataLoader
   :members:

Callbacks
~~~~~~~~~

.. autoclass:: tango.integrations.flax.TrainCallback
   :members:
   :member-order: bysource

.. autoclass:: tango.integrations.flax.EvalCallback
   :members:
   :member-order: bysource


================================================
FILE: docs/source/api/integrations/gs.rst
================================================
☁️ Google Cloud Storage
=======================

.. automodule:: tango.integrations.gs

Reference
---------

.. autoclass:: tango.integrations.gs.GSWorkspace

.. autoclass:: tango.integrations.gs.GSStepCache


================================================
FILE: docs/source/api/integrations/index.rst
================================================
Integrations
============

.. automodule:: tango.integrations

.. toctree::
   :maxdepth: 2
   :caption: Integrations

   torch
   fairscale
   datasets
   transformers
   wandb
   beaker
   flax
   gs


================================================
FILE: docs/source/api/integrations/torch.rst
================================================
🔥 PyTorch
==========

.. automodule:: tango.integrations.torch

Reference
---------

Train step
~~~~~~~~~~

.. autoclass:: tango.integrations.torch.TorchTrainStep
   :members:

.. autoclass:: tango.integrations.torch.TrainConfig
   :members:

Eval step
~~~~~~~~~

.. autoclass:: tango.integrations.torch.TorchEvalStep
   :members:

Torch format
~~~~~~~~~~~~

.. autoclass:: tango.integrations.torch.TorchFormat

Model
~~~~~

.. autoclass:: tango.integrations.torch.Model
   :members:

TrainingEngine
~~~~~~~~~~~~~~

.. autoclass:: tango.integrations.torch.TrainingEngine
   :members:

.. autoclass:: tango.integrations.torch.TorchTrainingEngine

Optim
~~~~~

.. autoclass:: tango.integrations.torch.Optimizer
   :members:

.. autoclass:: tango.integrations.torch.LRScheduler
   :members:

Data
~~~~

.. autoclass:: tango.integrations.torch.DataLoader
   :members:

.. autoclass:: tango.integrations.torch.Sampler
   :members:

.. autoclass:: tango.integrations.torch.DataCollator
   :members:
   :special-members: __call__

.. autoclass:: tango.integrations.torch.ConcatTensorDictsCollator
   :members:

Callbacks
~~~~~~~~~

.. autoclass:: tango.integrations.torch.TrainCallback
   :members:
   :member-order: bysource

.. autoclass:: tango.integrations.torch.EvalCallback
   :members:
   :member-order: bysource

.. autoclass:: tango.integrations.torch.StopEarlyCallback

.. autoclass:: tango.integrations.torch.StopEarly
   :members:


================================================
FILE: docs/source/api/integrations/transformers.rst
================================================
🤗 Transformers
===============

.. automodule:: tango.integrations.transformers
    :members:

.. autofunction:: tango.integrations.transformers.ia3.modify_with_ia3

================================================
FILE: docs/source/api/integrations/wandb.rst
================================================
⚖️ Weights & Biases
===================
 
.. automodule:: tango.integrations.wandb

Reference
---------

.. autoclass:: tango.integrations.wandb.WandbWorkspace

.. autoclass:: tango.integrations.wandb.WandbStepCache

.. autoclass:: tango.integrations.wandb.WandbTrainCallback

.. autoclass:: tango.integrations.wandb.WandbFlaxTrainCallback


================================================
FILE: docs/source/api/logging.rst
================================================
Logging
=======

.. automodule:: tango.common.logging

Reference
---------

.. autodata:: tango.common.logging.TANGO_LOG_LEVEL

.. autodata:: tango.common.logging.FILE_FRIENDLY_LOGGING

.. autodata:: tango.common.logging.cli_logger

.. autofunction:: tango.common.logging.initialize_logging

.. autofunction:: tango.common.logging.initialize_worker_logging

.. autofunction:: tango.common.logging.initialize_prefix_logging

.. autofunction:: tango.common.logging.teardown_logging

.. autofunction:: tango.common.logging.file_handler


================================================
FILE: docs/source/api/sequences.rst
================================================
Sequences
=========

This module contains some utilities to make sequences out of other sequences. All of these are lazy, so they
take minimal time and memory when you create them. These work particularly well when used together. For example,
you can concatenate two sequences (:class:`~tango.common.sequences.ConcatenatedSequence`), and then shuffle
them (:class:`~tango.common.sequences.ShuffledSequence`).

This module is not dependent on other Tango modules and can be used in isolation.

.. automodule:: tango.common.sequences
   :members:


================================================
FILE: docs/source/api/settings.rst
================================================
Global settings
---------------

Some command-line options can set globally in a ``tango.yml`` or ``tango.yaml`` settings file.
Tango will check the current directory and ``~/.config/``, in that order.

The full spec of this config is defined by the :class:`~tango.settings.TangoGlobalSettings` class.

.. autoclass:: tango.settings.TangoGlobalSettings
   :members:
   :exclude-members: path,find_or_default
   :member-order: bysource


================================================
FILE: docs/source/api/utilities.rst
================================================
Utilities
=========

.. automodule:: tango.common
   :members:
   :exclude-members: det_hash


================================================
FILE: docs/source/conf.py
================================================
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

import logging
import os
import sys
from datetime import datetime

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.

sys.path.insert(0, os.path.abspath("../../"))

from tango.version import VERSION, VERSION_SHORT  # noqa: E402

# -- Project information -----------------------------------------------------

project = "AI2 Tango"
copyright = f"{datetime.today().year}, Allen Institute for Artificial Intelligence"
author = "Allen Institute for Artificial Intelligence"
version = VERSION_SHORT
release = VERSION

# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
    "sphinx.ext.autodoc",
    "sphinx.ext.napoleon",
    "myst_parser",
    "sphinx.ext.intersphinx",
    "sphinx.ext.viewcode",
    "sphinx.ext.doctest",
    "sphinx_copybutton",
    "sphinx_autodoc_typehints",
]

suppress_warnings = ["myst.header"]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ["_build"]

source_suffix = [".rst", ".md"]

# -- Extension configuration -------------------------------------------------

intersphinx_mapping = {
    "python": ("https://docs.python.org/3", None),
    "rich": ("https://rich.readthedocs.io/en/latest", None),
    "torch": ("https://pytorch.org/docs/stable", None),
    "flax": ("https://flax.readthedocs.io/en/latest", None),
    "fairscale": ("https://fairscale.readthedocs.io/en/latest/", None),
    "datasets": ("https://huggingface.co/docs/datasets/master/en", None),
    "transformers": ("https://huggingface.co/docs/transformers/master/en", None),
    "beaker": ("https://beaker-py.readthedocs.io/en/latest/", None),
}

# Tell myst-parser to assign header anchors for h1-h3.
myst_heading_anchors = 3

# By default, sort documented members by type within classes and modules.
autodoc_member_order = "groupwise"

python_use_unqualified_type_names = True

# Include default values when documenting parameter types.
typehints_defaults = "comma"

# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages.  See the documentation for
# a list of builtin themes.
#
html_theme = "furo"

html_title = f"ai2-tango v{VERSION}"

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]

html_css_files = ["css/custom.css"]

html_favicon = "_static/favicon.ico"

html_theme_options = {
    "light_css_variables": {
        "color-announcement-background": "#1B4596",
        "color-announcement-text": "#FFFFFF",
    },
    "dark_css_variables": {},
    "light_logo": "tango_final_squareish.png",
    "dark_logo": "tango_final_squareish.png",
    "footer_icons": [
        {
            "name": "GitHub",
            "url": "https://github.com/allenai/tango",
            "html": """
                <svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 16 16">
                    <path fill-rule="evenodd" d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0 0 16 8c0-4.42-3.58-8-8-8z"></path>
                </svg>
            """,  # noqa: E501
            "class": "",
        },
    ],
}

# -- Hack to get rid of stupid warnings from sphinx_autodoc_typehints --------


class ShutupSphinxAutodocTypehintsFilter(logging.Filter):
    def filter(self, record: logging.LogRecord) -> bool:
        if "Cannot resolve forward reference" in record.msg:
            return False
        return True


logging.getLogger("sphinx.sphinx_autodoc_typehints").addFilter(ShutupSphinxAutodocTypehintsFilter())


================================================
FILE: docs/source/examples/euler.md
================================================
```{include} ../../../examples/euler/README.md
```

## Running the experiment

If you haven't already, clone the [tango repository](https://github.com/allenai/tango) and then
change directories into `examples/euler`.

You can then run the experiment with:

```bash
tango run euler_general.jsonnet -i complex_arithmetic -w workspace
```

This will leave its results in a subdirectory of `workspace/runs/` corresponding to the name of the run.
The output it prints should look something like this:
```
Starting new run comic-heron
Server started at http://localhost:8080/run/comic-heron
[step i_times_pi] ● Starting step "i_times_pi"...
[step i_times_pi] ✓ Finished step "i_times_pi"
[step cos] ● Starting step "cos"...
[step cos] ✓ Finished step "cos"
[step sin] ● Starting step "sin"...
[step sin] ✓ Finished step "sin"
[step pow_e] ✓ Found output for step "i_times_pi" in cache (needed by "pow_e")...
[step pow_e] ● Starting step "pow_e"...
[step pow_e] ✓ Finished step "pow_e"
[step i_times_sin] ✓ Found output for step "sin" in cache (needed by "i_times_sin")...
[step i_times_sin] ● Starting step "i_times_sin"...
[step i_times_sin] ✓ Finished step "i_times_sin"
[step sum] ✓ Found output for step "cos" in cache (needed by "sum")...
[step sum] ✓ Found output for step "i_times_sin" in cache (needed by "sum")...
[step sum] ● Starting step "sum"...
[step sum] ✓ Finished step "sum"
[step sub] ✓ Found output for step "sum" in cache (needed by "sub")...
[step sub] ✓ Found output for step "pow_e" in cache (needed by "sub")...
[step sub] ● Starting step "sub"...
[step sub] ✓ Finished step "sub"
[step print] ✓ Found output for step "sub" in cache (needed by "print")...
[step print] ● Starting step "print"...
[step print] 0j
[step print] ✓ Finished step "print"
✓ Finished run comic-heron

 ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
 ┃ Step Name   ┃ Status      ┃ Cached Result                                                     ┃
 ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
 │ cos         │ ✓ succeeded │ workspace/cache/CosineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk       │
 │ i_times_pi  │ ✓ succeeded │ workspace/cache/MultiplyStep-4SRzHCCqYGs2PLeT8LeL5ukrCWGJoiae     │
 │ i_times_sin │ ✓ succeeded │ workspace/cache/MultiplyStep-2ZG7wPj9WLn5PgpYyPVHw9Qg7VM1mhwf     │
 │ pow_e       │ ✓ succeeded │ workspace/cache/ExponentiateStep-1swPpNipP6HBSP5rKdNjEqbYAWNf4CdG │
 │ print       │ ✓ succeeded │ N/A                                                               │
 │ sin         │ ✓ succeeded │ workspace/cache/SineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk         │
 │ sub         │ ✓ succeeded │ workspace/cache/SubtractionStep-4ygj1UyLk6TCVBxN7DWTCccbMa7M1C5v  │
 │ sum         │ ✓ succeeded │ workspace/cache/AdditionStep-34AiXoyiPKADMUnhcBzFYd6JeMcgx4DP     │
 └─────────────┴─────────────┴───────────────────────────────────────────────────────────────────┘
                                                                 ✓ 8 succeeded

Use your workspace to get the cached result of a step, e.g.

 >>> from tango import Workspace
 >>> workspace = Workspace.from_url(...)
 >>> workspace.step_result_for_run("comic-heron", "sum")
```

A few things are of note here:
 1. Tango assigns a name to your run. In this case, the name is "comic-heron".
 2. In this configuration, the "print" step prints the output ("`0j`"). Most of the time though, you will look
    for the output in the output directories that are given in the table.
 3. You might notice that the "print" step produces no output. That's because it is uncacheable, and thus writes
    out nothing.


## Change a step

Let's make an update to a step! Open `complex_arithmetic.py` and change `AdditionStep`. The actual change you make
in the `run()` method does not matter, but the important thing is to update the `VERSION` member of the
`AdditionStep` class. `AdditionStep` does not yet have a `VERSION`, so we will give it one:
```Python
@Step.register("cadd")
class AdditionStep(Step):
    VERSION = "002"     # This is the important change.
    
    def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # type: ignore
        return make_complex(a) + make_complex(b)
```

Now run the config again with
```bash
tango run euler_general.jsonnet -i complex_arithmetic -w workspace
```

This time, the output will look like this:
```
Starting new run right-amoeba
Server started at http://localhost:8080/run/right-amoeba
[step sum] ✓ Found output for step "cos" in cache (needed by "sum")...
[step sum] ✓ Found output for step "i_times_sin" in cache (needed by "sum")...
[step sum] ● Starting step "sum"...
[step sum] ✓ Finished step "sum"
[step sub] ✓ Found output for step "sum" in cache (needed by "sub")...
[step sub] ✓ Found output for step "pow_e" in cache (needed by "sub")...
[step sub] ● Starting step "sub"...
[step sub] ✓ Finished step "sub"
[step print] ✓ Found output for step "sub" in cache (needed by "print")...
[step print] ● Starting step "print"...
[step print] 0j
[step print] ✓ Finished step "print"
✓ Finished run right-amoeba

 ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
 ┃ Step Name   ┃ Status      ┃ Cached Result                                                     ┃
 ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
 │ cos         │ - not run   │ workspace/cache/CosineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk       │
 │ i_times_pi  │ - not run   │ workspace/cache/MultiplyStep-4SRzHCCqYGs2PLeT8LeL5ukrCWGJoiae     │
 │ i_times_sin │ - not run   │ workspace/cache/MultiplyStep-2ZG7wPj9WLn5PgpYyPVHw9Qg7VM1mhwf     │
 │ pow_e       │ - not run   │ workspace/cache/ExponentiateStep-1swPpNipP6HBSP5rKdNjEqbYAWNf4CdG │
 │ print       │ ✓ succeeded │ N/A                                                               │
 │ sin         │ - not run   │ workspace/cache/SineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk         │
 │ sub         │ ✓ succeeded │ workspace/cache/SubtractionStep-42mdcQBtrNAYvxYhmzdd1vj2uCG8N5Yf  │
 │ sum         │ ✓ succeeded │ workspace/cache/AdditionStep-002-34AiXoyiPKADMUnhcBzFYd6JeMcgx4DP │
 └─────────────┴─────────────┴───────────────────────────────────────────────────────────────────┘
                                                           ✓ 3 succeeded, 5 not run

Use your workspace to get the cached result of a step, e.g.

 >>> from tango import Workspace
 >>> workspace = Workspace.from_url(...)
 >>> workspace.step_result_for_run("right-amoeba", "sum")
```

As you can see, it re-used the cached results for several of the steps, and only ran three steps anew.

```{eval-rst}
:class:`tango.step.Step.VERSION` is just one of the ways in which you can change the behavior of a step. Head over to the
documentation of the :class:`tango.step.Step` class to see the others.
```


================================================
FILE: docs/source/examples/eval_p3.md
================================================
```{include} ../../../examples/eval_p3/README.md
```

## `RougeScoreStep`

`RougeScoreStep` is defined in `eval.py`:

```{literalinclude} ../../../examples/eval_p3/eval.py
:language: py
```

## Config

The configuration file, `config.jsonnet`, uses some advanced [Jsonnet](https://jsonnet.org) concepts like `std.foldl`
to create the same configuration for all 10 prompts:

```{literalinclude} ../../../examples/eval_p3/config.jsonnet
```

## Run it

You can run the experiment with:

```bash
tango run config.jsonnet -i eval -d /tmp/workspace
```


================================================
FILE: docs/source/examples/index.rst
================================================
Examples
========

Real-world examples of using Tango.
You can find all of these `on GitHub <https://github.com/allenai/tango/tree/main/examples>`_ as well.

.. toctree::
   :maxdepth: 2
   :caption: Examples

   euler
   train_lm
   eval_p3


================================================
FILE: docs/source/examples/train_lm.md
================================================
# Fine-tuning a language model

```{include} ../../../examples/train_lm/README.md
:start-after: <!-- start overview -->
:end-before: <!-- end overview -->
```

```{tip}
You can find the full code for this example on [GitHub](https://github.com/allenai/tango/tree/main/examples/train_lm).
```

## Components

We'll need to write a step for tokenizing the data and preparing it for language model training.
All of the other steps we need are provided by Tango integrations.

So, create a file called `tokenize_step.py` with following contents:

```{literalinclude} ../../../examples/train_lm/tokenize_step.py
:language: py
```

## Configuration file

Next you'll need to create a configuration file that defines the experiment. Just copy over these contents into a file called `config.jsonnet`:


```{literalinclude} ../../../examples/train_lm/config.jsonnet
```

## Run it

Now we can run the experiment with:

```bash
tango run config.jsonnet -i tokenize_step.py -d /tmp/results
```


================================================
FILE: docs/source/faq.md
================================================
# FAQ

```{include} ../../README.md
:start-after: <!-- start faq -->
:end-before: <!-- end faq -->
```


================================================
FILE: docs/source/first_steps.md
================================================
# First Steps

## What is a Step?

Tango is a Python library for choreographing machine learning research experiments by executing
a series of steps.
A step can do anything, really, such as [prepare a dataset](tango.integrations.datasets.LoadDataset), [train a model](tango.integrations.torch.TorchTrainStep), send an email to your mother wishing her happy birthday, *etc*.

Concretely, each step is just a subclass of {class}`~tango.step.Step`, where the {meth}`~tango.step.Step.run` method in particular defines what the step actually does.
So anything that can be implemented in Python can be run as a step.

Steps can also depend on other steps in that the output of one step can be part of the input to another step.
Therefore, the steps that make up an experiment form a [directed graph](tango.step_graph.StepGraph).

The concept of the {class}`~tango.step.Step` is the bread and butter that makes Tango so general and powerful.
*So* powerful, in fact, that you might be wondering if Tango is [Turing-complete](https://en.wikipedia.org/wiki/Turing_completeness)?
Well, we don't know yet, but we can say at least that Tango is **Tango-complete** 😉

## Configuration files

Experiments themselves are defined through JSON, [Jsonnet](https://jsonnet.org/), or YAML configuration files.
At a minimum, these files must contain the "steps" field, which should be a mapping of arbitrary (yet unique) step names to the configuration of the corresponding step.

For example, let's create a config file called `config.jsonnet` with the following contents:

```json
{
  "steps": {
    "random_name": {
      "type": "random_choice",
      "choices": ["Turing", "Tango", "Larry"],
    },
    "say_hello": {
      "type": "concat_strings",
      "string1": "Hello, ",
      "string2": {
        "type": "ref",
        "ref": "random_name"
      }
    },
    "print": {
      "type": "print",
      "input": {
        "type": "ref",
        "ref": "say_hello"
      }
    }
  }
}
```

*Can you guess what this experiment does?*

There are three steps in this experiment graph: "random_name" is the name of one step, "say_hello" is the name of another, and "print" is the name of the last.
The "type" parameter within the config of each step tells Tango which {class}`~tango.step.Step` class implementation to use for that step.

So, within the "random_name" step config

```json
"random_name": {
  "type": "random_choice",
  "choices": ["Turing", "Tango", "Larry"],
}
```

the `"type": "random_choice"` part tells Tango to use the {class}`~tango.step.Step` subclass that is registered by the name "random_choice".

But wait... what do we mean by *registered*?

Tango keeps track of an internal registry for certain classes (such as the {class}`~tango.step.Step` class) that is just a mapping of arbitrary unique names to subclasses.
When you look through Tango's source code, you'll see things like:

```python
@Step.register("foo")
class Foo(Step):
    ...
```

This is how subclasses get added to the registry.
In this case the subclass `Foo` is added to the `Step` registry under the name "foo", so if you were to use `"type": "foo"` in your configuration file, Tango would understand
that you mean to use the `Foo` class for the given step.

```{tip}
Any class that inherits from {class}`~tango.common.registrable.Registrable` can have its own
registry.
```

Now back to our example.
The step classes referenced in our configuration file ("random_choice" and "concat_strings") don't actually exist in the Tango library (though the ["print" step](tango.steps.PrintStep) does),
but we can easily implement and register them on our own.

Let's put them in a file called `components.py`:

```python
# file: components.py

import random
from typing import List

from tango import Step

@Step.register("random_choice")
class RandomChoiceStep(Step):
    DETERMINISTIC = False

    def run(self, choices: List[str]) -> str:
        return random.choice(choices)

@Step.register("concat_strings")
class ConcatStringsStep(Step):
    def run(self, string1: str, string2: str) -> str:
        return string1 + string2
```

```{important}
It's important that you use type hints in your code so that Tango can properly construct Python objects from the corresponding serialized (JSON) objects
and warn you when the types don't match up.
```

So as long as Tango is able to import this module (`components.py`) these step implementations will be added to the registry
and Tango will know how to instantiate and run them.

There's also a short-hand way of implementing steps, using the {func}`@step() <tango.step.step>` function decorator:

```python
from tango import step

@step(deterministic=False)
def random_choice(choices: List[str]) -> str:
    return random.choice(choices)

@step()
def concat_strings(string1: str, string2: str) -> str:
    return string1 + string2
```

This will register these steps under the name of the corresponding function, i.e. "random_choice" and "concat_strings", by default, though that can be overridden by specifying the "name" parameter to the decorator:

```python
@step(name="random-string", deterministic=False)
def random_choice(choices: List[str]) -> str:
    return random.choice(choices)
```

## Executing an experiment

At this point we've implemented our custom steps (`components.py`) and created our configuration
file `config.jsonnet`, so we're ready to actually run this experiment.

For that, just use the `tango run` command:

```
$ tango run config.jsonnet -i components
```

```{tip}
- The `-i` option is short for `--include-package`, which takes the name of a Python package which Tango will try to import.
In this case our custom steps are in `components.py`, so we need Tango to import this module to find those steps.
As long as `components.py` is in the current directory or somewhere else on the `PYTHONPATH`, Tango will be able to find and import
this module when you pass `-i components` (note the lack of the `.py` at the end).
```

You should see something like this in the output:

```
Starting new run cute-kitten
● Starting step "random_name"
✓ Finished step "random_name"
● Starting step "say_hello"
✓ Finished step "say_hello"
● Starting step "print"
Hello, Tango
✓ Finished step "print"
```

## Step caching

This particular experiment didn't write any results to disk, but in many situations you'll want to save the output of at least some of your steps.

For example, if you're using the {class}`~tango.integrations.torch.TorchTrainStep` step, the output is a trained model, which is certainly a useful thing to keep around.
In other cases, you may not actually care about the direct result of a particular step, but it could still be useful to save it when possible so that Tango doesn't need to run the step
again unnecessarily.

This is where Tango's caching mechanism comes in.

To demonstrate this, let's look at another example that pretends to do some expensive computation.
Here is the `config.jsonnet` file:

```json
{
  "steps": {
    "add_numbers": {
      "type": "really_inefficient_addition",
      "num1": 34,
      "num2": 8
    }
  }
}
```

And let's implement "really_inefficient_addition":

```python
# components.py

import time

from tango import Step, JsonFormat
from tango.common import Tqdm


@Step.register("really_inefficient_addition")
class ReallyInefficientAdditionStep(Step):
    DETERMINISTIC = True
    CACHEABLE = True
    FORMAT = JsonFormat()

    def run(self, num1: int, num2: int) -> int:
        for _ in Tqdm.tqdm(range(100), desc="Computing...", total=100):
            time.sleep(0.05)
        return num1 + num2
```

There are a couple of things to note about this step, other than the obvious inefficiencies; the class variables
we've defined: {attr}`~tango.step.Step.DETERMINISTIC`, {attr}`~tango.step.Step.CACHEABLE`, and
{attr}`~tango.step.Step.FORMAT`.

`DETERMINISTIC = True` tells Tango that, given particular inputs, the output to this step will always be the same
every time it is ran, which has implications on caching.
By default, Tango assumes steps are deterministic.
You can override this by saying `DETERMINISTIC = False`.
Tango will warn you when you try to cache a non-deterministic step.

`CACHEABLE = True` tells Tango that it can cache this step and `FORMAT = JsonFormat()` defines which
{class}`~tango.format.Format` Tango will use to serialize the result of the step.

This time when we run the experiment we'll designate a specific directory for Tango to use:

```bash
$ tango run config.jsonnet -i components -d workspace/
```
```
Starting new run live-tarpon
● Starting step "add_numbers"
Computing...: 100%|##########| 100/100 [00:05<00:00, 18.99it/s]
✓ Finished step "add_numbers"
✓ The output for "add_numbers" is in workspace/runs/live-tarpon/add_numbers
```

The last line in the output tells us where we can find the result of our "add_numbers" step. `live-tarpon` is
the name of the run. Run names are randomly generated and may be different on your machine. `add_numbers` is the
name of the step in your config. The whole path is a symlink to a directory, which contains (among other things)
a file `data.json`:

```bash
$ cat workspace/runs/live-tarpon/add_numbers/data.json
```
```
42
```

Now look what happens when we run this step again:

```bash
$ tango run config.jsonnet -i components -d workspace/
```
```
Starting new run modest-shrimp
✓ Found output for "add_numbers" in cache
✓ The output for "add_numbers" is in workspace/runs/modest-shrimp/add_numbers
```

Tango didn't have to run our really inefficient addition step this time because it found the previous cached
result. It put the results in the result directory for a different run (in our case, the `modest-shrimp` run),
but once again it is a symlink that links to the same results from our first run.

If we changed the inputs to the step in `config.jsonnet`:

```diff
     "add_numbers": {
       "type": "really_inefficient_addition",
       "num1": 34,
-      "num2": 8
+      "num2": 2
     }
   }
 }
```

And ran it again:

```bash
$ tango run config.jsonnet -i components -d workspace/
```
```
Starting new run true-parrot
● Starting step "add_numbers"
Computing...: 100%|##########| 100/100 [00:05<00:00, 19.13it/s]
✓ Finished step "add_numbers"
✓ The output for "add_numbers" is in workspace/runs/true-parrot/add_numbers
```

You'd see that Tango had to run our "add_numbers" step again.

You may have noticed that `workspace/runs/true-parrot/add_numbers` is now a symlink that points to a different
place than it did for the first two runs. That's because it produced a different result this time. All the
result symlinks point into the `workspace/cache/` directory, where all the step's results are cached.

This means that if we ran the experiment again with the original inputs, Tango would still find the cached result
and wouldn't need to rerun the step.

## Arbitrary objects as inputs

### `FromParams`

So far the inputs to all of the steps in our examples have been built-in Python types that can be deserialized from JSON (e.g. {class}`int`, {class}`str`, etc.),
but sometimes you need the input to a step to be an instance of an arbitrary Python class.

Tango allows this as well as it can infer from type hints what the class is and how to instantiate it.
When writing your own classes, it's recommended that you have your class inherit from the {class}`~tango.common.from_params.FromParams` class, which will gaurantee that
Tango can instantiate it from a config file.

For example, suppose we had a step like this:

```python
from tango import Step
from tango.common import FromParams


class Bar(FromParams):
    def __init__(self, x: int) -> None:
        self.x = x


@Step.register("foo")
class FooStep(Step):
    def run(self, bar: Bar) -> int:
        return bar.x
```

```{tip}
If you've used [AllenNLP](https://github.com/allenai/allennlp) before, this will look familiar!
In fact, it's the same system under the hood.
```

Then we could create a config like this:

```json
{
  "steps": {
    "foo": {
      "type": "foo",
      "bar": {"x": 1}
    }
  }
}
```

And Tango will figure out how to deserialize `{"x": 1}` into a `Bar` instance.

You can also have `FromParams` objects nested within other `FromParams` objects or standard containers
like {class}`list`:

```python
from typing import List

from tango import Step
from tango.common import FromParams


class Bar(FromParams):
    def __init__(self, x: int) -> None:
        self.x = x


class Baz(FromParams):
    def __init__(self, bar: Bar) -> None:
        self.bar = bar


@Step.register("foo")
class FooStep(Step):
    def run(self, bars: List[Bar], baz: Baz) -> int:
        return sum([bar.x for bar in bars]) + baz.bar.x
```

### `Registrable`

The {class}`~tango.common.registrable.Registrable` class is a special kind of {class}`~tango.common.from_params.FromParams` class that allows you to specify from the config which subclass of an expected class to deserialize into.

This is actually how we've been instantiating specific `Step` subclasses. Because {class}`~tango.step.Step` inherits from {class}`~tango.common.registrable.Registrable`, we can use the `"type"` fields in the config file to specify a `Step` subclass.

This is also very useful when you're writing a step that requires a certain type as input, but you want to be able to change the exact subclass of the type from your config file. For example, the {class}`~tango.integrations.torch.TorchTrainStep` takes `Registrable` inputs such as {class}`~tango.integrations.torch.Model`. Model variants can then be subclasses that are specified in the config file by their registered names. A sketch of this might look like the following: 

```python
from tango import Step
from tango.common import FromParams, Registrable

class Model(torch.nn.Module, Registrable):
    ...

@Model.register("variant1")
class Variant1(Model):
    ...

@Model.register("variant2")
class Variant2(Model):
    ...

@Step.register("torch::train")
class TorchTrainerStep(Step):
    def run(self, model: Model, ...) -> Model:
        ...
```

And a sketch of the config file would be something like this:

```json
{
  "steps": {
    "train": {
      "type": "torch::train",
      "model": {
        "type": "variant1",
      }
    }
  }
}
```

As in the `FromParams` example the specifications can be nested, but now we also denote the subclass with the `"type": "..."` field. To swap models we need only change "variant1" to "variant2" in the config. The value for "type" can either be the name that the class is registered under (e.g. "train" for `TorchTrainStep`), or the fully qualified class name (e.g. `tango.integrations.torch.TorchTrainStep`).

You'll see more examples of this in the [next section](examples/index).


================================================
FILE: docs/source/index.md
================================================
# **AI2 Tango**

```{include} ../../README.md
:start-after: <!-- start tagline -->
:end-before: <!-- end tagline -->
```

```{toctree}
:maxdepth: 2
:hidden:
:caption: Getting started

installation
first_steps
examples/index
faq
```

```{toctree}
:maxdepth: 2
:hidden:
:caption: API Reference

api/commands
api/components/index
api/integrations/index
api/settings
api/exceptions
api/logging
api/sequences
api/det_hash
api/utilities
```

```{toctree}
:hidden:
:caption: Development

CONTRIBUTING
CHANGELOG
License <https://raw.githubusercontent.com/allenai/tango/main/LICENSE>
GitHub Repository <https://github.com/allenai/tango>
```

To learn about Tango in 5 minutes, head over to the [First Steps section](first_steps).

If you'd rather learn from examples, check out the [Examples section](examples/index).

## Team

```{include} ../../README.md
:start-after: <!-- start team -->
:end-before: <!-- end team -->
```

## License

```{include} ../../README.md
:start-after: <!-- start license -->
:end-before: <!-- end license -->
```

## Indices and tables

```{eval-rst}
* :ref:`genindex`
* :ref:`modindex`
```


================================================
FILE: docs/source/installation.md
================================================
Installation
============

```{include} ../../README.md
:start-after: <!-- start install -->
:end-before: <!-- end install -->
```


================================================
FILE: examples/euler/README.md
================================================
Euler
=====

This is a toy example that proves Euler's identity using Tango. You can use this to play with the concept of a
`Step` and see how Tango runs things without getting distracted by the details of what you're running.

================================================
FILE: examples/euler/complex_arithmetic.py
================================================
import cmath
from typing import Tuple, Union

from tango import Step

ComplexOrTuple = Union[complex, Tuple[float, float]]


def make_complex(x: ComplexOrTuple) -> complex:
    if isinstance(x, complex):
        return x
    elif isinstance(x, (int, float)):
        return complex(x)
    else:
        return complex(*x)


@Step.register("cadd")
class AdditionStep(Step):
    def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # type: ignore
        return make_complex(a) + make_complex(b)


@Step.register("csub")
class SubtractionStep(Step):
    def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # type: ignore
        return make_complex(a) - make_complex(b)


@Step.register("cexp")
class ExponentiateStep(Step):
    def run(self, x: ComplexOrTuple, base: ComplexOrTuple = cmath.e) -> complex:  # type: ignore
        return make_complex(base) ** make_complex(x)


@Step.register("cmul")
class MultiplyStep(Step):
    def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # type: ignore
        return make_complex(a) * make_complex(b)


@Step.register("csin")
class SineStep(Step):
    def run(self, x: ComplexOrTuple) -> complex:  # type: ignore
        return cmath.sin(make_complex(x))


@Step.register("ccos")
class CosineStep(Step):
    def run(self, x: ComplexOrTuple) -> complex:  # type: ignore
        return cmath.cos(make_complex(x))


================================================
FILE: examples/euler/euler.jsonnet
================================================
local i = [0.0, 1.0];
local pi = [3.1415926535, 0.0];

{
    "steps": {
        "i_times_pi": {
            "type": "cmul",
            "a": i,
            "b": pi
        },
        "pow_e": {
            "type": "cexp",
            "x": { "type": "ref", "ref": "i_times_pi" }
        },
        "plus_one": {
            "type": "cadd",
            "a": { "type": "ref", "ref": "pow_e" },
            "b": [1, 0]
        },
        "print": {
            "type": "print",
            "input": { "type": "ref", "ref": "plus_one" }
        }
    }
}

================================================
FILE: examples/euler/euler_general.jsonnet
================================================
local i = [0.0, 1.0];
local pi = [3.1415926535, 0.0];

{
    "steps": {
        "cos": {
            "type": "ccos",
            "x": pi
        },
        "sin": {
            "type": "csin",
            "x": pi
        },
        "i_times_sin": {
            "type": "cmul",
            "a": i,
            "b": { "type": "ref", "ref": "sin" }
        },
        "sum": {
            "type": "cadd",
            "a": { "type": "ref", "ref": "cos" },
            "b": { "type": "ref", "ref": "i_times_sin" },
        },

        "i_times_pi": {
            "type": "cmul",
            "a": i,
            "b": pi
        },
        "pow_e": {
            "type": "cexp",
            "x": { "type": "ref", "ref": "i_times_pi" }
        },

        "sub": {
            "type": "csub",
            "a": { "type": "ref", "ref": "sum" },
            "b": { "type": "ref", "ref": "pow_e" },
        },

        "print": {
            "type": "print",
            "input": { "type": "ref", "ref": "sub" }
        }
    }
}

================================================
FILE: examples/euler/run.sh
================================================
#!/bin/bash

tango run euler_general.jsonnet -d workspace --include-package complex_arithmetic


================================================
FILE: examples/eval_p3/README.md
================================================
# Evaluating T0

This example uses the `transformers::run_generation_dataset` step to run the
[T0 model](https://api.semanticscholar.org/CorpusID:239009562). It runs the
[XSum summarization data](https://github.com/EdinburghNLP/XSum), prompted in 10 different ways, and computes
ROUGE scores for all variants. Finally, it computes an overall ROUGE score.

This example uses mostly built-in Tango steps. You will need the `datasets` and `transformers` integrations.
The only custom step in this example is the `RougeScoreStep`, which computes ROUGE scores from the
generated text.

================================================
FILE: examples/eval_p3/config.jsonnet
================================================
local model = "bigscience/T0_3B";
local batch_size = 8;

local datasets = [
    'xsum_DOC_boils_down_to_simple_idea_that',
    'xsum_DOC_given_above_write_one_sentence',
    'xsum_DOC_how_would_you_rephrase_few_words',
    'xsum_DOC_tldr',
    'xsum_DOC_write_summary_of_above',
    'xsum_article_DOC_summary',
    'xsum_college_roommate_asked_DOC_so_I_recap',
    'xsum_read_below_DOC_write_abstract',
    'xsum_summarize_DOC',
    'xsum_summarize_this_DOC_summary'
];

# This creates three steps for each of the datasets:
# 1. Load the dataset.
# 2. Generate output based on the dataset.
# 3. Evaluate the output against the gold answers.
local dataset_steps = std.foldl(
    function(x, dataset_name) x + {
        ["dataset_" + dataset_name]: {
            "type": "datasets::load",
            "path": "bigscience/P3",
            "name": dataset_name,
        },
        ["generation_" + dataset_name]: {
            "type": "transformers::run_generation_dataset",
            "max_length": 200,
            "input": {"ref": "dataset_" + dataset_name},
            "batch_size": batch_size,
            "model": model,
            "prompt_field": "inputs_pretokenized",
            "output_field": "generation",
            "splits": ["validation"]
        },
        ["eval_" + dataset_name]: {
            "type": "rouge_score",
            "input": {"ref": "generation_" + dataset_name},
            "input_split": "validation",
            "target_field": "targets_pretokenized",
            "prediction_field": "generation"
        }
    },
    datasets,
    {}
);

# In addition to the three steps per dataset, we also combine all the generations and
# evaluate them all together.
{
    "steps": dataset_steps + {
        "all_generations": {
            "type": "dataset_combine",
            "inputs": std.map(
                function(dataset_name) {"ref": "generation_" + dataset_name},
                datasets
            )
        },
        "all_evaluations": {
            "type": "rouge_score",
            "input": {"ref": "all_generations"},
            "input_split": "validation",
            "target_field": "targets_pretokenized",
            "prediction_field": "generation"
        }
    }
}


================================================
FILE: examples/eval_p3/eval.py
================================================
import logging
from typing import Dict

from torch import Tensor
from torchmetrics.text.rouge import ROUGEScore

from tango import Format, JsonFormat, Step
from tango.common import DatasetDict
from tango.common.tqdm import Tqdm

logger = logging.getLogger(__name__)


@Step.register("rouge_score")
class RougeScoreStep(Step[Dict[str, Tensor]]):
    VERSION = "002"
    FORMAT: Format = JsonFormat()

    def run(  # type: ignore
        self,
        input: DatasetDict,
        input_split: str,
        target_field: str,
        prediction_field: str,
        use_stemmer: bool = True,
    ) -> Dict[str, Tensor]:
        metric = ROUGEScore(
            use_stemmer=use_stemmer,
            rouge_keys=("rouge1", "rouge2", "rougeL"),
            accumulate="avg",
        )

        for instance in Tqdm.tqdm(input[input_split], desc="Calculating scores"):
            target = instance[target_field]
            for prediction in instance[prediction_field]:
                metric.update(prediction, target)

        return metric.compute()


================================================
FILE: examples/finetune/__init__.py
================================================


================================================
FILE: examples/finetune/config.jsonnet
================================================
##################
# Model settings #
##################

local pretrained_model = "t5-base";
local load_with_low_cpu_mem_usage = false;

local modules_to_wrap = ["[a-zA-Z_.]+\\.[0-9]+"];  # TODO: works for t5 and gpt2. confirm with other models too.

####################
# Trainer settings #
####################

# Trainer settings, adjust to your use-case.
local training_steps = 20;  # total number of optimization steps to train for
local validate_every = 5;  # how often to validate and save checkpoints

local devices = 1;  # number of devices to train on (will use GPUs if enough are available, otherwise CPU)
local grad_accum = 1;  # number of gradient accumulation steps (changes the effective batch size)
# This is the batch size per GPU, ignoring gradient accumulation:
local batch_size = 2;
# So the effective batch size is `batch_size * grad_accum * devices`

local activation_checkpointing = false;  # use activation/gradient checkpointing (probably need this GPT-J 6B, but not gpt2)
local amp = false;  # use PyTorch's native automatic mixed precision
local fsdp = false;  # Use FairScale's FullyShardedDataParallel (probably need this GPT-J 6B, but not gpt2)
local cpu_offloading = false;  # Can only be used with 'fsdp' - saves a lot of GPU memory by offloading params+gradients to CPU, but is very slow.

######################
# Optimizer settings #
######################

local warmup_steps = 20;
local learning_rate = 0.00005;  # you can probably use a higher LR for a small model like "gpt2"


assert fsdp == true || cpu_offloading == false : "cpu_offloading only available with fsdp";

# FullyShardedDataParallel config:
local fsdp_config = if fsdp then {
    reshard_after_forward: true,
    move_params_to_cpu: cpu_offloading,
    move_grads_to_cpu: cpu_offloading,
    mixed_precision: amp,
} else null;

local training_engine = {
    type: if fsdp then "fairscale" else "torch",
    optimizer: {
        type: "torch::AdamW",
        lr: learning_rate,
        betas: [0.9, 0.95],
        eps: 1e-6,
    },
    lr_scheduler: {
        type: "transformers::linear",
        num_warmup_steps: warmup_steps,
        num_training_steps: training_steps,
    },
    amp: amp,
    [if fsdp then "fsdp_config" else null]: fsdp_config,
};

local distributed_dataloader = {
    batch_size: batch_size,
    sampler: {
        type: "torch::DistributedSampler",
        shuffle: true,
        drop_last: true,
    },
};

local single_device_dataloader = {
    shuffle: true,
    batch_size: batch_size,
};

local dataloader = if devices > 1 then distributed_dataloader else single_device_dataloader;

{
    steps: {
        raw_data: {
            type: "datasets::load",
            path: "snli",
        },
        /*"subset_data": {
            type: "subset-data",
            data: { type: "ref", ref: "raw_data" },
            max_samples: 10,
        },*/
        processed_data: {
            type: "snli-text2text",
            data: { type: "ref", ref: "raw_data" },
        },
        trained_model: {
            type: "transformers::finetune",
            model: {
                type: "fairscale::with_wrapped_modules",
                model: {
                    type: "transformers::finetune::from_pretrained",
                    pretrained_model_name_or_path: pretrained_model,
                    low_cpu_mem_usage: load_with_low_cpu_mem_usage,
                },
                modules_to_wrap: modules_to_wrap,  # tell FairScale to wrap the transformer's blocks individually
                fsdp_config: fsdp_config,
                activation_checkpointing: activation_checkpointing,
            },
            tokenizer: {
                pretrained_model_name_or_path: pretrained_model
            },
            dataset_dict: { type: "ref", ref: "processed_data" },
            train_dataloader: dataloader,
            validation_split: "validation",
            grad_accum: grad_accum,
            train_steps: training_steps,
            validate_every: validate_every,
            checkpoint_every: validate_every,
            log_every: 1,
            device_count: devices,
            training_engine: training_engine,
        },
        generations: {
            type: "transformers::run_generation_dataset",
            max_length: 5,
            input: {"type": "ref", "ref": "processed_data"},
            batch_size: batch_size,
            model: {"type": "ref", "ref": "trained_model"},
            prompt_field: "source",
            output_field: "generation",
            splits: ["validation"]
        }
    }
}


================================================
FILE: examples/finetune/snli_steps.py
================================================
from typing import Union

import datasets as ds

from tango.integrations.datasets import DatasetsFormat
from tango.step import Step


@Step.register("subset-data")
class SubsetData(Step):
    """
    Creates a subset of the data; mostly to be used for testing/debugging.
    """

    DETERMINISTIC = True
    CACHEABLE = True
    VERSION = "001"

    FORMAT = DatasetsFormat()

    def run(  # type: ignore
        self,
        data: Union[ds.DatasetDict, ds.Dataset],
        max_samples: int = 5,
    ) -> Union[ds.DatasetDict, ds.Dataset]:
        """
        Returns a copy of the `data` with number of samples limited to `max_samples` for
        each split.

        :param data:
            The dataset or dataset dict object.
        :param max_samples:
            The maximum number of samples to return per split.
        """

        # Unlike `ds.Dataset.select`, this works on both `ds.Dataset` and `ds.DatasetDict`.
        def filter_fn(example, indices):
            return indices < max_samples

        return data.filter(filter_fn, with_indices=True)


@Step.register("snli-text2text")
class SnliText2Text(Step):
    """
    Converts the snli dataset to a text-to-text format.

    Examples
    --------

    original_instance = {
        "premise": "Two cats are sitting on a wall.",
        "hypothesis": "The cats are chasing a mouse.",
        "label": 2  # contradiction
    }

    returned_instance = {
        "source": "nli premise: Two cats are sitting on a wall. hypothesis: The cats are chasing a mouse. label: "
        "target": "contradiction"
    }

    """

    DETERMINISTIC = True
    CACHEABLE = True
    VERSION = "001"

    FORMAT = DatasetsFormat()

    def run(  # type: ignore
        self,
        data: Union[ds.DatasetDict, ds.Dataset],
        source_prefix: str = "nli",
        premise_prefix: str = "premise",
        hypothesis_prefix: str = "hypothesis",
        label_prefix: str = "label",
        num_workers: int = 1,
    ) -> Union[ds.DatasetDict, ds.Dataset]:
        """
        :param data:
            The snli `Dataset` or `DatasetDict` object.
        :param source_prefix:
            The str to add before the start of the source sequence.
        :param premise_prefix:
            The str to add before the start of the `premise` in the source sequence.
        :param hypothesis_prefix:
            The str to add before the start of the `hypothesis` in the source sequence.
        :param label_prefix:
            The str to add as the prompt for the label.
        :param num_workers:
            The number of workers to use for processing the data.
        """

        def filter_no_gold(example, indices):
            if example["label"] == -1:
                return False
            return True

        data = data.filter(filter_no_gold, with_indices=True)

        label_map = {0: "entailment", 1: "neutral", 2: "contradiction"}

        def _mapper(example):
            return {
                "source": (
                    f'{source_prefix} {premise_prefix}: {example["premise"]} '
                    f'{hypothesis_prefix}: {example["hypothesis"]} {label_prefix}: '
                ),
                "target": f'{label_map[example["label"]]}',
            }

        if isinstance(data, ds.Dataset):
            old_cols = data.column_names
        else:
            old_cols = list(data.column_names.values())[0]

        dataset = data.map(
            _mapper,
            batched=False,
            num_proc=num_workers,
            remove_columns=old_cols,  # remove all old columns
            desc="Converting data to text-to-text format",
        )

        return dataset


================================================
FILE: examples/finetune/test.py
================================================
import typing

import datasets as ds
import pytest

from tango.common import Params
from tango.common.testing import TangoTestCase, run_experiment


class TestFinetuneSNLI(TangoTestCase):
    @pytest.mark.parametrize(
        "model, model_type",
        [("patrickvonplaten/t5-tiny-random", "t5"), ("sshleifer/tiny-gpt2", "gpt2")],
    )
    @typing.no_type_check  # mypy has become incompatible with the datasets library
    def test_config(self, model: str, model_type: str):
        overrides = {
            "steps.trained_model.model.model.pretrained_model_name_or_path": model,
            "steps.trained_model.tokenizer.pretrained_model_name_or_path": model,
            "steps.subset_data": {
                "type": "subset-data",
                "data": {"type": "ref", "ref": "raw_data"},
                "max_samples": 10,
            },
            "steps.processed_data.data.ref": "subset_data",
        }
        config = Params.from_file("config.jsonnet", params_overrides=overrides)
        # Make sure we've overrode the model entirely.
        flattened = config.as_flat_dict()
        for key, value in flattened.items():
            if "model_name" in key or (isinstance(value, str) and model_type in value):
                assert value == model

        with run_experiment(config, include_package=["snli_steps.py"]) as run_dir:
            assert (run_dir / "processed_data").is_dir()
            processed = ds.load_from_disk(run_dir / "processed_data" / "data")
            assert len(processed["train"][0].keys()) == 2
            assert "source" in processed["train"][0].keys()
            assert "target" in processed["train"][0].keys()
            assert processed["train"][0]["source"].startswith("nli premise:")

            assert (run_dir / "trained_model").is_dir()


================================================
FILE: examples/finetune_resnet/.gitignore
================================================
data/
results/
extra_testing.py


================================================
FILE: examples/finetune_resnet/config.jsonnet
================================================
local input_size = 224;
local batch_size = 32;
local num_classes = 2;
local val_size = 0.05;
local model = "resnet";
local feature_extract = true;
local distributed = false;
local devices = if distributed then 2 else 1;
local pretrained_model = "resnet_ft";
local training_steps = 500;
local validate_every = 50;
local image_url = "https://tinyurl.com/2p9xjvn9";

local distributed_dataloader = {
    batch_size: batch_size,
    sampler: {
        type: "torch::DistributedSampler",
        shuffle: true,
        drop_last: true,
    },
    collate_fn: {"type": "image_collator"},
};

local single_device_dataloader = {
    shuffle: true,
    batch_size: batch_size,
    collate_fn: {"type": "image_collator"},
};

{
    steps: {
        raw_data: {
            type: "datasets::load",
            path: "nateraw/auto-cats-and-dogs",
            name: "cats_and_dogs",
        },
        transform_data: {
            type: "transform_data",
            dataset: { type: 'ref', ref: 'raw_data' },
            input_size: input_size,
            val_size: val_size,
        },
        trained_model: {
            type: "torch::train",
            model: {
                type: pretrained_model,
                num_classes: num_classes,
                feature_extract: true,
                use_pretrained: true,
            },
            training_engine: {
                optimizer: {
                    type: "torch_adam",
                    lr: 0.001,
                },
            },
            dataset_dict: {"type": "ref", "ref": "transform_data"},
            train_dataloader: single_device_dataloader,
            validation_split: "val",
            val_metric_name: "accuracy",
            train_steps: training_steps,
            validate_every: validate_every,
            checkpoint_every: validate_every,
            log_every: 1,
            device_count: devices,
            minimize_val_metric: false,
        },
        prediction: {
            type: "prediction",
            image_url: image_url,
            input_size: input_size,
            model: {"type": "ref", "ref": "trained_model"},
        },
    },
}
 


================================================
FILE: examples/finetune_resnet/resnet_steps.py
================================================
from typing import Any, Dict, List, Optional

import datasets
import torch
from cached_path import cached_path
from PIL import Image
from torch import nn
from torch.optim import Adam
from torchvision import models, transforms

from tango import Format, JsonFormat, Step
from tango.integrations.torch import DataCollator, Model, Optimizer

# Register the Adam optimizer as an `Optimizer` so we can use it in the train step.
Optimizer.register("torch_adam")(Adam)


# Wrapper class around the pre-trained ResNet-18 model that modifies the final layer.
@Model.register("resnet_ft")
class ResNetWrapper(Model):
    def __init__(self, num_classes: int, feature_extract: bool, use_pretrained: bool):
        super().__init__()
        self.model_ft = models.resnet18(pretrained=use_pretrained)
        self.set_parameter_requires_grad(self.model_ft, feature_extract)
        num_features = self.model_ft.fc.in_features
        self.model_ft.fc = nn.Linear(num_features, num_classes)
        self.loss_fn = nn.CrossEntropyLoss()

    def set_parameter_requires_grad(self, model: models, feature_extracting: bool):
        if feature_extracting:
            for param in model.parameters():
                param.requires_grad = False

    def forward(  # type: ignore
        self, image: torch.Tensor, label: Optional[torch.Tensor] = None
    ) -> Dict[str, torch.Tensor]:
        output = self.model_ft(image)
        preds = torch.argmax(output, dim=1)
        if label is None:
            return {"preds": preds}
        loss = self.loss_fn(output, label)
        accuracy = (preds == label).float().mean()
        return {"loss": loss, "accuracy": accuracy}


# Custom data collator for images, that takes in a batch of images and labels and
# reformats the data so that it is suitable for the model.
@DataCollator.register("image_collator")
class ImageCollator(DataCollator[Dict[str, Any]]):
    def __call__(self, batch: List[Dict[str, Any]]) -> Dict[str, Any]:
        return {
            "image": torch.cat([item["image"].unsqueeze(0) for item in batch], dim=0),
            "label": torch.tensor([item["labels"] for item in batch]),
        }


# Function that returns an image transformations dict with the appropriate image size.
def get_data_transforms(input_size: int):
    data_transforms = {
        "train": transforms.Compose(
            [
                transforms.RandomResizedCrop(input_size),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
            ]
        ),
        "val": transforms.Compose(
            [
                transforms.Resize(input_size),
                transforms.CenterCrop(input_size),
                transforms.ToTensor(),
                transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
            ]
        ),
    }
    return data_transforms


# loads and image and applies the appropriate transformation
def pil_loader(path: str, input_size: int, transform_type: str):
    with open(path, "rb") as f:
        image = Image.open(f)
        image = image.convert("RGB")
        transform = get_data_transforms(input_size=input_size)[transform_type]
        transformed_image = transform(image)
        return transformed_image


# calls the image loader on every image in a given batch
def image_loader(example_batch, input_size: int, transform_type: str):
    example_batch["image"] = [
        pil_loader(f, input_size, transform_type) for f in example_batch["file"]
    ]
    return example_batch


# This step takes in raw image data and transforms and tokenizes it.
@Step.register("transform_data")
class TransformData(Step):
    DETERMINISTIC = True
    CACHEABLE = False

    def run(  # type: ignore
        self, dataset: datasets.DatasetDict, val_size: float, input_size: int
    ) -> datasets.DatasetDict:
        def image_loader_wrapper(example_batch):
            return image_loader(example_batch, input_size=input_size, transform_type="train")

        dataset = dataset.with_transform(image_loader_wrapper)
        train_val = dataset["train"].train_test_split(test_size=val_size)
        train_val["val"] = train_val.pop("test")
        return train_val


# function to map integer labels to string labels
def convert_to_label(int_label: int) -> str:
    if int_label == 0:
        return "cat"
    else:
        return "dog"


@Step.register("prediction")
class Prediction(Step):
    FORMAT: Format = JsonFormat()

    def run(  # type: ignore
        self, image_url: str, input_size: int, model: models, device: Optional[str] = "cpu"
    ) -> Dict[str, Any]:
        # download and store image
        image_path = cached_path(image_url)
        transformed_image = pil_loader(image_path, input_size, transform_type="val")

        # pass image through transform
        transformed_image = transformed_image.unsqueeze(0).to(device)

        # pass image through model and get the prediction
        prediction = model(image=transformed_image, label=None)["preds"][0].float()
        label = convert_to_label(prediction)
        return {"image_url": image_url, "local_path": image_path, "label": label}


================================================
FILE: examples/flax/config.jsonnet
================================================
{
    "steps": {
        "data": {
            "type": "datasets::load",
            "path": "xsum",
        },
        "tokenize": {
            "type": "tokenize_data",
            "dataset": {
                "type": "ref",
                "ref": "data"
            }
        },
        "train": {
            "type": "flax::train",
            "model": {
                "type": "transformers::FlaxAutoModelForSeq2SeqLM::from_pretrained",
                "pretrained_model_name_or_path": "facebook/bart-base"
            },
            "dataset": {
                "type": "ref",
                "ref": "tokenize"
            },
            "optimizer": {
                "type" : "optax::adamw",
                "learning_rate" : 2e-5
            },
            "train_dataloader": {
                "batch_size": 16,
                "drop_last": true
            },
            "wrapper": {
                "type": "xsum_wrapper"
            },
            "train_split": "train",
            "validation_split" : "validation",
            "validate_every" : 1000,
            "validation_dataloader": {
                "batch_size": 16,
                "drop_last": true
            },
            "train_epoch": 5,
            "checkpoint_every": 1000,
            "log_every": 1000,

            "callbacks" : [
                //{"type" : "wandb::log_flax"},
                {"type": "flax::generate_step"}
            ]
        },
        "eval": {
            "type": "flax::eval",
            "state": {
                "type": "ref",
                "ref": "train"
            },
            "dataset": {
                "type": "ref",
                "ref": "tokenize"
            },
            "dataloader": {
                "batch_size": 16,
                "drop_last": true
            },
            "wrapper": {
                "type" : "xsum_wrapper"
            }
        }
    }
}

================================================
FILE: examples/flax/run.sh
================================================
#!/bin/bash

tango run config.jsonnet -d workspace --include-package xsum


================================================
FILE: examples/flax/xsum.py
================================================
import logging
from typing import List, Optional

import jax
import jax.numpy as jnp
import nltk
import numpy as np
import optax
from datasets import load_metric
from flax.training.common_utils import onehot
from transformers import AutoConfig, AutoTokenizer, FlaxAutoModelForSeq2SeqLM

from tango.integrations.flax import FlaxWrapper
from tango.integrations.flax.train_callback import TrainCallback
from tango.step import Step

"""
XSum Summarization with facebook/bart-base
"""


@Step.register("tokenize_data")
class PreProcessing(Step):
    DETERMINISTIC = False

    def run(self, dataset):
        tokenizer = AutoTokenizer.from_pretrained("facebook/bart-base")
        model = FlaxAutoModelForSeq2SeqLM.from_pretrained("facebook/bart-base")
        model_module = __import__(model.__module__, fromlist=["shift_tokens_tight"])
        shift_tokens_right_fn = getattr(model_module, "shift_tokens_right")
        config = AutoConfig.from_pretrained("facebook/bart-base")

        MAX_SOURCE_LENGTH = 512
        MAX_TGT_LENGTH = 64

        def preprocess_function(examples):
            inputs = examples["document"]
            targets = examples["summary"]
            inputs = [inp for inp in inputs]
            model_inputs = tokenizer(
                inputs,
                max_length=MAX_SOURCE_LENGTH,
                padding="max_length",
                truncation=True,
                return_tensors="np",
            )

            # Setup the tokenizer for targets
            with tokenizer.as_target_tokenizer():
                labels = tokenizer(
                    targets,
                    max_length=MAX_TGT_LENGTH,
                    padding="max_length",
                    truncation=True,
                    return_tensors="np",
                )

            model_inputs["labels"] = labels["input_ids"]
            decoder_input_ids = shift_tokens_right_fn(
                labels["input_ids"], config.pad_token_id, config.decoder_start_token_id
            )
            model_inputs["decoder_input_ids"] = np.asarray(decoder_input_ids)

            # We need decoder_attention_mask so we can ignore pad tokens from loss
            model_inputs["decoder_attention_mask"] = labels["attention_mask"]

            return model_inputs

        column_names = dataset["train"].column_names

        dataset = dataset.map(
            preprocess_function,
            batched=True,
            remove_columns=column_names,
            desc="Running tokenizer on dataset",
        )

        return dataset


@FlaxWrapper.register("xsum_wrapper")  # type: ignore
class TransformerWrapper(FlaxWrapper):
    def loss_helper(self, logits, labels, batch):
        label_smoothing_factor = 0
        padding_mask = batch["decoder_attention_mask"]
        vocab_size = logits.shape[-1]
        confidence = 1.0 - label_smoothing_factor
        low_confidence = (1.0 - confidence) / (vocab_size - 1)
        normalizing_constant = -(
            confidence * jnp.log(confidence)
            + (vocab_size - 1) * low_confidence * jnp.log(low_confidence + 1e-20)
        )
        soft_labels = onehot(labels, vocab_size, on_value=confidence, off_value=low_confidence)

        loss = optax.softmax_cross_entropy(logits, soft_labels)
        loss = loss - normalizing_constant

        # ignore padded tokens from loss
        loss = loss * padding_mask
        loss = loss.sum() / padding_mask.sum()
        return loss

    def train_loss(self, params, state, batch, dropout_rng, labels):
        logits = state.apply_fn(**batch, params=params, dropout_rng=dropout_rng, train=True)[0]
        loss = self.loss_helper(logits, labels, batch)
        return loss

    def val_metrics(self, batch, logits, labels):
        loss = self.loss_helper(logits, labels, batch)
        metrics = {"loss": loss}
        return metrics

    def eval_metrics(self, batch, logits, labels):
        loss = self.loss_helper(logits, labels, batch)
        metrics = {"loss": loss}
        return metrics


@TrainCallback.register("flax::generate_step")
class GenerateCallback(TrainCallback):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self.logger = logging.getLogger(GenerateCallback.__name__)

    def generate_step(self, params, batch):
        self.model.params = params
        gen_kwargs = {"max_length": 64, "num_beams": self.model.config.num_beams}
        output_ids = self.model.generate(
            batch["input_ids"], attention_mask=batch["attention_mask"], **gen_kwargs
        )
        return output_ids.sequences

    def pre_train_loop(self) -> None:
        if len(jax.devices()) > 1:
            self.p_generate_step = jax.pmap(self.generate_step, axis_name="batch")

    def pre_val_loop(self, step: int, val_step: int, state) -> None:
        self.state = state
        self.eval_preds: List = []
        self.eval_labels: List = []

    def pre_val_batch(self, step: int, val_step: int, epoch: int, val_batch) -> None:
        labels = val_batch["labels"]
        if len(jax.devices()) > 1:
            generated_ids = self.p_generate_step(self.state.params, val_batch)
        else:
            generated_ids = self.generate_step(self.state.params, val_batch)
        self.eval_preds.extend(jax.device_get(generated_ids.reshape(-1, 64)))
        self.eval_labels.extend(jax.device_get(labels.reshape(-1, labels.shape[-1])))

    def postprocess_text(self, preds, labels):
        preds = [pred.strip() for pred in preds]
        labels = [label.strip() for label in labels]

        # rougeLSum expects newline after each sentence
        preds = ["\n".join(nltk.sent_tokenize(pred)) for pred in preds]
        labels = ["\n".join(nltk.sent_tokenize(label)) for label in labels]

        return preds, labels

    def compute_metrics(self, preds, labels):
        tokenizer = AutoTokenizer.from_pretrained("facebook/bart-base")
        decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
        decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

        # Some simple post-processing
        decoded_preds, decoded_labels = self.postprocess_text(decoded_preds, decoded_labels)
        metric = load_metric("rouge")
        result = metric.compute(
            predictions=decoded_preds, references=decoded_labels, use_stemmer=True
        )
        # Extract a few results from ROUGE
        result = {key: value.mid.fmeasure * 100 for key, value in result.items()}

        prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in preds]
        result["gen_len"] = np.mean(prediction_lens)
        result = {k: round(v, 4) for k, v in result.items()}
        return result

    def post_val_loop(
        self, step: int, epoch: int, val_metric: Optional[float], best_val_metric: Optional[float]
    ) -> None:
        rouge_metrics = self.compute_metrics(self.eval_preds, self.eval_labels)
        rouge_desc = " ".join([f"Eval {key}: {value} |" for key, value in rouge_metrics.items()])
        self.logger.info(rouge_desc)


================================================
FILE: examples/train_lm/.gitignore
================================================
runs
run


================================================
FILE: examples/train_lm/README.md
================================================
# Fine-tuning a language model

<!-- start overview -->

This Tango example showcases how you could train or fine-tune a causal language model like [GPT-2](https://huggingface.co/docs/transformers/model_doc/gpt2)
or [GPT-J](https://huggingface.co/docs/transformers/model_doc/gptj) from [transformers](https://github.com/huggingface/transformers) on WikiText2 or a similar dataset.
It's best that you run this experiment on a machine with a GPU and PyTorch [properly installed](https://pytorch.org/get-started/locally/#start-locally), otherwise Tango will fall back to CPU-only and it will be extremely slow.

This example also depends on [FairScale](https://fairscale.readthedocs.io/en/latest/), which allows you to leverage [`FullyShardedDataParallel`](https://fairscale.readthedocs.io/en/latest/api/nn/fsdp.html) (FSDP) and [activation checkpointing](https://fairscale.readthedocs.io/en/latest/api/nn/checkpoint/checkpoint_activations.html) to fine-tune [GPT-J 6B](https://huggingface.co/EleutherAI/gpt-j-6B) or a similar-sized model. Just set the constants `fsdp` and `activation_checkpointing` in the config to `true`.
Without using CPU offloading you'll need at least 4 x 40GiB A100 GPUs, or a different configuration with a comparable amount of total GPU memory.

<!-- end overview -->

To getting started, just run

```
tango run config.jsonnet -i tokenize_step.py
```


================================================
FILE: examples/train_lm/config.jsonnet
================================================
##################
# Model settings #
##################

local pretrained_model = "gpt2";
# With 'fsdp' and 'activation_checkpointing' (see constants below), you should be able to train
# a 6B model on 4x ~40GB GPUs:
# local pretrained_model = "EleutherAI/gpt-j-6B";

# This doesn't seem to work with gpt2, but works fine with gpt-j.
local load_with_low_cpu_mem_usage = std.startsWith(pretrained_model, "EleutherAI/gpt-j");

####################
# Trainer settings #
####################

# Trainer settings, adjust to your use-case.
local training_steps = 200;  # total number of optimization steps to train for
local validate_every = 20;  # how often to validate and save checkpoints

local devices = 1;  # number of devices to train on (will use GPUs if enough are available, otherwise CPU)
local grad_accum = 1;  # number of gradient accumulation steps (changes the effective batch size)
# This is the batch size per GPU, ignoring gradient accumulation:
local batch_size = 8;
# So the effective batch size is `batch_size * grad_accum * devices`

local activation_checkpointing = false;  # use activation/gradient checkpointing (probably need this GPT-J 6B, but not gpt2)
local amp = false;  # use PyTorch's native automatic mixed precision
local fsdp = false;  # Use FairScale's FullyShardedDataParallel (probably need this GPT-J 6B, but not gpt2)
local cpu_offloading = false;  # Can only be used with 'fsdp' - saves a lot of GPU memory by offloading params+gradients to CPU, but is very slow.

######################
# Optimizer settings #
######################

local warmup_steps = 20;
local learning_rate = 0.00005;  # you can probably use a higher LR for a small model like "gpt2"


# <----- you probably don't need to edit below this line ----> #


assert fsdp == true || cpu_offloading == false : "cpu_offloading only available with fsdp";

# FullyShardedDataParallel config:
local fsdp_config = if fsdp then {
    reshard_after_forward: true,
    move_params_to_cpu: cpu_offloading,
    move_grads_to_cpu: cpu_offloading,
    mixed_precision: amp,
} else null;

local training_engine = {
    type: if fsdp then "fairscale" else "torch",
    optimizer: {
        type: "torch::AdamW",
        lr: learning_rate,
        betas: [0.9, 0.95],
        eps: 1e-6,
    },
    lr_scheduler: {
        type: "transformers::linear",
        num_warmup_steps: warmup_steps,
        num_training_steps: training_steps,
    },
    amp: amp,
    [if fsdp then "fsdp_config" else null]: fsdp_config,
};

local distributed_dataloader = {
    batch_size: batch_size,
    collate_fn: { type: "transformers::DefaultDataCollator" },
    sampler: {
        type: "torch::DistributedSampler",
        shuffle: true,
        drop_last: true,
    },
};

local single_device_dataloader = {
    shuffle: true,
    batch_size: batch_size,
    collate_fn: { type: "transformers::DefaultDataCollator" },
};

local dataloader = if devices > 1 then distributed_dataloader else single_device_dataloader;

{
    steps: {
        raw_data: {
            type: "datasets::load",
            path: "wikitext",
            name: "wikitext-2-raw-v1",
        },
        tokenized_data: {
            type: "tokenize_data",
            dataset: { type: "ref", ref: "raw_data" },
            tokenizer: { pretrained_model_name_or_path: pretrained_model }
        },
        trained_model: {
            type: "torch::train",
            model: {
                type: "fairscale::with_wrapped_modules",
                model: {
                    type: "transformers::AutoModelForCausalLM::from_pretrained",
                    pretrained_model_name_or_path: pretrained_model,
                    low_cpu_mem_usage: load_with_low_cpu_mem_usage,
                },
                modules_to_wrap: ["transformer\\.h\\.[0-9]+"],  # tell FairScale to wrap the transformer's blocks individually
                fsdp_config: fsdp_config,
                activation_checkpointing: activation_checkpointing,
            },
            dataset_dict: { type: "ref", ref: "tokenized_data" },
            train_dataloader: dataloader,
            validation_split: "validation",
            grad_accum: grad_accum,
            train_steps: training_steps,
            validate_every: validate_every,
            checkpoint_every: validate_every,
            log_every: 1,
            device_count: devices,
            training_engine: training_engine,
        },
        final_metrics: {
            type: "torch::eval",
            model: { type: "ref", ref: "trained_model" },
            dataset_dict: { type: "ref", ref: "tokenized_data" },
            dataloader: single_device_dataloader,
            test_split: "test",
        },
    }
}


================================================
FILE: examples/train_lm/test.py
================================================
from tango.common import Params
from tango.common.testing import run_experiment


def test_small_experiment():
    model = "sshleifer/tiny-gpt2"
    dataloader = {
        "batch_size": 2,
        "collate_fn": {"type": "transformers::DefaultDataCollator"},
    }
    steps = 4
    overrides = {
        "steps.tokenized_data.block_size": 64,
        # Override the model in the config with the tiny alternative so training is fast.
        "steps.tokenized_data.tokenizer.pretrained_model_name_or_path": model,
        "steps.trained_model.model.model.pretrained_model_name_or_path": model,
        # Use a small number of training/validation/eval steps.
        "steps.trained_model.training_engine.lr_scheduler.num_warmup_steps": 1,
        "steps.trained_model.training_engine.lr_scheduler.num_training_steps": steps,
        "steps.trained_model.train_steps": steps,
        "steps.trained_model.validation_steps": 2,
        "steps.trained_model.validate_every": steps,
        "steps.final_metrics.eval_steps": 2,
        "steps.trained_model.checkpoint_every": steps,
        "steps.trained_model.device_count": 1,
        # Override data loaders.
        "steps.trained_model.train_dataloader": dataloader,
        "steps.trained_model.validation_dataloader": dataloader,
        "steps.final_metrics.dataloader": dataloader,
    }

    # Load the config.
    config = Params.from_file("config.jsonnet", params_overrides=overrides)

    # Make sure we've overrode the model entirely.
    flattened = config.as_flat_dict()
    for key, value in flattened.items():
        if "model_name" in key or (isinstance(value, str) and "gpt" in value):
            assert value == model

    with run_experiment(config, include_package=["tokenize_step.py"]) as run_dir:
        assert (run_dir / "trained_model").is_dir()


================================================
FILE: examples/train_lm/tokenize_step.py
================================================
import datasets

from tango import Step
from tango.integrations.datasets import DatasetsFormat
from tango.integrations.transformers import Tokenizer


# We need a step to tokenize the raw data. The result of this step will be passed
# directly into the "torch::train" step.
@Step.register("tokenize_data")
class TokenizeData(Step):
    DETERMINISTIC = True
    CACHEABLE = True
    FORMAT = DatasetsFormat()

    def run(  # type: ignore[override]
        self,
        dataset: datasets.DatasetDict,
        tokenizer: Tokenizer,
        block_size: int = 1024,
        num_workers: int = 1,
        field_to_tokenize: str = "text",
    ) -> datasets.DatasetDict:
        def tokenize_function(example):
            return tokenizer(example[field_to_tokenize])

        dataset = dataset.map(
            tokenize_function,
            batched=True,
            num_proc=num_workers,
            remove_columns=list(dataset.column_names.values())[0],  # remove all old columns
            desc="Tokenizing dataset",
        )

        def group_texts(examples):
            # Concatenate all texts.
            concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}  # type: ignore
            total_length = len(concatenated_examples[list(examples.keys())[0]])
            # We drop the small remainder, we could add padding if the model supported
            # it instead of this drop, you can customize this part to your needs.
            if total_length >= block_size:
                total_length = (total_length // block_size) * block_size
            # Split by chunks of max_len.
            result = {
                k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
                for k, t in concatenated_examples.items()
            }
            result["labels"] = result["input_ids"].copy()
            return result

        dataset = dataset.map(
            group_texts,
            batched=True,
            num_proc=num_workers,
            desc=f"Grouping texts into chunks of {block_size}",
        )

        return dataset


================================================
FILE: integration_tests/README.md
================================================
# Integration tests

These are a collection of longer running end-to-end tests of various parts of the Tango library.

The easiest way to run any of these integration tests is by triggering the [**Integration tests**](https://github.com/allenai/tango/actions/workflows/integration_tests.yml)
workflow on GitHub Actions. Just select the "Run workflow" dropdown, then pick the test to run and the Beaker cluster to run it on,
and finally hit the "Run workflow" button.

Each test should have a `run.sh` file in its folder that will run the relevant tango command.
This is what the **Integration tests** workflow will call, and you can also use it to run the test manually.


================================================
FILE: integration_tests/fairscale_benchmarks/README.md
================================================
# FairScale Benchmarks

This integration test is for checking the performance of the `FairScaleTrainingEngine` with various configurations.

**When to run it:** It should be ran every time there is a major PyTorch or FairScale upgrade.

**Where to run it:** A server with 4 A100 GPUs. Make sure you set your `WANDB_API_KEY` environment variable.

**How to run it:** From the root directory of this repository, run:
```
integration_tests/fairscale_benchmarks/run.sh
```

By default, not all configurations are run. If you want to run change which configurations are run, open `config.jsonnet`
are search for "enabled". Then toggle this `enabled` field to `true` or `false` for each configuration.

**What to look for:** The training jobs shouldn't fail, for one. After `tango run` completes, check the corresponding Weights & Biases
dashboard and inspect the results. Compare the various "fsdp" training runs with the baseline to ensure you see memory savings.


==========================

Download .txt

gitextract_p3vof5f6/

├── .dockerignore
├── .github/
│   ├── CONTRIBUTING.md
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.yml
│   │   ├── documentation.yml
│   │   └── feature_request.yml
│   ├── dependabot.yml
│   └── workflows/
│       ├── changelog.yml
│       ├── docker.yml
│       ├── docker_testing.yml
│       ├── integration_tests.yml
│       ├── main.yml
│       └── update_dependency_pr.yml
├── .gitignore
├── .readthedocs.yaml
├── CHANGELOG.md
├── CITATION.cff
├── Dockerfile
├── Dockerfile.test
├── LICENSE
├── Makefile
├── README.md
├── RELEASE_PROCESS.md
├── docs/
│   ├── .gitignore
│   ├── Makefile
│   ├── make.bat
│   └── source/
│       ├── _static/
│       │   └── css/
│       │       └── custom.css
│       ├── api/
│       │   ├── commands.rst
│       │   ├── components/
│       │   │   ├── executor.rst
│       │   │   ├── format.rst
│       │   │   ├── index.rst
│       │   │   ├── step.rst
│       │   │   ├── step_cache.rst
│       │   │   ├── step_graph.rst
│       │   │   ├── step_info.rst
│       │   │   └── workspace.rst
│       │   ├── det_hash.rst
│       │   ├── exceptions.rst
│       │   ├── integrations/
│       │   │   ├── beaker.rst
│       │   │   ├── datasets.rst
│       │   │   ├── fairscale.rst
│       │   │   ├── flax.rst
│       │   │   ├── gs.rst
│       │   │   ├── index.rst
│       │   │   ├── torch.rst
│       │   │   ├── transformers.rst
│       │   │   └── wandb.rst
│       │   ├── logging.rst
│       │   ├── sequences.rst
│       │   ├── settings.rst
│       │   └── utilities.rst
│       ├── conf.py
│       ├── examples/
│       │   ├── euler.md
│       │   ├── eval_p3.md
│       │   ├── index.rst
│       │   └── train_lm.md
│       ├── faq.md
│       ├── first_steps.md
│       ├── index.md
│       └── installation.md
├── examples/
│   ├── euler/
│   │   ├── README.md
│   │   ├── complex_arithmetic.py
│   │   ├── euler.jsonnet
│   │   ├── euler_general.jsonnet
│   │   └── run.sh
│   ├── eval_p3/
│   │   ├── README.md
│   │   ├── config.jsonnet
│   │   └── eval.py
│   ├── finetune/
│   │   ├── __init__.py
│   │   ├── config.jsonnet
│   │   ├── snli_steps.py
│   │   └── test.py
│   ├── finetune_resnet/
│   │   ├── .gitignore
│   │   ├── config.jsonnet
│   │   └── resnet_steps.py
│   ├── flax/
│   │   ├── config.jsonnet
│   │   ├── run.sh
│   │   └── xsum.py
│   └── train_lm/
│       ├── .gitignore
│       ├── README.md
│       ├── config.jsonnet
│       ├── test.py
│       └── tokenize_step.py
├── integration_tests/
│   ├── README.md
│   └── fairscale_benchmarks/
│       ├── README.md
│       ├── config.jsonnet
│       └── run.sh
├── pyproject.toml
├── scripts/
│   ├── entrypoint.sh
│   ├── hash_extras.py
│   ├── prepare_changelog.py
│   ├── prepare_citation_cff.py
│   ├── release.sh
│   └── release_notes.py
├── tango/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── common/
│   │   ├── __init__.py
│   │   ├── aliases.py
│   │   ├── dataset_dict.py
│   │   ├── det_hash.py
│   │   ├── exceptions.py
│   │   ├── file_lock.py
│   │   ├── from_params.py
│   │   ├── lazy.py
│   │   ├── logging.py
│   │   ├── params.py
│   │   ├── registrable.py
│   │   ├── remote_utils.py
│   │   ├── sequences.py
│   │   ├── testing/
│   │   │   ├── __init__.py
│   │   │   └── steps.py
│   │   ├── tqdm.py
│   │   └── util.py
│   ├── executor.py
│   ├── executors/
│   │   ├── __init__.py
│   │   └── multicore_executor.py
│   ├── format.py
│   ├── integrations/
│   │   ├── __init__.py
│   │   ├── beaker/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── entrypoint.sh
│   │   │   ├── executor.py
│   │   │   ├── step_cache.py
│   │   │   └── workspace.py
│   │   ├── datasets/
│   │   │   └── __init__.py
│   │   ├── fairscale/
│   │   │   ├── __init__.py
│   │   │   ├── fsdp_config.py
│   │   │   ├── module_wrapper.py
│   │   │   └── training_engine.py
│   │   ├── flax/
│   │   │   ├── __init__.py
│   │   │   ├── data.py
│   │   │   ├── eval.py
│   │   │   ├── eval_callback.py
│   │   │   ├── format.py
│   │   │   ├── model.py
│   │   │   ├── optim.py
│   │   │   ├── train.py
│   │   │   ├── train_callback.py
│   │   │   ├── train_config.py
│   │   │   ├── util.py
│   │   │   └── wrapper.py
│   │   ├── gs/
│   │   │   ├── __init__.py
│   │   │   ├── common.py
│   │   │   ├── step_cache.py
│   │   │   └── workspace.py
│   │   ├── torch/
│   │   │   ├── __init__.py
│   │   │   ├── data.py
│   │   │   ├── eval.py
│   │   │   ├── eval_callback.py
│   │   │   ├── exceptions.py
│   │   │   ├── format.py
│   │   │   ├── model.py
│   │   │   ├── optim.py
│   │   │   ├── train.py
│   │   │   ├── train_callback.py
│   │   │   ├── train_config.py
│   │   │   ├── training_engine.py
│   │   │   └── util.py
│   │   ├── transformers/
│   │   │   ├── __init__.py
│   │   │   ├── config.py
│   │   │   ├── data.py
│   │   │   ├── finetune.py
│   │   │   ├── ia3.py
│   │   │   ├── model.py
│   │   │   ├── optim.py
│   │   │   ├── run_generation.py
│   │   │   ├── soft_prompt.py
│   │   │   └── tokenizer.py
│   │   └── wandb/
│   │       ├── __init__.py
│   │       ├── flax_train_callback.py
│   │       ├── step_cache.py
│   │       ├── torch_train_callback.py
│   │       ├── util.py
│   │       └── workspace.py
│   ├── py.typed
│   ├── settings.py
│   ├── step.py
│   ├── step_cache.py
│   ├── step_caches/
│   │   ├── __init__.py
│   │   ├── local_step_cache.py
│   │   ├── memory_step_cache.py
│   │   └── remote_step_cache.py
│   ├── step_graph.py
│   ├── step_info.py
│   ├── steps/
│   │   ├── __init__.py
│   │   ├── dataset_remix.py
│   │   ├── print.py
│   │   └── shell_step.py
│   ├── version.py
│   ├── workspace.py
│   └── workspaces/
│       ├── __init__.py
│       ├── local_workspace.py
│       ├── memory_workspace.py
│       └── remote_workspace.py
├── test_fixtures/
│   ├── __init__.py
│   ├── beaker/
│   │   └── nvidia_smi.yml
│   ├── common/
│   │   ├── params_example.jsonnet
│   │   └── params_example.yaml
│   ├── experiment/
│   │   ├── hello_world.jsonnet
│   │   ├── logging_check.jsonnet
│   │   ├── multiprocessing.jsonnet
│   │   ├── noisy.jsonnet
│   │   └── random.jsonnet
│   ├── integrations/
│   │   ├── __init__.py
│   │   ├── common/
│   │   │   └── __init__.py
│   │   ├── datasets/
│   │   │   └── config.json
│   │   ├── fairscale/
│   │   │   ├── __init__.py
│   │   │   ├── components.py
│   │   │   └── config.jsonnet
│   │   ├── flax/
│   │   │   ├── __init__.py
│   │   │   ├── config.jsonnet
│   │   │   └── xsum.py
│   │   └── torch/
│   │       ├── __init__.py
│   │       ├── eval.jsonnet
│   │       ├── train.jsonnet
│   │       ├── train_dist.jsonnet
│   │       └── train_streaming.jsonnet
│   └── v1_local_workspace/
│       └── cache/
│           ├── AdditionStep-34AiXoyiPKADMUnhcBzFYd6JeMcgx4DP/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── CosineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── ExponentiateStep-Rf73w34zWJcBrQafpAkxDvXR4mq3MXC9/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── MultiplyStep-2ZG7wPj9WLn5PgpYyPVHw9Qg7VM1mhwf/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── MultiplyStep-4SRzHCCqYGs2PLeT8LeL5ukrCWGJoiae/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           ├── SineStep-5aes9CUTRmkz5gJ5J6JSRbJZ4qkFu4kk/
│           │   ├── cache-metadata.json
│           │   ├── conda-environment.yaml
│           │   ├── executor-metadata.json
│           │   ├── lock
│           │   ├── requirements.txt
│           │   └── stepinfo.dill
│           └── SubtractionStep-YCdedqjmmd9GUFi96VzPXD5tAVho3CTz/
│               ├── cache-metadata.json
│               ├── conda-environment.yaml
│               ├── executor-metadata.json
│               ├── lock
│               ├── requirements.txt
│               └── stepinfo.dill
└── tests/
    ├── __init__.py
    ├── common/
    │   ├── __init__.py
    │   ├── dataset_dict_test.py
    │   ├── det_hash_test.py
    │   ├── from_params_pep_563_test.py
    │   ├── from_params_test.py
    │   ├── params_test.py
    │   ├── registrable_test.py
    │   ├── sequences_test.py
    │   └── util_test.py
    ├── end_to_end/
    │   ├── test_dataset_dict_from_separate_steps.py
    │   ├── test_lazy_input_with_another_step.py
    │   ├── test_multicore_cli.py
    │   ├── test_non_cacheable_into_cacheable_multiple_runs.py
    │   ├── test_registered_runs.py
    │   ├── test_run_single_step.py
    │   ├── test_step_indexing.py
    │   ├── test_steps_that_fail.py
    │   └── test_uncacheable_leaf_steps.py
    ├── executor_test.py
    ├── executors/
    │   ├── __init__.py
    │   └── multicore_executor_test.py
    ├── format_test.py
    ├── integrations/
    │   ├── __init__.py
    │   ├── beaker/
    │   │   ├── __init__.py
    │   │   ├── conftest.py
    │   │   ├── executor_test.py
    │   │   ├── step_cache_test.py
    │   │   └── workspace_test.py
    │   ├── datasets/
    │   │   ├── __init__.py
    │   │   └── dataset_test.py
    │   ├── fairscale/
    │   │   ├── __init__.py
    │   │   └── train_test.py
    │   ├── flax/
    │   │   ├── __init__.py
    │   │   ├── data_test.py
    │   │   ├── format_test.py
    │   │   ├── optim_test.py
    │   │   └── train_test.py
    │   ├── gs/
    │   │   ├── __init__.py
    │   │   ├── step_cache_test.py
    │   │   └── workspace_test.py
    │   ├── torch/
    │   │   ├── __init__.py
    │   │   ├── data_test.py
    │   │   ├── det_hash_test.py
    │   │   ├── eval_test.py
    │   │   ├── format_test.py
    │   │   ├── optim_test.py
    │   │   ├── train_callback_test.py
    │   │   ├── train_test.py
    │   │   └── training_engine_test.py
    │   ├── transformers/
    │   │   ├── data_test.py
    │   │   ├── finetune_test.py
    │   │   ├── ia3_test.py
    │   │   ├── run_generation_test.py
    │   │   └── soft_prompt_test.py
    │   └── wandb/
    │       ├── __init__.py
    │       ├── step_cache_test.py
    │       └── workspace_test.py
    ├── main_test.py
    ├── step_caches/
    │   ├── __init__.py
    │   └── local_step_cache_test.py
    ├── step_graph_test.py
    ├── step_info_test.py
    ├── step_test.py
    ├── steps/
    │   ├── __init__.py
    │   ├── dataset_remix_test.py
    │   └── shell_step_test.py
    └── workspaces/
        ├── __init__.py
        ├── local_workspace_test.py
        └── memory_workspace_test.py

Download .txt

SYMBOL INDEX (1387 symbols across 156 files)

FILE: docs/source/conf.py
  class ShutupSphinxAutodocTypehintsFilter (line 125) | class ShutupSphinxAutodocTypehintsFilter(logging.Filter):
    method filter (line 126) | def filter(self, record: logging.LogRecord) -> bool:

FILE: examples/euler/complex_arithmetic.py
  function make_complex (line 9) | def make_complex(x: ComplexOrTuple) -> complex:
  class AdditionStep (line 19) | class AdditionStep(Step):
    method run (line 20) | def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # typ...
  class SubtractionStep (line 25) | class SubtractionStep(Step):
    method run (line 26) | def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # typ...
  class ExponentiateStep (line 31) | class ExponentiateStep(Step):
    method run (line 32) | def run(self, x: ComplexOrTuple, base: ComplexOrTuple = cmath.e) -> co...
  class MultiplyStep (line 37) | class MultiplyStep(Step):
    method run (line 38) | def run(self, a: ComplexOrTuple, b: ComplexOrTuple) -> complex:  # typ...
  class SineStep (line 43) | class SineStep(Step):
    method run (line 44) | def run(self, x: ComplexOrTuple) -> complex:  # type: ignore
  class CosineStep (line 49) | class CosineStep(Step):
    method run (line 50) | def run(self, x: ComplexOrTuple) -> complex:  # type: ignore

FILE: examples/eval_p3/eval.py
  class RougeScoreStep (line 15) | class RougeScoreStep(Step[Dict[str, Tensor]]):
    method run (line 19) | def run(  # type: ignore

FILE: examples/finetune/snli_steps.py
  class SubsetData (line 10) | class SubsetData(Step):
    method run (line 21) | def run(  # type: ignore
  class SnliText2Text (line 44) | class SnliText2Text(Step):
    method run (line 70) | def run(  # type: ignore

FILE: examples/finetune/test.py
  class TestFinetuneSNLI (line 10) | class TestFinetuneSNLI(TangoTestCase):
    method test_config (line 16) | def test_config(self, model: str, model_type: str):

FILE: examples/finetune_resnet/resnet_steps.py
  class ResNetWrapper (line 20) | class ResNetWrapper(Model):
    method __init__ (line 21) | def __init__(self, num_classes: int, feature_extract: bool, use_pretra...
    method set_parameter_requires_grad (line 29) | def set_parameter_requires_grad(self, model: models, feature_extractin...
    method forward (line 34) | def forward(  # type: ignore
  class ImageCollator (line 49) | class ImageCollator(DataCollator[Dict[str, Any]]):
    method __call__ (line 50) | def __call__(self, batch: List[Dict[str, Any]]) -> Dict[str, Any]:
  function get_data_transforms (line 58) | def get_data_transforms(input_size: int):
  function pil_loader (line 81) | def pil_loader(path: str, input_size: int, transform_type: str):
  function image_loader (line 91) | def image_loader(example_batch, input_size: int, transform_type: str):
  class TransformData (line 100) | class TransformData(Step):
    method run (line 104) | def run(  # type: ignore
  function convert_to_label (line 117) | def convert_to_label(int_label: int) -> str:
  class Prediction (line 125) | class Prediction(Step):
    method run (line 128) | def run(  # type: ignore

FILE: examples/flax/xsum.py
  class PreProcessing (line 23) | class PreProcessing(Step):
    method run (line 26) | def run(self, dataset):
  class TransformerWrapper (line 82) | class TransformerWrapper(FlaxWrapper):
    method loss_helper (line 83) | def loss_helper(self, logits, labels, batch):
    method train_loss (line 103) | def train_loss(self, params, state, batch, dropout_rng, labels):
    method val_metrics (line 108) | def val_metrics(self, batch, logits, labels):
    method eval_metrics (line 113) | def eval_metrics(self, batch, logits, labels):
  class GenerateCallback (line 120) | class GenerateCallback(TrainCallback):
    method __init__ (line 121) | def __init__(self, *args, **kwargs) -> None:
    method generate_step (line 125) | def generate_step(self, params, batch):
    method pre_train_loop (line 133) | def pre_train_loop(self) -> None:
    method pre_val_loop (line 137) | def pre_val_loop(self, step: int, val_step: int, state) -> None:
    method pre_val_batch (line 142) | def pre_val_batch(self, step: int, val_step: int, epoch: int, val_batc...
    method postprocess_text (line 151) | def postprocess_text(self, preds, labels):
    method compute_metrics (line 161) | def compute_metrics(self, preds, labels):
    method post_val_loop (line 180) | def post_val_loop(

FILE: examples/train_lm/test.py
  function test_small_experiment (line 5) | def test_small_experiment():

FILE: examples/train_lm/tokenize_step.py
  class TokenizeData (line 11) | class TokenizeData(Step):
    method run (line 16) | def run(  # type: ignore[override]

FILE: scripts/hash_extras.py
  function main (line 8) | def main():

FILE: scripts/prepare_changelog.py
  function main (line 7) | def main():

FILE: scripts/prepare_citation_cff.py
  function main (line 7) | def main():

FILE: scripts/release_notes.py
  function get_change_log_notes (line 20) | def get_change_log_notes() -> str:
  function get_commit_history (line 46) | def get_commit_history() -> str:
  function main (line 74) | def main():

FILE: tango/__main__.py
  class SettingsObject (line 113) | class SettingsObject(NamedTuple):
  function main (line 148) | def main(
  function cleanup (line 173) | def cleanup(*args, **kwargs):
  function run (line 241) | def run(
  function beaker_executor_run (line 313) | def beaker_executor_run(
  function info (line 348) | def info(obj: SettingsObject):
  function settings (line 392) | def settings(ctx):
  function init (line 413) | def init(obj: SettingsObject, path: Optional[str] = None, force: bool = ...
  function set_setting (line 428) | def set_setting(obj: SettingsObject):
  function save_settings (line 439) | def save_settings(settings: TangoGlobalSettings):
  function workspace (line 455) | def workspace(obj: SettingsObject, workspace: str, validate: bool = True...
  function include_package (line 494) | def include_package(
  function log_level (line 527) | def log_level(obj: SettingsObject, level: str) -> TangoGlobalSettings:
  function file_friendly_logging (line 541) | def file_friendly_logging(obj: SettingsObject, value: bool) -> TangoGlob...
  function multiprocessing_start_method (line 555) | def multiprocessing_start_method(obj: SettingsObject, start_method: str)...
  function env (line 573) | def env(obj: SettingsObject, key: str, value: str) -> TangoGlobalSettings:
  function _run (line 594) | def _run(

FILE: tango/cli.py
  function load_settings (line 30) | def load_settings(settings: Union[str, Params, dict, None] = None) -> Ta...
  function tango_cli (line 41) | def tango_cli(settings: Union[TangoGlobalSettings, str, Params, dict, No...
  function initialize_cli (line 52) | def initialize_cli(
  function cleanup_cli (line 88) | def cleanup_cli():
  function prepare_workspace (line 92) | def prepare_workspace(
  function prepare_executor (line 112) | def prepare_executor(
  function execute_step_graph (line 164) | def execute_step_graph(

FILE: tango/common/aliases.py
  class EnvVarNames (line 9) | class EnvVarNames(Enum):
    method values (line 19) | def values(cls) -> Set[str]:

FILE: tango/common/dataset_dict.py
  class DatasetDictBase (line 9) | class DatasetDictBase(Generic[S], Mapping[str, S]):
    method __getitem__ (line 24) | def __getitem__(self, split: str) -> S:
    method __contains__ (line 30) | def __contains__(self, split: str) -> bool:  # type: ignore[override]
    method __iter__ (line 36) | def __iter__(self) -> Iterator[str]:
    method __len__ (line 42) | def __len__(self) -> int:
    method keys (line 48) | def keys(self):
  class DatasetDict (line 56) | class DatasetDict(DatasetDictBase[Sequence[T]], Generic[T]):
  class IterableDatasetDict (line 64) | class IterableDatasetDict(DatasetDictBase[Iterable[T]], Generic[T]):

FILE: tango/common/det_hash.py
  class CustomDetHash (line 23) | class CustomDetHash:
    method det_hash_object (line 35) | def det_hash_object(self) -> Any:
  class DetHashFromInitParams (line 42) | class DetHashFromInitParams(CustomDetHash):
    method __new__ (line 50) | def __new__(cls, *args, **kwargs):
    method det_hash_object (line 59) | def det_hash_object(self) -> Any:
  class DetHashWithVersion (line 64) | class DetHashWithVersion(CustomDetHash):
    method det_hash_object (line 87) | def det_hash_object(self) -> Any:
  class _DetHashPickler (line 100) | class _DetHashPickler(dill.Pickler):
    method __init__ (line 101) | def __init__(self, buffer: io.BytesIO):
    method save (line 115) | def save(self, obj, save_persistent_id=True):
    method persistent_id (line 120) | def persistent_id(self, obj: Any) -> Any:
  function det_hash (line 148) | def det_hash(o: Any) -> str:

FILE: tango/common/exceptions.py
  class TangoError (line 8) | class TangoError(Exception):
  class ConfigurationError (line 14) | class ConfigurationError(TangoError):
    method __reduce__ (line 20) | def __reduce__(self) -> Union[str, Tuple[Any, ...]]:
    method __init__ (line 23) | def __init__(self, message: str):
    method __str__ (line 27) | def __str__(self):
  class RegistryKeyError (line 31) | class RegistryKeyError(ConfigurationError):
  class CancellationError (line 38) | class CancellationError(TangoError):
  class SigTermReceived (line 44) | class SigTermReceived(CancellationError):
  class StepCancelled (line 50) | class StepCancelled(CancellationError):
  class RunCancelled (line 54) | class RunCancelled(CancellationError):
  class CliRunError (line 58) | class CliRunError(TangoError):
  class IntegrationMissingError (line 64) | class IntegrationMissingError(TangoError):
    method __init__ (line 69) | def __init__(self, integration: str, dependencies: Optional[Set[str]] ...
  class StepStateError (line 79) | class StepStateError(TangoError):
    method __init__ (line 84) | def __init__(
  class DirtyRepoError (line 98) | class DirtyRepoError(TangoError):
  class ExecutorError (line 104) | class ExecutorError(TangoError):

FILE: tango/common/file_lock.py
  class FileLock (line 11) | class FileLock(_FileLock):  # type: ignore[valid-type,misc]
    method __init__ (line 22) | def __init__(self, lock_file: PathOrStr, timeout=-1, read_only_ok: boo...
    method acquire (line 26) | def acquire(  # type: ignore[override]
    method acquire_with_updates (line 52) | def acquire_with_updates(self, desc: Optional[str] = None) -> AcquireR...

FILE: tango/common/from_params.py
  class UnionType (line 33) | class UnionType:  # type: ignore
  function takes_arg (line 46) | def takes_arg(obj, arg: str) -> bool:
  function takes_kwargs (line 62) | def takes_kwargs(obj) -> bool:
  function is_base_registrable (line 81) | def is_base_registrable(cls) -> bool:
  function remove_optional (line 99) | def remove_optional(annotation: type):
  function infer_constructor_params (line 114) | def infer_constructor_params(
  function infer_method_params (line 125) | def infer_method_params(
  function create_kwargs (line 208) | def create_kwargs(
  function create_extras (line 279) | def create_extras(cls: Type[T], extras: Dict[str, Any]) -> Dict[str, Any]:
  function pop_and_construct_arg (line 307) | def pop_and_construct_arg(
  function _params_contain_step (line 359) | def _params_contain_step(o: Any) -> bool:
  function construct_arg (line 379) | def construct_arg(
  class FromParams (line 677) | class FromParams(DetHashWithVersion):
    method from_params (line 685) | def from_params(
    method to_params (line 832) | def to_params(self) -> Params:
    method _to_params (line 860) | def _to_params(self) -> Dict[str, Any]:

FILE: tango/common/lazy.py
  class Lazy (line 11) | class Lazy(Generic[T], CustomDetHash):
    method __init__ (line 53) | def __init__(
    method constructor (line 66) | def constructor(self) -> Callable[..., T]:
    method construct (line 81) | def construct(self, **kwargs) -> T:
    method det_hash_object (line 90) | def det_hash_object(self) -> Any:

FILE: tango/common/logging.py
  class LevelFilter (line 161) | class LevelFilter(logging.Filter):
    method __init__ (line 168) | def __init__(self, max_level: int, min_level: Optional[int] = None, na...
    method filter (line 173) | def filter(self, record):
  class CliFilter (line 180) | class CliFilter(logging.Filter):
    method __init__ (line 181) | def __init__(self, filter_out: bool):
    method filter (line 184) | def filter(self, record):
  class WorkerLogFilter (line 191) | class WorkerLogFilter(logging.Filter):
    method __init__ (line 192) | def __init__(self, rank=-1):
    method filter (line 196) | def filter(self, record):
  class PrefixLogFilter (line 202) | class PrefixLogFilter(logging.Filter):
    method __init__ (line 203) | def __init__(self, prefix):
    method filter (line 207) | def filter(self, record):
  class LogRecordStreamHandler (line 219) | class LogRecordStreamHandler(socketserver.StreamRequestHandler):
    method handle (line 229) | def handle(self):
    method unPickle (line 247) | def unPickle(self, data):
    method handleLogRecord (line 250) | def handleLogRecord(self, record):
  class LogRecordSocketReceiver (line 260) | class LogRecordSocketReceiver(socketserver.ThreadingTCPServer):
    method __init__ (line 270) | def __init__(self, host: str, port: int = 0):
    method serve_until_stopped (line 275) | def serve_until_stopped(self):
  class RichHandler (line 293) | class RichHandler(logging.Handler):
    method __init__ (line 310) | def __init__(
    method emit (line 332) | def emit(self, record: logging.LogRecord) -> None:
    method render_message (line 346) | def render_message(self, record: logging.LogRecord, message: str) -> C...
    method get_time_text (line 364) | def get_time_text(self, record: logging.LogRecord) -> Text:
    method get_level_text (line 373) | def get_level_text(self, record: logging.LogRecord) -> Text:
    method get_path_text (line 380) | def get_path_text(self, record: logging.LogRecord, length_so_far: int)...
    method render (line 395) | def render(
  function get_handler (line 418) | def get_handler(
  function excepthook (line 459) | def excepthook(exctype, value, traceback):
  function log_exception (line 466) | def log_exception(exc: Optional[BaseException] = None, logger: Optional[...
  function log_exc_info (line 474) | def log_exc_info(exctype, value, traceback, logger: Optional[logging.Log...
  function initialize_logging (line 493) | def initialize_logging(
  function initialize_worker_logging (line 537) | def initialize_worker_logging(worker_rank: Optional[int] = None):
  function initialize_prefix_logging (line 555) | def initialize_prefix_logging(
  function _initialize_logging (line 572) | def _initialize_logging(
  function teardown_logging (line 693) | def teardown_logging():
  function insert_handlers (line 714) | def insert_handlers(*handlers: logging.Handler) -> Generator[None, None,...
  function file_handler (line 743) | def file_handler(filepath: PathOrStr) -> ContextManager[None]:

FILE: tango/common/params.py
  function infer_and_cast (line 22) | def infer_and_cast(value: Any):
  function _is_encodable (line 62) | def _is_encodable(value: str) -> bool:
  function _environment_variables (line 73) | def _environment_variables() -> Dict[str, str]:
  function with_overrides (line 83) | def with_overrides(original: T, overrides_dict: Dict[str, Any], prefix: ...
  function parse_overrides (line 128) | def parse_overrides(
  function _is_dict_free (line 139) | def _is_dict_free(obj: Any) -> bool:
  function pop_choice (line 151) | def pop_choice(
  function _replace_none (line 175) | def _replace_none(params: Any) -> Any:
  function remove_keys_from_params (line 189) | def remove_keys_from_params(params: "Params", keys: List[str] = ["pretra...
  class Params (line 204) | class Params(MutableMapping):
    method __init__ (line 232) | def __init__(self, params: "MutableMapping[str, Any]", history: str = ...
    method pop (line 239) | def pop(self, key: str, default: Any = DEFAULT, keep_as_dict: bool = F...
    method pop_int (line 265) | def pop_int(self, key: str, default: Any = DEFAULT) -> Optional[int]:
    method pop_float (line 275) | def pop_float(self, key: str, default: Any = DEFAULT) -> Optional[float]:
    method pop_bool (line 285) | def pop_bool(self, key: str, default: Any = DEFAULT) -> Optional[bool]:
    method get (line 301) | def get(self, key: str, default: Any = DEFAULT):
    method pop_choice (line 310) | def pop_choice(
    method as_dict (line 362) | def as_dict(self, quiet: bool = False, infer_type_and_cast: bool = Fal...
    method as_flat_dict (line 392) | def as_flat_dict(self) -> Dict[str, Any]:
    method duplicate (line 410) | def duplicate(self) -> "Params":
    method assert_empty (line 417) | def assert_empty(self, name: str):
    method __getitem__ (line 427) | def __getitem__(self, key):
    method __setitem__ (line 433) | def __setitem__(self, key, value):
    method __delitem__ (line 436) | def __delitem__(self, key):
    method __iter__ (line 439) | def __iter__(self):
    method __len__ (line 442) | def __len__(self):
    method _check_is_dict (line 445) | def _check_is_dict(self, new_history, value):
    method from_file (line 454) | def from_file(
    method to_file (line 510) | def to_file(
    method as_ordered_dict (line 519) | def as_ordered_dict(self, preference_orders: Optional[List[List[str]]]...
    method get_hash (line 553) | def get_hash(self) -> str:
    method __str__ (line 566) | def __str__(self) -> str:

FILE: tango/common/registrable.py
  class Registrable (line 41) | class Registrable(FromParams):
    method register (line 69) | def register(
    method by_name (line 165) | def by_name(cls: Type[_RegistrableT], name: str) -> Callable[..., _Reg...
    method search_modules (line 179) | def search_modules(cls: Type[_RegistrableT], name: str):
    method resolve_class_name (line 256) | def resolve_class_name(
    method list_available (line 324) | def list_available(cls) -> List[str]:
  class RegistrableFunction (line 342) | class RegistrableFunction(Registrable):
    method __call__ (line 350) | def __call__(self, *args, **kwargs):
  function make_registrable (line 354) | def make_registrable(name: Optional[str] = None, *, exist_ok: bool = Fal...
  function _get_suggestion (line 374) | def _get_suggestion(name: str, available: List[str]) -> Optional[str]:
  function _fullname (line 383) | def _fullname(c: type) -> str:
  function _cls_is_step (line 387) | def _cls_is_step(c: type) -> bool:

FILE: tango/common/remote_utils.py
  class RemoteConstants (line 10) | class RemoteConstants:
    method step_artifact_name (line 29) | def step_artifact_name(cls, step: Union[str, StepInfo, Step]) -> str:
    method step_lock_artifact_name (line 33) | def step_lock_artifact_name(cls, step: Union[str, StepInfo, Step]) -> ...
    method run_artifact_name (line 37) | def run_artifact_name(cls, name: str) -> str:

FILE: tango/common/sequences.py
  class ShuffledSequence (line 10) | class ShuffledSequence(abc.Sequence):
    method __init__ (line 52) | def __init__(self, inner_sequence: Sequence, indices: Optional[Sequenc...
    method __len__ (line 61) | def __len__(self) -> int:
    method __getitem__ (line 64) | def __getitem__(self, i: Union[int, slice]):
    method __contains__ (line 70) | def __contains__(self, item) -> bool:
  class SlicedSequence (line 77) | class SlicedSequence(ShuffledSequence):
    method __init__ (line 110) | def __init__(self, inner_sequence: Sequence, s: slice):
  class ConcatenatedSequence (line 114) | class ConcatenatedSequence(abc.Sequence):
    method __init__ (line 149) | def __init__(self, *sequences: Sequence):
    method __len__ (line 157) | def __len__(self):
    method __getitem__ (line 160) | def __getitem__(self, i: Union[int, slice]):
    method __contains__ (line 172) | def __contains__(self, item) -> bool:
  class MappedSequence (line 176) | class MappedSequence(abc.Sequence):
    method __init__ (line 213) | def __init__(self, fn: Callable, inner_sequence: Sequence):
    method __getitem__ (line 217) | def __getitem__(self, item):
    method __len__ (line 235) | def __len__(self):
    method __contains__ (line 238) | def __contains__(self, item):
  class SqliteSparseSequence (line 242) | class SqliteSparseSequence(MutableSequence[Any]):
    method __init__ (line 282) | def __init__(self, filename: Union[str, PathLike], read_only: bool = F...
    method __del__ (line 287) | def __del__(self):
    method __getitem__ (line 292) | def __getitem__(self, i: Union[int, slice]) -> Any:
    method __setitem__ (line 309) | def __setitem__(self, i: Union[int, slice], value: Any):
    method __delitem__ (line 320) | def __delitem__(self, i: Union[int, slice]):
    method extend (line 339) | def extend(self, values: Iterable[Any]) -> None:
    method insert (line 349) | def insert(self, i: int, value: Any) -> None:
    method __len__ (line 357) | def __len__(self) -> int:
    method clear (line 363) | def clear(self) -> None:
    method close (line 370) | def close(self) -> None:
    method copy_to (line 378) | def copy_to(self, target: Union[str, PathLike]):

FILE: tango/common/testing/__init__.py
  class TangoTestCase (line 15) | class TangoTestCase:
    method setup_method (line 60) | def setup_method(self):
    method teardown_method (line 76) | def teardown_method(self):
    method run (line 81) | def run(
  function run_experiment (line 121) | def run_experiment(
  function requires_gpus (line 156) | def requires_gpus(test_method):

FILE: tango/common/testing/steps.py
  class FloatStep (line 14) | class FloatStep(Step):
    method run (line 18) | def run(self, result: float) -> float:  # type: ignore
  class StringStep (line 23) | class StringStep(Step):
    method run (line 27) | def run(self, result: str) -> str:  # type: ignore
  class ConcatStringsStep (line 32) | class ConcatStringsStep(Step):
    method run (line 36) | def run(self, string1: str, string2: str, join_with: str = " ") -> str...
  class NoisyStep (line 41) | class NoisyStep(Step):
    method run (line 45) | def run(self, raise_error: bool = False) -> None:  # type: ignore
  class RandomStringStep (line 63) | class RandomStringStep(Step):
    method run (line 64) | def run(self, length: int = 10) -> str:  # type: ignore
  class AddNumbersStep (line 69) | class AddNumbersStep(Step):
    method run (line 73) | def run(self, a_number: int, b_number: int) -> int:  # type: ignore
  class SleepPrintMaybeFail (line 78) | class SleepPrintMaybeFail(Step):
    method run (line 82) | def run(self, string: str, seconds: int = 5, fail: bool = False) -> st...
  class LoggingStep (line 92) | class LoggingStep(Step):
    method run (line 96) | def run(self, string: str, num_log_lines: int = 50) -> str:  # type: i...
  class MakeNumber (line 104) | class MakeNumber(Step):
    method run (line 108) | def run(self, what_number: int) -> int:  # type: ignore
  class StoreNumberInFile (line 113) | class StoreNumberInFile(Step):
    method run (line 117) | def run(self, number: int, file_name: str) -> None:  # type: ignore
  class MultiprocessingStep (line 126) | class MultiprocessingStep(Step):
    method run (line 131) | def run(self, num_proc: int = 2) -> bool:  # type: ignore
  class RangeOutput (line 148) | class RangeOutput(Step):
    method run (line 149) | def run(self, start: int, end: int) -> List[int]:  # type: ignore
  function _worker_function (line 153) | def _worker_function(worker_id: int):

FILE: tango/common/tqdm.py
  function replace_cr_with_newline (line 38) | def replace_cr_with_newline(message: str) -> str:
  class TqdmToLogsWriter (line 53) | class TqdmToLogsWriter:
    method __init__ (line 54) | def __init__(self):
    method write (line 57) | def write(self, message):
    method flush (line 77) | def flush(self):
  class Tqdm (line 81) | class Tqdm:
    method tqdm (line 88) | def tqdm(*args, **kwargs):
    method wrapattr (line 94) | def wrapattr(*args, **kwargs):
    method get_updated_kwargs (line 100) | def get_updated_kwargs(**kwargs):
    method set_lock (line 110) | def set_lock(lock):
    method get_lock (line 114) | def get_lock():

FILE: tango/common/util.py
  function tango_cache_dir (line 19) | def tango_cache_dir() -> Path:
  function _handle_sigterm (line 29) | def _handle_sigterm(sig, frame):
  function install_sigterm_handler (line 33) | def install_sigterm_handler():
  function get_extra_imported_modules (line 40) | def get_extra_imported_modules() -> Set[str]:
  function import_extra_module (line 44) | def import_extra_module(package_name: str) -> None:
  function resolve_module_name (line 50) | def resolve_module_name(package_name: str) -> Tuple[str, Path]:
  function import_module_and_submodules (line 79) | def import_module_and_submodules(
  function _parse_bool (line 143) | def _parse_bool(value: Union[bool, str]) -> bool:
  function _parse_optional_int (line 151) | def _parse_optional_int(value: Optional[str]) -> Optional[int]:
  function find_submodules (line 157) | def find_submodules(
  function find_integrations (line 197) | def find_integrations() -> Iterable[str]:
  function filename_is_safe (line 207) | def filename_is_safe(filename: str) -> bool:
  function make_safe_filename (line 211) | def make_safe_filename(name: str) -> str:
  function could_be_class_name (line 222) | def could_be_class_name(name: str) -> bool:
  function _is_valid_python_name (line 229) | def _is_valid_python_name(name: str) -> bool:
  function threaded_generator (line 233) | def threaded_generator(g, queue_size: int = 16):
  function exception_to_string (line 268) | def exception_to_string(e: BaseException) -> str:
  function utc_now_datetime (line 281) | def utc_now_datetime() -> datetime:
  function local_timezone (line 285) | def local_timezone() -> Optional[tzinfo]:
  function replace_steps_with_unique_id (line 289) | def replace_steps_with_unique_id(o: Any):
  function jsonify (line 304) | def jsonify(o: Any) -> Any:
  class StrEnum (line 325) | class StrEnum(str, Enum):
    method __str__ (line 326) | def __str__(self) -> str:

FILE: tango/executor.py
  class ExecutionMetadata (line 25) | class ExecutionMetadata:
  class ExecutorOutput (line 38) | class ExecutorOutput:
    method display (line 52) | def display(self) -> None:
  class Executor (line 101) | class Executor(Registrable):
    method __init__ (line 115) | def __init__(
    method execute_step (line 125) | def execute_step(self, step: "Step") -> None:
    method execute_step_graph (line 136) | def execute_step_graph(
    method execute_sub_graph_for_steps (line 181) | def execute_sub_graph_for_steps(

FILE: tango/executors/multicore_executor.py
  class MulticoreExecutor (line 20) | class MulticoreExecutor(Executor):
    method __init__ (line 25) | def __init__(
    method execute_step_graph (line 42) | def execute_step_graph(
    method _get_state (line 310) | def _get_state(self, step: Step) -> StepState:

FILE: tango/format.py
  class Format (line 38) | class Format(Registrable, Generic[T]):
    method write (line 56) | def write(self, artifact: T, dir: PathOrStr):
    method read (line 61) | def read(self, dir: PathOrStr) -> T:
    method _to_params (line 65) | def _to_params(self) -> Dict[str, Any]:
  function _open_compressed (line 95) | def _open_compressed(filename: PathOrStr, mode: str) -> IO:
  class DillFormat (line 107) | class DillFormat(Format[T], Generic[T]):
    method __init__ (line 122) | def __init__(self, compress: Optional[str] = None):
    method write (line 127) | def write(self, artifact: T, dir: PathOrStr):
    method read (line 141) | def read(self, dir: PathOrStr) -> T:
    method _get_artifact_path (line 157) | def _get_artifact_path(self, dir: PathOrStr) -> Path:
  class DillFormatIterator (line 161) | class DillFormatIterator(Iterator[T], Generic[T]):
    method __init__ (line 166) | def __init__(self, filename: PathOrStr):
    method __iter__ (line 178) | def __iter__(self) -> Iterator[T]:
    method __next__ (line 181) | def __next__(self) -> T:
  class JsonFormat (line 193) | class JsonFormat(Format[T], Generic[T]):
    method __init__ (line 204) | def __init__(self, compress: Optional[str] = None):
    method _encoding_fallback (line 211) | def _encoding_fallback(unencodable: Any):
    method _decoding_fallback (line 238) | def _decoding_fallback(o: Dict) -> Any:
    method write (line 253) | def write(self, artifact: T, dir: PathOrStr):
    method read (line 266) | def read(self, dir: PathOrStr) -> T:
    method _get_artifact_path (line 292) | def _get_artifact_path(self, dir: PathOrStr, iterator: bool = False) -...
  class JsonFormatIterator (line 298) | class JsonFormatIterator(Iterator[T], Generic[T]):
    method __init__ (line 303) | def __init__(self, filename: PathOrStr):
    method __iter__ (line 306) | def __iter__(self) -> Iterator[T]:
    method __next__ (line 309) | def __next__(self) -> T:
  class TextFormat (line 324) | class TextFormat(Format[Union[str, Iterable[str]]]):
    method __init__ (line 341) | def __init__(self, compress: Optional[str] = None):
    method write (line 347) | def write(self, artifact: Union[str, Iterable[str]], dir: PathOrStr):
    method read (line 360) | def read(self, dir: PathOrStr) -> Union[str, Iterable[str]]:
    method _get_artifact_path (line 386) | def _get_artifact_path(self, dir: PathOrStr, iterator: bool = False) -...
  class TextFormatIterator (line 392) | class TextFormatIterator(Iterator[str]):
    method __init__ (line 397) | def __init__(self, filename: PathOrStr):
    method __iter__ (line 400) | def __iter__(self) -> Iterator[str]:
    method __next__ (line 403) | def __next__(self) -> str:
  class SqliteSequenceFormat (line 420) | class SqliteSequenceFormat(Format[Sequence[T]]):
    method write (line 425) | def write(self, artifact: Sequence[T], dir: Union[str, PathLike]):
    method read (line 437) | def read(self, dir: Union[str, PathLike]) -> Sequence[T]:
  class SqliteDictFormat (line 443) | class SqliteDictFormat(Format[DatasetDict]):
    method write (line 496) | def write(self, artifact: DatasetDict, dir: Union[str, PathLike]):
    method read (line 514) | def read(self, dir: Union[str, PathLike]) -> DatasetDict:

FILE: tango/integrations/beaker/common.py
  class Constants (line 24) | class Constants(RemoteConstants):
  function get_client (line 35) | def get_client(beaker_workspace: Optional[str] = None, **kwargs) -> Beaker:
  function dataset_url (line 48) | def dataset_url(beaker: Beaker, dataset: Optional[str] = None) -> str:
  class BeakerStepLock (line 65) | class BeakerStepLock:
    method __init__ (line 68) | def __init__(
    method metadata (line 82) | def metadata(self) -> Dict[str, Any]:
    method _last_metadata (line 89) | def _last_metadata(self) -> Optional[Dict[str, Any]]:
    method _acquiring_job_is_done (line 99) | def _acquiring_job_is_done(self) -> bool:
    method acquire (line 126) | def acquire(self, timeout=None, poll_interval: float = 2.0, log_interv...
    method release (line 176) | def release(self):
    method __del__ (line 186) | def __del__(self):

FILE: tango/integrations/beaker/executor.py
  class StepFailedError (line 52) | class StepFailedError(ExecutorError):
    method __init__ (line 53) | def __init__(self, msg: str, experiment_url: str):
  class ResourceAssignmentError (line 58) | class ResourceAssignmentError(ExecutorError):
  class UnrecoverableResourceAssignmentError (line 64) | class UnrecoverableResourceAssignmentError(ExecutorError):
  class ResourceAssignment (line 71) | class ResourceAssignment(NamedTuple):
  class BeakerScheduler (line 92) | class BeakerScheduler(Registrable):
    method __init__ (line 103) | def __init__(self):
    method beaker (line 107) | def beaker(self) -> Beaker:
    method beaker (line 113) | def beaker(self, beaker: Beaker) -> None:
    method schedule (line 117) | def schedule(self, step: Step) -> ResourceAssignment:
  class SimpleBeakerScheduler (line 128) | class SimpleBeakerScheduler(BeakerScheduler):
    method __init__ (line 134) | def __init__(self, clusters: List[str], priority: Union[str, Priority]):
    method node_resources (line 143) | def node_resources(self) -> Dict[str, List[NodeResources]]:
    method schedule (line 154) | def schedule(self, step: Step) -> ResourceAssignment:
  class BeakerExecutor (line 179) | class BeakerExecutor(Executor):
    method __init__ (line 340) | def __init__(
    method check_repo_state (line 486) | def check_repo_state(self):
    method execute_step_graph (line 512) | def execute_step_graph(
    method _emit_resource_assignment_warning (line 698) | def _emit_resource_assignment_warning(self):
    method _check_if_cancelled (line 709) | def _check_if_cancelled(self):
    method _execute_sub_graph_for_step (line 713) | def _execute_sub_graph_for_step(
    method _parse_git_remote (line 844) | def _parse_git_remote(url: str) -> Tuple[str, str]:
    method _ensure_entrypoint_dataset (line 856) | def _ensure_entrypoint_dataset(self) -> Dataset:
    method _ensure_step_graph_dataset (line 913) | def _ensure_step_graph_dataset(self, step_graph: StepGraph) -> Dataset:
    method _build_experiment_spec (line 929) | def _build_experiment_spec(

FILE: tango/integrations/beaker/step_cache.py
  class BeakerStepCache (line 22) | class BeakerStepCache(RemoteStepCache):
    method __init__ (line 39) | def __init__(self, beaker_workspace: Optional[str] = None, beaker: Opt...
    method _step_result_remote (line 56) | def _step_result_remote(self, step: Union[Step, StepInfo]) -> Optional...
    method _upload_step_remote (line 67) | def _upload_step_remote(self, step: Step, objects_dir: Path) -> Beaker...
    method _download_step_remote (line 84) | def _download_step_remote(self, step_result, target_dir: PathOrStr) ->...
    method __len__ (line 93) | def __len__(self):

FILE: tango/integrations/beaker/workspace.py
  class BeakerWorkspace (line 38) | class BeakerWorkspace(RemoteWorkspace):
    method __init__ (line 53) | def __init__(self, workspace: str, max_workers: Optional[int] = None, ...
    method cache (line 63) | def cache(self):
    method locks (line 67) | def locks(self):
    method steps_dir_name (line 71) | def steps_dir_name(self):
    method url (line 75) | def url(self) -> str:
    method _step_location (line 78) | def _step_location(self, step: Step) -> str:
    method from_parsed_url (line 82) | def from_parsed_url(cls, parsed_url: ParseResult) -> Workspace:
    method current_beaker_experiment (line 95) | def current_beaker_experiment(self) -> Optional[Experiment]:
    method _remote_lock (line 109) | def _remote_lock(self, step: Step) -> BeakerStepLock:
    method _get_object_from_cache (line 114) | def _get_object_from_cache(self, digest: Digest, o_type: Type[U]) -> O...
    method _add_object_to_cache (line 141) | def _add_object_to_cache(self, digest: Digest, o: U):
    method step_info (line 150) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method _get_step_info_from_dataset (line 161) | def _get_step_info_from_dataset(self, dataset: Dataset) -> StepInfo:
    method _save_run (line 178) | def _save_run(
    method registered_runs (line 213) | def registered_runs(self) -> Dict[str, Run]:
    method search_registered_runs (line 234) | def search_registered_runs(
    method num_registered_runs (line 272) | def num_registered_runs(self, *, match: Optional[str] = None) -> int:
    method search_step_info (line 288) | def search_step_info(
    method num_steps (line 332) | def num_steps(self, *, match: Optional[str] = None, state: Optional[St...
    method registered_run (line 353) | def registered_run(self, name: str) -> Run:
    method _save_run_log (line 370) | def _save_run_log(self, name: str, log_file: Path):
    method _get_run_from_dataset (line 375) | def _get_run_from_dataset(self, dataset: BeakerDataset) -> Optional[Run]:
    method _update_step_info (line 407) | def _update_step_info(self, step_info: StepInfo):
    method _remove_step_info (line 424) | def _remove_step_info(self, step_info: StepInfo) -> None:

FILE: tango/integrations/datasets/__init__.py
  function convert_to_tango_dataset_dict (line 53) | def convert_to_tango_dataset_dict(hf_dataset_dict: ds.DatasetDict) -> Da...
  function convert_to_tango_dataset_dict (line 58) | def convert_to_tango_dataset_dict(hf_dataset_dict: ds.IterableDatasetDic...
  function convert_to_tango_dataset_dict (line 62) | def convert_to_tango_dataset_dict(hf_dataset_dict):
  class DatasetsFormat (line 81) | class DatasetsFormat(Format[T]):
    method write (line 91) | def write(self, artifact: T, dir: PathOrStr):
    method read (line 95) | def read(self, dir: PathOrStr) -> T:
  class LoadDataset (line 101) | class LoadDataset(Step):
    method run (line 124) | def run(self, path: str, **kwargs) -> Union[ds.DatasetDict, ds.Dataset...
  class LoadStreamingDataset (line 141) | class LoadStreamingDataset(Step):
    method run (line 157) | def run(  # type: ignore
  class InterleaveDatasets (line 179) | class InterleaveDatasets(Step):
    method run (line 193) | def run(  # type: ignore[override]
  class ConcatenateDatasets (line 206) | class ConcatenateDatasets(Step):
    method run (line 220) | def run(  # type: ignore[override]
  class DatasetRemixStep (line 234) | class DatasetRemixStep(Step):
    method run (line 275) | def run(  # type: ignore

FILE: tango/integrations/fairscale/fsdp_config.py
  class FSDPConfig (line 11) | class FSDPConfig(FromParams):
    method as_kwargs (line 65) | def as_kwargs(self) -> Dict[str, Any]:
    method wrap (line 71) | def wrap(self, module: torch.nn.Module):

FILE: tango/integrations/fairscale/module_wrapper.py
  function with_wrapped_modules (line 14) | def with_wrapped_modules(

FILE: tango/integrations/fairscale/training_engine.py
  class FairScaleTrainingEngine (line 24) | class FairScaleTrainingEngine(TorchTrainingEngine):
    method __init__ (line 91) | def __init__(
    method _construct_model (line 123) | def _construct_model(self, model: Union[Model, Lazy[Model]]) -> Model:
    method clip_grad_norm (line 130) | def clip_grad_norm(self) -> None:
    method get_model_state (line 134) | def get_model_state(self) -> Dict[str, torch.Tensor]:
    method load_model_state (line 140) | def load_model_state(self, state_dict: Dict[str, torch.Tensor]) -> None:
    method save_complete_weights_from_checkpoint (line 143) | def save_complete_weights_from_checkpoint(

FILE: tango/integrations/flax/data.py
  class DataLoader (line 14) | class DataLoader(Generic[T], Registrable):
  class FlaxDataLoader (line 22) | class FlaxDataLoader(DataLoader):
    method __init__ (line 23) | def __init__(
    method __call__ (line 43) | def __call__(self, rng: jax._src.random.KeyArrayLike, do_distributed: ...

FILE: tango/integrations/flax/eval.py
  class FlaxEvalStep (line 24) | class FlaxEvalStep(Step):
    method run (line 58) | def run(  # type: ignore[override]

FILE: tango/integrations/flax/eval_callback.py
  class EvalCallback (line 11) | class EvalCallback(Registrable):
    method __init__ (line 29) | def __init__(
    method pre_eval_loop (line 43) | def pre_eval_loop(self) -> None:
    method post_eval_loop (line 49) | def post_eval_loop(self, aggregated_metrics: Dict[str, float]) -> None:
    method pre_batch (line 57) | def pre_batch(self, step: int, batch: Dict[str, Any]) -> None:
    method post_batch (line 63) | def post_batch(self, step: int, batch_outputs: Dict[str, Any]) -> None:

FILE: tango/integrations/flax/format.py
  class FlaxFormat (line 13) | class FlaxFormat(Format[T], Generic[T]):
    method write (line 24) | def write(self, artifact: T, dir: PathOrStr) -> None:
    method read (line 27) | def read(self, dir: PathOrStr) -> T:

FILE: tango/integrations/flax/model.py
  class Model (line 6) | class Model(nn.Module, Registrable):

FILE: tango/integrations/flax/optim.py
  class Optimizer (line 9) | class Optimizer(Registrable):
    method __init__ (line 39) | def __init__(self, optimizer: Callable) -> None:
    method __call__ (line 42) | def __call__(self, **kwargs) -> optax.GradientTransformation:
  class LRScheduler (line 46) | class LRScheduler(Registrable):
    method __init__ (line 76) | def __init__(self, scheduler: Callable) -> None:
    method __call__ (line 79) | def __call__(self, **kwargs):
  function optimizer_factory (line 83) | def optimizer_factory(optim_method: Callable) -> Type[Callable]:
  function scheduler_factory (line 90) | def scheduler_factory(scheduler_method: Callable) -> Type[Callable]:

FILE: tango/integrations/flax/train.py
  class FlaxTrainStep (line 34) | class FlaxTrainStep(Step):
    method run (line 74) | def run(  # type: ignore[override]
    method _train (line 194) | def _train(
    method train_helper (line 279) | def train_helper(
    method save_checkpoint (line 590) | def save_checkpoint(self, dir: Path, target: PyTree, step: int, keep_c...
    method load_checkpoint (line 595) | def load_checkpoint(self, dir: Path, target: PyTree):
    method _construct_optimizer (line 598) | def _construct_optimizer(self, optimizer):
    method _construct_lr_scheduler (line 602) | def _construct_lr_scheduler(self, scheduler):
    method _get_devices (line 606) | def _get_devices(self) -> List[Any]:

FILE: tango/integrations/flax/train_callback.py
  class TrainCallback (line 15) | class TrainCallback(Registrable):
    method __init__ (line 46) | def __init__(
    method step_id (line 66) | def step_id(self) -> str:
    method step_name (line 73) | def step_name(self) -> Optional[str]:
    method work_dir (line 80) | def work_dir(self) -> Path:
    method state_dict (line 86) | def state_dict(self) -> Dict[str, Any]:
    method load_state_dict (line 95) | def load_state_dict(self, state_dict: Dict[str, Any]):
    method pre_train_loop (line 104) | def pre_train_loop(self) -> None:
    method post_train_loop (line 110) | def post_train_loop(self, step: int, epoch: int) -> None:
    method pre_epoch (line 118) | def pre_epoch(self, step: int, epoch: int) -> None:
    method post_epoch (line 124) | def post_epoch(self, step: int, epoch: int) -> None:
    method pre_batch (line 130) | def pre_batch(self, step: int, epoch: int, batch) -> None:
    method post_batch (line 135) | def post_batch(self, step: int, epoch: int, train_metrics: Dict) -> None:
    method log_batch (line 149) | def log_batch(self, step: int, epoch: int, train_metrics: Dict) -> None:
    method pre_val_loop (line 163) | def pre_val_loop(self, step: int, val_step: int, state) -> None:
    method pre_val_batch (line 169) | def pre_val_batch(self, step: int, val_step: int, epoch: int, val_batc...
    method post_val_batch (line 175) | def post_val_batch(self, step: int, val_step: int, epoch: int, val_met...
    method post_val_loop (line 188) | def post_val_loop(

FILE: tango/integrations/flax/train_config.py
  class TrainConfig (line 7) | class TrainConfig:
    method state_path (line 101) | def state_path(self) -> Path:
    method best_state_path (line 108) | def best_state_path(self) -> Path:
    method should_log_this_step (line 115) | def should_log_this_step(self, step: int) -> bool:
    method should_checkpoint_this_step (line 119) | def should_checkpoint_this_step(self, step: int) -> bool:
    method should_log_this_val_step (line 123) | def should_log_this_val_step(self, val_step: int) -> bool:
    method as_dict (line 127) | def as_dict(self) -> Dict[str, Any]:

FILE: tango/integrations/flax/util.py
  function get_PRNGkey (line 6) | def get_PRNGkey(seed: int = 42) -> Union[Any, jax._src.random.KeyArray]:
  function get_multiple_keys (line 14) | def get_multiple_keys(key, multiple: int = 1) -> Union[Any, jax._src.ran...

FILE: tango/integrations/flax/wrapper.py
  class FlaxWrapper (line 7) | class FlaxWrapper(Registrable):
    method train_metrics (line 13) | def train_metrics(self, state, batch, labels) -> Dict:
    method train_loss (line 21) | def train_loss(self, params, state, batch, dropout_rng, labels):
    method val_metrics (line 30) | def val_metrics(self, batch, logits, labels) -> Dict:
    method eval_metrics (line 37) | def eval_metrics(self, batch, logits, labels) -> Dict:

FILE: tango/integrations/gs/common.py
  function get_bucket_and_prefix (line 30) | def get_bucket_and_prefix(folder_name: str) -> Tuple[str, str]:
  function empty_bucket_folder (line 38) | def empty_bucket_folder(folder_name: str):
  function empty_datastore (line 56) | def empty_datastore(folder_name: str):
  class GSArtifact (line 80) | class GSArtifact:
  class GSArtifactConflict (line 104) | class GSArtifactConflict(TangoError):
  class GSArtifactNotFound (line 112) | class GSArtifactNotFound(TangoError):
  class GSArtifactWriteError (line 120) | class GSArtifactWriteError(TangoError):
  function join_path (line 128) | def join_path(*args) -> str:
  class GSClient (line 135) | class GSClient:
    method __init__ (line 168) | def __init__(
    method url (line 192) | def url(self, artifact: Optional[str] = None):
    method _convert_blobs_to_artifact (line 201) | def _convert_blobs_to_artifact(self, blobs: List[storage.Blob]) -> GSA...
    method from_env (line 224) | def from_env(cls, folder_name: str):
    method get (line 230) | def get(self, artifact: Union[str, GSArtifact]) -> GSArtifact:
    method _gs_path (line 248) | def _gs_path(self, *args):
    method create (line 254) | def create(self, artifact: str):
    method delete (line 274) | def delete(self, artifact: GSArtifact):
    method upload (line 283) | def upload(self, artifact: Union[str, GSArtifact], objects_dir: Path):
    method commit (line 327) | def commit(self, artifact: Union[str, GSArtifact]):
    method download (line 343) | def download(self, artifact: GSArtifact, target_dir: PathOrStr):
    method artifacts (line 379) | def artifacts(self, prefix: str, uncommitted: bool = True) -> List[GSA...
  function get_credentials (line 399) | def get_credentials(credentials: Optional[Union[str, Credentials]] = Non...
  function get_client (line 442) | def get_client(
  class Constants (line 454) | class Constants(RemoteConstants):
  class GCSStepLock (line 458) | class GCSStepLock:
    method __init__ (line 464) | def __init__(
    method acquire (line 475) | def acquire(self, timeout=None, poll_interval: float = 2.0, log_interv...
    method release (line 508) | def release(self):
    method __del__ (line 518) | def __del__(self):

FILE: tango/integrations/gs/step_cache.py
  class GSStepCache (line 25) | class GSStepCache(RemoteStepCache):
    method __init__ (line 42) | def __init__(self, folder_name: str, client: Optional[GSClient] = None):
    method client (line 55) | def client(self):
    method _step_result_remote (line 58) | def _step_result_remote(self, step: Union[Step, StepInfo]) -> Optional...
    method _upload_step_remote (line 69) | def _upload_step_remote(self, step: Step, objects_dir: Path) -> GSArti...
    method _download_step_remote (line 86) | def _download_step_remote(self, step_result, target_dir: PathOrStr) ->...
    method __len__ (line 95) | def __len__(self):

FILE: tango/integrations/gs/workspace.py
  class GSWorkspace (line 39) | class GSWorkspace(RemoteWorkspace):
    method __init__ (line 69) | def __init__(
    method cache (line 92) | def cache(self):
    method locks (line 96) | def locks(self):
    method steps_dir_name (line 100) | def steps_dir_name(self):
    method from_parsed_url (line 104) | def from_parsed_url(cls, parsed_url: ParseResult) -> Workspace:
    method url (line 117) | def url(self) -> str:
    method _remote_lock (line 120) | def _remote_lock(self, step: Step) -> GCSStepLock:
    method _step_location (line 123) | def _step_location(self, step: Step) -> str:
    method _run_key (line 127) | def _run_key(self):
    method _stepinfo_key (line 131) | def _stepinfo_key(self):
    method _save_run (line 134) | def _save_run(
    method _get_run_from_entity (line 158) | def _get_run_from_entity(self, run_entity: datastore.Entity) -> Option...
    method registered_runs (line 182) | def registered_runs(self) -> Dict[str, Run]:
    method search_registered_runs (line 201) | def search_registered_runs(
    method num_registered_runs (line 218) | def num_registered_runs(self, *, match: Optional[str] = None) -> int:
    method _fetch_run_entities (line 224) | def _fetch_run_entities(
    method search_step_info (line 276) | def search_step_info(
    method num_steps (line 298) | def num_steps(self, *, match: Optional[str] = None, state: Optional[St...
    method _fetch_step_info_entities (line 304) | def _fetch_step_info_entities(
    method registered_run (line 366) | def registered_run(self, name: str) -> Run:
    method step_info (line 379) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method _step_info_multiple (line 395) | def _step_info_multiple(
    method _get_run_step_info (line 435) | def _get_run_step_info(self, targets: Iterable[Step]) -> Tuple[Dict, D...
    method _update_step_info (line 453) | def _update_step_info(self, step_info: StepInfo):
    method _remove_step_info (line 471) | def _remove_step_info(self, step_info: StepInfo) -> None:
    method _save_run_log (line 480) | def _save_run_log(self, name: str, log_file: Path):

FILE: tango/integrations/torch/data.py
  class DataCollator (line 11) | class DataCollator(Generic[T], Registrable):
    method __call__ (line 24) | def __call__(self, items: List[T]) -> Dict[str, Any]:
  class ConcatTensorDictsCollator (line 32) | class ConcatTensorDictsCollator(DataCollator[Dict[str, Any]]):
    method __call__ (line 42) | def __call__(self, items: List[Dict[str, Any]]) -> Dict[str, Any]:
  class Sampler (line 55) | class Sampler(torch.utils.data.Sampler, Registrable):
  class BatchSampler (line 67) | class BatchSampler(torch.utils.data.BatchSampler, Sampler):
    method __init__ (line 68) | def __init__(
  class DataLoader (line 96) | class DataLoader(torch.utils.data.DataLoader, Registrable):
    method __init__ (line 104) | def __init__(

FILE: tango/integrations/torch/eval.py
  class TorchEvalStep (line 21) | class TorchEvalStep(Step):
    method resources (line 56) | def resources(self) -> StepResources:
    method run (line 59) | def run(  # type: ignore[override]

FILE: tango/integrations/torch/eval_callback.py
  class EvalCallback (line 12) | class EvalCallback(Registrable):
    method __init__ (line 31) | def __init__(
    method pre_eval_loop (line 47) | def pre_eval_loop(self) -> None:
    method post_eval_loop (line 53) | def post_eval_loop(self, aggregated_metrics: Dict[str, float]) -> None:
    method pre_batch (line 61) | def pre_batch(self, step: int, batch: Dict[str, Any]) -> None:
    method post_batch (line 67) | def post_batch(self, step: int, batch_outputs: Dict[str, Any]) -> None:

FILE: tango/integrations/torch/exceptions.py
  class StopEarly (line 4) | class StopEarly(TangoError):

FILE: tango/integrations/torch/format.py
  class TorchFormat (line 14) | class TorchFormat(Format[T], Generic[T]):
    method write (line 28) | def write(self, artifact: T, dir: PathOrStr):
    method read (line 33) | def read(self, dir: PathOrStr) -> T:

FILE: tango/integrations/torch/model.py
  class Model (line 6) | class Model(torch.nn.Module, Registrable):

FILE: tango/integrations/torch/optim.py
  class Optimizer (line 8) | class Optimizer(torch.optim.Optimizer, Registrable):
  class LRScheduler (line 40) | class LRScheduler(torch.optim.lr_scheduler._LRScheduler, Registrable):

FILE: tango/integrations/torch/train.py
  class TorchTrainStep (line 34) | class TorchTrainStep(Step):
    method resources (line 78) | def resources(self) -> StepResources:
    method run (line 81) | def run(  # type: ignore[override]
    method _get_devices (line 211) | def _get_devices(self, device_count: int) -> List[int]:
    method _train (line 229) | def _train(
  function _train (line 353) | def _train(
  function _cycle_through_epochs (line 810) | def _cycle_through_epochs(dataloader: DataLoader, is_distributed: bool, ...

FILE: tango/integrations/torch/train_callback.py
  class TrainCallback (line 16) | class TrainCallback(Registrable):
    method __init__ (line 46) | def __init__(
    method step_id (line 64) | def step_id(self) -> str:
    method step_name (line 71) | def step_name(self) -> Optional[str]:
    method work_dir (line 78) | def work_dir(self) -> Path:
    method is_local_main_process (line 85) | def is_local_main_process(self) -> bool:
    method model (line 93) | def model(self) -> Model:
    method state_dict (line 99) | def state_dict(self) -> Dict[str, Any]:
    method load_state_dict (line 108) | def load_state_dict(self, state_dict: Dict[str, Any]) -> None:
    method pre_train_loop (line 117) | def pre_train_loop(self) -> None:
    method post_train_loop (line 123) | def post_train_loop(self, step: int, epoch: int) -> None:
    method pre_epoch (line 131) | def pre_epoch(self, step: int, epoch: int) -> None:
    method post_epoch (line 137) | def post_epoch(self, step: int, epoch: int) -> None:
    method pre_batch (line 143) | def pre_batch(self, step: int, epoch: int, batch: List[Dict[str, Any]]...
    method post_batch (line 154) | def post_batch(
    method log_batch (line 174) | def log_batch(
    method pre_val_batch (line 193) | def pre_val_batch(
    method post_val_batch (line 201) | def post_val_batch(
    method post_val_loop (line 216) | def post_val_loop(
  class StopEarlyCallback (line 226) | class StopEarlyCallback(TrainCallback):
    method __init__ (line 236) | def __init__(self, *args, patience: int = 10000, **kwargs) -> None:
    method post_val_loop (line 242) | def post_val_loop(
    method state_dict (line 253) | def state_dict(self) -> Dict[str, Any]:
    method load_state_dict (line 263) | def load_state_dict(self, state_dict: Dict[str, Any]) -> None:

FILE: tango/integrations/torch/train_config.py
  class TrainConfig (line 9) | class TrainConfig:
    method worker_local_default_device (line 142) | def worker_local_default_device(self) -> torch.device:
    method device_type (line 163) | def device_type(self) -> str:
    method is_local_main_process (line 174) | def is_local_main_process(self) -> bool:
    method state_path (line 181) | def state_path(self) -> Path:
    method best_state_path (line 188) | def best_state_path(self) -> Path:
    method state_path_for_step (line 195) | def state_path_for_step(self, step: int) -> Path:
    method final_weights_path (line 199) | def final_weights_path(self) -> Path:
    method should_log_this_step (line 202) | def should_log_this_step(self, step: int) -> bool:
    method should_checkpoint_this_step (line 206) | def should_checkpoint_this_step(self, step: int) -> bool:
    method should_log_this_val_step (line 210) | def should_log_this_val_step(self, val_step: int) -> bool:
    method as_dict (line 214) | def as_dict(self) -> Dict[str, Any]:

FILE: tango/integrations/torch/training_engine.py
  class TrainingEngine (line 19) | class TrainingEngine(Registrable):
    method __init__ (line 35) | def __init__(
    method _construct_model (line 50) | def _construct_model(self, model: Union[Model, Lazy[Model]]) -> Model:
    method _construct_optimizer (line 55) | def _construct_optimizer(self, optimizer: Lazy[Optimizer]) -> Optimizer:
    method _construct_lr_scheduler (line 59) | def _construct_lr_scheduler(self, lr_scheduler: Lazy[LRScheduler]) -> ...
    method forward_train (line 64) | def forward_train(
    method forward_eval (line 73) | def forward_eval(self, batch: Dict[str, Any]) -> Dict[str, Any]:
    method backward (line 80) | def backward(self, loss: torch.Tensor) -> None:
    method step (line 87) | def step(self) -> None:
    method save_checkpoint (line 94) | def save_checkpoint(self, checkpoint_dir: Path, client_state: Dict[str...
    method load_checkpoint (line 102) | def load_checkpoint(self, checkpoint_dir: Path) -> Dict[str, Any]:
    method save_complete_weights_from_checkpoint (line 110) | def save_complete_weights_from_checkpoint(
  class TorchTrainingEngine (line 120) | class TorchTrainingEngine(TrainingEngine):
    method __init__ (line 143) | def __init__(
    method _construct_model (line 182) | def _construct_model(self, model: Union[Model, Lazy[Model]]) -> Model:
    method forward_train (line 191) | def forward_train(
    method forward_eval (line 206) | def forward_eval(self, batch: Dict[str, Any]) -> Dict[str, Any]:
    method backward (line 216) | def backward(self, loss: torch.Tensor) -> None:
    method clip_grad_norm (line 222) | def clip_grad_norm(self) -> None:
    method step (line 226) | def step(self) -> None:
    method get_model_state (line 245) | def get_model_state(self) -> Dict[str, torch.Tensor]:
    method load_model_state (line 251) | def load_model_state(self, state_dict: Dict[str, torch.Tensor]) -> None:
    method save_checkpoint (line 257) | def save_checkpoint(self, checkpoint_dir: Path, client_state: Dict[str...
    method load_checkpoint (line 290) | def load_checkpoint(self, checkpoint_dir: Path) -> Dict[str, Any]:
    method save_complete_weights_from_checkpoint (line 307) | def save_complete_weights_from_checkpoint(

FILE: tango/integrations/torch/util.py
  function move_to_device (line 15) | def move_to_device(o: T, device: torch.device) -> T:
  function check_dataset (line 28) | def check_dataset(dataset, split: str):
  function check_dataloader (line 41) | def check_dataloader(dataloader: DataLoader):
  function set_seed_all (line 54) | def set_seed_all(seed: int):
  function resolve_device (line 67) | def resolve_device(device: Optional[Union[int, str, torch.device]] = Non...
  function peak_gpu_memory (line 87) | def peak_gpu_memory(reset: bool = False) -> Dict[int, int]:

FILE: tango/integrations/transformers/config.py
  class Config (line 6) | class Config(PretrainedConfig, Registrable):

FILE: tango/integrations/transformers/data.py
  function data_collator_with_tokenizer_factory (line 14) | def data_collator_with_tokenizer_factory(cls) -> Callable[..., DataColla...

FILE: tango/integrations/transformers/finetune.py
  class FinetuneWrapper (line 35) | class FinetuneWrapper(PreTrainedModel):
    method from_pretrained (line 41) | def from_pretrained(  # type: ignore
  function _add_special_tokens (line 68) | def _add_special_tokens(tokenizer: Tokenizer) -> None:
  function tokenize_data (line 79) | def tokenize_data(
  class TokenizeText2TextData (line 184) | class TokenizeText2TextData(Step):
    method run (line 196) | def run(  # type: ignore[override]
  class FinetuneStep (line 257) | class FinetuneStep(TorchTrainStep):
    method run (line 299) | def run(  # type: ignore[override]

FILE: tango/integrations/transformers/ia3.py
  class WithIA3Config (line 13) | class WithIA3Config:
  class WithIA3 (line 102) | class WithIA3(nn.Module):
    method __init__ (line 103) | def __init__(self, ia3_param_names: str, unfuse_size: Optional[int] = ...
    method scale_by_ia3 (line 114) | def scale_by_ia3(self, x):
  class LinearWithIA3 (line 130) | class LinearWithIA3(WithIA3):
    method __init__ (line 131) | def __init__(
    method forward (line 157) | def forward(self, x):
  class Conv1DWithIA3 (line 162) | class Conv1DWithIA3(WithIA3):
    method __init__ (line 163) | def __init__(
    method forward (line 190) | def forward(self, x):
  function modify_with_ia3 (line 199) | def modify_with_ia3(

FILE: tango/integrations/transformers/model.py
  function auto_model_wrapper_factory (line 11) | def auto_model_wrapper_factory(cls: type) -> Type[Model]:
  function flax_auto_model_wrapper_factory (line 41) | def flax_auto_model_wrapper_factory(cls: type) -> Type[FlaxModel]:

FILE: tango/integrations/transformers/run_generation.py
  function adjust_length_to_model (line 73) | def adjust_length_to_model(length, model):
  function _generate (line 89) | def _generate(
  function _generate_with_model_name (line 233) | def _generate_with_model_name(model_name: str, *args, **kwargs) -> Itera...
  class RunGeneration (line 244) | class RunGeneration(Step[Iterable[List[str]]]):
    method run (line 258) | def run(  # type: ignore
  class RunGenerationDataset (line 347) | class RunGenerationDataset(Step[DatasetDict]):
    method run (line 362) | def run(  # type: ignore

FILE: tango/integrations/transformers/soft_prompt.py
  function _get_bound_args_with_decorators (line 19) | def _get_bound_args_with_decorators(fn, *args, **kwargs):
  function add_soft_prompt (line 29) | def add_soft_prompt(
  function _with_soft_prompt (line 227) | def _with_soft_prompt(

FILE: tango/integrations/transformers/tokenizer.py
  class Tokenizer (line 7) | class Tokenizer(PreTrainedTokenizerBase, Registrable):

FILE: tango/integrations/wandb/flax_train_callback.py
  class WandbFlaxTrainCallback (line 14) | class WandbFlaxTrainCallback(TrainCallback):
    method __init__ (line 64) | def __init__(
    method state_dict (line 121) | def state_dict(self) -> Dict[str, Any]:
    method load_state_dict (line 124) | def load_state_dict(self, state_dict: Dict[str, Any]) -> None:
    method pre_train_loop (line 127) | def pre_train_loop(self) -> None:
    method post_train_loop (line 156) | def post_train_loop(self, step: int, epoch: int) -> None:
    method log_batch (line 160) | def log_batch(self, step: int, epoch: int, train_metrics: Dict) -> None:
    method post_val_loop (line 166) | def post_val_loop(

FILE: tango/integrations/wandb/step_cache.py
  class WandbStepCache (line 21) | class WandbStepCache(RemoteStepCache):
    method __init__ (line 36) | def __init__(self, project: str, entity: str):
    method wandb_client (line 48) | def wandb_client(self) -> wandb.Api:
    method client (line 52) | def client(self):
    method wandb_project_url (line 59) | def wandb_project_url(self) -> str:
    method _step_artifact_name (line 67) | def _step_artifact_name(self, step: Union[Step, StepInfo]) -> str:
    method _step_result_remote (line 73) | def _step_result_remote(  # type: ignore
    method create_step_result_artifact (line 88) | def create_step_result_artifact(self, step: Step, objects_dir: Optiona...
    method get_step_result_artifact (line 91) | def get_step_result_artifact(self, step: Union[Step, StepInfo]) -> Opt...
    method _upload_step_remote (line 104) | def _upload_step_remote(self, step: Step, objects_dir: Optional[PathOr...
    method get_step_result_artifact_url (line 125) | def get_step_result_artifact_url(self, step: Union[Step, StepInfo]) ->...
    method use_step_result_artifact (line 133) | def use_step_result_artifact(self, step: Union[Step, StepInfo]) -> None:
    method _download_step_remote (line 143) | def _download_step_remote(self, step_result, target_dir: PathOrStr):
    method __len__ (line 149) | def __len__(self) -> int:

FILE: tango/integrations/wandb/torch_train_callback.py
  class WandbTrainCallback (line 15) | class WandbTrainCallback(TrainCallback):
    method __init__ (line 65) | def __init__(
    method state_dict (line 130) | def state_dict(self) -> Dict[str, Any]:
    method load_state_dict (line 133) | def load_state_dict(self, state_dict: Dict[str, Any]) -> None:
    method pre_train_loop (line 136) | def pre_train_loop(self) -> None:
    method post_train_loop (line 174) | def post_train_loop(self, step: int, epoch: int) -> None:
    method log_batch (line 178) | def log_batch(
    method post_val_loop (line 196) | def post_val_loop(
    method _get_default_notes (line 209) | def _get_default_notes(self) -> str:

FILE: tango/integrations/wandb/util.py
  function is_missing_artifact_error (line 12) | def is_missing_artifact_error(err: WandbError):
  function check_environment (line 30) | def check_environment():
  class RunKind (line 47) | class RunKind(Enum):
  class ArtifactKind (line 52) | class ArtifactKind(Enum):

FILE: tango/integrations/wandb/workspace.py
  class WandbWorkspace (line 28) | class WandbWorkspace(Workspace):
    method __init__ (line 67) | def __init__(self, project: str, entity: Optional[str] = None):
    method __getstate__ (line 77) | def __getstate__(self):
    method wandb_client (line 87) | def wandb_client(self) -> wandb.Api:
    method entity (line 94) | def entity(self) -> str:
    method url (line 98) | def url(self) -> str:
    method from_parsed_url (line 102) | def from_parsed_url(cls, parsed_url: ParseResult) -> Workspace:
    method step_cache (line 110) | def step_cache(self) -> StepCache:
    method wandb_project_url (line 114) | def wandb_project_url(self) -> str:
    method _get_unique_id (line 122) | def _get_unique_id(self, step_or_unique_id: Union[Step, str]) -> str:
    method step_dir (line 129) | def step_dir(self, step_or_unique_id: Union[Step, str]) -> Path:
    method work_dir (line 135) | def work_dir(self, step: Step) -> Path:
    method step_info (line 140) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method step_starting (line 153) | def step_starting(self, step: Step) -> None:
    method step_finished (line 227) | def step_finished(self, step: Step, result: T) -> T:
    method step_failed (line 266) | def step_failed(self, step: Step, e: BaseException) -> None:
    method remove_step (line 295) | def remove_step(self, step_unique_id: str):
    method register_run (line 302) | def register_run(self, targets: Iterable[Step], name: Optional[str] = ...
    method _generate_run_suite_id (line 361) | def _generate_run_suite_id(self) -> str:
    method registered_runs (line 364) | def registered_runs(self) -> Dict[str, Run]:
    method registered_run (line 376) | def registered_run(self, name: str) -> Run:
    method _get_run_from_wandb_run (line 389) | def _get_run_from_wandb_run(
    method _get_updated_step_info (line 411) | def _get_updated_step_info(

FILE: tango/settings.py
  class TangoGlobalSettings (line 13) | class TangoGlobalSettings(FromParams):
    method default (line 67) | def default(cls) -> "TangoGlobalSettings":
    method find_or_default (line 80) | def find_or_default(cls, path: Optional[PathOrStr] = None) -> "TangoGl...
    method path (line 94) | def path(self) -> Optional[Path]:
    method from_file (line 101) | def from_file(cls, path: PathOrStr) -> "TangoGlobalSettings":
    method to_file (line 109) | def to_file(self, path: PathOrStr) -> None:
    method save (line 119) | def save(self) -> None:

FILE: tango/step.py
  function get_origin (line 46) | def get_origin(tp):  # type: ignore
  function get_args (line 49) | def get_args(tp):  # type: ignore
  class StepResources (line 65) | class StepResources(FromParams):
  class Step (line 115) | class Step(Registrable, Generic[T]):
    method __init__ (line 198) | def __init__(
    method class_name (line 292) | def class_name(self) -> str:
    method massage_kwargs (line 296) | def massage_kwargs(cls, kwargs: Dict[str, Any]) -> Dict[str, Any]:
    method logger (line 321) | def logger(self) -> logging.Logger:
    method from_params (line 328) | def from_params(  # type: ignore[override]
    method run (line 451) | def run(self, **kwargs) -> T:
    method _run_with_work_dir (line 460) | def _run_with_work_dir(self, workspace: "Workspace", needed_by: Option...
    method work_dir (line 499) | def work_dir(self) -> Path:
    method workspace (line 516) | def workspace(self) -> "Workspace":
    method config (line 530) | def config(self) -> Dict[str, Any]:
    method det_hash_object (line 540) | def det_hash_object(self) -> Any:
    method resources (line 544) | def resources(self) -> StepResources:
    method unique_id (line 555) | def unique_id(self) -> str:
    method __str__ (line 596) | def __str__(self):
    method __hash__ (line 599) | def __hash__(self):
    method __eq__ (line 605) | def __eq__(self, other):
    method _replace_steps_with_results (line 615) | def _replace_steps_with_results(self, o: Any, workspace: "Workspace"):
    method result (line 639) | def result(
    method ensure_result (line 673) | def ensure_result(
    method _ordered_dependencies (line 694) | def _ordered_dependencies(self) -> Iterable["Step"]:
    method dependencies (line 718) | def dependencies(self) -> Set["Step"]:
    method recursive_dependencies (line 725) | def recursive_dependencies(self) -> Set["Step"]:
    method log_cache_hit (line 739) | def log_cache_hit(self, needed_by: Optional["Step"] = None) -> None:
    method log_starting (line 753) | def log_starting(self, needed_by: Optional["Step"] = None) -> None:
    method log_finished (line 766) | def log_finished(self, run_name: Optional[str] = None) -> None:
    method log_failure (line 779) | def log_failure(self, exception: Optional[BaseException] = None) -> None:
  class FunctionalStep (line 785) | class FunctionalStep(Step):
    method class_name (line 790) | def class_name(self) -> str:
    method run (line 793) | def run(self, *args, **kwargs):
  function step (line 800) | def step(
  class StepIndexer (line 860) | class StepIndexer(CustomDetHash):
    method __init__ (line 861) | def __init__(self, step: Step, key: Union[str, int]):
    method result (line 865) | def result(
    method det_hash_object (line 870) | def det_hash_object(self) -> Any:
  class WithUnresolvedSteps (line 874) | class WithUnresolvedSteps(CustomDetHash):
    method __init__ (line 947) | def __init__(self, function, *args, **kwargs):
    method with_resolved_steps (line 953) | def with_resolved_steps(
    method construct (line 986) | def construct(self, workspace: "Workspace"):
    method det_hash_object (line 997) | def det_hash_object(self) -> Any:

FILE: tango/step_cache.py
  class StepCache (line 18) | class StepCache(Registrable):
    method __contains__ (line 30) | def __contains__(self, step: Any) -> bool:
    method __getitem__ (line 42) | def __getitem__(self, step: Union[Step, StepInfo]) -> Any:
    method __setitem__ (line 47) | def __setitem__(self, step: Step, value: Any) -> None:
    method __delitem__ (line 52) | def __delitem__(self, step_unique_id: Union[Step, StepInfo]) -> None:
    method __len__ (line 57) | def __len__(self) -> int:
  class CacheMetadata (line 63) | class CacheMetadata(FromParams):

FILE: tango/step_caches/local_step_cache.py
  class LocalStepCache (line 20) | class LocalStepCache(StepCache):
    method __init__ (line 40) | def __init__(self, dir: PathOrStr):
    method _init_mem_caches (line 52) | def _init_mem_caches(self):
    method __getstate__ (line 56) | def __getstate__(self):
    method __setstate__ (line 64) | def __setstate__(self, state):
    method _add_to_cache (line 69) | def _add_to_cache(self, key: str, o: Any) -> None:
    method _get_from_cache (line 84) | def _get_from_cache(self, key: str) -> Optional[Any]:
    method _remove_from_cache (line 94) | def _remove_from_cache(self, key: str) -> None:
    method _metadata_path (line 105) | def _metadata_path(self, step_or_unique_id: Union[Step, StepInfo, str]...
    method __contains__ (line 108) | def __contains__(self, step: object) -> bool:
    method __getitem__ (line 123) | def __getitem__(self, step: Union[Step, StepInfo]) -> Any:
    method __setitem__ (line 134) | def __setitem__(self, step: Step, value: Any) -> None:
    method __delitem__ (line 163) | def __delitem__(self, step: Union[Step, StepInfo]) -> None:
    method __len__ (line 171) | def __len__(self) -> int:
    method step_dir (line 174) | def step_dir(self, step_or_unique_id: Union[Step, StepInfo, str]) -> P...

FILE: tango/step_caches/memory_step_cache.py
  class MemoryStepCache (line 13) | class MemoryStepCache(StepCache):
    method __init__ (line 21) | def __init__(self):
    method __getitem__ (line 24) | def __getitem__(self, step: Union[Step, StepInfo]) -> Any:
    method __setitem__ (line 27) | def __setitem__(self, step: Step, value: Any) -> None:
    method __delitem__ (line 38) | def __delitem__(self, step: Union[Step, StepInfo]) -> None:
    method __contains__ (line 44) | def __contains__(self, step: object) -> bool:
    method __len__ (line 50) | def __len__(self) -> int:

FILE: tango/step_caches/remote_step_cache.py
  class RemoteNotFoundError (line 22) | class RemoteNotFoundError(TangoError):
  class RemoteStepCache (line 30) | class RemoteStepCache(LocalStepCache):
    method __init__ (line 44) | def __init__(self, local_dir: Path):
    method _step_result_remote (line 48) | def _step_result_remote(self, step: Union[Step, StepInfo]):
    method _upload_step_remote (line 52) | def _upload_step_remote(self, step: Step, objects_dir: Path):
    method _download_step_remote (line 56) | def _download_step_remote(self, step_result, target_dir: PathOrStr) ->...
    method __len__ (line 60) | def __len__(self):
    method _acquire_step_lock_file (line 63) | def _acquire_step_lock_file(self, step: Union[Step, StepInfo], read_on...
    method __contains__ (line 68) | def __contains__(self, step: Any) -> bool:
    method __getitem__ (line 92) | def __getitem__(self, step: Union[Step, StepInfo]) -> Any:
    method __setitem__ (line 133) | def __setitem__(self, step: Step, value: Any) -> None:

FILE: tango/step_graph.py
  class StepGraph (line 12) | class StepGraph(Mapping[str, Step]):
    method __init__ (line 20) | def __init__(self, step_dict: Dict[str, Step]):
    method _is_ordered (line 36) | def _is_ordered(cls, step_dict: Dict[str, Step]):
    method _check_unsatisfiable_dependencies (line 46) | def _check_unsatisfiable_dependencies(cls, dependencies: Dict[str, Set...
    method _get_ordered_steps (line 66) | def _get_ordered_steps(cls, dependencies: Dict[str, Set[str]]) -> List...
    method _sanity_check (line 90) | def _sanity_check(self) -> None:
    method from_params (line 104) | def from_params(cls: Type["StepGraph"], params: Dict[str, Params]) -> ...
    method sub_graph (line 130) | def sub_graph(self, *step_names: str) -> "StepGraph":
    method _dict_is_ref (line 145) | def _dict_is_ref(d: Union[dict, Params]) -> bool:
    method _find_step_dependencies (line 154) | def _find_step_dependencies(cls, o: Any) -> Set[str]:
    method _replace_step_dependencies (line 170) | def _replace_step_dependencies(cls, o: Any, existing_steps: Mapping[st...
    method __getitem__ (line 194) | def __getitem__(self, name: str) -> Step:
    method __len__ (line 200) | def __len__(self) -> int:
    method __iter__ (line 206) | def __iter__(self) -> Iterator[str]:
    method ordered_steps (line 213) | def ordered_steps(cls, step_dict: Dict[str, Step]) -> List[Step]:
    method uncacheable_leaf_steps (line 230) | def uncacheable_leaf_steps(self) -> Set[Step]:
    method from_file (line 241) | def from_file(cls, filename: PathOrStr) -> "StepGraph":
    method to_config (line 245) | def to_config(self, include_unique_id: bool = False) -> Dict[str, Dict]:
    method to_file (line 285) | def to_file(self, filename: PathOrStr, include_unique_id: bool = False...
    method __repr__ (line 297) | def __repr__(self) -> str:

FILE: tango/step_info.py
  function get_pip_packages (line 23) | def get_pip_packages() -> Optional[List[Tuple[str, str]]]:
  class StepState (line 39) | class StepState(StrEnum):
  class GitMetadata (line 60) | class GitMetadata(FromParams):
    method check_for_repo (line 72) | def check_for_repo(cls) -> Optional["GitMetadata"]:
  class TangoMetadata (line 84) | class TangoMetadata(FromParams):
  class PlatformMetadata (line 92) | class PlatformMetadata(FromParams):
  class EnvironmentMetadata (line 115) | class EnvironmentMetadata(FromParams):
  class StepInfo (line 154) | class StepInfo(FromParams):
    method start_time_local (line 247) | def start_time_local(self) -> Optional[datetime]:
    method end_time_local (line 255) | def end_time_local(self) -> Optional[datetime]:
    method duration (line 263) | def duration(self) -> Optional[timedelta]:
    method state (line 273) | def state(self) -> StepState:
    method to_json_dict (line 290) | def to_json_dict(self) -> Dict[str, Any]:
    method from_json_dict (line 297) | def from_json_dict(cls, json_dict: Dict[str, Any]) -> "StepInfo":
    method new_from_step (line 318) | def new_from_step(cls, step: Step, **kwargs) -> "StepInfo":
    method refresh (line 335) | def refresh(self):

FILE: tango/steps/dataset_remix.py
  class DatasetRemixStep (line 16) | class DatasetRemixStep(Step[DatasetDict]):
    method run (line 59) | def run(  # type: ignore
  class DatasetCombineStep (line 150) | class DatasetCombineStep(Step[DatasetDict]):
    method run (line 192) | def run(  # type: ignore

FILE: tango/steps/print.py
  class PrintStep (line 9) | class PrintStep(Step):
    method run (line 17) | def run(self, input: Any) -> str:  # type: ignore[override]

FILE: tango/steps/shell_step.py
  function check_path_existence (line 10) | def check_path_existence(path: PathOrStr):
  class ShellStep (line 15) | class ShellStep(Step):
    method run (line 35) | def run(  # type: ignore[override]
    method run_command (line 50) | def run_command(self, command: Union[str, List[str]], **kwargs):

FILE: tango/workspace.py
  class Run (line 37) | class Run(FromParams):
    method to_json_dict (line 60) | def to_json_dict(self) -> Dict[str, Any]:
    method from_json_dict (line 64) | def from_json_dict(cls, json_dict: Dict[str, Any]) -> "Run":
  class RunInfo (line 74) | class RunInfo(FromParams):
  class RunSort (line 98) | class RunSort(StrEnum):
  class StepInfoSort (line 103) | class StepInfoSort(StrEnum):
  class Workspace (line 108) | class Workspace(Registrable):
    method __init__ (line 129) | def __init__(self):
    method __getstate__ (line 132) | def __getstate__(self):
    method url (line 143) | def url(self) -> str:
    method from_url (line 151) | def from_url(cls, url: str) -> "Workspace":
    method from_parsed_url (line 170) | def from_parsed_url(cls, parsed_url: ParseResult) -> "Workspace":
    method step_cache (line 180) | def step_cache(self) -> StepCache:
    method work_dir (line 186) | def work_dir(self, step: Step) -> Path:
    method step_info (line 201) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method search_step_info (line 212) | def search_step_info(
    method num_steps (line 260) | def num_steps(self, *, match: Optional[str] = None, state: Optional[St...
    method step_starting (line 270) | def step_starting(self, step: Step) -> None:
    method step_finished (line 281) | def step_finished(self, step: Step, result: T) -> T:
    method step_failed (line 297) | def step_failed(self, step: Step, e: BaseException) -> None:
    method register_run (line 309) | def register_run(self, targets: Iterable[Step], name: Optional[str] = ...
    method search_registered_runs (line 320) | def search_registered_runs(
    method num_registered_runs (line 369) | def num_registered_runs(self, *, match: Optional[str] = None) -> int:
    method registered_runs (line 378) | def registered_runs(self) -> Dict[str, Run]:
    method registered_run (line 387) | def registered_run(self, name: str) -> Run:
    method step_result_for_run (line 397) | def step_result_for_run(self, run_name: str, step_name: str) -> Any:
    method step_result (line 410) | def step_result(self, step_name: str) -> Any:
    method remove_step (line 423) | def remove_step(self, step_unique_id: str):
    method capture_logs_for_run (line 430) | def capture_logs_for_run(self, name: str) -> ContextManager[None]:

FILE: tango/workspaces/local_workspace.py
  class LocalWorkspace (line 30) | class LocalWorkspace(Workspace):
    method __init__ (line 55) | def __init__(self, dir: PathOrStr):
    method __getstate__ (line 96) | def __getstate__(self):
    method url (line 106) | def url(self) -> str:
    method from_parsed_url (line 110) | def from_parsed_url(cls, parsed_url: ParseResult) -> "Workspace":
    method step_dir (line 122) | def step_dir(self, step_or_unique_id: Union[Step, str]) -> Path:
    method step_cache (line 126) | def step_cache(self) -> StepCache:
    method work_dir (line 129) | def work_dir(self, step: Step) -> Path:
    method guess_step_dir_state (line 135) | def guess_step_dir_state(cls, dir: Path) -> Set[StepState]:
    method _fix_step_info (line 172) | def _fix_step_info(step_info: StepInfo) -> None:
    method step_info (line 182) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method _step_lock_file (line 231) | def _step_lock_file(self, step_or_unique_id: Union[Step, str]) -> Path:
    method step_starting (line 236) | def step_starting(self, step: Step) -> None:
    method step_finished (line 275) | def step_finished(self, step: Step, result: T) -> T:
    method step_failed (line 305) | def step_failed(self, step: Step, e: BaseException) -> None:
    method remove_step (line 325) | def remove_step(self, step_unique_id: str) -> None:
    method register_run (line 339) | def register_run(self, targets: Iterable[Step], name: Optional[str] = ...
    method registered_runs (line 372) | def registered_runs(self) -> Dict[str, Run]:
    method search_step_info (line 379) | def search_step_info(
    method registered_run (line 411) | def registered_run(self, name: str) -> Run:
    method _run_step_info_file (line 418) | def _run_step_info_file(self, name: str) -> Path:
    method _save_registered_run (line 421) | def _save_registered_run(self, name: str, all_steps: Iterable[Step]) -...
    method _load_registered_run (line 439) | def _load_registered_run(self, name: str) -> Dict[str, StepInfo]:
    method run_dir (line 464) | def run_dir(self, name: str) -> Path:
    method capture_logs_for_run (line 476) | def capture_logs_for_run(self, name: str):

FILE: tango/workspaces/memory_workspace.py
  class MemoryWorkspace (line 19) | class MemoryWorkspace(Workspace):
    method __init__ (line 29) | def __init__(self):
    method url (line 35) | def url(self) -> str:
    method from_parsed_url (line 39) | def from_parsed_url(cls, parsed_url: ParseResult) -> "Workspace":
    method step_cache (line 43) | def step_cache(self) -> StepCache:
    method step_info (line 46) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method step_starting (line 61) | def step_starting(self, step: Step) -> None:
    method step_finished (line 70) | def step_finished(self, step: Step, result: T) -> T:
    method step_failed (line 89) | def step_failed(self, step: Step, e: BaseException) -> None:
    method remove_step (line 101) | def remove_step(self, step_unique_id: str) -> None:
    method register_run (line 113) | def register_run(self, targets: Iterable[Step], name: Optional[str] = ...
    method registered_runs (line 125) | def registered_runs(self) -> Dict[str, Run]:
    method registered_run (line 128) | def registered_run(self, name: str) -> Run:

FILE: tango/workspaces/remote_workspace.py
  class RemoteWorkspace (line 24) | class RemoteWorkspace(Workspace):
    method cache (line 37) | def cache(self) -> RemoteStepCache:
    method steps_dir_name (line 42) | def steps_dir_name(self) -> str:
    method locks (line 47) | def locks(self) -> Dict:
    method steps_dir (line 51) | def steps_dir(self) -> Path:
    method url (line 56) | def url(self) -> str:
    method from_parsed_url (line 61) | def from_parsed_url(cls, parsed_url: ParseResult) -> Workspace:
    method step_cache (line 65) | def step_cache(self) -> RemoteStepCache:
    method step_dir (line 68) | def step_dir(self, step_or_unique_id: Union[Step, str]) -> Path:
    method work_dir (line 76) | def work_dir(self, step: Step) -> Path:
    method step_info (line 81) | def step_info(self, step_or_unique_id: Union[Step, str]) -> StepInfo:
    method _remote_lock (line 85) | def _remote_lock(self, step: Step):
    method _step_location (line 89) | def _step_location(self, step: Step) -> str:
    method step_starting (line 92) | def step_starting(self, step: Step) -> None:
    method step_finished (line 135) | def step_finished(self, step: Step, result: T) -> T:
    method step_failed (line 162) | def step_failed(self, step: Step, e: BaseException) -> None:
    method remove_step (line 177) | def remove_step(self, step_unique_id: str) -> None:
    method _get_run_step_info (line 193) | def _get_run_step_info(self, targets: Iterable[Step]) -> Tuple[Dict, D...
    method _save_run (line 221) | def _save_run(
    method register_run (line 226) | def register_run(self, targets: Iterable[Step], name: Optional[str] = ...
    method _save_run_log (line 232) | def _save_run_log(self, name: str, log_file: Path):
    method capture_logs_for_run (line 236) | def capture_logs_for_run(self, name: str) -> Generator[None, None, None]:
    method _update_step_info (line 246) | def _update_step_info(self, step_info: StepInfo):
    method _remove_step_info (line 250) | def _remove_step_info(self, step_info: StepInfo):

FILE: test_fixtures/integrations/common/__init__.py
  class GenerateData (line 9) | class GenerateData(Step):
    method run (line 13) | def run(self) -> DatasetDict:  # type: ignore[override]
  class RandomIterableDataset (line 23) | class RandomIterableDataset(IterableDataset):
    method __init__ (line 24) | def __init__(self, data):
    method __iter__ (line 27) | def __iter__(self):
  class GenerateStreamingData (line 32) | class GenerateStreamingData(Step):
    method run (line 36) | def run(self) -> IterableDatasetDict:  # type: ignore[override]

FILE: test_fixtures/integrations/fairscale/components.py
  class FeedForward (line 10) | class FeedForward(nn.Module):
    method __init__ (line 11) | def __init__(self):
    method forward (line 16) | def forward(self, x):
  class SimpleRegressionModel (line 21) | class SimpleRegressionModel(Model):
    method __init__ (line 22) | def __init__(self):
    method forward (line 28) | def forward(self, x, y):
  class SimpleRegressionDataStep (line 36) | class SimpleRegressionDataStep(Step):
    method run (line 40) | def run(self, seed: int = 317) -> DatasetDict:  # type: ignore

FILE: test_fixtures/integrations/flax/xsum.py
  class PreProcessing (line 16) | class PreProcessing(Step):
    method run (line 19) | def run(self, dataset):
  class TransformerWrapper (line 75) | class TransformerWrapper(FlaxWrapper):
    method train_metrics (line 76) | def train_metrics(self, state, batch, labels):
    method loss_helper (line 80) | def loss_helper(self, logits, labels, batch):
    method train_loss (line 100) | def train_loss(self, params, state, batch, dropout_rng, labels):
    method val_metrics (line 105) | def val_metrics(self, batch, logits, labels):
    method eval_metrics (line 110) | def eval_metrics(self, batch, logits, labels):

FILE: test_fixtures/integrations/torch/__init__.py
  class BasicRegression (line 7) | class BasicRegression(Model):
    method __init__ (line 8) | def __init__(self):
    method forward (line 14) | def forward(self, x, y=None):
    method _to_params (line 21) | def _to_params(self):

FILE: tests/common/dataset_dict_test.py
  function test_dataset_dict (line 4) | def test_dataset_dict():

FILE: tests/common/det_hash_test.py
  function test_normal_det_hash (line 4) | def test_normal_det_hash():
  function test_versioned_det_hash (line 32) | def test_versioned_det_hash():

FILE: tests/common/from_params_pep_563_test.py
  class Foo (line 11) | class Foo(FromParams):
    method __init__ (line 12) | def __init__(self, x: int):
  class Bar (line 16) | class Bar(FromParams):
    method __init__ (line 17) | def __init__(self, foo: Lazy[Foo]):
  class Baz (line 21) | class Baz(FromParams):
    method __init__ (line 22) | def __init__(self, bar: Lazy[Bar]):
  function test_infer_method_params (line 26) | def test_infer_method_params():
  function test_from_params (line 31) | def test_from_params():

FILE: tests/common/from_params_test.py
  class TestFromParams (line 37) | class TestFromParams(TangoTestCase):
    method test_takes_arg (line 38) | def test_takes_arg(self):
    method test_remove_optional (line 68) | def test_remove_optional(self):
    method test_from_params (line 80) | def test_from_params(self, input_type):
    method test_create_kwargs (line 88) | def test_create_kwargs(self):
    method test_extras (line 96) | def test_extras(self):
    method test_variable_length_tuple (line 145) | def test_variable_length_tuple(self):
    method test_union (line 154) | def test_union(self):
    method test_non_params_object_with_params (line 184) | def test_non_params_object_with_params(self):
    method test_crazy_nested_union (line 188) | def test_crazy_nested_union(self):
    method test_union_of_castable_types (line 215) | def test_union_of_castable_types(self):
    method test_invalid_type_conversions (line 233) | def test_invalid_type_conversions(self):
    method test_dict (line 243) | def test_dict(self):
    method test_dict_not_params (line 275) | def test_dict_not_params(self):
    method test_list (line 286) | def test_list(self):
    method test_tuple (line 314) | def test_tuple(self):
    method test_set (line 351) | def test_set(self):
    method test_kwargs_with_multiple_inheritance (line 392) | def test_kwargs_with_multiple_inheritance(self):
    method test_instantiating_with_multiple_inheritance (line 419) | def test_instantiating_with_multiple_inheritance(self):
    method test_only_infer_superclass_params_if_unknown (line 446) | def test_only_infer_superclass_params_if_unknown(self):
    method test_kwargs_are_passed_to_deeper_superclasses (line 476) | def test_kwargs_are_passed_to_deeper_superclasses(self):
    method test_lazy_construction_can_happen_multiple_times (line 508) | def test_lazy_construction_can_happen_multiple_times(self):
    method test_lazy_and_from_params_can_be_pickled (line 528) | def test_lazy_and_from_params_can_be_pickled(self):
    method test_optional_vs_required_lazy_objects (line 534) | def test_optional_vs_required_lazy_objects(self):
    method test_wrapper_kwargs_passed_down (line 574) | def test_wrapper_kwargs_passed_down(self):
    method test_iterable (line 587) | def test_iterable(self):
    method test_mapping (line 616) | def test_mapping(self):
    method test_custom_abc_mapping (line 648) | def test_custom_abc_mapping(self):
    method test_extra_parameters_are_not_allowed_when_there_is_no_constructor (line 672) | def test_extra_parameters_are_not_allowed_when_there_is_no_constructor...
    method test_explicit_kwargs_always_passed_to_constructor (line 679) | def test_explicit_kwargs_always_passed_to_constructor(self):
    method test_raises_when_there_are_no_implementations (line 699) | def test_raises_when_there_are_no_implementations(self):
    method test_from_params_raises_error_on_wrong_parameter_name_in_optional_union (line 727) | def test_from_params_raises_error_on_wrong_parameter_name_in_optional_...
    method test_from_params_handles_base_class_kwargs (line 741) | def test_from_params_handles_base_class_kwargs(self):
    method test_from_params_base_class_kwargs_crashes_if_params_not_handled (line 779) | def test_from_params_base_class_kwargs_crashes_if_params_not_handled(s...
    method test_from_params_handles_kwargs_in_non_from_params_registered_class (line 798) | def test_from_params_handles_kwargs_in_non_from_params_registered_clas...
    method test_from_params_passes_extras_to_non_from_params_registered_class (line 825) | def test_from_params_passes_extras_to_non_from_params_registered_class...
    method test_from_params_child_has_kwargs_base_implicit_constructor (line 854) | def test_from_params_child_has_kwargs_base_implicit_constructor(self):
    method test_from_params_has_args (line 865) | def test_from_params_has_args(self):
    method test_from_params_with_dataclass (line 873) | def test_from_params_with_dataclass(self):
    method test_to_params (line 883) | def test_to_params(self):
    method test_to_params_needs_custom_to_params (line 899) | def test_to_params_needs_custom_to_params(self):
    method test_type_hinting_generics_from_std_collections (line 914) | def test_type_hinting_generics_from_std_collections(self):
    method test_with_non_from_params_generics (line 929) | def test_with_non_from_params_generics(self):
    method test_with_union_pipe (line 944) | def test_with_union_pipe(self):
    method test_from_params_with_function (line 956) | def test_from_params_with_function(self):
    method test_from_params_passes_no_extra_args_in_factory_construction (line 979) | def test_from_params_passes_no_extra_args_in_factory_construction(self):
    method test_lazy_from_params_with_version (line 1013) | def test_lazy_from_params_with_version(self):
    method test_from_params_that_takes_step_directly (line 1054) | def test_from_params_that_takes_step_directly(self):
  class MyClass (line 1074) | class MyClass(FromParams):
    method __init__ (line 1075) | def __init__(self, my_int: int, my_bool: bool = False) -> None:
  class Foo (line 1080) | class Foo(FromParams):
    method __init__ (line 1081) | def __init__(self, a: int = 1) -> None:
  class Bar (line 1085) | class Bar(FromParams):
    method __init__ (line 1086) | def __init__(self, foo: Foo) -> None:
  class Baz (line 1090) | class Baz(FromParams):
    method __init__ (line 1091) | def __init__(self, bar: Lazy[Bar]) -> None:
    method bar (line 1095) | def bar(self):

FILE: tests/common/params_test.py
  class TestParams (line 18) | class TestParams(TangoTestCase):
    method test_load_from_file (line 20) | def test_load_from_file(self, extension):
    method test_replace_none (line 25) | def test_replace_none(self):
    method test_init_with_different_types (line 31) | def test_init_with_different_types(self):
    method test_bad_unicode_environment_variables (line 34) | def test_bad_unicode_environment_variables(self):
    method test_with_overrides (line 40) | def test_with_overrides(self):
    method test_bad_overrides (line 60) | def test_bad_overrides(self):
    method test_overrides (line 67) | def test_overrides(self, input_type):
    method test_as_flat_dict (line 85) | def test_as_flat_dict(self):
    method test_jsonnet_features (line 90) | def test_jsonnet_features(self):
    method test_regexes_with_backslashes (line 114) | def test_regexes_with_backslashes(self):
    method test_env_var_substitution (line 141) | def test_env_var_substitution(self):
    method test_as_ordered_dict (line 161) | def test_as_ordered_dict(self):
    method test_to_file (line 185) | def test_to_file(self):
    method test_infer_and_cast (line 199) | def test_infer_and_cast(self):
    method test_pop_choice (line 223) | def test_pop_choice(self):
    method test_remove_keys_from_params (line 239) | def test_remove_keys_from_params(self):

FILE: tests/common/registrable_test.py
  class TestRegistrable (line 9) | class TestRegistrable(TangoTestCase):
    method test_basic_functionality (line 10) | def test_basic_functionality(self):
    method test_registering_step_by_reserved_name (line 44) | def test_registering_step_by_reserved_name(self):
    method test_search_modules (line 51) | def test_search_modules(self):

FILE: tests/common/sequences_test.py
  function assert_equal_including_exceptions (line 15) | def assert_equal_including_exceptions(expected_fn, actual_fn):
  function test_shuffled_sequence (line 25) | def test_shuffled_sequence():
  function test_sliced_sequence (line 31) | def test_sliced_sequence():
  function test_concatenated_sequence (line 40) | def test_concatenated_sequence():
  function test_sqlite_sparse_sequence (line 85) | def test_sqlite_sparse_sequence():
  function test_mapped_sequence (line 106) | def test_mapped_sequence():

FILE: tests/common/util_test.py
  class TestResolveModuleName (line 18) | class TestResolveModuleName(TangoTestCase):
    method setup_method (line 19) | def setup_method(self):
    method teardown_method (line 24) | def teardown_method(self):
    method test_with_package_init_file (line 28) | def test_with_package_init_file(self):
    method test_with_submodule (line 35) | def test_with_submodule(self):
    method test_with_module_in_child_directory (line 42) | def test_with_module_in_child_directory(self):
  function test_find_submodules (line 49) | def test_find_submodules():
  function test_find_integrations (line 58) | def test_find_integrations():
  function test_could_be_class_name (line 74) | def test_could_be_class_name(name: str, result: bool):
  function test_threaded_generator (line 79) | def test_threaded_generator():

FILE: tests/end_to_end/test_dataset_dict_from_separate_steps.py
  class TrainData (line 9) | class TrainData(Step):
    method run (line 13) | def run(self) -> Sequence[int]:  # type: ignore
  class ValData (line 18) | class ValData(Step):
    method run (line 22) | def run(self) -> Sequence[int]:  # type: ignore
  class SaveData (line 27) | class SaveData(Step):
    method run (line 32) | def run(self, dataset_dict: DatasetDict) -> Any:  # type: ignore
  function test_experiment (line 36) | def test_experiment():

FILE: tests/end_to_end/test_lazy_input_with_another_step.py
  class Foo (line 9) | class Foo(FromParams):
  class GenerateNumberStep (line 14) | class GenerateNumberStep(Step):
    method run (line 19) | def run(self) -> float:  # type: ignore[override]
  class StepWithLazyInput (line 24) | class StepWithLazyInput(Step):
    method run (line 29) | def run(self, foo: Lazy[Foo]) -> float:  # type: ignore[override]
  function test_experiment (line 36) | def test_experiment():

FILE: tests/end_to_end/test_multicore_cli.py
  class TestExperiment (line 8) | class TestExperiment(TangoTestCase):
    method setup_method (line 9) | def setup_method(self):
    method teardown_method (line 35) | def teardown_method(self):
    method test_experiment (line 39) | def test_experiment(self, caplog):
    method test_experiment_with_overrides (line 53) | def test_experiment_with_overrides(self, caplog):

FILE: tests/end_to_end/test_non_cacheable_into_cacheable_multiple_runs.py
  class GiveMeANumber (line 8) | class GiveMeANumber(Step):
    method run (line 12) | def run(self, what_number: int) -> int:  # type: ignore
  class RandomInt (line 17) | class RandomInt(Step):
    method run (line 21) | def run(self, lower_bound: int, upper_bound: int) -> int:  # type: ignore
  class TestExperiment (line 25) | class TestExperiment(TangoTestCase):
    method test_experiment (line 26) | def test_experiment(self, caplog):

FILE: tests/end_to_end/test_registered_runs.py
  class ReturnANumber (line 6) | class ReturnANumber(Step):
    method run (line 11) | def run(self, what_number: int) -> int:  # type: ignore
  class TestExperiment (line 15) | class TestExperiment(TangoTestCase):
    method test_experiment_updates_latest_run_output (line 16) | def test_experiment_updates_latest_run_output(self, caplog):

FILE: tests/end_to_end/test_run_single_step.py
  class TestRunSingleStep (line 4) | class TestRunSingleStep(TangoTestCase):
    method test_run_single_step (line 5) | def test_run_single_step(self):

FILE: tests/end_to_end/test_step_indexing.py
  class TestStepIndexing (line 5) | class TestStepIndexing(TangoTestCase):
    method test_step_indexing (line 6) | def test_step_indexing(self):

FILE: tests/end_to_end/test_steps_that_fail.py
  class StepA (line 14) | class StepA(Step):
    method run (line 15) | def run(self, what_number: int) -> int:  # type: ignore
  class StepB (line 22) | class StepB(Step):
    method run (line 23) | def run(self, what_number: int) -> int:  # type: ignore
  class StepFail (line 33) | class StepFail(Step):
    method run (line 34) | def run(self, what_number: int) -> int:  # type: ignore
  class TestExperiment (line 44) | class TestExperiment(TangoTestCase):
    method test_experiment (line 45) | def test_experiment(self, caplog):

FILE: tests/end_to_end/test_uncacheable_leaf_steps.py
  class StoreNumberGlobally (line 9) | class StoreNumberGlobally(Step):
    method run (line 13) | def run(self, number: int) -> None:  # type: ignore
  class TestExperiment (line 18) | class TestExperiment(TangoTestCase):
    method test_experiment (line 19) | def test_experiment(self, caplog):
  class TestExperimentMulticore (line 39) | class TestExperimentMulticore(TangoTestCase):
    method test_experiment (line 40) | def test_experiment(self, caplog):

FILE: tests/executor_test.py
  class AdditionStep (line 10) | class AdditionStep(Step):
    method run (line 14) | def run(self, a: int, b: int) -> int:  # type: ignore
  class TestExecutor (line 18) | class TestExecutor(TangoTestCase):
    method test_executor (line 19) | def test_executor(self):
    method test_executor_with_failing_steps (line 30) | def test_executor_with_failing_steps(self):

FILE: tests/executors/multicore_executor_test.py
  class TestMulticoreExecutor (line 13) | class TestMulticoreExecutor(TangoTestCase):
    method setup_method (line 14) | def setup_method(self):
    method test_simple_execution_in_parallel (line 18) | def test_simple_execution_in_parallel(self):
    method test_more_processes_ready_than_parallelism (line 36) | def test_more_processes_ready_than_parallelism(self):
    method test_failing_step_no_downstream_task (line 55) | def test_failing_step_no_downstream_task(self, parallelism):
    method test_failing_step_with_downstream_task (line 88) | def test_failing_step_with_downstream_task(self, parallelism):
    method test_failing_step_with_further_downstream_task (line 121) | def test_failing_step_with_further_downstream_task(self, parallelism):
    method test_uncacheable_failing_step_no_downstream_task (line 153) | def test_uncacheable_failing_step_no_downstream_task(self):
    method test_uncacheable_failing_step_with_downstream_task (line 186) | def test_uncacheable_failing_step_with_downstream_task(self):
    method test_steps_with_their_own_multiprocessing (line 220) | def test_steps_with_their_own_multiprocessing(self, parallelism):

FILE: tests/format_test.py
  class TestFormat (line 9) | class TestFormat(TangoTestCase):
    method test_dill_format (line 11) | def test_dill_format(self, compress: Optional[str]):
    method test_iterable_dill_format (line 19) | def test_iterable_dill_format(self, compress: Optional[str]):
    method test_json_format (line 28) | def test_json_format(self, compress: Optional[str]):
    method test_iterable_json_format (line 36) | def test_iterable_json_format(self, compress: Optional[str]):
    method test_iterable_text_format (line 44) | def test_iterable_text_format(self):

FILE: tests/integrations/beaker/conftest.py
  function patched_cache_dir (line 15) | def patched_cache_dir(tmp_path, monkeypatch) -> Path:
  function patched_unique_id_suffix (line 21) | def patched_unique_id_suffix(monkeypatch) -> str:
  function patched_constants_prefix (line 28) | def patched_constants_prefix(monkeypatch) -> str:
  function beaker_workspace_name (line 39) | def beaker_workspace_name() -> str:
  function beaker_workspace (line 44) | def beaker_workspace(

FILE: tests/integrations/beaker/executor_test.py
  function test_from_params (line 14) | def test_from_params(beaker_workspace_name: str):
  function test_init_with_mem_workspace (line 34) | def test_init_with_mem_workspace(beaker_workspace_name: str):
  function settings (line 47) | def settings(beaker_workspace_name: str) -> TangoGlobalSettings:
  function test_beaker_executor (line 60) | def test_beaker_executor(

FILE: tests/integrations/beaker/step_cache_test.py
  function test_step_cache (line 5) | def test_step_cache(beaker_workspace: str):

FILE: tests/integrations/beaker/workspace_test.py
  function test_from_url (line 10) | def test_from_url(beaker_workspace: str):
  function test_direct_usage (line 16) | def test_direct_usage(beaker_workspace: str):
  function test_remove_step (line 31) | def test_remove_step(beaker_workspace: str):

FILE: tests/integrations/datasets/dataset_test.py
  class TestDatasets (line 14) | class TestDatasets(TangoTestCase):
    method test_from_params_and_convert_to_tango_dataset_dict (line 15) | def test_from_params_and_convert_to_tango_dataset_dict(self):
    method test_convert_to_tango_iterable_dataset_dict (line 28) | def test_convert_to_tango_iterable_dataset_dict(self):
    method test_load_concatenate_and_interleave (line 39) | def test_load_concatenate_and_interleave(self):
  function test_mapped_sequence_of_dataset (line 52) | def test_mapped_sequence_of_dataset():
  function test_datasets_dataset_remix (line 60) | def test_datasets_dataset_remix():

FILE: tests/integrations/fairscale/train_test.py
  class TestFairScaleTrain (line 10) | class TestFairScaleTrain(TangoTestCase):
    method setup_method (line 11) | def setup_method(self):
    method teardown_method (line 15) | def teardown_method(self):
    method test_train_tiny_gpt2 (line 57) | def test_train_tiny_gpt2(self, fsdp: bool, activation_checkpoint: bool...

FILE: tests/integrations/flax/data_test.py
  class TestDataStep (line 11) | class TestDataStep(TangoTestCase):
    method test_dataloader (line 12) | def test_dataloader(self) -> None:
    method test_sample_data (line 15) | def test_sample_data(self) -> None:

FILE: tests/integrations/flax/format_test.py
  class TestTorchFormat (line 8) | class TestTorchFormat(TangoTestCase):
    method test_read_write (line 9) | def test_read_write(self):

FILE: tests/integrations/flax/optim_test.py
  function test_all_optimizers_registered (line 4) | def test_all_optimizers_registered():
  function test_all_lr_schedulers_registered (line 8) | def test_all_lr_schedulers_registered():

FILE: tests/integrations/flax/train_test.py
  class TestTrainStep (line 5) | class TestTrainStep(TangoTestCase):
    method setup_method (line 6) | def setup_method(self):
    method teardown_method (line 10) | def teardown_method(self):
    method test_trainer (line 14) | def test_trainer(self):

FILE: tests/integrations/gs/step_cache_test.py
  class TestGSStepCache (line 14) | class TestGSStepCache(TangoTestCase):
    method setup_method (line 15) | def setup_method(self):
    method teardown_method (line 20) | def teardown_method(self):
    method test_step_cache (line 24) | def test_step_cache(self, gs_path):

FILE: tests/integrations/gs/workspace_test.py
  class TestGSWorkspace (line 16) | class TestGSWorkspace(TangoTestCase):
    method setup_method (line 17) | def setup_method(self):
    method teardown_method (line 24) | def teardown_method(self):
    method test_from_url (line 28) | def test_from_url(self, gs_path: str):
    method test_from_params (line 33) | def test_from_params(self, gs_path: str):
    method test_direct_usage (line 38) | def test_direct_usage(self, gs_path: str):
    method test_remove_step (line 52) | def test_remove_step(self):

FILE: tests/integrations/torch/data_test.py
  function test_dataloader_from_params (line 6) | def test_dataloader_from_params():
  function test_samplers_registered (line 16) | def test_samplers_registered():
  function test_dataloader_from_params_with_sampler (line 20) | def test_dataloader_from_params_with_sampler():
  function test_dataloader_from_params_with_batch_sampler (line 34) | def test_dataloader_from_params_with_batch_sampler():

FILE: tests/integrations/torch/det_hash_test.py
  function test_numpy_det_hash (line 7) | def test_numpy_det_hash():
  function test_torch_det_hash (line 13) | def test_torch_det_hash():

FILE: tests/integrations/torch/eval_test.py
  class TestEvalStep (line 4) | class TestEvalStep(TangoTestCase):
    method test_basic_eval (line 5) | def test_basic_eval(self):

FILE: tests/integrations/torch/format_test.py
  class TestTorchFormat (line 8) | class TestTorchFormat(TangoTestCase):
    method test_read_write (line 9) | def test_read_write(self):

FILE: tests/integrations/torch/optim_test.py
  function test_all_optimizers_registered (line 4) | def test_all_optimizers_registered():
  function test_all_lr_schedulers_registered (line 8) | def test_all_lr_schedulers_registered():

FILE: tests/integrations/torch/train_callback_test.py
  function test_stop_early_callback (line 19) | def test_stop_early_callback():

FILE: tests/integrations/torch/train_test.py
  class TestTrainStep (line 10) | class TestTrainStep(TangoTestCase):
    method setup_method (line 11) | def setup_method(self):
    method teardown_method (line 15) | def teardown_method(self):
    method test_basic_train (line 22) | def test_basic_train(self, with_validation: bool):
    method test_basic_train_with_epochs (line 48) | def test_basic_train_with_epochs(self, grad_acc: int):
    method test_basic_train_with_streaming_data (line 74) | def test_basic_train_with_streaming_data(self):
    method test_train_distributed (line 84) | def test_train_distributed(self):
    method test_train_distributed_with_epochs (line 108) | def test_train_distributed_with_epochs(self, grad_acc: int):

FILE: tests/integrations/torch/training_engine_test.py
  class DummyModel (line 21) | class DummyModel(Model):
    method __init__ (line 22) | def __init__(self):
    method forward (line 26) | def forward(self, x, y=None):
  class TestTorchTrainingEngine (line 32) | class TestTorchTrainingEngine(TangoTestCase):
    method test_grad_scaler (line 33) | def test_grad_scaler(self):
  class WorseningModel (line 54) | class WorseningModel(Model):
    method __init__ (line 55) | def __init__(self):
    method forward (line 61) | def forward(self, x, y):
  class StopOnStepCallback (line 67) | class StopOnStepCallback(TrainCallback):
    method __init__ (line 68) | def __init__(self, stop_on_step: int, *args, **kwargs):
    method post_val_loop (line 72) | def post_val_loop(
  function test_with_increasing_loss (line 79) | def test_with_increasing_loss():

FILE: tests/integrations/transformers/data_test.py
  function test_init_collator_no_tokenizer (line 7) | def test_init_collator_no_tokenizer():
  function test_init_collator_with_tokenizer (line 12) | def test_init_collator_with_tokenizer():

FILE: tests/integrations/transformers/finetune_test.py
  class TestTokenizeText2TextData (line 8) | class TestTokenizeText2TextData(TangoTestCase):
    method test_tokenize_seq2seq (line 9) | def test_tokenize_seq2seq(self):
    method test_tokenize_concat (line 26) | def test_tokenize_concat(self):

FILE: tests/integrations/transformers/ia3_test.py
  function test_ia3 (line 7) | def test_ia3():

FILE: tests/integrations/transformers/run_generation_test.py
  class TestRunGeneration (line 7) | class TestRunGeneration(TangoTestCase):
    method test_run_generation (line 8) | def test_run_generation(self):
    method test_run_generation_with_model (line 19) | def test_run_generation_with_model(self):
    method test_run_generation_dataset (line 33) | def test_run_generation_dataset(self):

FILE: tests/integrations/transformers/soft_prompt_test.py
  function test_soft_prompt (line 6) | def test_soft_prompt():
  function test_soft_prompt_twice (line 26) | def test_soft_prompt_twice():

FILE: tests/integrations/wandb/step_cache_test.py
  class SomeFakeStep (line 14) | class SomeFakeStep(Step):
    method run (line 18) | def run(self) -> int:  # type: ignore
  function test_step_cache_artifact_not_found (line 22) | def test_step_cache_artifact_not_found():
  function test_pickling (line 41) | def test_pickling(protocol: int):

FILE: tests/integrations/wandb/workspace_test.py
  class TestWandbWorkspace (line 23) | class TestWandbWorkspace(TangoTestCase):
    method setup_method (line 26) | def setup_method(self, monkeypatch):
    method test_pickle_workspace (line 44) | def test_pickle_workspace(self, protocol):
    method test_from_url (line 52) | def test_from_url(self):
  class TestWandbWorkspaceUsage (line 59) | class TestWandbWorkspaceUsage(TangoTestCase):
    method setup_method (line 62) | def setup_method(self, monkeypatch):
    method teardown_method (line 79) | def teardown_method(self):
    method test_direct_usage (line 94) | def test_direct_usage(self):
    method test_with_wandb_train_callback (line 151) | def test_with_wandb_train_callback(self, multicore: bool, distributed:...

FILE: tests/main_test.py
  class TestRun (line 16) | class TestRun(TangoTestCase):
    method clean_log_lines (line 17) | def clean_log_lines(
    method check_logs (line 38) | def check_logs(
    method test_version (line 60) | def test_version(self):
    method test_logging_all_levels (line 67) | def test_logging_all_levels(self, log_level: str, raise_error):
    method test_deterministic_experiment (line 114) | def test_deterministic_experiment(self):
    method test_experiment_with_memory_workspace (line 140) | def test_experiment_with_memory_workspace(self):
    method test_experiment_with_default_workspace (line 151) | def test_experiment_with_default_workspace(self):
    method test_random_experiment (line 160) | def test_random_experiment(self):
    method test_run_name (line 171) | def test_run_name(self):
    method test_experiment_with_logging_and_multiprocessing (line 191) | def test_experiment_with_logging_and_multiprocessing(
  class TestSettings (line 242) | class TestSettings(TangoTestCase):
    method setup_method (line 243) | def setup_method(self):
    method teardown_method (line 250) | def teardown_method(self):
    method settings (line 255) | def settings(self) -> TangoGlobalSettings:
    method test_settings_set_workspace (line 258) | def test_settings_set_workspace(self):
    method test_settings_set_include_package (line 266) | def test_settings_set_include_package(self):
    method test_settings_set_include_package_invalid (line 271) | def test_settings_set_include_package_invalid(self):
    method test_settings_set_environment (line 276) | def test_settings_set_environment(self):
    method test_settings_set_environment_blocked_var (line 281) | def test_settings_set_environment_blocked_var(self):

FILE: tests/step_caches/local_step_cache_test.py
  class DummyStep (line 11) | class DummyStep(Step):
    method run (line 12) | def run(self, x: int) -> int:  # type: ignore[override]
  class TestLocalStepCache (line 16) | class TestLocalStepCache(TangoTestCase):
    method test_pickling (line 30) | def test_pickling(self, protocol: int):

FILE: tests/step_graph_test.py
  class TestStepGraph (line 17) | class TestStepGraph(TangoTestCase):
    method test_ordered_steps (line 18) | def test_ordered_steps(self):
    method test_from_file (line 42) | def test_from_file(self):
    method test_missing_type (line 47) | def test_missing_type(self):
    method test_direct_construction (line 58) | def test_direct_construction(self):
    method test_direct_construction_missing_dependency (line 64) | def test_direct_construction_missing_dependency(self):
    method test_to_file (line 70) | def test_to_file(self):
    method test_to_file_without_config (line 81) | def test_to_file_without_config(self):
    method test_with_step_indexer (line 97) | def test_with_step_indexer(self):
    method test_with_forced_dependencies (line 110) | def test_with_forced_dependencies(self):

FILE: tests/step_info_test.py
  function test_step_info (line 11) | def test_step_info():
  function test_step_info_with_step_dependency (line 31) | def test_step_info_with_step_dependency():

FILE: tests/step_test.py
  class TestStep (line 15) | class TestStep(TangoTestCase):
    method test_from_params (line 16) | def test_from_params(self):
    method test_from_params_wrong_type (line 21) | def test_from_params_wrong_type(self):
    method test_step_with_from_params_input (line 25) | def test_step_with_from_params_input(self):
    method test_no_hash_arguments (line 38) | def test_no_hash_arguments(self):
    method test_skip_default_arguments (line 50) | def test_skip_default_arguments(self):
    method test_massage_kwargs (line 67) | def test_massage_kwargs(self):
    method test_default_args (line 87) | def test_default_args(self):
    method test_steps_in_params (line 98) | def test_steps_in_params(self):
    method test_functional_step (line 157) | def test_functional_step(self):
    method test_bound_functional_step (line 173) | def test_bound_functional_step(self):
    method test_bound_functional_step_missing_self (line 187) | def test_bound_functional_step_missing_self(self):

FILE: tests/steps/dataset_remix_test.py
  function test_dataset_remix_step (line 5) | def test_dataset_remix_step():

FILE: tests/steps/shell_step_test.py
  class TestShellStep (line 7) | class TestShellStep(TangoTestCase):
    method test_shell_step (line 8) | def test_shell_step(self):
    method test_shell_step_failure (line 14) | def test_shell_step_failure(self):
    method test_shell_step_with_output_path (line 19) | def test_shell_step_with_output_path(self, caplog):
    method test_shell_step_different_validation (line 25) | def test_shell_step_different_validation(self, caplog):
    method test_shell_step_in_config (line 42) | def test_shell_step_in_config(self, caplog):

FILE: tests/workspaces/local_workspace_test.py
  class AdditionStep (line 12) | class AdditionStep(Step):
    method run (line 13) | def run(self, a: int, b: int) -> int:  # type: ignore
  class TestLocalWorkspace (line 17) | class TestLocalWorkspace(TangoTestCase):
    method test_local_workspace_one_step (line 18) | def test_local_workspace_one_step(self):
    method test_local_workspace_two_steps (line 37) | def test_local_workspace_two_steps(self):
    method test_local_workspace_upgrade_v1_to_v2 (line 61) | def test_local_workspace_upgrade_v1_to_v2(self):
    method test_remove_step (line 78) | def test_remove_step(self):

FILE: tests/workspaces/memory_workspace_test.py
  function test_remove_step (line 5) | def test_remove_step():

Download .json

Condensed preview — 329 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,307K chars).

[
  {
    "path": ".dockerignore",
    "chars": 142,
    "preview": ".dockerignore\n**.pyc\n**/__pycache__\n.gitignore\n.git\n.coverage\n.mypy_cache\ndocs\nexamples\ntests\ntest_fixtures\nintegration_"
  },
  {
    "path": ".github/CONTRIBUTING.md",
    "chars": 10319,
    "preview": "# Contributing\n\nThanks for considering contributing! Please read this document to learn the various ways you can contrib"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.yml",
    "chars": 1989,
    "preview": "name: 🐛 Bug Report\ndescription: Create a report to help us reproduce and fix the bug\nlabels: 'bug'\n\nbody:\n- type: markdo"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/documentation.yml",
    "chars": 597,
    "preview": "name: 📚 Documentation\ndescription: Report an issue related to https://ai2-tango.readthedocs.io/latest\nlabels: 'documenta"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.yml",
    "chars": 926,
    "preview": "name: 🚀 Feature request\ndescription: Submit a proposal/request for a new feature\nlabels: 'feature request'\n\nbody:\n- type"
  },
  {
    "path": ".github/dependabot.yml",
    "chars": 216,
    "preview": "version: 2\nupdates:\n- package-ecosystem: pip\n  directory: \"/\"\n  schedule:\n    interval: \"daily\"\n  open-pull-requests-lim"
  },
  {
    "path": ".github/workflows/changelog.yml",
    "chars": 713,
    "preview": "name: Changelog\n\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.ref }}\n  cancel-in-progress: true\n\non:\n  pull_r"
  },
  {
    "path": ".github/workflows/docker.yml",
    "chars": 1618,
    "preview": "name: Docker\n\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.ref }}\n  cancel-in-progress: true\n\non:\n  pull_requ"
  },
  {
    "path": ".github/workflows/docker_testing.yml",
    "chars": 3030,
    "preview": "# This workflow is just for building our Docker image for GPU testing on Beaker,\n# and pushing it to Beaker. We only run"
  },
  {
    "path": ".github/workflows/integration_tests.yml",
    "chars": 5163,
    "preview": "name: Integration tests\n\non:\n  workflow_dispatch:\n    inputs:\n      test:\n        description: the integration test to r"
  },
  {
    "path": ".github/workflows/main.yml",
    "chars": 11547,
    "preview": "name: Main\n\nconcurrency:\n  group: ${{ github.workflow }}-${{ github.ref }}\n\non:\n  pull_request:\n    branches:\n      - \"*"
  },
  {
    "path": ".github/workflows/update_dependency_pr.yml",
    "chars": 814,
    "preview": "name: Update dependency PR\n\non:\n  pull_request:\n    types:\n      - opened\n    paths:\n      - \"pyproject.toml\"\n\npermissio"
  },
  {
    "path": ".gitignore",
    "chars": 532,
    "preview": "# build artifacts\n\n.eggs/\n.mypy_cache\nai2_tango.egg-info/\nbuild/\ndist/\npip-wheel-metadata/\nruns/\nworkspace/\n\n# dev tools"
  },
  {
    "path": ".readthedocs.yaml",
    "chars": 242,
    "preview": "version: 2\n\nsphinx:\n  configuration: docs/source/conf.py\n  fail_on_warning: true\n\nbuild:\n  os: ubuntu-22.04\n  tools:\n   "
  },
  {
    "path": "CHANGELOG.md",
    "chars": 41829,
    "preview": "# Changelog\n\nAll notable changes to this project will be documented in this file.\n\nThe format is based on [Keep a Change"
  },
  {
    "path": "CITATION.cff",
    "chars": 748,
    "preview": "cff-version: 1.2.0\nmessage: \"If you use this software, please cite it as below.\"\nauthors:\n- family-names: \"Groeneveld\"\n "
  },
  {
    "path": "Dockerfile",
    "chars": 324,
    "preview": "# This Dockerfile can be used to build a Docker image suitable for tango projects.\n\nARG BASE_IMAGE=ghcr.io/allenai/pytor"
  },
  {
    "path": "Dockerfile.test",
    "chars": 636,
    "preview": "# This Dockerfile is for building an image suitable for running tango's GPU tests and integration tests.\n# There are no "
  },
  {
    "path": "LICENSE",
    "chars": 11357,
    "preview": "                                 Apache License\n                           Version 2.0, January 2004\n                   "
  },
  {
    "path": "Makefile",
    "chars": 615,
    "preview": ".PHONY : docs\ndocs :\n\trm -rf docs/build/\n\tsphinx-autobuild -b html --watch tango/ --watch examples/ docs/source/ docs/bu"
  },
  {
    "path": "README.md",
    "chars": 8264,
    "preview": "<div align=\"center\">\n<br>\n<img src=\"https://raw.githubusercontent.com/allenai/tango/main/docs/source/_static/tango_final"
  },
  {
    "path": "RELEASE_PROCESS.md",
    "chars": 774,
    "preview": "# GitHub Release Process\n\n## Steps\n\n1. Update the version in `tango/version.py`.\n\n2. Run the release script:\n\n    ```bas"
  },
  {
    "path": "docs/.gitignore",
    "chars": 6,
    "preview": "build\n"
  },
  {
    "path": "docs/Makefile",
    "chars": 638,
    "preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line, and also\n# from the "
  },
  {
    "path": "docs/make.bat",
    "chars": 804,
    "preview": "@ECHO OFF\r\n\r\npushd %~dp0\r\n\r\nREM Command file for Sphinx documentation\r\n\r\nif \"%SPHINXBUILD%\" == \"\" (\r\n\tset SPHINXBUILD=sp"
  },
  {
    "path": "docs/source/_static/css/custom.css",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "docs/source/api/commands.rst",
    "chars": 50,
    "preview": "Commands\n========\n\n.. automodule:: tango.__main__\n"
  },
  {
    "path": "docs/source/api/components/executor.rst",
    "chars": 218,
    "preview": "Executor\n========\n\nBase class\n----------\n\n.. autoclass:: tango.executor.Executor\n   :members: \n\n.. autoclass:: tango.exe"
  },
  {
    "path": "docs/source/api/components/format.rst",
    "chars": 232,
    "preview": "Format\n======\n\nBase class\n----------\n\n.. autoclass:: tango.format.Format\n   :members: \n   :private-members:\n\nImplementat"
  },
  {
    "path": "docs/source/api/components/index.rst",
    "chars": 200,
    "preview": "Components\n==========\n\nThe core components of **AI2 Tango**.\n\n.. toctree::\n   :maxdepth: 2\n   :caption: Components\n\n   s"
  },
  {
    "path": "docs/source/api/components/step.rst",
    "chars": 357,
    "preview": "Step\n====\n\nBase class\n----------\n\n.. autoclass:: tango.step.Step\n   :members: \n   :special-members:\n   :exclude-members:"
  },
  {
    "path": "docs/source/api/components/step_cache.rst",
    "chars": 345,
    "preview": "StepCache\n=========\n\nBase class\n----------\n\n.. autoclass:: tango.step_cache.StepCache\n   :members: \n   :special-members:"
  },
  {
    "path": "docs/source/api/components/step_graph.rst",
    "chars": 77,
    "preview": "StepGraph\n=========\n\n.. autoclass:: tango.step_graph.StepGraph\n   :members: \n"
  },
  {
    "path": "docs/source/api/components/step_info.rst",
    "chars": 532,
    "preview": "StepInfo\n========\n\n.. autoclass:: tango.step_info.StepInfo\n   :member-order: bysource\n   :members:\n\n.. autoclass:: tango"
  },
  {
    "path": "docs/source/api/components/workspace.rst",
    "chars": 489,
    "preview": "Workspace\n=========\n\nBase class\n----------\n\n.. autoclass:: tango.workspace.Workspace\n   :members:\n\nImplementations\n-----"
  },
  {
    "path": "docs/source/api/det_hash.rst",
    "chars": 517,
    "preview": "Deterministic Hashing\n=====================\n\nIn order to detect whether a :class:`~tango.step.Step` has to be re-run or "
  },
  {
    "path": "docs/source/api/exceptions.rst",
    "chars": 176,
    "preview": "Exceptions\n==========\n\n.. autoexception:: tango.common.exceptions.TangoError\n   :members:\n\n.. automodule:: tango.common."
  },
  {
    "path": "docs/source/api/integrations/beaker.rst",
    "chars": 565,
    "preview": "🧪 Beaker\n=========\n\n.. automodule:: tango.integrations.beaker\n\nReference\n---------\n\n.. autoclass:: tango.integrations.be"
  },
  {
    "path": "docs/source/api/integrations/datasets.rst",
    "chars": 598,
    "preview": "🤗 Datasets\n===========\n\n.. automodule:: tango.integrations.datasets\n\nReference\n---------\n\n.. autofunction:: tango.integr"
  },
  {
    "path": "docs/source/api/integrations/fairscale.rst",
    "chars": 300,
    "preview": "🔥 FairScale\n============\n\n.. automodule:: tango.integrations.fairscale\n\nReference\n---------\n\n.. autoclass:: tango.integr"
  },
  {
    "path": "docs/source/api/integrations/flax.rst",
    "chars": 958,
    "preview": "Flax\n=======\n\n.. automodule:: tango.integrations.flax\n\nReference\n---------\n\nTrain step\n~~~~~~~~~~\n\n.. autoclass:: tango."
  },
  {
    "path": "docs/source/api/integrations/gs.rst",
    "chars": 208,
    "preview": "☁️ Google Cloud Storage\n=======================\n\n.. automodule:: tango.integrations.gs\n\nReference\n---------\n\n.. autoclas"
  },
  {
    "path": "docs/source/api/integrations/index.rst",
    "chars": 202,
    "preview": "Integrations\n============\n\n.. automodule:: tango.integrations\n\n.. toctree::\n   :maxdepth: 2\n   :caption: Integrations\n\n "
  },
  {
    "path": "docs/source/api/integrations/torch.rst",
    "chars": 1436,
    "preview": "🔥 PyTorch\n==========\n\n.. automodule:: tango.integrations.torch\n\nReference\n---------\n\nTrain step\n~~~~~~~~~~\n\n.. autoclass"
  },
  {
    "path": "docs/source/api/integrations/transformers.rst",
    "chars": 164,
    "preview": "🤗 Transformers\n===============\n\n.. automodule:: tango.integrations.transformers\n    :members:\n\n.. autofunction:: tango.i"
  },
  {
    "path": "docs/source/api/integrations/wandb.rst",
    "chars": 340,
    "preview": "⚖️ Weights & Biases\n===================\n \n.. automodule:: tango.integrations.wandb\n\nReference\n---------\n\n.. autoclass:: "
  },
  {
    "path": "docs/source/api/logging.rst",
    "chars": 533,
    "preview": "Logging\n=======\n\n.. automodule:: tango.common.logging\n\nReference\n---------\n\n.. autodata:: tango.common.logging.TANGO_LOG"
  },
  {
    "path": "docs/source/api/sequences.rst",
    "chars": 545,
    "preview": "Sequences\n=========\n\nThis module contains some utilities to make sequences out of other sequences. All of these are lazy"
  },
  {
    "path": "docs/source/api/settings.rst",
    "chars": 435,
    "preview": "Global settings\n---------------\n\nSome command-line options can set globally in a ``tango.yml`` or ``tango.yaml`` setting"
  },
  {
    "path": "docs/source/api/utilities.rst",
    "chars": 93,
    "preview": "Utilities\n=========\n\n.. automodule:: tango.common\n   :members:\n   :exclude-members: det_hash\n"
  },
  {
    "path": "docs/source/conf.py",
    "chars": 5064,
    "preview": "# Configuration file for the Sphinx documentation builder.\n#\n# This file only contains a selection of the most common op"
  },
  {
    "path": "docs/source/examples/euler.md",
    "chars": 6933,
    "preview": "```{include} ../../../examples/euler/README.md\n```\n\n## Running the experiment\n\nIf you haven't already, clone the [tango "
  },
  {
    "path": "docs/source/examples/eval_p3.md",
    "chars": 548,
    "preview": "```{include} ../../../examples/eval_p3/README.md\n```\n\n## `RougeScoreStep`\n\n`RougeScoreStep` is defined in `eval.py`:\n\n``"
  },
  {
    "path": "docs/source/examples/index.rst",
    "chars": 242,
    "preview": "Examples\n========\n\nReal-world examples of using Tango.\nYou can find all of these `on GitHub <https://github.com/allenai/"
  },
  {
    "path": "docs/source/examples/train_lm.md",
    "chars": 983,
    "preview": "# Fine-tuning a language model\n\n```{include} ../../../examples/train_lm/README.md\n:start-after: <!-- start overview -->\n"
  },
  {
    "path": "docs/source/faq.md",
    "chars": 103,
    "preview": "# FAQ\n\n```{include} ../../README.md\n:start-after: <!-- start faq -->\n:end-before: <!-- end faq -->\n```\n"
  },
  {
    "path": "docs/source/first_steps.md",
    "chars": 14767,
    "preview": "# First Steps\n\n## What is a Step?\n\nTango is a Python library for choreographing machine learning research experiments by"
  },
  {
    "path": "docs/source/index.md",
    "chars": 1112,
    "preview": "# **AI2 Tango**\n\n```{include} ../../README.md\n:start-after: <!-- start tagline -->\n:end-before: <!-- end tagline -->\n```"
  },
  {
    "path": "docs/source/installation.md",
    "chars": 131,
    "preview": "Installation\n============\n\n```{include} ../../README.md\n:start-after: <!-- start install -->\n:end-before: <!-- end insta"
  },
  {
    "path": "examples/euler/README.md",
    "chars": 226,
    "preview": "Euler\n=====\n\nThis is a toy example that proves Euler's identity using Tango. You can use this to play with the concept o"
  },
  {
    "path": "examples/euler/complex_arithmetic.py",
    "chars": 1392,
    "preview": "import cmath\nfrom typing import Tuple, Union\n\nfrom tango import Step\n\nComplexOrTuple = Union[complex, Tuple[float, float"
  },
  {
    "path": "examples/euler/euler.jsonnet",
    "chars": 549,
    "preview": "local i = [0.0, 1.0];\nlocal pi = [3.1415926535, 0.0];\n\n{\n    \"steps\": {\n        \"i_times_pi\": {\n            \"type\": \"cmu"
  },
  {
    "path": "examples/euler/euler_general.jsonnet",
    "chars": 1017,
    "preview": "local i = [0.0, 1.0];\nlocal pi = [3.1415926535, 0.0];\n\n{\n    \"steps\": {\n        \"cos\": {\n            \"type\": \"ccos\",\n   "
  },
  {
    "path": "examples/euler/run.sh",
    "chars": 95,
    "preview": "#!/bin/bash\n\ntango run euler_general.jsonnet -d workspace --include-package complex_arithmetic\n"
  },
  {
    "path": "examples/eval_p3/README.md",
    "chars": 579,
    "preview": "# Evaluating T0\n\nThis example uses the `transformers::run_generation_dataset` step to run the\n[T0 model](https://api.sem"
  },
  {
    "path": "examples/eval_p3/config.jsonnet",
    "chars": 2222,
    "preview": "local model = \"bigscience/T0_3B\";\nlocal batch_size = 8;\n\nlocal datasets = [\n    'xsum_DOC_boils_down_to_simple_idea_that"
  },
  {
    "path": "examples/eval_p3/eval.py",
    "chars": 1046,
    "preview": "import logging\nfrom typing import Dict\n\nfrom torch import Tensor\nfrom torchmetrics.text.rouge import ROUGEScore\n\nfrom ta"
  },
  {
    "path": "examples/finetune/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/finetune/config.jsonnet",
    "chars": 4577,
    "preview": "##################\n# Model settings #\n##################\n\nlocal pretrained_model = \"t5-base\";\nlocal load_with_low_cpu_me"
  },
  {
    "path": "examples/finetune/snli_steps.py",
    "chars": 3670,
    "preview": "from typing import Union\n\nimport datasets as ds\n\nfrom tango.integrations.datasets import DatasetsFormat\nfrom tango.step "
  },
  {
    "path": "examples/finetune/test.py",
    "chars": 1802,
    "preview": "import typing\n\nimport datasets as ds\nimport pytest\n\nfrom tango.common import Params\nfrom tango.common.testing import Tan"
  },
  {
    "path": "examples/finetune_resnet/.gitignore",
    "chars": 32,
    "preview": "data/\nresults/\nextra_testing.py\n"
  },
  {
    "path": "examples/finetune_resnet/config.jsonnet",
    "chars": 2147,
    "preview": "local input_size = 224;\nlocal batch_size = 32;\nlocal num_classes = 2;\nlocal val_size = 0.05;\nlocal model = \"resnet\";\nloc"
  },
  {
    "path": "examples/finetune_resnet/resnet_steps.py",
    "chars": 5220,
    "preview": "from typing import Any, Dict, List, Optional\n\nimport datasets\nimport torch\nfrom cached_path import cached_path\nfrom PIL "
  },
  {
    "path": "examples/flax/config.jsonnet",
    "chars": 1906,
    "preview": "{\n    \"steps\": {\n        \"data\": {\n            \"type\": \"datasets::load\",\n            \"path\": \"xsum\",\n        },\n        "
  },
  {
    "path": "examples/flax/run.sh",
    "chars": 74,
    "preview": "#!/bin/bash\n\ntango run config.jsonnet -d workspace --include-package xsum\n"
  },
  {
    "path": "examples/flax/xsum.py",
    "chars": 7085,
    "preview": "import logging\nfrom typing import List, Optional\n\nimport jax\nimport jax.numpy as jnp\nimport nltk\nimport numpy as np\nimpo"
  },
  {
    "path": "examples/train_lm/.gitignore",
    "chars": 9,
    "preview": "runs\nrun\n"
  },
  {
    "path": "examples/train_lm/README.md",
    "chars": 1376,
    "preview": "# Fine-tuning a language model\n\n<!-- start overview -->\n\nThis Tango example showcases how you could train or fine-tune a"
  },
  {
    "path": "examples/train_lm/config.jsonnet",
    "chars": 4712,
    "preview": "##################\n# Model settings #\n##################\n\nlocal pretrained_model = \"gpt2\";\n# With 'fsdp' and 'activation"
  },
  {
    "path": "examples/train_lm/test.py",
    "chars": 1820,
    "preview": "from tango.common import Params\nfrom tango.common.testing import run_experiment\n\n\ndef test_small_experiment():\n    model"
  },
  {
    "path": "examples/train_lm/tokenize_step.py",
    "chars": 2091,
    "preview": "import datasets\n\nfrom tango import Step\nfrom tango.integrations.datasets import DatasetsFormat\nfrom tango.integrations.t"
  },
  {
    "path": "integration_tests/README.md",
    "chars": 671,
    "preview": "# Integration tests\n\nThese are a collection of longer running end-to-end tests of various parts of the Tango library.\n\nT"
  },
  {
    "path": "integration_tests/fairscale_benchmarks/README.md",
    "chars": 960,
    "preview": "# FairScale Benchmarks\n\nThis integration test is for checking the performance of the `FairScaleTrainingEngine` with vari"
  },
  {
    "path": "integration_tests/fairscale_benchmarks/config.jsonnet",
    "chars": 10103,
    "preview": "##################\n# Model settings #\n##################\n\nlocal pretrained_model = \"gpt2\";\n# local pretrained_model = \"E"
  },
  {
    "path": "integration_tests/fairscale_benchmarks/run.sh",
    "chars": 113,
    "preview": "#!/bin/sh\n\ntango run integration_tests/fairscale_benchmarks/config.jsonnet -i examples/train_lm/tokenize_step.py\n"
  },
  {
    "path": "pyproject.toml",
    "chars": 4157,
    "preview": "[build-system]\nrequires = [\"setuptools\", \"wheel\"]\nbuild-backend = \"setuptools.build_meta\"\n\n[project]\nname = \"ai2-tango\"\n"
  },
  {
    "path": "scripts/entrypoint.sh",
    "chars": 650,
    "preview": "#!/bin/bash\n\n# Exit script if any commands fail.\nset -e\nset -o pipefail\n\n# Check that the environment variable has been "
  },
  {
    "path": "scripts/hash_extras.py",
    "chars": 210,
    "preview": "\"\"\"\nUsed in CI to create a unique ID for any set of install extras.\n\"\"\"\n\nimport sys\n\n\ndef main():\n    extras = sys.argv["
  },
  {
    "path": "scripts/prepare_changelog.py",
    "chars": 944,
    "preview": "from datetime import datetime\nfrom pathlib import Path\n\nfrom tango.version import VERSION\n\n\ndef main():\n    changelog = "
  },
  {
    "path": "scripts/prepare_citation_cff.py",
    "chars": 582,
    "preview": "from datetime import datetime\nfrom pathlib import Path\n\nfrom tango.version import VERSION\n\n\ndef main():\n    citation = P"
  },
  {
    "path": "scripts/release.sh",
    "chars": 555,
    "preview": "#!/bin/bash\n\nset -e\n\nTAG=$(python -c 'from tango.version import VERSION; print(\"v\" + VERSION)')\n\nread -p \"Creating new r"
  },
  {
    "path": "scripts/release_notes.py",
    "chars": 2557,
    "preview": "# encoding: utf-8\n\n\"\"\"\nPrepares markdown release notes for GitHub releases.\n\"\"\"\n\nimport os\nfrom typing import List, Opti"
  },
  {
    "path": "tango/__init__.py",
    "chars": 1133,
    "preview": "\"\"\"\nA Python library for choreographing your machine learning research.\n\"\"\"\n\n__all__ = [\n    \"cleanup_cli\",\n    \"DillFor"
  },
  {
    "path": "tango/__main__.py",
    "chars": 19180,
    "preview": "\"\"\"\nThe Tango CLI is the recommended tool to run experiments with.\nIt also comes with several other useful commands.\n\nYo"
  },
  {
    "path": "tango/cli.py",
    "chars": 7416,
    "preview": "import logging\nimport multiprocessing as mp\nimport os\nimport sys\nimport warnings\nfrom contextlib import contextmanager, "
  },
  {
    "path": "tango/common/__init__.py",
    "chars": 671,
    "preview": "from .aliases import PathOrStr\nfrom .dataset_dict import DatasetDict, DatasetDictBase, IterableDatasetDict\nfrom .det_has"
  },
  {
    "path": "tango/common/aliases.py",
    "chars": 549,
    "preview": "from enum import Enum, unique\nfrom os import PathLike\nfrom typing import Set, Union\n\nPathOrStr = Union[str, PathLike]\n\n\n"
  },
  {
    "path": "tango/common/dataset_dict.py",
    "chars": 1787,
    "preview": "from dataclasses import dataclass, field\nfrom typing import Any, Generic, Iterable, Iterator, Mapping, Sequence, TypeVar"
  },
  {
    "path": "tango/common/det_hash.py",
    "chars": 6074,
    "preview": "import collections\nimport hashlib\nimport io\nfrom abc import abstractmethod\nfrom typing import Any, MutableMapping, Optio"
  },
  {
    "path": "tango/common/exceptions.py",
    "chars": 2578,
    "preview": "from typing import TYPE_CHECKING, Any, Optional, Set, Tuple, Union\n\nif TYPE_CHECKING:\n    from tango.step import Step\n  "
  },
  {
    "path": "tango/common/file_lock.py",
    "chars": 2837,
    "preview": "import os\nimport warnings\nfrom typing import Optional\n\nfrom filelock import AcquireReturnProxy\nfrom filelock import File"
  },
  {
    "path": "tango/common/from_params.py",
    "chars": 36817,
    "preview": "import collections.abc\nimport inspect\nimport logging\nfrom copy import deepcopy\nfrom pathlib import Path\nfrom typing impo"
  },
  {
    "path": "tango/common/lazy.py",
    "chars": 5946,
    "preview": "import copy\nimport inspect\nfrom typing import Any, Callable, Dict, Generic, Optional, Type, TypeVar, Union, cast\n\nfrom ."
  },
  {
    "path": "tango/common/logging.py",
    "chars": 27037,
    "preview": "\"\"\"\nTango makes heavy use of the :mod:`logging` module from the standard library to convey information to users.\nWhen yo"
  },
  {
    "path": "tango/common/params.py",
    "chars": 22721,
    "preview": "import copy\nimport json\nimport logging\nimport os\nimport zlib\nfrom collections import OrderedDict\nfrom collections.abc im"
  },
  {
    "path": "tango/common/registrable.py",
    "chars": 16052,
    "preview": "\"\"\"\n:class:`Registrable` is a \"mixin\" for endowing\nany base class with a named registry for its subclasses and a decorat"
  },
  {
    "path": "tango/common/remote_utils.py",
    "chars": 1274,
    "preview": "import logging\nfrom typing import Union\n\nfrom tango.step import Step\nfrom tango.step_info import StepInfo\n\nlogger = logg"
  },
  {
    "path": "tango/common/sequences.py",
    "chars": 12494,
    "preview": "import bisect\nimport os\nimport random\nimport shutil\nfrom collections import abc\nfrom os import PathLike\nfrom typing impo"
  },
  {
    "path": "tango/common/testing/__init__.py",
    "chars": 5360,
    "preview": "import logging\nimport os\nimport shutil\nimport tempfile\nfrom contextlib import contextmanager\nfrom pathlib import Path\nfr"
  },
  {
    "path": "tango/common/testing/steps.py",
    "chars": 4548,
    "preview": "import logging\nimport multiprocessing as mp\nimport random\nimport time\nfrom string import ascii_letters\nfrom typing impor"
  },
  {
    "path": "tango/common/tqdm.py",
    "chars": 3629,
    "preview": "\"\"\"\nCopied over from ``allennlp.common.tqdm.Tqdm``.\n\nWraps tqdm so we can add configurable global defaults for certain t"
  },
  {
    "path": "tango/common/util.py",
    "chars": 10415,
    "preview": "import importlib\nimport pkgutil\nimport signal\nimport string\nimport sys\nimport traceback\nfrom collections import OrderedD"
  },
  {
    "path": "tango/executor.py",
    "chars": 7257,
    "preview": "import logging\nimport warnings\nfrom dataclasses import dataclass, field\nfrom typing import TYPE_CHECKING, Dict, List, Op"
  },
  {
    "path": "tango/executors/__init__.py",
    "chars": 118,
    "preview": "\"\"\"\nBuilt-in :class:`~tango.executor.Executor` implementations.\n\"\"\"\nfrom .multicore_executor import MulticoreExecutor\n"
  },
  {
    "path": "tango/executors/multicore_executor.py",
    "chars": 13932,
    "preview": "import logging\nimport os\nimport subprocess\nimport time\nfrom tempfile import NamedTemporaryFile\nfrom typing import Dict, "
  },
  {
    "path": "tango/format.py",
    "chars": 19305,
    "preview": "import bz2\nimport dataclasses\nimport gzip\nimport importlib\nimport json\nimport logging\nimport lzma\nfrom abc import abstra"
  },
  {
    "path": "tango/integrations/__init__.py",
    "chars": 768,
    "preview": "\"\"\"\nIn :mod:`tango.integrations` we provide many ready-to-use `component <../components/index.html>`_\nimplementations fo"
  },
  {
    "path": "tango/integrations/beaker/__init__.py",
    "chars": 1063,
    "preview": "\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"beaker\" extra\n    (e.g. ``pip inst"
  },
  {
    "path": "tango/integrations/beaker/common.py",
    "chars": 6800,
    "preview": "import atexit\nimport json\nimport logging\nimport os.path\nimport tempfile\nimport time\nimport urllib\nimport urllib.parse\nfr"
  },
  {
    "path": "tango/integrations/beaker/entrypoint.sh",
    "chars": 4089,
    "preview": "#!/bin/bash\n#\n# This is the entrypoint script that the Beaker Executor uses when it runs a step\n# on Beaker.\n# It will w"
  },
  {
    "path": "tango/integrations/beaker/executor.py",
    "chars": 42650,
    "preview": "import json\nimport logging\nimport os\nimport threading\nimport time\nimport uuid\nimport warnings\nfrom abc import abstractme"
  },
  {
    "path": "tango/integrations/beaker/step_cache.py",
    "chars": 4037,
    "preview": "import logging\nfrom pathlib import Path\nfrom typing import Optional, Union\n\nfrom beaker import Beaker\nfrom beaker import"
  },
  {
    "path": "tango/integrations/beaker/workspace.py",
    "chars": 15958,
    "preview": "import json\nimport logging\nimport os\nimport random\nfrom collections import OrderedDict\nfrom pathlib import Path\nfrom typ"
  },
  {
    "path": "tango/integrations/datasets/__init__.py",
    "chars": 11483,
    "preview": "\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"datasets\" extra\n    (e.g. ``pip in"
  },
  {
    "path": "tango/integrations/fairscale/__init__.py",
    "chars": 1880,
    "preview": "\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"fairscale\" extra\n    (e.g. ``pip i"
  },
  {
    "path": "tango/integrations/fairscale/fsdp_config.py",
    "chars": 2908,
    "preview": "from dataclasses import asdict, dataclass\nfrom typing import Any, Dict, Optional\n\nimport torch\nfrom fairscale.nn.data_pa"
  },
  {
    "path": "tango/integrations/fairscale/module_wrapper.py",
    "chars": 4272,
    "preview": "import re\nfrom typing import Optional, Set\n\nimport torch\nimport torch.nn as nn\nfrom fairscale.nn.checkpoint import check"
  },
  {
    "path": "tango/integrations/fairscale/training_engine.py",
    "chars": 5813,
    "preview": "import logging\nfrom pathlib import Path\nfrom typing import Any, Dict, List, Optional, Union\n\nimport torch\nfrom fairscale"
  },
  {
    "path": "tango/integrations/flax/__init__.py",
    "chars": 755,
    "preview": "from tango.common.exceptions import IntegrationMissingError\n\ntry:\n    import flax\nexcept ModuleNotFoundError:\n    raise "
  },
  {
    "path": "tango/integrations/flax/data.py",
    "chars": 1912,
    "preview": "import logging\nfrom typing import Generic, TypeVar\n\nimport jax.random\nimport numpy as np\nfrom datasets import Dataset\nfr"
  },
  {
    "path": "tango/integrations/flax/eval.py",
    "chars": 7655,
    "preview": "import logging\nfrom collections import defaultdict\nfrom itertools import islice\nfrom typing import Dict, List, Optional,"
  },
  {
    "path": "tango/integrations/flax/eval_callback.py",
    "chars": 2600,
    "preview": "from pathlib import Path\nfrom typing import Any, Dict\n\nfrom tango.common.dataset_dict import DatasetDictBase\nfrom tango."
  },
  {
    "path": "tango/integrations/flax/format.py",
    "chars": 687,
    "preview": "from pathlib import Path\nfrom typing import Generic, TypeVar\n\nfrom flax.training import checkpoints\n\nfrom tango.common.a"
  },
  {
    "path": "tango/integrations/flax/model.py",
    "chars": 481,
    "preview": "from flax import linen as nn\n\nfrom tango.common.registrable import Registrable\n\n\nclass Model(nn.Module, Registrable):\n  "
  },
  {
    "path": "tango/integrations/flax/optim.py",
    "chars": 3048,
    "preview": "from inspect import isfunction\nfrom typing import Callable, Type\n\nimport optax\n\nfrom tango.common.registrable import Reg"
  },
  {
    "path": "tango/integrations/flax/train.py",
    "chars": 25052,
    "preview": "import logging\nimport time\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Any, DefaultD"
  },
  {
    "path": "tango/integrations/flax/train_callback.py",
    "chars": 6731,
    "preview": "import logging\nfrom pathlib import Path\nfrom typing import Any, Dict, Optional\n\nfrom tango.common.dataset_dict import Da"
  },
  {
    "path": "tango/integrations/flax/train_config.py",
    "chars": 3166,
    "preview": "from dataclasses import asdict, dataclass\nfrom pathlib import Path\nfrom typing import Any, Dict, Optional\n\n\n@dataclass\nc"
  },
  {
    "path": "tango/integrations/flax/util.py",
    "chars": 502,
    "preview": "from typing import Any, Union\n\nimport jax\n\n\ndef get_PRNGkey(seed: int = 42) -> Union[Any, jax._src.random.KeyArray]:\n   "
  },
  {
    "path": "tango/integrations/flax/wrapper.py",
    "chars": 1250,
    "preview": "from abc import abstractmethod\nfrom typing import Dict\n\nfrom tango.common.registrable import Registrable\n\n\nclass FlaxWra"
  },
  {
    "path": "tango/integrations/gs/__init__.py",
    "chars": 699,
    "preview": "\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"gs\" extra\n    (e.g. ``pip install "
  },
  {
    "path": "tango/integrations/gs/common.py",
    "chars": 19315,
    "preview": "\"\"\"\nClasses and utility functions for GSWorkspace and GSStepCache.\n\"\"\"\nimport atexit\nimport datetime\nimport json\nimport "
  },
  {
    "path": "tango/integrations/gs/step_cache.py",
    "chars": 3740,
    "preview": "import logging\nfrom pathlib import Path\nfrom typing import Optional, Union\n\nfrom tango.common import PathOrStr\nfrom tang"
  },
  {
    "path": "tango/integrations/gs/workspace.py",
    "chars": 18313,
    "preview": "import json\nimport random\nfrom pathlib import Path\nfrom typing import (\n    Dict,\n    Generator,\n    Iterable,\n    List,"
  },
  {
    "path": "tango/integrations/torch/__init__.py",
    "chars": 5063,
    "preview": "# -*- coding: UTF-8 -*-\n\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"torch\" ext"
  },
  {
    "path": "tango/integrations/torch/data.py",
    "chars": 3648,
    "preview": "from typing import Any, Dict, Generic, List, Optional, TypeVar, Union\n\nimport torch\n\nfrom tango.common.lazy import Lazy\n"
  },
  {
    "path": "tango/integrations/torch/eval.py",
    "chars": 6976,
    "preview": "from collections import defaultdict\nfrom itertools import islice\nfrom typing import Dict, List, Optional, Sequence\n\nimpo"
  },
  {
    "path": "tango/integrations/torch/eval_callback.py",
    "chars": 2719,
    "preview": "from pathlib import Path\nfrom typing import Any, Dict\n\nfrom tango.common.dataset_dict import DatasetDictBase\nfrom tango."
  },
  {
    "path": "tango/integrations/torch/exceptions.py",
    "chars": 352,
    "preview": "from tango.common.exceptions import TangoError\n\n\nclass StopEarly(TangoError):\n    \"\"\"\n    Callbacks can raise this excep"
  },
  {
    "path": "tango/integrations/torch/format.py",
    "chars": 1168,
    "preview": "from pathlib import Path\nfrom typing import Generic, TypeVar\n\nimport dill\nimport torch\n\nfrom tango.common.aliases import"
  },
  {
    "path": "tango/integrations/torch/model.py",
    "chars": 404,
    "preview": "import torch\n\nfrom tango.common.registrable import Registrable\n\n\nclass Model(torch.nn.Module, Registrable):\n    \"\"\"\n    "
  },
  {
    "path": "tango/integrations/torch/optim.py",
    "chars": 2432,
    "preview": "from typing import Type\n\nimport torch\n\nfrom tango.common.registrable import Registrable\n\n\nclass Optimizer(torch.optim.Op"
  },
  {
    "path": "tango/integrations/torch/train.py",
    "chars": 33758,
    "preview": "import logging\nimport math\nimport os\nimport shutil\nfrom itertools import islice\nfrom typing import Any, Dict, List, Opti"
  },
  {
    "path": "tango/integrations/torch/train_callback.py",
    "chars": 9190,
    "preview": "import logging\nfrom pathlib import Path\nfrom typing import Any, Dict, List, Optional\n\nfrom tango.common.dataset_dict imp"
  },
  {
    "path": "tango/integrations/torch/train_config.py",
    "chars": 5630,
    "preview": "from dataclasses import asdict, dataclass\nfrom pathlib import Path\nfrom typing import Any, Dict, List, Optional\n\nimport "
  },
  {
    "path": "tango/integrations/torch/training_engine.py",
    "chars": 11634,
    "preview": "import os\nimport tempfile\nfrom abc import abstractmethod\nfrom pathlib import Path\nfrom typing import Any, Dict, Optional"
  },
  {
    "path": "tango/integrations/torch/util.py",
    "chars": 4211,
    "preview": "import random\nimport warnings\nfrom collections import UserDict\nfrom typing import Dict, Optional, TypeVar, Union\n\nimport"
  },
  {
    "path": "tango/integrations/transformers/__init__.py",
    "chars": 5721,
    "preview": "\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"transformers\" extra\n    (e.g. ``pi"
  },
  {
    "path": "tango/integrations/transformers/config.py",
    "chars": 496,
    "preview": "from transformers import AutoConfig, PretrainedConfig\n\nfrom tango.common import Registrable\n\n\nclass Config(PretrainedCon"
  },
  {
    "path": "tango/integrations/transformers/data.py",
    "chars": 1207,
    "preview": "from dataclasses import fields, is_dataclass\nfrom typing import Callable\n\nfrom transformers.data import data_collator as"
  },
  {
    "path": "tango/integrations/transformers/finetune.py",
    "chars": 19927,
    "preview": "import logging\nfrom os import PathLike\nfrom typing import List, Optional, Union, cast\n\nimport datasets as ds\nfrom transf"
  },
  {
    "path": "tango/integrations/transformers/ia3.py",
    "chars": 10497,
    "preview": "import re\nfrom dataclasses import dataclass\nfrom typing import Optional\n\nimport torch\nimport torch.nn as nn\nimport torch"
  },
  {
    "path": "tango/integrations/transformers/model.py",
    "chars": 2567,
    "preview": "from typing import Optional, Type\n\nfrom transformers.models.auto import modeling_auto\n\nfrom tango.common.exceptions impo"
  },
  {
    "path": "tango/integrations/transformers/optim.py",
    "chars": 700,
    "preview": "import torch\nfrom transformers import optimization as transformers_optim\n\nfrom tango.integrations.torch.optim import LRS"
  },
  {
    "path": "tango/integrations/transformers/run_generation.py",
    "chars": 19390,
    "preview": "import logging\nimport typing\nfrom typing import Any, Dict, Iterable, List, Optional, Sequence, Set, Union, cast\n\nimport "
  },
  {
    "path": "tango/integrations/transformers/soft_prompt.py",
    "chars": 9783,
    "preview": "import inspect\nimport logging\nimport random\nfrom typing import Any, Dict, Optional\n\nimport torch\nfrom torch import nn\nfr"
  },
  {
    "path": "tango/integrations/transformers/tokenizer.py",
    "chars": 580,
    "preview": "from transformers import AutoTokenizer\nfrom transformers.tokenization_utils_base import PreTrainedTokenizerBase\n\nfrom ta"
  },
  {
    "path": "tango/integrations/wandb/__init__.py",
    "chars": 1598,
    "preview": "\"\"\"\n.. important::\n    To use this integration you should install ``tango`` with the \"wandb\" extra\n    (e.g. ``pip insta"
  },
  {
    "path": "tango/integrations/wandb/flax_train_callback.py",
    "chars": 6289,
    "preview": "from typing import Any, Dict, List, Optional\n\nimport jax\nimport wandb\nfrom flax import jax_utils\n\nfrom tango.common.exce"
  },
  {
    "path": "tango/integrations/wandb/step_cache.py",
    "chars": 5701,
    "preview": "import logging\nfrom typing import Any, Optional, Union\n\nimport wandb\nfrom retry import retry\nfrom wandb.errors import Er"
  },
  {
    "path": "tango/integrations/wandb/torch_train_callback.py",
    "chars": 7885,
    "preview": "from typing import Any, Dict, List, Optional\n\nimport torch\nimport wandb\n\nfrom tango.common.exceptions import Configurati"
  },
  {
    "path": "tango/integrations/wandb/util.py",
    "chars": 1575,
    "preview": "import os\nimport re\nimport warnings\nfrom enum import Enum\n\nfrom wandb.errors import Error as WandbError\n\n_API_KEY_WARNIN"
  },
  {
    "path": "tango/integrations/wandb/workspace.py",
    "chars": 18683,
    "preview": "import logging\nimport tempfile\nfrom datetime import datetime\nfrom pathlib import Path\nfrom typing import Any, Dict, Iter"
  },
  {
    "path": "tango/py.typed",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "tango/settings.py",
    "chars": 3899,
    "preview": "from dataclasses import dataclass\nfrom pathlib import Path\nfrom typing import Any, ClassVar, Dict, List, Optional\n\nimpor"
  },
  {
    "path": "tango/step.py",
    "chars": 39603,
    "preview": "import inspect\nimport itertools\nimport logging\nimport random\nimport re\nimport warnings\nfrom abc import abstractmethod\nfr"
  },
  {
    "path": "tango/step_cache.py",
    "chars": 2051,
    "preview": "import logging\nfrom abc import abstractmethod\nfrom dataclasses import dataclass\nfrom typing import Any, TypeVar, Union\n\n"
  },
  {
    "path": "tango/step_caches/__init__.py",
    "chars": 184,
    "preview": "\"\"\"\nBuilt-in :class:`~tango.step_cache.StepCache` implementations.\n\"\"\"\n\nfrom .local_step_cache import LocalStepCache\nfro"
  },
  {
    "path": "tango/step_caches/local_step_cache.py",
    "chars": 7490,
    "preview": "import collections\nimport logging\nimport os\nimport shutil\nimport warnings\nimport weakref\nfrom pathlib import Path\nfrom t"
  },
  {
    "path": "tango/step_caches/memory_step_cache.py",
    "chars": 1631,
    "preview": "import logging\nimport warnings\nfrom typing import Any, Dict, Union\n\nfrom tango.step import Step\nfrom tango.step_cache im"
  },
  {
    "path": "tango/step_caches/remote_step_cache.py",
    "chars": 6269,
    "preview": "import logging\nimport os\nimport shutil\nimport tempfile\nfrom abc import abstractmethod\nfrom pathlib import Path\nfrom typi"
  },
  {
    "path": "tango/step_graph.py",
    "chars": 12293,
    "preview": "import logging\nfrom typing import Any, Dict, Iterator, List, Mapping, Set, Type, Union\n\nfrom tango.common import PathOrS"
  },
  {
    "path": "tango/step_info.py",
    "chars": 9744,
    "preview": "import getpass\nimport logging\nimport os\nimport platform\nimport socket\nimport sys\nfrom dataclasses import dataclass, fiel"
  },
  {
    "path": "tango/steps/__init__.py",
    "chars": 316,
    "preview": "\"\"\"\nBuilt-in :class:`~tango.step.Step` implementations that are not tied to any particular\nintegration.\n\"\"\"\n\n__all__ = ["
  },
  {
    "path": "tango/steps/dataset_remix.py",
    "chars": 7606,
    "preview": "import collections\nimport random\nimport re\nfrom typing import Any, Dict, List, Mapping, Sequence\n\nfrom tango.common.data"
  },
  {
    "path": "tango/steps/print.py",
    "chars": 704,
    "preview": "import logging\nfrom typing import Any\n\nfrom tango.common.logging import cli_logger\nfrom tango.step import Step\n\n\n@Step.r"
  },
  {
    "path": "tango/steps/shell_step.py",
    "chars": 2572,
    "preview": "import os\nimport subprocess\nfrom typing import List, Optional, Union\n\nfrom tango.common import PathOrStr, RegistrableFun"
  },
  {
    "path": "tango/version.py",
    "chars": 237,
    "preview": "_MAJOR = \"1\"\n_MINOR = \"3\"\n_PATCH = \"2\"\n# This is mainly for pre-releases which have the suffix \"rc[0-9]+\".\n_SUFFIX = \"\"\n"
  },
  {
    "path": "tango/workspace.py",
    "chars": 16271,
    "preview": "import logging\nfrom abc import abstractmethod\nfrom contextlib import contextmanager\nfrom dataclasses import dataclass\nfr"
  },
  {
    "path": "tango/workspaces/__init__.py",
    "chars": 180,
    "preview": "\"\"\"\nBuilt-in :class:`~tango.workspace.Workspace` implementations.\n\"\"\"\n\nfrom .local_workspace import LocalWorkspace\nfrom "
  },
  {
    "path": "tango/workspaces/local_workspace.py",
    "chars": 18659,
    "preview": "import json\nimport logging\nimport os\nfrom datetime import datetime\nfrom pathlib import Path\nfrom typing import Dict, Ite"
  },
  {
    "path": "tango/workspaces/memory_workspace.py",
    "chars": 4629,
    "preview": "import copy\nfrom typing import Dict, Iterable, Iterator, Optional, TypeVar, Union\nfrom urllib.parse import ParseResult\n\n"
  },
  {
    "path": "tango/workspaces/remote_workspace.py",
    "chars": 8690,
    "preview": "import logging\nimport tempfile\nimport warnings\nfrom abc import abstractmethod\nfrom contextlib import contextmanager\nfrom"
  },
  {
    "path": "test_fixtures/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "test_fixtures/beaker/nvidia_smi.yml",
    "chars": 476,
    "preview": "# Used to test that GPUs in a cluster are available. Submit this to beaker with:\n# $ beaker experiment create test_fixtu"
  },
  {
    "path": "test_fixtures/common/params_example.jsonnet",
    "chars": 338,
    "preview": "{\n    \"model\": {\n        \"type\": \"classifier\",\n        \"num_classes\": 3,\n        \"layers\": [\n            {\n             "
  },
  {
    "path": "test_fixtures/common/params_example.yaml",
    "chars": 154,
    "preview": "model:\n  type: classifier\n  num_classes: 3\n  layers:\n    - type: ff\n      activation: relu\n    - type: ff\n      activati"
  },
  {
    "path": "test_fixtures/experiment/hello_world.jsonnet",
    "chars": 276,
    "preview": "{\n    \"steps\": {\n        \"hello\": {\"type\": \"string\", \"result\": \"Hello\"},\n        \"hello_world\": {\n            \"type\": \"c"
  },
  {
    "path": "test_fixtures/experiment/logging_check.jsonnet",
    "chars": 659,
    "preview": "{\n    \"steps\": {\n        \"stringA\": {\"type\": \"logging-step\", \"string\": \"This is a logging test.\", \"num_log_lines\": 5},\n "
  }
]

// ... and 129 more files (download for full content)

About this extraction

This page contains the full source code of the allenai/tango GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 329 files (1.2 MB), approximately 305.9k tokens, and a symbol index with 1387 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo