Full Code of OpenPipe/ART for AI

main 621e82b2d38e cached

341 files

25.0 MB

3.0M tokens

1696 symbols

1 requests

Copy disabled (too large) Download .txt

Showing preview only (11,941K chars total). Download the full file to get everything.

Repository: OpenPipe/ART
Branch: main
Commit: 621e82b2d38e
Files: 341
Total size: 25.0 MB

Directory structure:
gitextract_7jngjjdm/

├── .agents/
│   └── skills/
│       ├── fix-art-issues/
│       │   └── SKILL.md
│       ├── train-rl/
│       │   └── SKILL.md
│       └── train-sft/
│           └── SKILL.md
├── .dockerignore
├── .github/
│   └── workflows/
│       ├── create-draft-release.yml
│       ├── package-install.yml
│       ├── prek.yml
│       └── release.yml
├── .gitignore
├── .pre-commit-config.yaml
├── .python-version
├── .skyignore
├── AGENT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── THIRD-PARTY-NOTICES
├── dev/
│   ├── demo_logging.py
│   ├── math-vista/
│   │   ├── math-vista.ipynb
│   │   └── math-vista.py
│   ├── new_models/
│   │   ├── benchmark_inference.py
│   │   ├── gemma3.py
│   │   ├── prompts.json
│   │   ├── qwen3_try.ipynb
│   │   └── qwen3_try.py
│   ├── profile.ipynb
│   ├── run_yes_no_maybe_kl_advantage.py
│   ├── sft/
│   │   ├── dataset.jsonl
│   │   ├── distillation.py
│   │   ├── sft-from-file.py
│   │   └── sft-warmup.py
│   ├── yes-no-maybe-kl-advantage.py
│   ├── yes-no-maybe-megatron.py
│   ├── yes-no-maybe-metrics.py
│   ├── yes-no-maybe-vision/
│   │   ├── generate_images.py
│   │   └── train.ipynb
│   ├── yes-no-maybe.ipynb
│   └── yes-no-maybe.py
├── docs/
│   ├── .gitignore
│   ├── README.md
│   ├── analytics.js
│   ├── docs.json
│   ├── experimental/
│   │   └── gspo.mdx
│   ├── features/
│   │   ├── additional-histories.mdx
│   │   ├── checkpoint-deletion.mdx
│   │   ├── checkpoint-forking.mdx
│   │   ├── mcp-rl.mdx
│   │   └── tracking-metrics.mdx
│   ├── fundamentals/
│   │   ├── art-backend.mdx
│   │   ├── art-client.mdx
│   │   ├── ruler.mdx
│   │   ├── sft-training.mdx
│   │   └── training-loop.mdx
│   ├── getting-started/
│   │   ├── about.mdx
│   │   ├── faq.mdx
│   │   ├── installation-setup.mdx
│   │   ├── notebooks.mdx
│   │   └── quick-start.mdx
│   ├── integrations/
│   │   ├── langgraph-integration.mdx
│   │   └── openenv-integration.mdx
│   ├── package.json
│   ├── proposals/
│   │   └── backend-first-training-api.md
│   ├── resources/
│   │   ├── glossary.mdx
│   │   └── models.mdx
│   ├── style.css
│   └── tutorials/
│       ├── open-deep-research.mdx
│       └── summarizer.mdx
├── examples/
│   ├── 2048/
│   │   ├── display_benchmarks.ipynb
│   │   ├── generate_benchmarks.py
│   │   ├── rollout.py
│   │   ├── train.py
│   │   └── utils.py
│   ├── benchmarking_comparison_models.py
│   ├── codenames/
│   │   ├── codenames_words.json
│   │   └── dictionary.json
│   ├── data/
│   │   ├── greentext/
│   │   │   └── prompts.txt
│   │   └── temporal-clue/
│   │       └── puzzles.json
│   ├── hn_title_generator/
│   │   ├── reference_grpo_trainer.py
│   │   ├── train.py
│   │   └── utils.py
│   ├── just-the-facts/
│   │   ├── .gitignore
│   │   ├── README.md
│   │   ├── just_the_facts/
│   │   │   ├── __init__.py
│   │   │   ├── checks.py
│   │   │   ├── display_benchmarks.ipynb
│   │   │   ├── experiments.py
│   │   │   ├── find_articles.py
│   │   │   ├── generate_benchmarks.py
│   │   │   ├── rollout.py
│   │   │   ├── scenarios.py
│   │   │   ├── train.py
│   │   │   └── utils.py
│   │   ├── main.py
│   │   ├── pyproject.toml
│   │   └── test_scraper.py
│   ├── mcp-rl/
│   │   ├── README.md
│   │   ├── all_experiments.py
│   │   ├── mcp_rl/
│   │   │   ├── __init__.py
│   │   │   ├── benchmarks/
│   │   │   │   ├── display_benchmarks/
│   │   │   │   │   ├── mcp_alphavantage.ipynb
│   │   │   │   │   └── mcp_balldontlie.ipynb
│   │   │   │   └── generate_benchmarks.py
│   │   │   ├── checks.py
│   │   │   ├── mcp_server.py
│   │   │   ├── rollout.py
│   │   │   ├── scenario_generator.py
│   │   │   ├── train.py
│   │   │   └── utils.py
│   │   ├── pyproject.toml
│   │   ├── servers/
│   │   │   └── python/
│   │   │       ├── mcp_alphavantage/
│   │   │       │   ├── README.md
│   │   │       │   ├── __init__.py
│   │   │       │   ├── scenarios/
│   │   │       │   │   ├── train.jsonl
│   │   │       │   │   └── val.jsonl
│   │   │       │   ├── scenarios.jsonl
│   │   │       │   ├── server.py
│   │   │       │   └── server_params.py
│   │   │       ├── mcp_balldontlie/
│   │   │       │   ├── README.md
│   │   │       │   ├── __init__.py
│   │   │       │   ├── scenarios/
│   │   │       │   │   ├── train.jsonl
│   │   │       │   │   └── val.jsonl
│   │   │       │   ├── server.py
│   │   │       │   └── server_params.py
│   │   │       └── mcp_googlemaps/
│   │   │           ├── README.md
│   │   │           ├── __init__.py
│   │   │           ├── pyproject.toml
│   │   │           ├── server.py
│   │   │           └── server_params.py
│   │   └── test_scenario_generation.py
│   ├── openenv_echo.py
│   ├── prisoners-dilemma.ipynb
│   ├── rock-paper-tool-use.ipynb
│   ├── roflbot/
│   │   └── .gitignore
│   ├── temporal_clue/
│   │   ├── temporal-clue-7b-async.ipynb
│   │   ├── temporal-clue-7b.ipynb
│   │   └── temporal-clue.py
│   ├── tic_tac_toe/
│   │   ├── display-benchmarks.ipynb
│   │   ├── game_utils.py
│   │   ├── rollout.py
│   │   └── tic-tac-toe.py
│   └── tic_tac_toe_self_play/
│       ├── deploy_step.py
│       ├── game_utils.py
│       ├── gather_trajectory_groups_by_index.py
│       ├── rollout.py
│       ├── train.py
│       └── train_o4_mini.py
├── licenses/
│   ├── GPL-3.0.txt
│   └── LGPL-3.0.txt
├── pyproject.toml
├── requirements/
│   └── backend.vcs.txt
├── scripts/
│   ├── bump_version.py
│   ├── ci/
│   │   ├── build_and_push_uv_cache.sh
│   │   └── compute_uv_fingerprint.py
│   ├── deploy-model.py
│   ├── kill-gpu-processes.sh
│   ├── migrate-s3-checkpoints.py
│   ├── publish.sh
│   └── setup.sh
├── skypilot-config.yaml
├── src/
│   ├── art/
│   │   ├── __init__.py
│   │   ├── api_costs.py
│   │   ├── auto_trajectory.py
│   │   ├── backend.py
│   │   ├── batches.py
│   │   ├── cli.py
│   │   ├── costs.py
│   │   ├── dev/
│   │   │   ├── __init__.py
│   │   │   ├── engine.py
│   │   │   ├── get_model_config.py
│   │   │   ├── model.py
│   │   │   ├── openai_server.py
│   │   │   ├── train.py
│   │   │   └── validate.py
│   │   ├── errors.py
│   │   ├── gather.py
│   │   ├── guided_completion.py
│   │   ├── langgraph/
│   │   │   ├── __init__.py
│   │   │   ├── llm_wrapper.py
│   │   │   ├── logging.py
│   │   │   └── message_utils.py
│   │   ├── local/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   ├── checkpoints.py
│   │   │   └── service.py
│   │   ├── loss.py
│   │   ├── mcp/
│   │   │   ├── __init__.py
│   │   │   ├── default_tools.py
│   │   │   ├── generate_scenarios.py
│   │   │   └── types.py
│   │   ├── megatron/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   ├── flex_attention.py
│   │   │   ├── lora.py
│   │   │   ├── offload.py
│   │   │   ├── provider.py
│   │   │   ├── service.py
│   │   │   ├── setup.sh
│   │   │   └── train.py
│   │   ├── metrics.py
│   │   ├── metrics_taxonomy.py
│   │   ├── model.py
│   │   ├── openai.py
│   │   ├── pipeline_trainer/
│   │   │   ├── __init__.py
│   │   │   ├── binary_prefix_tool_pipeline.py
│   │   │   ├── state.py
│   │   │   ├── status.py
│   │   │   ├── trainer.py
│   │   │   ├── types.py
│   │   │   └── yes_no_maybe_pipeline.py
│   │   ├── preprocessing/
│   │   │   ├── __init__.py
│   │   │   ├── inputs.py
│   │   │   ├── pack.py
│   │   │   └── tokenize.py
│   │   ├── py.typed
│   │   ├── rewards/
│   │   │   ├── __init__.py
│   │   │   └── ruler.py
│   │   ├── serverless/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   └── client.py
│   │   ├── test/
│   │   │   ├── test_kl_advantage.py
│   │   │   └── test_step_skipping.py
│   │   ├── tinker/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   ├── cookbook_v/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── hyperparam_utils.py
│   │   │   │   ├── image_processing_utils.py
│   │   │   │   ├── renderers/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── base.py
│   │   │   │   │   ├── deepseek_v3.py
│   │   │   │   │   ├── gpt_oss.py
│   │   │   │   │   ├── kimi_k2.py
│   │   │   │   │   ├── kimi_k25.py
│   │   │   │   │   ├── kimi_k2_5_tool_declaration_ts.py
│   │   │   │   │   ├── llama3.py
│   │   │   │   │   ├── qwen3.py
│   │   │   │   │   ├── qwen3_5.py
│   │   │   │   │   └── role_colon.py
│   │   │   │   ├── tokenizer_utils.py
│   │   │   │   └── utils/
│   │   │   │       ├── __init__.py
│   │   │   │       └── misc_utils.py
│   │   │   ├── prefix_cache.py
│   │   │   ├── renderers.py
│   │   │   ├── server.py
│   │   │   └── service.py
│   │   ├── tinker_native/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   └── data.py
│   │   ├── trajectories.py
│   │   ├── transformers/
│   │   │   ├── __init__.py
│   │   │   └── patches.py
│   │   ├── types.py
│   │   ├── unsloth/
│   │   │   ├── __init__.py
│   │   │   ├── service.py
│   │   │   └── train.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── benchmark_rollout.py
│   │   │   ├── benchmarking/
│   │   │   │   ├── aggregate_trajectories.py
│   │   │   │   ├── charts/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── percentage_comparison_bar_chart.py
│   │   │   │   │   └── training_progress_chart.py
│   │   │   │   ├── filter_model_split.py
│   │   │   │   ├── load_trajectories.py
│   │   │   │   ├── log_constant_metrics_wandb.py
│   │   │   │   ├── pull_model_trajectories.py
│   │   │   │   └── types.py
│   │   │   ├── convert_moe_lora.py
│   │   │   ├── deploy_model.py
│   │   │   ├── deployment/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── common.py
│   │   │   │   ├── legacy.py
│   │   │   │   ├── together.py
│   │   │   │   └── wandb.py
│   │   │   ├── format_message.py
│   │   │   ├── get_model_step.py
│   │   │   ├── get_repo_root_path.py
│   │   │   ├── group_aggregate.py
│   │   │   ├── iterate_dataset.py
│   │   │   ├── limit_concurrency.py
│   │   │   ├── litellm.py
│   │   │   ├── log_http_errors.py
│   │   │   ├── logging.py
│   │   │   ├── model_config.py
│   │   │   ├── old_benchmarking/
│   │   │   │   ├── calculate_step_metrics.py
│   │   │   │   ├── display_image_grid.py
│   │   │   │   ├── generate_comparison_table.py
│   │   │   │   ├── generate_line_graphs.py
│   │   │   │   ├── load_benchmarked_models.py
│   │   │   │   └── types.py
│   │   │   ├── output_dirs.py
│   │   │   ├── record_provenance.py
│   │   │   ├── retry.py
│   │   │   ├── s3.py
│   │   │   ├── s3_checkpoint_utils.py
│   │   │   ├── sft.py
│   │   │   ├── strip_logprobs.py
│   │   │   ├── suppress_litellm_serialization_warnings.py
│   │   │   ├── trajectory_logging.py
│   │   │   └── trajectory_migration.py
│   │   ├── vllm/
│   │   │   ├── __init__.py
│   │   │   ├── dedicated_server.py
│   │   │   ├── engine.py
│   │   │   ├── patches.py
│   │   │   └── server.py
│   │   └── yield_trajectory.py
│   └── mp_actors/
│       ├── __init__.py
│       ├── move.py
│       └── traceback.py
└── tests/
    ├── integration/
    │   ├── __init__.py
    │   ├── test_live_api_cost.py
    │   ├── test_multi_checkpoint_training.py
    │   ├── test_pipeline_localbackend_dedicated.py
    │   ├── test_provenance.py
    │   ├── test_push_and_fork.py
    │   ├── test_tinker_native_backend.py
    │   └── test_vllm_contract.py
    ├── integration.py
    ├── test_backend_train_api.py
    └── unit/
        ├── test_auto_trajectory.py
        ├── test_benchmarking_loader.py
        ├── test_dedicated_config.py
        ├── test_dedicated_server.py
        ├── test_frontend_logging.py
        ├── test_metric_routing.py
        ├── test_metrics_builder.py
        ├── test_metrics_taxonomy.py
        ├── test_model_openai_client_costs.py
        ├── test_mp_actors.py
        ├── test_multi_checkpoint_inference.py
        ├── test_pipeline_trainer_batching.py
        ├── test_pipeline_trainer_local_backend.py
        ├── test_pipeline_trainer_metrics.py
        ├── test_prefix_cache.py
        ├── test_ruler_metrics.py
        ├── test_sft.py
        ├── test_strip_logprobs.py
        ├── test_tinker_renderers.py
        ├── test_tokenize_trajectory_groups.ipynb
        ├── test_track_api_cost.py
        ├── test_trajectory_copy.py
        ├── test_trajectory_parquet.py
        ├── test_unsloth_metrics.py
        ├── test_vllm_patches_contract.py
        └── test_yield_trajectory.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .agents/skills/fix-art-issues/SKILL.md
================================================
---
name: fix-art-issues
description: >
  Fix a GitHub issue on OpenPipe/ART and open a PR.
  Use when the user asks to fix, solve, or work on an ART issue,
  or references a GitHub issue URL containing "OpenPipe/ART".
  Triggers: "fix ART issue", "solve this issue" with an OpenPipe/ART URL,
  "work on ART #N".
---

# Fix ART Issue

Fix a GitHub issue on `OpenPipe/ART` and open a PR.

- **Repo**: `OpenPipe/ART`
- **Base branch**: `main`

Assumes the workspace is already set up with the correct branch checked out and `.env` in place (handled by the system-level `fix-art-workspace` skill).

## Workflow

### 1. Read the Issue
```
gh issue view <number> --repo OpenPipe/ART --json title,body,labels,assignees,comments
```

### 2. Explore, Plan, Implement
- Use the Explore agent to understand relevant code before making changes.
- Plan clearly, implement with minimal focused changes. No over-engineering.

### 3. Commit and Push
- Commit with a message that includes `Closes #<issue-number>`.
- Push the feature branch. If HTTPS push fails due to SAML SSO, set SSH remote: `git remote set-url origin git@github.com:OpenPipe/ART.git`

### 4. Open a Draft PR
- `gh pr create --base main --draft`.
- PR body: `## Summary`, `Closes #<number>`, `## Changes`, `## Test plan`.

### 5. Testing
- **No test artifacts in the final PR**: debug prints, test scripts, and temporary changes must NOT be committed.
- Update the PR's test plan section with detailed results.
- When testing passes, mark the PR as ready: `gh pr ready`.

## Reference

Read `CONTRIBUTING.md` at the repo root for guidance on code quality checks (prek), CI cache refresh, and the release process.

## Dependency Management Tips

- **Pin versions strictly** (`==`) for critical deps like `transformers`, `trl`, `unsloth`, `unsloth-zoo`, `vllm` to avoid surprise breakage from new releases.
- **Don't loosen pins without reason**: if a dep was `==X.Y.Z`, keep it pinned unless there's a specific reason to change. Don't use `>=` just because it seems more flexible.
- **`uv run` fails on macOS** for backend deps (apex/torch need CUDA). This is expected — use `uvx ruff` for linting locally, test on GPU cluster.

## Deploying a GPU Cluster

Name the SkyPilot cluster after the branch name without the `fix/` prefix, replacing `/` with `-` (SkyPilot doesn't allow slashes). For example, if the branch is `fix/short-description`:
```
uv run sky launch -c short-description skypilot-config.yaml -y
```

To connect: `ssh short-description`

To tear down when done: `uv run sky down short-description`

## GPU Cluster Testing Tips

- **Kill stale GPU processes** before re-running tests: `nvidia-smi --query-compute-apps=pid --format=csv,noheader | xargs -r kill -9`. Previous failed runs leave processes holding GPU memory.
- **Set `gpu_memory_utilization`** in test scripts (e.g. `0.7`) — the default `0.9` is too high when Unsloth's training model is also loaded on the same GPU.
- **Redirect test output to a log file**: `nohup python test.py > /tmp/output.log 2>&1 &` then `tail -f /tmp/output.log`. SSH background tasks lose output when connection drops.
- **Git on cluster**: SSH keys may not be configured. Use HTTPS with token: `git remote set-url origin https://${GITHUB_TOKEN}@github.com/OpenPipe/ART.git`
- **Tear down clusters** when done: `sky down <cluster-name> -y`

$ARGUMENTS


================================================
FILE: .agents/skills/train-rl/SKILL.md
================================================
---
name: train-rl
description: RL training reference for the ART framework. Use when the user asks to create, write, or help with an RL training script, reinforcement learning, GRPO, reward functions, RULER scoring, rollout functions, or anything related to RL fine-tuning.
---

# RL Training Wizard

You are guiding the user through setting up Reinforcement Learning (RL) training for a language model using the ART framework. Act as an interactive wizard: ask questions, validate inputs, and generate a complete runnable script.

**Important**: Ask ONE question at a time. Wait for the user's response before asking the next question. Never bundle multiple questions into a single message.

**Adaptability note**: Some steps reference tools like AskUserQuestion, Glob, or Bash. If you don't have access to these tools, simply ask the user the same questions as plain text and skip any steps that require running code (e.g., file search, dataset validation, hyperparameter computation). Do NOT fabricate results — never pretend you ran a tool or searched for files when you didn't.

## Step 1: Single-turn or Multi-turn

Ask the user using AskUserQuestion:

1. **Single-turn** — The agent responds to a prompt once. A reward is assigned based on that single response (e.g., solve a math problem, classify text, answer a question).
2. **Multi-turn** — The agent interacts over multiple turns with an environment, tools, or a game. A reward is assigned at the end of all turns (e.g., play a board game, use tools to complete a task, navigate a conversation).

## Step 2: Describe the Task

Ask the user to describe what the agent needs to do. Tell them you will help create a draft of the rollout function and environment, but they will likely need to edit it once the script is generated.

Gather:
- **Task description** — What does the agent need to accomplish?
- **Scenarios/inputs** — How are training inputs generated or provided? (e.g., a list of problems, a game generator, a dataset of tasks)
- **System prompt** (optional) — Any system-level instructions for the agent

For **multi-turn** scenarios, also ask:
- **Does the agent use tool calling?** — If yes, gather tool names, descriptions, parameter schemas (OpenAI function calling format), and how tool calls are executed (local function, API, MCP server, etc.)
- **How does the environment work?** — What observations does the agent receive? What actions can it take? How does a turn work?
- **When does an episode end?** — Win/loss conditions, turn limits, max tool calls, etc. (default max turns: 10)

Help the user flesh out incomplete descriptions. Offer to write helper functions (game logic, tool execution, scenario generators) as part of the final script.

## Step 3: Reward Method

Ask the user using AskUserQuestion:

1. **Programmatic reward** — You have a ground truth or scoring function to compute the reward (e.g., check correctness against an answer, game win/loss, composite score)
2. **RULER (LLM-as-judge)** — An LLM judge scores and compares the trajectories. No manual reward function needed. Requires an OpenAI API key (`OPENAI_API_KEY` env var).

If they choose **programmatic reward**, help them design a reward function. Common patterns:
- **Binary**: 1 for correct, 0 for incorrect
- **Accuracy**: fraction of correct sub-answers (0.0 to 1.0)
- **Game outcome**: 1 for win, 0.5 for draw, 0 for loss, -1 for invalid move
- **Scaled score**: logarithmic or normalized continuous score
- **Composite**: weighted combination of multiple signals

The reward must be a float assigned to `trajectory.reward`. Additional signals can go in `trajectory.metrics` for W&B logging. **Important: `metrics` values must be numeric (`float`, `int`) or `bool` — strings are not allowed and will cause a Pydantic validation error.**

If they choose **RULER**, ask for:
- **Judge model**: Recommend `openai/o4-mini` (default) or `openai/o3` for higher quality

## Step 4: Gather Base Parameters

Do NOT ask the user to review or confirm their answers after collecting them — just proceed to the next step.

- **Base model**: Recommend ONLY these models:
  - `OpenPipe/Qwen3-14B-Instruct`
  - `Qwen/Qwen3-30B-A3B-Instruct-2507`
  - `meta-llama/Llama-3.1-8B-Instruct`
- **Project name**: A name for this training project (default: `rl-project`)
- **Run name**: A static, descriptive name (e.g., `math-solver-001`, `game-agent-001`). Ask the user for a meaningful name. Do NOT generate random names.

## Step 5: Gather Hyperparameters

Present these defaults to the user, then ask using AskUserQuestion:
- **Use defaults (Recommended)** — show all values in the description
- **Customize** — adjust individual hyperparameters

Default values:
- **Learning rate**: `1e-5`
- **Number of training steps**: `50`
- **Rollouts per group**: `8` (number of trajectories per scenario per step; more = better advantage estimation but slower). For RULER, default to `16`.
- **Groups per step**: `1` (number of different scenarios per training step)

If they choose "Customize", ask which parameters to change.

## Step 6: Generate the Training Script

Write a complete, runnable Python script by combining the appropriate **rollout pattern** (from Step 1/2) with the appropriate **reward method** (from Step 3) and the **training loop**.

Every script MUST:
- Call `await backend.close()` at the end so the process doesn't hang
- Print post-training info and usage examples (see shared block below)

### Post-training block (append to ALL scripts before `backend.close()`):
```python
    # --- Training complete ---
    step = await model.get_step()
    inference_name = model.get_inference_name()
    client = model.openai_client()

    print("\n" + "=" * 60)
    print("RL TRAINING COMPLETE")
    print("=" * 60)
    print(f"  Model:          {inference_name}")
    print(f"  Base model:     <BASE_MODEL>")
    print(f"  Training step:  {step}")
    print(f"  Inference URL:  {client.base_url}")
    print("=" * 60)

    print("\n--- Python usage (openai SDK) ---\n")
    print(f'''\
from openai import OpenAI

client = OpenAI(
    base_url="{client.base_url}",
    api_key="not-needed",
)

response = client.chat.completions.create(
    model="{inference_name}",
    messages=[
        {{"role": "user", "content": "Your prompt here"}},
    ],
)
print(response.choices[0].message.content)
''')

    print("--- curl usage ---\n")
    print(f'''\
curl {client.base_url}chat/completions \\
  -H "Content-Type: application/json" \\
  -d '{{
    "model": "{inference_name}",
    "messages": [
      {{"role": "user", "content": "Your prompt here"}}
    ]
  }}'
''')

    await backend.close()
```

### Rollout pattern: Single-turn
```python
async def rollout(model: art.Model, scenario: dict) -> art.Trajectory:
    client = AsyncOpenAI(
        base_url=model.inference_base_url,
        api_key=model.inference_api_key,
    )
    messages = [
        # {"role": "system", "content": "<SYSTEM_PROMPT>"},
        {"role": "user", "content": scenario["prompt"]},
    ]
    response = await client.chat.completions.create(
        model=model.get_inference_name(),
        messages=messages,
        temperature=0.7,
    )
    choice = response.choices[0]

    # --- Compute reward (if programmatic) ---
    reward = <REWARD_LOGIC>  # e.g., 1.0 if correct else 0.0

    return art.Trajectory(
        messages_and_choices=[*messages, choice],
        reward=reward,
        metrics={"acc": reward},
    )
```

### Rollout pattern: Multi-turn (environment/game loop)
```python
async def rollout(model: art.Model, scenario) -> art.Trajectory:
    client = AsyncOpenAI(
        base_url=model.inference_base_url,
        api_key=model.inference_api_key,
    )
    game = create_game()
    traj = art.Trajectory(
        messages_and_choices=[
            {"role": "system", "content": "<SYSTEM_PROMPT>"},
        ],
        reward=0.0,
    )

    while not is_finished(game):
        traj.messages_and_choices.append(
            {"role": "user", "content": render_observation(game)}
        )
        response = await client.chat.completions.create(
            model=model.get_inference_name(),
            messages=traj.messages(),
            temperature=0.7,
            max_completion_tokens=256,
        )
        choice = response.choices[0]
        traj.messages_and_choices.append(choice)
        try:
            apply_action(game, choice.message.content)
        except ValueError:
            traj.reward = -1.0
            return traj

    traj.reward = compute_reward(game)
    return traj
```

### Rollout pattern: Multi-turn with tool calling
```python
async def rollout(model: art.Model, scenario: dict) -> art.Trajectory:
    client = AsyncOpenAI(
        base_url=model.inference_base_url,
        api_key=model.inference_api_key,
    )
    MAX_TURNS = <MAX_TURNS>
    traj = art.Trajectory(
        messages_and_choices=[
            # {"role": "system", "content": "<SYSTEM_PROMPT>"},
            {"role": "user", "content": scenario["task"]},
        ],
        tools=tools,
        reward=0.0,
    )

    for turn in range(MAX_TURNS):
        response = await client.chat.completions.create(
            model=model.get_inference_name(),
            messages=traj.messages(),
            tools=tools,
            temperature=0.7,
        )
        choice = response.choices[0]
        traj.messages_and_choices.append(choice)

        if choice.message.tool_calls:
            for tc in choice.message.tool_calls:
                args = json.loads(tc.function.arguments)
                result = execute_tool(tc.function.name, args)
                traj.messages_and_choices.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": str(result),
                })
        else:
            break  # Agent finished (no more tool calls)

    # --- Compute reward (if programmatic) ---
    traj.reward = <REWARD_LOGIC>
    return traj
```

### Reward method: RULER addition
When using RULER, the rollout function should set `reward=0.0` (RULER fills it in). Add this scoring block inside the training loop, after `gather_trajectory_groups` and before `model.train`:

```python
from art.rewards import ruler_score_group

        # Score with RULER (LLM judge assigns relative rewards 0-1)
        judged_groups = []
        for group in finished_groups:
            judged = await ruler_score_group(
                group,
                judge_model=JUDGE_MODEL,
                debug=True,
            )
            judged_groups.append(judged)
        finished_groups = judged_groups
```

### Training loop (shared by all patterns):
```python
"""RL training script generated by /train-rl wizard."""
import asyncio
import json
from openai import AsyncOpenAI
import art
from art.local import LocalBackend

# --- Scenarios ---
scenarios = [
    # Define or load your training scenarios here.
]

# --- Rollout function ---
# (insert the appropriate rollout pattern here)

# --- Training loop ---
async def main():
    backend = LocalBackend()
    model = art.TrainableModel(
        name="<RUN_NAME>",
        project="<PROJECT_NAME>",
        base_model="<BASE_MODEL>",
        _internal_config=art.dev.InternalModelConfig(
            engine_args={"gpu_memory_utilization": 0.7},
        ),
    )
    await model.register(backend)

    NUM_STEPS = <NUM_STEPS>
    ROLLOUTS_PER_GROUP = <ROLLOUTS_PER_GROUP>
    GROUPS_PER_STEP = <GROUPS_PER_STEP>

    for step in range(await model.get_step(), NUM_STEPS):
        groups = [
            art.TrajectoryGroup(
                rollout(model, scenarios[
                    (step * GROUPS_PER_STEP + i) % len(scenarios)
                ])
                for _ in range(ROLLOUTS_PER_GROUP)
            )
            for i in range(GROUPS_PER_STEP)
        ]
        finished_groups = await art.gather_trajectory_groups(
            groups, pbar_desc=f"step {step}"
        )

        # (insert RULER scoring block here if using LLM-as-judge)

        avg_reward = sum(
            t.reward for g in finished_groups for t in g.trajectories
        ) / max(1, sum(len(g.trajectories) for g in finished_groups))
        print(f"Step {step}: avg_reward={avg_reward:.3f}")

        await model.delete_checkpoints()
        await model.train(
            finished_groups,
            config=art.TrainConfig(learning_rate=<LEARNING_RATE>),
        )

    # ... post-training block + backend.close() ...

if __name__ == "__main__":
    asyncio.run(main())
```

### Alternative loop: Dataset-driven with iterate_dataset
When the user has a fixed list of training scenarios and wants epoch-based iteration, use `iterate_dataset` instead of the manual step loop. This can be combined with any rollout pattern and reward method.

```python
from art.utils import iterate_dataset

    # Replace the manual for-loop with:
    training_iterator = iterate_dataset(
        scenarios,
        groups_per_step=<GROUPS_PER_STEP>,
        num_epochs=<NUM_EPOCHS>,
        initial_step=await model.get_step(),
    )

    for batch in training_iterator:
        groups = [
            art.TrajectoryGroup(
                rollout(model, item) for _ in range(ROLLOUTS_PER_GROUP)
            )
            for item in batch.items
        ]
        finished_groups = await art.gather_trajectory_groups(
            groups, pbar_desc=f"epoch {batch.epoch} step {batch.step}"
        )

        # (insert RULER scoring block here if using LLM-as-judge)

        avg_reward = sum(
            t.reward for g in finished_groups for t in g.trajectories
        ) / max(1, sum(len(g.trajectories) for g in finished_groups))
        print(f"Step {batch.step} (epoch {batch.epoch}): avg_reward={avg_reward:.3f}")

        await model.delete_checkpoints()
        await model.train(
            finished_groups,
            config=art.TrainConfig(learning_rate=<LEARNING_RATE>),
        )
```

## Step 7: Write and Offer to Run

1. Write the script to a file (suggest `rl_train.py`)
2. Ask the user if they want to run it now with `uv run python <script_path>`
3. If yes, run it **directly using the Bash tool** (do NOT delegate to a Task subagent) so training logs stream live to the user. Use a **2-minute timeout**. If it times out, check progress and decide whether to continue.
4. **GPU memory errors**: If training fails with OOM, lower `gpu_memory_utilization` in the existing `_internal_config` (e.g. from `0.7` to `0.5`).
5. **Stale GPU memory**: If available GPU memory looks too small, previous training runs may still be occupying memory. Before retrying, run `nvidia-smi` to check, and if needed kill leftover processes with `kill <pid>` to free memory.

## Important Notes

- LocalBackend requires a GPU.
- RL uses **GRPO** (Group Relative Policy Optimization) under the hood. It needs multiple trajectories per scenario (a `TrajectoryGroup`) to compute relative advantages. More rollouts per group = better advantage estimation.
- **RULER** eliminates the need for manual reward engineering by using an LLM judge to compare trajectories within a group. It requires an OpenAI API key (`OPENAI_API_KEY` env var).
- The `@art.retry` decorator can wrap rollout functions to handle transient errors: `@art.retry(exceptions=(openai.LengthFinishReasonError,))`.
- **Validation**: To log validation metrics without training, use `await model.log(val_groups)` or `await model.log(val_groups, split="val")`.
- **Resuming**: All patterns use `await model.get_step()` as the loop start, so training resumes from the last checkpoint automatically.


================================================
FILE: .agents/skills/train-sft/SKILL.md
================================================
---
name: train-sft
description: SFT training reference for the ART framework. Use when the user asks to create, write, or help with an SFT training script, fine-tune a model, train from a JSONL dataset, do distillation, or anything related to supervised fine-tuning.
---

# SFT Training Wizard

You are guiding the user through setting up Supervised Fine-Tuning (SFT) for a language model using the ART framework. Act as an interactive wizard: ask questions, validate inputs, and generate a complete runnable script.

**Important**: Ask ONE question at a time. Wait for the user's response before asking the next question. Never bundle multiple questions into a single message.

**Adaptability note**: Some steps reference tools like AskUserQuestion, Glob, or Bash. If you don't have access to these tools, simply ask the user the same questions as plain text and skip any steps that require running code (e.g., file search, dataset validation, hyperparameter computation). Do NOT fabricate results — never pretend you ran a tool or searched for files when you didn't.

## Step 1: Determine Training Scenario

Ask the user ONE question at a time. Wait for their response before moving to the next question.

**Training scenario:**
1. **Train from a JSONL file** — They have a dataset file with chat-formatted examples
2. **Distillation** — They want to train a smaller model using outputs from a larger teacher model

## Step 2: Determine Backend

**Backend:**
1. **ServerlessBackend (Recommended)** — Train on remote managed GPUs. No local GPU needed, production-ready inference endpoint.
2. **LocalBackend** — Train on your local GPU. Full control, fast iteration.

## Step 3: Select and Validate Dataset (JSONL scenario)

**IMPORTANT**: Do NOT assume a dataset. Do NOT make up or hallucinate file paths. Never pretend you searched for files if you didn't actually run a search tool.

If you have access to file system tools (Glob) and can actually execute them, search for `.jsonl` files using Glob (`**/*.jsonl`). Present real results as options. Always include "Provide my own file path" as the last option.

Otherwise, ask the user: "What is the path to your JSONL training file?" — nothing more.

Once the user has provided a file path, validate it if you can run code using the script below. If you cannot run code, skip validation and move on.

```python
import json, sys
ROLES = {"system", "user", "assistant", "developer", "tool", "function"}
errors = []
for i, line in enumerate(open(sys.argv[1]), 1):
    try:
        r = json.loads(line)
        msgs = r.get("input", r).get("messages", [])
        assert isinstance(msgs, list) and msgs, "no messages"
        for j, m in enumerate(msgs):
            assert m.get("role") in ROLES, f"messages[{j}]: invalid role {m.get('role')!r}"
            assert m.get("content") or m.get("function_call") or m.get("tool_calls"), f"messages[{j}]: no content"
        if "input" not in r:
            assert msgs[-1]["role"] == "assistant", "last message must be from assistant"
        tools = r.get("tools")
        if tools is not None:
            assert isinstance(tools, list), "tools must be a list"
    except Exception as e:
        errors.append(f"  Line {i}: {e}")
print(f"{len(errors)} error(s):\n" + "\n".join(errors) if errors else f"Valid! {i} rows")
sys.exit(1 if errors else 0)
```

The JSONL format supports these fields per row:
- **`messages`** (required): List of chat messages
- **`tools`** (optional): List of tool/function definitions for tool-call training
- **`response_format`** (optional): Structured output schema (not used during training, but useful as metadata)

Report the row count and validation result to the user. Do NOT read the whole dataset file. Do NOT name the dataset. If the format is wrong, help them fix it or convert their data.

## Step 4: Gather Base Parameters

Do NOT ask the user to review or confirm their answers after collecting them — just proceed to the next step.

- **Base model**: Recommend ONLY these models:
  - `OpenPipe/Qwen3-14B-Instruct`
  - `Qwen/Qwen3-30B-A3B-Instruct-2507`
  - `meta-llama/Llama-3.1-8B-Instruct`
- **Project name**: A name for this training project (default: `sft-project`)
- **Run name**: A static, descriptive name (e.g., `agent-001`, `pii-redactor-001`, `math-tutor-001`). Ask the user for a meaningful name. Do NOT generate random names.

For **distillation** also ask:
- **Teacher model**: The larger model to distill from (e.g., an OpenRouter model)
- **Teacher API base URL and key**: If using a third-party provider
- **Prompts**: What prompts to send to the teacher model

## Step 5: Gather Hyperparameters

This step only applies if you can run code AND know the row count from validation. If you cannot run code, skip this step entirely — do NOT make up or guess hyperparameter values. The `train_sft_from_file` function has sensible built-in defaults.

Run this Python snippet via Bash to compute defaults (replace `NUM_ROWS` with the actual row count). Do NOT show any formulas or calculation steps to the user — only show the final values.

```python
import math, sys
n = int(sys.argv[1])
epochs = max(1, min(10, round(10000 / n)))
batch_size = 2
total_steps = math.ceil(n * epochs / batch_size)
steps_per_epoch = math.ceil(n / batch_size)
warmup_steps = max(10, min(1000, round(steps_per_epoch * 0.05)))
warmup_ratio = round(warmup_steps / total_steps, 4)
print(f"epochs={epochs} batch_size={batch_size} lr=2e-4 schedule=linear warmup_ratio={warmup_ratio}")
```

Present the output values to the user, then ask:
- **Use defaults (Recommended)** — show all values in the description
- **Customize** — adjust individual hyperparameters

If they choose "Customize", ask which parameters to change.

### For distillation:
Use the same defaults computation as JSONL (replace `NUM_ROWS` with the number of trajectories). `create_sft_dataset_iterator` handles the LR schedule automatically.

## Step 6: Generate the Training Script

Write a complete, runnable Python script. Use the patterns below. Every script MUST:
- Call `await backend.close()` at the end so the process doesn't hang
- Print post-training info and usage examples (see shared block below)

### Post-training block (append to ALL scripts before `backend.close()`):
```python
    # --- Training complete ---
    step = await model.get_step()
    inference_name = model.get_inference_name()
    client = model.openai_client()

    print("\n" + "=" * 60)
    print("SFT TRAINING COMPLETE")
    print("=" * 60)
    print(f"  Model:          {inference_name}")
    print(f"  Base model:     <BASE_MODEL>")
    print(f"  Training step:  {step}")
    print(f"  Inference URL:  {client.base_url}")
    print(f"  W&B run:        https://wandb.ai/<YOUR_TEAM>/<PROJECT_NAME>/runs/<RUN_NAME>")
    print("=" * 60)

    print("\n--- Python usage (openai SDK) ---\n")
    print(f'''\
from openai import OpenAI

client = OpenAI(
    base_url="{client.base_url}",
    api_key="not-needed",
)

response = client.chat.completions.create(
    model="{inference_name}",
    messages=[
        {{"role": "user", "content": "Your prompt here"}},
    ],
)
print(response.choices[0].message.content)
''')

    print("--- curl usage ---\n")
    print(f'''\
curl {client.base_url}chat/completions \\
  -H "Content-Type: application/json" \\
  -d '{{
    "model": "{inference_name}",
    "messages": [
      {{"role": "user", "content": "Your prompt here"}}
    ]
  }}'
''')

    await backend.close()
```

### Backend setup

Use the appropriate backend based on the user's choice:

**LocalBackend:**
```python
from art.local import LocalBackend

backend = LocalBackend()
model = art.TrainableModel(
    name="<RUN_NAME>",
    project="<PROJECT_NAME>",
    base_model="<BASE_MODEL>",
    _internal_config=art.dev.InternalModelConfig(
        engine_args={"gpu_memory_utilization": 0.7},
    ),
)
await model.register(backend)
```

**ServerlessBackend:**
```python
from art.serverless.backend import ServerlessBackend

backend = ServerlessBackend()  # uses WANDB_API_KEY env var
model = art.TrainableModel(
    name="<RUN_NAME>",
    project="<PROJECT_NAME>",
    base_model="<BASE_MODEL>",
)
await model.register(backend)
```

Note: `_internal_config` with `gpu_memory_utilization` is only used with LocalBackend. Do NOT include it for ServerlessBackend.

### JSONL file training pattern:

If hyperparameters were computed in Step 5, pass them explicitly. If Step 5 was skipped, omit them — `train_sft_from_file` has sensible defaults.

```python
"""SFT training script generated by /train-sft wizard."""
import asyncio
import art
<BACKEND_IMPORT>
from art.utils.sft import train_sft_from_file

async def main():
    <BACKEND_SETUP>

    await train_sft_from_file(
        model=model,
        file_path="<FILE_PATH>",
        # Only include these if hyperparameters were computed:
        # epochs=<EPOCHS>,
        # batch_size=<BATCH_SIZE>,
        # peak_lr=<PEAK_LR>,
        # schedule_type="<SCHEDULE_TYPE>",
        # warmup_ratio=<WARMUP_RATIO>,
        verbose=True,
    )

    # ... post-training block + backend.close() ...

if __name__ == "__main__":
    asyncio.run(main())
```

### Distillation pattern:
```python
"""Distillation SFT script generated by /train-sft wizard."""
import asyncio, os
from dotenv import load_dotenv
from openai import AsyncOpenAI
import art
<BACKEND_IMPORT>
from art.utils.sft import create_sft_dataset_iterator

load_dotenv()

async def main():
    teacher_client = AsyncOpenAI(
        api_key=os.environ["<API_KEY_ENV_VAR>"],
        base_url="<TEACHER_API_BASE>",
    )
    prompts = ["<PROMPT_1>", "<PROMPT_2>"]

    trajectories = []
    for prompt in prompts:
        completion = await teacher_client.chat.completions.create(
            model="<TEACHER_MODEL>",
            messages=[{"role": "user", "content": prompt}],
        )
        trajectories.append(
            art.Trajectory(
                messages_and_choices=[
                    {"role": "user", "content": prompt},
                    {"role": "assistant", "content": completion.choices[0].message.content},
                ],
                tools=<TOOLS_OR_NONE>,
                )
        )

    <BACKEND_SETUP>

    for chunk in create_sft_dataset_iterator(
        trajectories,
        epochs=<EPOCHS>,
        batch_size=<BATCH_SIZE>,
        peak_lr=<PEAK_LR>,
        schedule_type="<SCHEDULE_TYPE>",
        warmup_ratio=<WARMUP_RATIO>,
    ):
        await model.train_sft(chunk.trajectories, chunk.config, verbose=True)

    # ... post-training block + backend.close() ...

if __name__ == "__main__":
    asyncio.run(main())
```

## Step 7: Write and Offer to Run

1. Write the script to a file (suggest `sft_train.py`)
2. Ask the user if they want to run it now with `uv run python <script_path>`
3. If yes, run it **directly using the Bash tool** (do NOT delegate to a Task subagent) so training logs stream live to the user. Use a **2-minute timeout**. If it times out, check progress and decide whether to continue.
4. **LocalBackend only — GPU memory errors**: If training fails with OOM, lower `gpu_memory_utilization` in the existing `_internal_config` (e.g. from `0.7` to `0.5`).
5. **LocalBackend only — Stale GPU memory**: If available GPU memory looks too small, previous training runs may still be occupying memory. Before retrying, run `nvidia-smi` to check, and if needed kill leftover processes with `kill <pid>` to free memory.

## Important Notes

- LocalBackend requires a GPU.
- ServerlessBackend requires a `WANDB_API_KEY` environment variable.


================================================
FILE: .dockerignore
================================================
__pycache__/
.art/
# .env
.venv/
grpo_trainer_lora_model/
logs/
shared_cache.db
streaming-chat-completions/
unsloth_compiled_cache/
wandb/
docs/node_modules/
dist/
replays/
trajectories/
.DS_Store
# .local/
# .claude/
.vscode/

================================================
FILE: .github/workflows/create-draft-release.yml
================================================
name: Create Draft Release

on:
  workflow_dispatch:
    inputs:
      version_type:
        description: 'Version bump type'
        required: true
        default: 'patch'
        type: choice
        options:
        - patch
        - minor
        - major

permissions:
  contents: write
  pull-requests: write

jobs:
  create-draft:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install uv
        run: |
          curl -LsSf https://astral.sh/uv/install.sh | sh
          echo "$HOME/.cargo/bin" >> $GITHUB_PATH

      - name: Install dependencies
        run: |
          uv venv
          uv pip install -e .

      - name: Bump version
        id: bump
        run: |
          python scripts/bump_version.py ${{ github.event.inputs.version_type }}
          NEW_VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml', 'rb'))['project']['version'])")
          echo "NEW_VERSION=$NEW_VERSION" >> $GITHUB_OUTPUT

      - name: Generate release notes template
        id: release_notes
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          # Get the previous tag
          PREVIOUS_TAG=$(git describe --tags --abbrev=0 2>/dev/null || echo "")
          
          # Get merged PRs since last release
          if [ -z "$PREVIOUS_TAG" ]; then
            PRS=$(gh pr list --state merged --limit 100 --json number,title,url | jq -r '.[] | "- \(.title) ([#\(.number)](\(.url)))"')
          else
            LAST_TAG_DATE=$(git log -1 --format=%aI $PREVIOUS_TAG)
            PRS=$(gh pr list --state merged --limit 100 --json number,title,url,mergedAt | jq -r --arg date "$LAST_TAG_DATE" '.[] | select(.mergedAt > $date) | "- \(.title) ([#\(.number)](\(.url)))"')
          fi
          
          # Create release notes template
          cat << EOF > release_notes_template.md
          ## Release Highlights
          
          <!-- Add key features, improvements, and breaking changes here -->
          
          ## What's Changed
          
          $PRS
          
          **Full Changelog**: https://github.com/OpenPipe/ART/compare/$PREVIOUS_TAG...v${{ steps.bump.outputs.NEW_VERSION }}
          EOF

      - name: Create draft release
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          gh release create v${{ steps.bump.outputs.NEW_VERSION }} \
            --title "v${{ steps.bump.outputs.NEW_VERSION }}" \
            --notes-file release_notes_template.md \
            --draft

      - name: Create PR with version bump
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          git config --local user.email "action@github.com"
          git config --local user.name "GitHub Action"
          git checkout -b release/v${{ steps.bump.outputs.NEW_VERSION }}
          git add pyproject.toml uv.lock
          git commit -m "Bump version to ${{ steps.bump.outputs.NEW_VERSION }}"
          git push origin release/v${{ steps.bump.outputs.NEW_VERSION }}
          
          gh pr create \
            --title "Release v${{ steps.bump.outputs.NEW_VERSION }}" \
            --body "This PR bumps the version to ${{ steps.bump.outputs.NEW_VERSION }}. 

          **Next steps:**
          1. Review and edit the [draft release](https://github.com/OpenPipe/ART/releases) 
          2. Add release highlights and curate the changelog
          3. Merge this PR to publish the release automatically" \
            --base main \
            --head release/v${{ steps.bump.outputs.NEW_VERSION }}

      - name: Output instructions
        run: |
          echo "::notice::Draft release created! Next steps:"
          echo "::notice::1. Go to https://github.com/OpenPipe/ART/releases and edit the draft"
          echo "::notice::2. Add release highlights and curate the auto-generated PR list"  
          echo "::notice::3. Merge the PR to publish the release"


================================================
FILE: .github/workflows/package-install.yml
================================================
name: Package Install

on:
  pull_request:
  push:
    branches: [main]
  workflow_dispatch:

permissions:
  contents: read

jobs:
  install-smoke-test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install uv
        run: |
          curl -LsSf https://astral.sh/uv/install.sh | sh
          echo "$HOME/.cargo/bin" >> "$GITHUB_PATH"

      - name: Build wheel
        run: uv build --wheel --out-dir dist

      - name: Smoke test uv add + sync for backend extra
        run: |
          wheel_path="$(python - <<'PY'
          from pathlib import Path

          print(next(Path("dist").glob("openpipe_art-*.whl")).resolve())
          PY
          )"

          project_dir="$(mktemp -d)"
          cd "$project_dir"
          uv init --name art-install-smoke --python 3.11 --bare
          uv add "openpipe-art[backend] @ file://${wheel_path}"
          uv sync


================================================
FILE: .github/workflows/prek.yml
================================================
name: Prek

on:
  pull_request:
  push:
    branches: [main]

permissions:
  contents: write

env:
  CI_BASE_IMAGE: "pytorch/pytorch:2.9.0-cuda12.8-cudnn9-devel"
  CI_PYTHON_MM: "3.11"
  CI_UV_CACHE_RELEASE_TAG: "prek-uv-cache"
  CI_UV_CACHE_ASSET_PREFIX: "prek-uv-cache"
  UV_CACHE_DIR: "/root/.cache/uv"
  UV_LINK_MODE: "copy"
  TORCH_CUDA_ARCH_LIST: "8.0"

jobs:
  cache-status:
    runs-on: art-large-runner
    outputs:
      cache-hit: ${{ steps.check.outputs.cache-hit }}
      fingerprint: ${{ steps.fingerprint.outputs.fingerprint }}
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Compute expected uv cache fingerprint
        id: fingerprint
        run: |
          fp="$(python3 scripts/ci/compute_uv_fingerprint.py \
            --pyproject pyproject.toml \
            --uv-lock uv.lock \
            --base-image "${CI_BASE_IMAGE}" \
            --python-mm "${CI_PYTHON_MM}")"
          echo "fingerprint=${fp}" >> "${GITHUB_OUTPUT}"
          echo "Expected uv cache fingerprint: ${fp}"

      - name: Check if uv cache exists
        id: check
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          fingerprint="${{ steps.fingerprint.outputs.fingerprint }}"
          part_prefix="${CI_UV_CACHE_ASSET_PREFIX}-${fingerprint}.tar.zst.part-"
          release_api="https://api.github.com/repos/${GITHUB_REPOSITORY}/releases/tags/${CI_UV_CACHE_RELEASE_TAG}"

          release_json="$(curl -fsSL \
            -H "Authorization: Bearer ${GITHUB_TOKEN}" \
            -H "Accept: application/vnd.github+json" \
            "${release_api}" || true)"

          if [ -z "${release_json}" ]; then
            echo "Cache release '${CI_UV_CACHE_RELEASE_TAG}' not found."
            echo "cache-hit=false" >> "${GITHUB_OUTPUT}"
            exit 0
          fi

          hit="$(RELEASE_JSON="${release_json}" PART_PREFIX="${part_prefix}" python3 -c "
          import json, os, re
          payload = json.loads(os.environ['RELEASE_JSON'])
          prefix = os.environ['PART_PREFIX']
          pattern = re.compile(r'^' + re.escape(prefix) + r'(\d{3})$')
          parts = sorted(
              int(m.group(1))
              for a in payload.get('assets', [])
              for m in [pattern.match(a.get('name', ''))]
              if m and a.get('id') is not None
          )
          print('true' if parts and parts == list(range(len(parts))) else 'false')
          ")"
          echo "cache-hit=${hit}" >> "${GITHUB_OUTPUT}"
          echo "Cache hit: ${hit}"

  build-cache:
    needs: cache-status
    if: needs.cache-status.outputs.cache-hit != 'true'
    runs-on: art-cache-builder
    container:
      image: pytorch/pytorch:2.9.0-cuda12.8-cudnn9-devel
    steps:
      - name: Install CI dependencies
        run: |
          apt-get update
          apt-get install -y --no-install-recommends ca-certificates curl git zstd
          rm -rf /var/lib/apt/lists/*
          curl -LsSf https://astral.sh/uv/install.sh | sh
          echo "/root/.local/bin" >> "${GITHUB_PATH}"

      - name: Install gh CLI
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          GH_DL_URL="$(curl -fsSL \
            -H "Authorization: Bearer ${GH_TOKEN}" \
            https://api.github.com/repos/cli/cli/releases/latest \
            | python3 -c "import json,sys;r=json.load(sys.stdin);print([a['browser_download_url'] for a in r['assets'] if a['name'].endswith('_linux_amd64.tar.gz')][0])")"
          curl -fsSL "${GH_DL_URL}" | tar xz --strip-components=1 -C /usr/local
          gh version

      - name: Checkout code
        uses: actions/checkout@v4

      - name: Mark workspace as a safe git directory
        run: |
          git config --global --add safe.directory "${GITHUB_WORKSPACE}"

      - name: Build and upload uv cache
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          bash scripts/ci/build_and_push_uv_cache.sh \
            --base-image "${CI_BASE_IMAGE}" \
            --python-mm "${CI_PYTHON_MM}"

  quality-checks:
    needs: [cache-status, build-cache]
    if: ${{ !failure() && !cancelled() }}
    runs-on: art-large-runner
    container:
      image: pytorch/pytorch:2.9.0-cuda12.8-cudnn9-devel
    steps:
      - name: Install CI dependencies
        run: |
          apt-get update
          apt-get install -y --no-install-recommends ca-certificates curl git zstd
          rm -rf /var/lib/apt/lists/*
          curl -LsSf https://astral.sh/uv/install.sh | sh
          echo "/root/.local/bin" >> "${GITHUB_PATH}"

      - name: Checkout code
        uses: actions/checkout@v4

      - name: Mark workspace as a safe git directory
        run: |
          git config --global --add safe.directory "${GITHUB_WORKSPACE}"

      - name: Restore prebuilt uv cache
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          release_api="https://api.github.com/repos/${GITHUB_REPOSITORY}/releases/tags/${CI_UV_CACHE_RELEASE_TAG}"
          fingerprint="${{ needs.cache-status.outputs.fingerprint }}"
          part_prefix="${CI_UV_CACHE_ASSET_PREFIX}-${fingerprint}.tar.zst.part-"

          release_json="$(curl -fsSL \
            -H "Authorization: Bearer ${GITHUB_TOKEN}" \
            -H "Accept: application/vnd.github+json" \
            "${release_api}" || true)"

          if [ -z "${release_json}" ]; then
            echo "::error::Missing cache release '${CI_UV_CACHE_RELEASE_TAG}'."
            exit 1
          fi

          part_selection_file="/tmp/uv-cache-part-selection.txt"
          if ! RELEASE_JSON="${release_json}" PART_PREFIX="${part_prefix}" python3 -c "import json, os, re, sys; payload=json.loads(os.environ['RELEASE_JSON']); part_prefix=os.environ['PART_PREFIX']; pattern=re.compile(r'^' + re.escape(part_prefix) + r'(\\d{3})$'); parts=[]; [parts.append((int(m.group(1)), int(a.get('id')), a.get('name'))) for a in payload.get('assets', []) for m in [pattern.match(a.get('name', ''))] if m and a.get('id') is not None]; parts.sort(key=lambda x: x[0]); indices=[p[0] for p in parts]; expected=list(range(len(parts))); print('\\n'.join(f'{asset_id} {name}' for _, asset_id, name in parts)) if parts and indices == expected else (_ for _ in ()).throw(SystemExit(2 if not parts else 3))" > "${part_selection_file}"; then
            echo "::error::No complete uv cache part set found for prefix '${part_prefix}'."
            exit 1
          fi

          part_count="$(wc -l < "${part_selection_file}" | tr -d ' ')"
          echo "Using uv cache part set '${part_prefix}*' (${part_count} parts)."

          parts_dir="/tmp/uv-cache-parts"
          part_paths_file="/tmp/uv-cache-part-paths.txt"
          rm -rf "${parts_dir}"
          mkdir -p "${parts_dir}"
          awk -v d="${parts_dir}" '{print d "/" $2}' "${part_selection_file}" > "${part_paths_file}"

          PARTS_DIR="${parts_dir}" GITHUB_TOKEN="${GITHUB_TOKEN}" GITHUB_REPOSITORY="${GITHUB_REPOSITORY}" \
            xargs -n 2 -P 8 sh -c '
              asset_id="$1"
              asset_name="$2"
              part_path="${PARTS_DIR}/${asset_name}"
              curl -fsSL -L \
                -H "Authorization: Bearer ${GITHUB_TOKEN}" \
                -H "Accept: application/octet-stream" \
                "https://api.github.com/repos/${GITHUB_REPOSITORY}/releases/assets/${asset_id}" \
                -o "${part_path}"
            ' sh < "${part_selection_file}"

          while IFS= read -r part_path; do
            [ -s "${part_path}" ] || {
              echo "::error::Missing or empty cache part: ${part_path}"
              exit 1
            }
          done < "${part_paths_file}"

          rm -rf "${UV_CACHE_DIR}"
          mkdir -p "${UV_CACHE_DIR}"
          while IFS= read -r part_path; do
            cat "${part_path}"
          done < "${part_paths_file}" | zstd -d -c | tar -xf - -C "${UV_CACHE_DIR}"
          du -sh "${UV_CACHE_DIR}"

      - name: Install dependencies (with all optional extras for complete type checking)
        run: |
          py_mm="$(python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')"
          cudnn_path="${GITHUB_WORKSPACE}/.venv/lib/python${py_mm}/site-packages/nvidia/cudnn"
          export CUDNN_PATH="${cudnn_path}"
          export CUDNN_HOME="${cudnn_path}"
          export CUDNN_INCLUDE_PATH="${cudnn_path}/include"
          export CUDNN_LIBRARY_PATH="${cudnn_path}/lib"
          export CPLUS_INCLUDE_PATH="${CUDNN_INCLUDE_PATH}${CPLUS_INCLUDE_PATH:+:${CPLUS_INCLUDE_PATH}}"
          export LIBRARY_PATH="${CUDNN_LIBRARY_PATH}${LIBRARY_PATH:+:${LIBRARY_PATH}}"
          export LD_LIBRARY_PATH="${CUDNN_LIBRARY_PATH}${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
          uv --version
          uv sync --all-extras --group dev --frozen

      - name: Run prek hooks (lint, format, typecheck, uv.lock, tests)
        run: |
          uv run prek run --all-files

      - name: Run unit tests (via prek)
        run: |
          uv run prek run pytest


================================================
FILE: .github/workflows/release.yml
================================================
name: Release

on:
  pull_request:
    types: [closed]
    branches: [main]

permissions:
  contents: write
  id-token: write

jobs:
  release:
    runs-on: ubuntu-latest
    if: github.event.pull_request.merged == true && startsWith(github.event.pull_request.head.ref, 'release/')
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install uv
        run: |
          curl -LsSf https://astral.sh/uv/install.sh | sh
          echo "$HOME/.cargo/bin" >> $GITHUB_PATH

      - name: Install dependencies
        run: |
          uv venv
          uv pip install -e .
          uv pip install hatch

      - name: Build package
        run: uv run hatch build

      - name: Get version from pyproject.toml
        id: get_version
        run: |
          VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml', 'rb'))['project']['version'])")
          echo "VERSION=$VERSION" >> $GITHUB_OUTPUT

      - name: Create git tag
        run: |
          git config --local user.email "action@github.com"
          git config --local user.name "GitHub Action"
          git tag v${{ steps.get_version.outputs.VERSION }}
          git push origin v${{ steps.get_version.outputs.VERSION }}

      - name: Publish draft release
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          # Check if draft release exists and publish it
          if gh release view v${{ steps.get_version.outputs.VERSION }} --json isDraft | jq -r '.isDraft' | grep -q true; then
            gh release edit v${{ steps.get_version.outputs.VERSION }} --draft=false
          else
            echo "::error::No draft release found for v${{ steps.get_version.outputs.VERSION }}"
            exit 1
          fi

      - name: Upload assets to release
        env:
          GH_TOKEN: ${{ github.token }}
        run: |
          gh release upload v${{ steps.get_version.outputs.VERSION }} dist/*

      - name: Publish to PyPI
        uses: pypa/gh-action-pypi-publish@release/v1
        with:
          password: ${{ secrets.PYPI_ART_TOKEN }}


================================================
FILE: .gitignore
================================================
__pycache__/
.art/
.env
.venv/
grpo_trainer_lora_model/
logs/
shared_cache.db
data/cache.db
streaming-chat-completions/
unsloth_compiled_cache/
wandb/
docs/node_modules/
dist/
replays/
trajectories/
.DS_Store
.local/
.claude/settings.local.json
.vscode/
.ruff_cache/
!/src/art/wandb/
!/src/art/wandb/**
/src/art/wandb/__pycache__/

================================================
FILE: .pre-commit-config.yaml
================================================
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.12.1
    hooks:
      - id: ruff
      - id: ruff-format

  - repo: local
    hooks:
      - id: ty
        name: ty type checking
        entry: uv run ty check src tests
        language: system
        pass_filenames: false

      - id: uv-lock-check
        name: uv.lock sync check
        entry: uv lock --check
        language: system
        pass_filenames: false

      - id: pytest
        name: Unit tests (manual)
        entry: uv run pytest --nbval --current-env --tb=short tests/unit
        language: system
        pass_filenames: false
        stages: [manual]


================================================
FILE: .python-version
================================================
3.11


================================================
FILE: .skyignore
================================================
__pycache__/
.art/
# .env
.venv/
grpo_trainer_lora_model/
logs/
shared_cache.db
streaming-chat-completions/
unsloth_compiled_cache/
wandb/
docs/node_modules/
dist/
dev/art-e/data/
replays/
trajectories/
.DS_Store
# .local/

================================================
FILE: AGENT.md
================================================
## uv package manager by default

This project uses the `uv` package manager.

- To add a dependency, run `uv add <package>`.
- To run a script, run `uv run <script>`.
- To examine dependencies, consult the `pyproject.toml` file.

## Testing

- Always run tests before committing. The test command is `uv run prek run --all-files`.

## Releases

- If asked to help with a release, refer to the checklist in CONTRIBUTING.md. Be sure to first share a draft of the release notes with the user before actually publishing the release to GitHub.
- To trigger the release workflow via GitHub CLI: `gh workflow run create-draft-release.yml --field version_type=patch` (use `minor` or `major` instead of `patch` as needed)

## Documentation

- All documentation is in the `docs` directory.
- If you add a new page, be sure to add it to the sidebar in `docs/docs.json`.
- If you move a page, be sure to update the sidebar in `docs/docs.json` and check for any broken links.

### Adding images

- Add images to the `docs/images` directory
- If the image is a png, first convert it to webp using `magick <input.png> <output.webp>`. Do not include the original png in the repo.
- Use the `<Frame>` tag to add images with captions as seen in the page `checkpoint-forking.mdx`.

### Adding notes

- Add notes using the `<Note>` tag as seen in the page `ruler.mdx`


================================================
FILE: CONTRIBUTING.md
================================================
## Contributing to ART

Clone the repository:

```bash
git clone https://github.com/OpenPipe/ART.git
cd ART
```

Install the dependencies:

```bash
uv sync --group dev
```

### Code Quality Checks (prek)

This project uses [prek](https://github.com/j178/prek) to run local checks (ruff, pyright, uv.lock sync, and unit tests). Before submitting a pull request, please ensure your code passes all quality checks:

```bash
# Install git hooks (optional but recommended)
uv run prek install

# Run all checks against all files (formatting, linting, typecheck, uv.lock, tests)
uv run prek run --all-files
```

You can also run individual hooks:

```bash
uv run prek run ruff
uv run prek run ruff-format
uv run prek run pyright
uv run prek run uv-lock-check
uv run prek run pytest
```

These checks are automatically run in CI for all pull requests. If your PR fails these checks, re-run the corresponding `prek` hook locally and commit any fixes.

### CI uv Cache

The PR `prek` workflow uses a prebuilt full `uv` cache (stored as a GitHub release asset) to avoid rebuilding heavy dependencies on every run.

The cache is keyed by a fingerprint computed from `pyproject.toml`, `uv.lock`, the base Docker image, and the Python version. When dependencies change, the fingerprint changes and CI automatically rebuilds the cache using Docker Buildx and uploads it for future runs. The first CI run after a dependency change will be slower while the cache is built.

To manually rebuild the cache (e.g., if the automatic build fails), run:

```bash
bash scripts/ci/build_and_push_uv_cache.sh
```

This requires GitHub CLI authentication (`gh auth login`) and should be run in an environment compatible with CI (same base CUDA image/toolchain).

### Release Process

To create a new release:

1. **Review merged PRs since the last release**:
   - Go to the [pull requests page](https://github.com/OpenPipe/ART/pulls?q=is%3Apr+is%3Amerged+sort%3Aupdated-desc)
   - Review PRs merged since the last release to understand what changed
   - Note any breaking changes, new features, or important bug fixes

2. **Create a draft release**:
   - Go to [Actions](https://github.com/OpenPipe/ART/actions/workflows/create-draft-release.yml)
   - Click "Run workflow"
   - Select the version bump type:
     - `patch`: Bug fixes and minor changes (0.3.13 → 0.3.14)
     - `minor`: New features and non-breaking changes (0.3.13 → 0.4.0)  
     - `major`: Breaking changes (0.3.13 → 1.0.0)

3. **Edit the draft release notes**:
   - Go to the [releases page](https://github.com/OpenPipe/ART/releases)
   - Click "Edit" on the draft release
   - Add release highlights, breaking changes, and curated changelog
   - The auto-generated PR list provides a starting point, but manual curation improves clarity

4. **Finalize the release**:
   - Review and merge the automatically created release PR
   - This will automatically:
     - Create the git tag
     - Publish the curated release notes
     - Build and publish the package to PyPI

Then follow the GPU training instructions below.

### GPU Training (Local or Cloud VM)

Copy the `.env.example` file to `.env` and set the environment variables:

```bash
cp .env.example .env
```

Make sure you're on a machine with at least one H100 or A100-80GB GPU. Machines equipped with lower-end GPUs may work, but training will be slower.

If you're using a cloud VM, you can SSH into the machine using either VSCode or the command line.

### Connecting via Command Line

Simply run:

```bash
ssh art
```

### Connecting via VSCode

1. **Install the Remote-SSH extension on your local machine**

   - Open the extensions view by clicking on the Extensions icon in the Activity Bar on the left.
   - Search for **"Remote-SSH"** and install it.

2. **Configure default extensions for your remote host**

   - In your VSCode settings, find **"Remote.SSH: Default Extensions"**
   - Add the following extensions:
     - `ms-python.python`
     - `ms-toolsai.jupyter`
     - `eamodio.gitlens`
     - `charliermarsh.ruff`

3. **Connect to the host**

   - Open the command palette and run **"Remote-SSH: Connect to Host..."**
   - Select `art`

4. **Set up the host**

   - Click **"Open Folder"**
     - Select **"sky_workdir"**
     - Click **OK**

5. **Run a notebook**
   - Find `2048.ipynb` and run it!

### "2048" example

Now you can run the "2048" example in `/examples/2048/2048.ipynb`.

It has been tested with the `Qwen/Qwen2.5-14B-Instruct` model on a 1xH100 instance.

You can monitor training progress with Weights & Biases at https://wandb.ai/your-wandb-organization/agent-reinforcement-training.

You should see immediate improvement in `val/reward` after one step.

If you run into any issues, the training output is set to maximum verbosity. Copying the outputs such as the vLLM or torchtune logs, or copying/screenshotting the plotted packed tensors, may help me debug the issue.

### Cleaning Up

When you're done, you can tear down the cluster with:

```bash
uv run sky down art
```

### Adding Docs

We use Mintlify to serve our docs. Here are the steps for adding a new page:
1. Clone the ART repo
2. Open the /docs directory in your CLI and IDE
3. Run npx mintlify dev to start serving a local version of the docs in your browser
4. Create a new .mdx file in the relevant directory
5. Add a title and sidebar title (see other pages for examples)
6. In docs.json, add a link to the new page within one of the `navigation`.`groups`
7. Ensure everything works by navigating to and viewing the page in your browser
8. Submit a PR

When you're done, shut down your GPU instance (if using a cloud VM) or stop the local training process.


================================================
FILE: LICENSE
================================================
                                Apache License
                        Version 2.0, January 2004
                    http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

   "License" shall mean the terms and conditions for use, reproduction,
   and distribution as defined by Sections 1 through 9 of this document.

   "Licensor" shall mean the copyright owner or entity authorized by
   the copyright owner that is granting the License.

   "Legal Entity" shall mean the union of the acting entity and all
   other entities that control, are controlled by, or are under common
   control with that entity. For the purposes of this definition,
   "control" means (i) the power, direct or indirect, to cause the
   direction or management of such entity, whether by contract or
   otherwise, or (ii) ownership of fifty percent (50%) or more of the
   outstanding shares, or (iii) beneficial ownership of such entity.

   "You" (or "Your") shall mean an individual or Legal Entity
   exercising permissions granted by this License.

   "Source" form shall mean the preferred form for making modifications,
   including but not limited to software source code, documentation
   source, and configuration files.

   "Object" form shall mean any form resulting from mechanical
   transformation or translation of a Source form, including but
   not limited to compiled object code, generated documentation,
   and conversions to other media types.

   "Work" shall mean the work of authorship, whether in Source or
   Object form, made available under the License, as indicated by a
   copyright notice that is included in or attached to the work
   (an example is provided in the Appendix below).

   "Derivative Works" shall mean any work, whether in Source or Object
   form, that is based on (or derived from) the Work and for which the
   editorial revisions, annotations, elaborations, or other modifications
   represent, as a whole, an original work of authorship. For the purposes
   of this License, Derivative Works shall not include works that remain
   separable from, or merely link (or bind by name) to the interfaces of,
   the Work and Derivative Works thereof.

   "Contribution" shall mean any work of authorship, including
   the original version of the Work and any modifications or additions
   to that Work or Derivative Works thereof, that is intentionally
   submitted to Licensor for inclusion in the Work by the copyright owner
   or by an individual or Legal Entity authorized to submit on behalf of
   the copyright owner. For the purposes of this definition, "submitted"
   means any form of electronic, verbal, or written communication sent
   to the Licensor or its representatives, including but not limited to
   communication on electronic mailing lists, source code control systems,
   and issue tracking systems that are managed by, or on behalf of, the
   Licensor for the purpose of discussing and improving the Work, but
   excluding communication that is conspicuously marked or otherwise
   designated in writing by the copyright owner as "Not a Contribution."

   "Contributor" shall mean Licensor and any individual or Legal Entity
   on behalf of whom a Contribution has been received by Licensor and
   subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of
   this License, each Contributor hereby grants to You a perpetual,
   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
   copyright license to reproduce, prepare Derivative Works of,
   publicly display, publicly perform, sublicense, and distribute the
   Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of
   this License, each Contributor hereby grants to You a perpetual,
   worldwide, non-exclusive, no-charge, royalty-free, irrevocable
   (except as stated in this section) patent license to make, have made,
   use, offer to sell, sell, import, and otherwise transfer the Work,
   where such license applies only to those patent claims licensable
   by such Contributor that are necessarily infringed by their
   Contribution(s) alone or by combination of their Contribution(s)
   with the Work to which such Contribution(s) was submitted. If You
   institute patent litigation against any entity (including a
   cross-claim or counterclaim in a lawsuit) alleging that the Work
   or a Contribution incorporated within the Work constitutes direct
   or contributory patent infringement, then any patent licenses
   granted to You under this License for that Work shall terminate
   as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the
   Work or Derivative Works thereof in any medium, with or without
   modifications, and in Source or Object form, provided that You
   meet the following conditions:

   (a) You must give any other recipients of the Work or
   Derivative Works a copy of this License; and

   (b) You must cause any modified files to carry prominent notices
   stating that You changed the files; and

   (c) You must retain, in the Source form of any Derivative Works
   that You distribute, all copyright, patent, trademark, and
   attribution notices from the Source form of the Work,
   excluding those notices that do not pertain to any part of
   the Derivative Works; and

   (d) If the Work includes a "NOTICE" text file as part of its
   distribution, then any Derivative Works that You distribute must
   include a readable copy of the attribution notices contained
   within such NOTICE file, excluding those notices that do not
   pertain to any part of the Derivative Works, in at least one
   of the following places: within a NOTICE text file distributed
   as part of the Derivative Works; within the Source form or
   documentation, if provided along with the Derivative Works; or,
   within a display generated by the Derivative Works, if and
   wherever such third-party notices normally appear. The contents
   of the NOTICE file are for informational purposes only and
   do not modify the License. You may add Your own attribution
   notices within Derivative Works that You distribute, alongside
   or as an addendum to the NOTICE text from the Work, provided
   that such additional attribution notices cannot be construed
   as modifying the License.

   You may add Your own copyright statement to Your modifications and
   may provide additional or different license terms and conditions
   for use, reproduction, or distribution of Your modifications, or
   for any such Derivative Works as a whole, provided Your use,
   reproduction, and distribution of the Work otherwise complies with
   the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise,
   any Contribution intentionally submitted for inclusion in the Work
   by You to the Licensor shall be under the terms and conditions of
   this License, without any additional terms or conditions.
   Notwithstanding the above, nothing herein shall supersede or modify
   the terms of any separate license agreement you may have executed
   with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade
   names, trademarks, service marks, or product names of the Licensor,
   except as required for reasonable and customary use in describing the
   origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or
   agreed to in writing, Licensor provides the Work (and each
   Contributor provides its Contributions) on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
   implied, including, without limitation, any warranties or conditions
   of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
   PARTICULAR PURPOSE. You are solely responsible for determining the
   appropriateness of using or redistributing the Work and assume any
   risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory,
   whether in tort (including negligence), contract, or otherwise,
   unless required by applicable law (such as deliberate and grossly
   negligent acts) or agreed to in writing, shall any Contributor be
   liable to You for damages, including any direct, indirect, special,
   incidental, or consequential damages of any character arising as a
   result of this License or out of the use or inability to use the
   Work (including but not limited to damages for loss of goodwill,
   work stoppage, computer failure or malfunction, or any and all
   other commercial damages or losses), even if such Contributor
   has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing
   the Work or Derivative Works thereof, You may choose to offer,
   and charge a fee for, acceptance of support, warranty, indemnity,
   or other liability obligations and/or rights consistent with this
   License. However, in accepting such obligations, You may act only
   on Your own behalf and on Your sole responsibility, not on behalf
   of any other Contributor, and only if You agree to indemnify,
   defend, and hold each Contributor harmless for any liability
   incurred by, or claims asserted against, such Contributor by reason
   of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

    To apply the Apache License to your work, attach the following
    boilerplate notice, with the fields enclosed by brackets "[]"
    replaced with your own identifying information. (Don't include
    the brackets!)  The text should be enclosed in the appropriate
    comment syntax for the file format. We also recommend that a
    file or class name and description of purpose be included on the
    same "printed page" as the copyright notice for easier
    identification within third-party archives.

Copyright 2025, OpenPipe, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

## Additional License Notice

This distribution includes a small portion of code adapted from a third-party
project licensed under the GNU Lesser General Public License v3.0.
See the file THIRD-PARTY-NOTICES and the licenses/ directory for details.


================================================
FILE: README.md
================================================
<div align="center">

<a href="https://art.openpipe.ai"><picture>
<img alt="ART logo" src="https://github.com/openpipe/art/raw/main/assets/ART_logo.png" width="160px">
</picture></a>

<p align="center">
  <h1>Agent Reinforcement Trainer</h1>
</p>

<p>
Train multi-step agents for real-world tasks using GRPO.
</p>

[![PRs-Welcome][contribute-image]][contribute-url]
[![PyPI version](https://img.shields.io/pypi/v/openpipe-art?color=364fc7)][pypi-url]
[![Train Agent](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb)

[![Join Discord](https://img.shields.io/badge/Join%20Discord-5865F2?style=plastic&logo=discord&logoColor=white)](https://discord.gg/EceeVdhpxD)
[![Documentation](https://img.shields.io/badge/Documentation-orange?style=plastic&logo=gitbook&logoColor=white)](https://art.openpipe.ai)

</div>

## 🚀 W&B Training: Serverless RL

**W&B Training (Serverless RL)** is the first publicly available service for flexibly training models with reinforcement learning. It manages your training and inference infrastructure automatically, letting you focus on defining your data, environment and reward function—leading to faster feedback cycles, lower costs, and far less DevOps.

✨ **Key Benefits:**

- **40% lower cost** - Multiplexing on shared production-grade inference cluster
- **28% faster training** - Scale to 2000+ concurrent requests across many GPUs
- **Zero infra headaches** - Fully managed infrastructure that stays healthy
- **Instant deployment** - Every checkpoint instantly available via W&B Inference

```python
# Before: Hours of GPU setup and infra management
# RuntimeError: CUDA error: out of memory 😢

# After: Serverless RL with instant feedback
from art.serverless.backend import ServerlessBackend

model = art.TrainableModel(
  project="voice-agent",
  name="agent-001",
  base_model="OpenPipe/Qwen3-14B-Instruct"
)

backend = ServerlessBackend(
    api_key="your_wandb_api_key"
)
model.register(backend)
# Edit and iterate in minutes, not hours!
```

[📖 Learn more about W&B Training →](https://docs.wandb.ai/guides/training)

## ART Overview

ART is an open-source RL framework that improves agent reliability by allowing LLMs to **learn from experience**. ART provides an ergonomic harness for integrating GRPO into any python application. For a quick hands-on introduction, run one of the notebooks below. When you're ready to learn more, check out the [docs](https://art.openpipe.ai).

## 📒 Notebooks

| Agent Task          | Example Notebook                                                                                                                       | Description                                         | Comparative Performance                                                                                                                                                                                                     |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **ART•E [Serverless]**   | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/art-e.ipynb)                       | Qwen3 14B learns to search emails using RULER     | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/email_agent/accuracy-training-progress.svg" height="72"> [benchmarks](/dev/art-e/art_e/evaluate/display_benchmarks.ipynb)                              |
| **2048 [Serverless]** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/2048/2048.ipynb)                   | Qwen3 14B learns to play 2048                     | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/2048/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/2048/display_benchmarks.ipynb)                                                |
| **ART•E LangGraph** | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/langgraph/art-e-langgraph.ipynb)   | Qwen 2.5 7B learns to search emails using LangGraph | [Link coming soon]                                                                                                                                                                                                          |
| **MCP•RL**          | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/mcp-rl/mcp-rl.ipynb)               | Qwen 2.5 3B masters the NWS MCP server              | [Link coming soon]                                                                                                                                                                                                          |
| **Temporal Clue**   | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/temporal_clue/temporal-clue.ipynb) | Qwen 2.5 7B learns to solve Temporal Clue           | [Link coming soon]                                                                                                                                                                                                          |
| **Tic Tac Toe**     | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/tic_tac_toe/tic-tac-toe.ipynb)     | Qwen 2.5 3B learns to play Tic Tac Toe              | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/tic-tac-toe-local/accuracy-training-progress.svg" height="72"> [benchmarks](/examples/tic_tac_toe/display-benchmarks.ipynb)                            |
| **Codenames**       | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb)      | Qwen 2.5 3B learns to play Codenames                | <img src="https://github.com/openpipe/art/raw/main/assets/benchmarks/codenames/win_rate_over_time.png" height="72"> [benchmarks](https://github.com/OpenPipe/art-notebooks/blob/main/examples/codenames/Codenames_RL.ipynb) |
| **AutoRL [RULER]**  | [🏋️ Train agent](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/auto_rl.ipynb)                     | Train Qwen 2.5 7B to master any task                | [Link coming soon]                                                                                                                                                                                                          |
| **Distillation (SFT)** | [🏋️ Train model](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/sft/distillation.ipynb)         | Distill text-to-SQL from Qwen 3 235B to Qwen 3 30B  | [Link coming soon]                                                                                                                                                                                                          |
| **Summarizer (SFT + RL)** | [🏋️ Train model](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/sft/sft-rl.ipynb)            | Train a document summarizer with SFT warmup then RL | [Link coming soon]                                                                                                                                                                                                          |
| **SFT from a dataset** | [🏋️ Train model](https://colab.research.google.com/github/openpipe/art-notebooks/blob/main/examples/sft/train_from_file.ipynb)      | Fine-tune Qwen 3 30B on text-to-SQL from a dataset  | [Link coming soon]                                                                                                                                                                                                          |

## 📰 ART News

Explore our latest research and updates on building SOTA agents.

- 🗞️ **[ART now integrates seamlessly with LangGraph](https://art.openpipe.ai/integrations/langgraph-integration)** - Train your LangGraph agents with reinforcement learning for smarter multi-step reasoning and improved tool usage.
- 🗞️ **[MCP•RL: Teach Your Model to Master Any MCP Server](https://x.com/corbtt/status/1953171838382817625)** - Automatically train models to effectively use MCP server tools through reinforcement learning.
- 🗞️ **[AutoRL: Zero-Data Training for Any Task](https://x.com/mattshumer_/status/1950572449025650733)** - Train custom AI models without labeled data using automatic input generation and RULER evaluation.
- 🗞️ **[RULER: Easy Mode for RL Rewards](https://openpipe.ai/blog/ruler-easy-mode-for-rl-rewards)** is now available for automatic reward generation in reinforcement learning.
- 🗞️ **[ART·E: How We Built an Email Research Agent That Beats o3](https://openpipe.ai/blog/art-e-mail-agent)** demonstrates a Qwen 2.5 14B email agent outperforming OpenAI's o3.
- 🗞️ **[ART Trainer: A New RL Trainer for Agents](https://openpipe.ai/blog/art-trainer)** enables easy training of LLM-based agents using GRPO.

[📖 See all blog posts →](https://openpipe.ai/blog)

## Why ART?

- ART provides convenient wrappers for introducing RL training into **existing applications**. We abstract the training server into a modular service that your code doesn't need to interface with.
- **Train from anywhere.** Run the ART client on your laptop and let the ART server kick off an ephemeral GPU-enabled environment, or run on a local GPU.
- Integrations with hosted platforms like W&B, Langfuse, and OpenPipe provide flexible observability and **simplify debugging**.
- ART is customizable with **intelligent defaults**. You can configure training parameters and inference engine configurations to meet specific needs, or take advantage of the defaults, which have been optimized for training efficiency and stability.

## Installation

ART agents can be trained from any client machine that runs python. To add to an existing project, run this command:

```
pip install openpipe-art
```

## 🤖 ART•E Agent

Curious about how to use ART for a real-world task? Check out the [ART•E Agent](https://openpipe.ai/blog/art-e-mail-agent) blog post, where we detail how we trained Qwen 2.5 14B to beat o3 at email retrieval!

<img src="https://github.com/openpipe/art/raw/main/assets/ART_E_graphs.png" width="700">

## 🔁 Training Loop Overview

ART's functionality is divided into a **client** and a **server**. The OpenAI-compatible client is responsible for interfacing between ART and your codebase. Using the client, you can pass messages and get completions from your LLM as it improves. The server runs independently on any machine with a GPU. It abstracts away the complexity of the inference and training portions of the RL loop while allowing for some custom configuration. An outline of the training loop is shown below:

1. **Inference**

   1. Your code uses the ART client to perform an agentic workflow (usually executing several rollouts in parallel to gather data faster).
   2. Completion requests are routed to the ART server, which runs the model's latest LoRA in vLLM.
   3. As the agent executes, each `system`, `user`, and `assistant` message is stored in a Trajectory.
   4. When a rollout finishes, your code assigns a `reward` to its Trajectory, indicating the performance of the LLM.

2. **Training**
   1. When each rollout has finished, Trajectories are grouped and sent to the server. Inference is blocked while training executes.
   2. The server trains your model using GRPO, initializing from the latest checkpoint (or an empty LoRA on the first iteration).
   3. The server saves the newly trained LoRA to a local directory and loads it into vLLM.
   4. Inference is unblocked and the loop resumes at step 1.

This training loop runs until a specified number of inference and training iterations have completed.

## 🧩 Supported Models

ART should work with most vLLM/HuggingFace-transformers compatible causal language models, or at least the ones supported by [Unsloth](https://docs.unsloth.ai/get-started/all-our-models). Gemma 3 does not appear to be supported for the time being. If any other model isn't working for you, please let us know on [Discord](https://discord.gg/zbBHRUpwf4) or open an issue on [GitHub](https://github.com/openpipe/art/issues)!

## 🤝 Contributing

ART is in active development, and contributions are most welcome! Please see the [CONTRIBUTING.md](CONTRIBUTING.md) file for more information.

## 📖 Citation

```bibtex
@misc{hilton2025art,
  author = {Brad Hilton and Kyle Corbitt and David Corbitt and Saumya Gandhi and Angky William and Bohdan Kovalevskyi and Andie Jones},
  title = {ART: Agent Reinforcement Trainer},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/openpipe/art}}
}
```

## ⚖️ License

This repository's source code is available under the [Apache-2.0 License](LICENSE).

## 🙏 Credits

ART stands on the shoulders of giants. While we owe many of the ideas and early experiments that led to ART's development to the open source RL community at large, we're especially grateful to the authors of the following projects:

- [Unsloth](https://github.com/unslothai/unsloth)
- [vLLM](https://github.com/vllm-project/vllm)
- [trl](https://github.com/huggingface/trl)
- [torchtune](https://github.com/pytorch/torchtune)

Finally, thank you to our partners who've helped us test ART in the wild! We're excited to see what you all build with it.

[pypi-url]: https://pypi.org/project/openpipe-art/
[contribute-url]: https://github.com/openpipe/art/blob/main/CONTRIBUTING.md
[contribute-image]: https://img.shields.io/badge/PRs-welcome-blue.svg


================================================
FILE: THIRD-PARTY-NOTICES
================================================
This project includes code from PyTorch TorchTune:

- Repository: https://github.com/pytorch/torchtune
- License: BSD 3-Clause License
- License URL: https://github.com/pytorch/torchtune/blob/main/LICENSE

The following files contain code copied and modified from TorchTune:

- src/art/torchtune/recipe.py

Note: This project also uses TorchTune as a dependency. The above notice applies
specifically to files containing copied/modified code from the TorchTune repository.

Full license text:

BSD 3-Clause License

Copyright 2024 Meta

Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list
   of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this
   list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may
   be used to endorse or promote products derived from this software without specific
   prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.


================================================
FILE: dev/demo_logging.py
================================================
#!/usr/bin/env python3
"""Demo of all logging functionality from art.utils.logging."""

import time

from art.utils.logging import _C, _ts, dim, err, info, ok, step, warn


def demo_basic_logging():
    """Demonstrate the basic logging functions."""
    print("=" * 60)
    print("BASIC LOGGING FUNCTIONS")
    print("=" * 60)

    info("This is an informational message")
    step("This indicates a step in a process")
    ok("This indicates successful completion")
    warn("This is a warning message")
    err("This is an error message")
    dim("This is dimmed/secondary text")

    print()


def demo_color_codes():
    """Demonstrate the color code constants."""
    print("=" * 60)
    print("COLOR CODE CONSTANTS (_C class)")
    print("=" * 60)

    print("Available color constants:")
    print(f"{_C.RESET}RESET{_C.RESET} - Reset all formatting")
    print(f"{_C.DIM}DIM{_C.RESET} - Dimmed text")
    print(f"{_C.BOLD}BOLD{_C.RESET} - Bold text")
    print(f"{_C.ITAL}ITAL{_C.RESET} - Italic text")
    print(f"{_C.GRAY}GRAY{_C.RESET} - Gray color")
    print(f"{_C.BLUE}BLUE{_C.RESET} - Blue color")
    print(f"{_C.CYAN}CYAN{_C.RESET} - Cyan color")
    print(f"{_C.GREEN}GREEN{_C.RESET} - Green color")
    print(f"{_C.YELLOW}YELLOW{_C.RESET} - Yellow color")
    print(f"{_C.RED}RED{_C.RESET} - Red color")
    print(f"{_C.MAGENTA}MAGENTA{_C.RESET} - Magenta color")

    print("\nCustom formatted messages:")
    print(f"{_C.BOLD}{_C.BLUE}Bold Blue Text{_C.RESET}")
    print(f"{_C.ITAL}{_C.GREEN}Italic Green Text{_C.RESET}")
    print(f"{_C.DIM}{_C.GRAY}Dimmed Gray Text{_C.RESET}")

    print()


def demo_timestamp():
    """Demonstrate the timestamp function."""
    print("=" * 60)
    print("TIMESTAMP FUNCTION (_ts)")
    print("=" * 60)

    print(f"Current timestamp: {_ts()}")
    print(f"Timestamp format: HH:MM:SS")
    print(f"Example with custom message: [{_ts()}] Custom log message")

    print()


def demo_real_world_usage():
    """Demonstrate real-world usage scenarios."""
    print("=" * 60)
    print("REAL-WORLD USAGE SCENARIOS")
    print("=" * 60)

    # Simulating a process with multiple steps
    info("Starting data processing pipeline")

    step("Loading configuration file")
    time.sleep(0.5)  # Simulate work
    ok("Configuration loaded successfully")

    step("Connecting to database")
    time.sleep(0.3)  # Simulate work
    ok("Database connection established")

    step("Processing 1000 records")
    time.sleep(0.7)  # Simulate work
    warn("Skipped 2 invalid records")
    ok("Processed 998/1000 records successfully")

    step("Generating report")
    time.sleep(0.4)  # Simulate work
    ok("Report generated successfully")

    info("Pipeline completed")
    dim("   Total time: 2.1 seconds")
    dim("   Records processed: 998")
    dim("   Records skipped: 2")

    print()


def demo_progress_tracking():
    """Demonstrate progress tracking with logging."""
    print("=" * 60)
    print("PROGRESS TRACKING EXAMPLE")
    print("=" * 60)

    total_items = 5
    info(f"Processing {total_items} items")

    for i in range(1, total_items + 1):
        step(f"Processing item {i}/{total_items}")
        time.sleep(0.2)  # Simulate work

        if i == 3:
            warn(f"Item {i} required additional validation")

        ok(f"Item {i} completed")
        dim(f"   Progress: {i}/{total_items} ({i / total_items * 100:.0f}%)")

    ok("All items processed successfully")

    print()


def demo_error_scenarios():
    """Demonstrate error reporting scenarios."""
    print("=" * 60)
    print("ERROR REPORTING SCENARIOS")
    print("=" * 60)

    info("Testing error handling scenarios")

    step("Attempting risky operation 1")
    warn("Operation completed with warnings")
    dim("   Warning: Deprecated API used")

    step("Attempting risky operation 2")
    err("Operation failed with error")
    dim("   Error: File not found: /path/to/missing/file.txt")
    dim("   Suggestion: Check file path and permissions")

    step("Attempting recovery")
    ok("Successfully recovered using fallback method")

    print()


def demo_formatting_combinations():
    """Demonstrate various formatting combinations."""
    print("=" * 60)
    print("ADVANCED FORMATTING COMBINATIONS")
    print("=" * 60)

    # Combining colors and styles
    print("Style combinations:")
    print(f"{_C.BOLD}{_C.RED}Bold Red Error{_C.RESET}")
    print(f"{_C.BOLD}{_C.GREEN}Bold Green Success{_C.RESET}")
    print(f"{_C.BOLD}{_C.YELLOW}Bold Yellow Warning{_C.RESET}")
    print(f"{_C.ITAL}{_C.BLUE}Italic Blue Info{_C.RESET}")
    print(f"{_C.DIM}{_C.GRAY}Dimmed Gray Details{_C.RESET}")

    print("\nNested formatting:")
    print(
        f"Regular text with {_C.BOLD}bold{_C.RESET} and {_C.ITAL}italic{_C.RESET} sections"
    )
    print(
        f"{_C.BLUE}Blue text with {_C.BOLD}bold section{_C.RESET}{_C.BLUE} continuing in blue{_C.RESET}"
    )

    print("\nStatus indicators:")
    print(f"[{_C.GREEN}{_C.RESET}] Success indicator")
    print(f"[{_C.YELLOW}!{_C.RESET}] Warning indicator")
    print(f"[{_C.RED}{_C.RESET}] Error indicator")
    print(f"[{_C.BLUE}i{_C.RESET}] Info indicator")

    print()


def demo_log_levels():
    """Demonstrate different log levels in action."""
    print("=" * 60)
    print("LOG LEVELS DEMONSTRATION")
    print("=" * 60)

    print("Simulating application startup:")
    info("Application starting up")
    step("Initializing modules")
    ok("Core modules loaded")
    step("Starting services")
    warn("Service A started with reduced performance mode")
    ok("Service B started normally")
    err("Service C failed to start")
    dim("   Fallback: Using Service D instead")
    ok("Service D started successfully")
    info("Application startup complete")

    print("\nSimulating application shutdown:")
    info("Shutting down application")
    step("Stopping services")
    ok("All services stopped cleanly")
    step("Cleaning up resources")
    ok("Resources cleaned up")
    info("Application shutdown complete")

    print()


def main():
    """Run all logging demonstrations."""
    print(f"{_C.BOLD}{_C.CYAN}ART Logging System Demo{_C.RESET}")
    print(f"Timestamp: {_ts()}")
    print()

    # Run all demonstrations
    demo_basic_logging()
    demo_color_codes()
    demo_timestamp()
    demo_real_world_usage()
    demo_progress_tracking()
    demo_error_scenarios()
    demo_formatting_combinations()
    demo_log_levels()

    # Final summary
    print("=" * 60)
    print("DEMO COMPLETE")
    print("=" * 60)
    ok("All logging functionality demonstrated successfully")
    info("Available functions: info(), step(), ok(), warn(), err(), dim()")
    info("Available constants: _C class with color codes, _ts() for timestamps")
    dim("   For more details, see: src/art/utils/logging.py")

    print(f"\n{_C.BOLD}Usage Examples:{_C.RESET}")
    print("from art.utils.logging import info, step, ok, warn, err, dim, _C")
    print("info('Starting process')")
    print("step('Processing data')")
    print("ok('Process completed')")
    print("warn('Performance degraded')")
    print("err('Operation failed')")
    print("dim('Additional details')")
    print(f"print(f'{_C.BOLD}Bold text{_C.RESET}')")


if __name__ == "__main__":
    main()


================================================
FILE: dev/math-vista/math-vista.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "46a6ad6d",
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "96d51078",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%html\n",
    "<style>\n",
    ".cell-output-ipywidget-background {\n",
    "    background-color: transparent !important;\n",
    "}\n",
    ":root {\n",
    "    --jp-widgets-color: var(--vscode-editor-foreground);\n",
    "    --jp-widgets-font-size: var(--vscode-editor-font-size);\n",
    "}  \n",
    "</style>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7dd70e04",
   "metadata": {},
   "outputs": [],
   "source": [
    "import polars as pl\n",
    "\n",
    "splits = {\n",
    "    \"testmini\": \"data/testmini-00000-of-00001-725687bf7a18d64b.parquet\",\n",
    "    \"test\": \"data/test-*.parquet\",\n",
    "}\n",
    "df = pl.read_parquet(\"hf://datasets/AI4Math/MathVista/\" + splits[\"testmini\"]).sample(\n",
    "    fraction=1.0, shuffle=True, seed=42\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "81e02b97",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Iterator, TypedDict, cast\n",
    "\n",
    "\n",
    "class DecodedImage(TypedDict):\n",
    "    bytes: bytes\n",
    "\n",
    "\n",
    "class Scenario(TypedDict):\n",
    "    pid: int\n",
    "    question: str\n",
    "    answer: str\n",
    "    image: str\n",
    "    decoded_image: DecodedImage\n",
    "\n",
    "\n",
    "val_scenarios = cast(list[Scenario], df.head(64).to_dicts())\n",
    "train_scenarios_iter = cast(Iterator[Scenario], df.tail(-64).iter_rows(named=True))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9287d8fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "\n",
    "import art\n",
    "from art.local import LocalBackend\n",
    "\n",
    "model = art.TrainableModel(\n",
    "    name=\"002\",\n",
    "    project=\"math-vista\",\n",
    "    base_model=\"Qwen/Qwen2.5-VL-7B-Instruct\",\n",
    ")\n",
    "backend = LocalBackend()\n",
    "await model.register(backend)\n",
    "client = model.openai_client()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c92b4b11",
   "metadata": {},
   "outputs": [],
   "source": [
    "async def rollout(scenario: Scenario) -> art.Trajectory:\n",
    "    image_path = f\"/tmp/{scenario['image']}\"\n",
    "\n",
    "    import os\n",
    "\n",
    "    os.makedirs(os.path.dirname(image_path), exist_ok=True)\n",
    "\n",
    "    with open(image_path, \"wb\") as f:\n",
    "        f.write(scenario[\"decoded_image\"][\"bytes\"])\n",
    "\n",
    "    trajectory = art.Trajectory(messages_and_choices=[], reward=0.0)\n",
    "    trajectory.messages_and_choices = [\n",
    "        {\n",
    "            \"role\": \"user\",\n",
    "            \"content\": [\n",
    "                {\n",
    "                    \"type\": \"text\",\n",
    "                    \"text\": scenario[\"question\"]\n",
    "                    + \"\\n\\nNote: Provide your answer in a LaTeX box.\",\n",
    "                },\n",
    "                {\"type\": \"image_url\", \"image_url\": {\"url\": f\"file://{image_path}\"}},\n",
    "            ],\n",
    "        }\n",
    "    ]\n",
    "    chat_completion = await client.chat.completions.create(\n",
    "        model=model.get_inference_name(), messages=trajectory.messages()\n",
    "    )\n",
    "    choice = chat_completion.choices[0]\n",
    "    trajectory.messages_and_choices.append(choice)\n",
    "    content = choice.message.content\n",
    "    assert content is not None\n",
    "    if matches := list(re.finditer(r\"\\\\boxed\\{(.*?)\\}\", content, re.DOTALL)):\n",
    "        match = matches[-1]\n",
    "        answer = match.group(1)\n",
    "        if answer.lower() == scenario[\"answer\"].lower():\n",
    "            trajectory.reward = 1.0\n",
    "    return trajectory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "359e530d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import asyncio\n",
    "import itertools\n",
    "\n",
    "SCENARIOS_PER_STEP = 8\n",
    "TRAJECTORY_GROUP_SIZE = 8\n",
    "start = await model.get_step()\n",
    "train_scenarios_iter = itertools.cycle(train_scenarios_iter)\n",
    "for _ in range(start * SCENARIOS_PER_STEP):\n",
    "    next(train_scenarios_iter)\n",
    "\n",
    "for i in range(start, 1000):\n",
    "    train_scenarios = [next(train_scenarios_iter) for _ in range(SCENARIOS_PER_STEP)]\n",
    "    val_trajectories, train_trajectory_groups = await asyncio.gather(\n",
    "        art.gather_trajectories(\n",
    "            (rollout(scenario) for scenario in val_scenarios),\n",
    "            pbar_desc=\"gather(val)\",\n",
    "            max_exceptions=32,\n",
    "        ),\n",
    "        art.gather_trajectory_groups(\n",
    "            (\n",
    "                art.TrajectoryGroup(\n",
    "                    rollout(scenario) for _ in range(TRAJECTORY_GROUP_SIZE)\n",
    "                )\n",
    "                for scenario in train_scenarios\n",
    "            ),\n",
    "            pbar_desc=\"gather(train)\",\n",
    "            max_exceptions=32,\n",
    "        ),\n",
    "    )\n",
    "    await model.log(val_trajectories)\n",
    "    await model.train(train_trajectory_groups)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}


================================================
FILE: dev/math-vista/math-vista.py
================================================
import argparse
import asyncio
import itertools
import os
import re
from typing import Iterator, TypedDict, cast

import polars as pl

import art
from art.local import LocalBackend


class DecodedImage(TypedDict):
    bytes: bytes


class Scenario(TypedDict):
    pid: int
    question: str
    answer: str
    image: str
    decoded_image: DecodedImage


async def main(model_name: str, steps: int) -> None:
    # Load and shuffle the dataset
    df = pl.read_parquet(
        "hf://datasets/AI4Math/MathVista/data/testmini-00000-of-00001-725687bf7a18d64b.parquet"
    ).sample(fraction=1.0, shuffle=True, seed=42)

    val_scenarios = cast(list[Scenario], df.head(64).to_dicts())
    train_scenarios_iter = cast(Iterator[Scenario], df.tail(-64).iter_rows(named=True))

    # Initialize trainable model and backend
    model = art.TrainableModel(
        name=model_name,
        project="math-vista",
        base_model="Qwen/Qwen2.5-VL-7B-Instruct",
    )

    async def rollout(scenario: Scenario) -> art.Trajectory:
        image_path = f"/tmp/{scenario['image']}"
        os.makedirs(os.path.dirname(image_path), exist_ok=True)
        with open(image_path, "wb") as f:
            f.write(scenario["decoded_image"]["bytes"])

        trajectory = art.Trajectory(messages_and_choices=[], reward=0.0)
        trajectory.messages_and_choices = [
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": scenario["question"]
                        + "\n\nNote: Provide your answer in a LaTeX box.",
                    },
                    {"type": "image_url", "image_url": {"url": f"file://{image_path}"}},
                ],
            }
        ]

        chat_completion = await client.chat.completions.create(
            model=model.get_inference_name(), messages=trajectory.messages()
        )
        choice = chat_completion.choices[0]
        trajectory.messages_and_choices.append(choice)
        content = choice.message.content
        assert content is not None

        if matches := list(re.finditer(r"\\boxed\{(.*?)\}", content, re.DOTALL)):
            match = matches[-1]
            answer = match.group(1)
            if answer.lower() == scenario["answer"].lower():
                trajectory.reward = 1.0
        return trajectory

    SCENARIOS_PER_STEP = 8
    TRAJECTORY_GROUP_SIZE = 8

    with LocalBackend() as backend:
        await model.register(backend)
        client = model.openai_client()

        start = await model.get_step()
        train_scenarios_iter = itertools.cycle(train_scenarios_iter)
        for _ in range(start * SCENARIOS_PER_STEP):
            next(train_scenarios_iter)

        # Training loop
        for _ in range(start, steps):
            train_scenarios = [
                next(train_scenarios_iter) for _ in range(SCENARIOS_PER_STEP)
            ]
            val_trajectories, train_trajectory_groups = await asyncio.gather(
                art.gather_trajectories(
                    (rollout(scenario) for scenario in val_scenarios),
                    pbar_desc="gather(val)",
                    max_exceptions=32,
                ),
                art.gather_trajectory_groups(
                    (
                        art.TrajectoryGroup(
                            rollout(scenario) for _ in range(TRAJECTORY_GROUP_SIZE)
                        )
                        for scenario in train_scenarios
                    ),
                    pbar_desc="gather(train)",
                    max_exceptions=32,
                ),
            )
            await model.log(val_trajectories)
            await model.train(train_trajectory_groups)


def parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser(description="Minimal MathVista trainer script")
    parser.add_argument(
        "-n",
        "--name",
        required=True,
        help="Run/model name to use for the TrainableModel",
    )
    parser.add_argument(
        "-s",
        "--steps",
        type=int,
        default=1000,
        help="Number of training steps to run",
    )
    return parser.parse_args()


if __name__ == "__main__":
    args = parse_args()
    asyncio.run(main(args.name, args.steps))


================================================
FILE: dev/new_models/benchmark_inference.py
================================================
"""
Benchmark inference performance for Qwen2.5-7B-Instruct using ART.

This script sends 5 concurrent requests with approximately 1000 input tokens
and requests approximately 1000 output tokens (max_tokens=1000), repeating
for 10 iterations. It measures per-request latencies and summarizes statistics.
"""

import asyncio
import statistics
import time

from dotenv import load_dotenv

import art
from art.local import LocalBackend

load_dotenv()


async def timed_request(client, model_name, prompt, max_tokens, temperature):
    """Execute a single model request and measure elapsed time and token usage."""
    start = time.perf_counter()
    response = await client.chat.completions.create(
        model=model_name,
        messages=[{"role": "user", "content": prompt}],
        max_tokens=max_tokens,
        temperature=temperature,
    )
    elapsed = time.perf_counter() - start
    print(elapsed)
    prompt_tokens = None
    completion_tokens = None
    if hasattr(response, "usage") and response.usage is not None:
        usage = response.usage
        prompt_tokens = getattr(usage, "prompt_tokens", None)
        completion_tokens = getattr(usage, "completion_tokens", None)
    return {
        "response": response,
        "elapsed": elapsed,
        "prompt_tokens": prompt_tokens,
        "completion_tokens": completion_tokens,
    }


async def main():
    # Define prompt (approx 1000 input tokens) and model
    prompt = (
        "Hello world. " * 500
    ).strip() + "Please repeat the entire prompt back to me verbatim"
    # Output tokens to request
    max_tokens = 1000
    temperature = 1.0
    model = art.TrainableModel(
        name="benchmark-qwen2.5-14b-instruct",
        project="benchmark-vllm",
        base_model="Qwen/Qwen2.5-14B-Instruct",
    )
    backend = LocalBackend()
    await model.register(backend)

    # Prepare for inference
    client = model.openai_client()
    iterations = 1
    concurrency = 800
    # Track iteration-level durations
    durations = []
    # Track per-request timings and token usage
    per_request_durations = []
    per_request_prompt_tokens = []
    per_request_completion_tokens = []

    for i in range(1, iterations + 1):
        print(
            f"Iteration {i}/{iterations}: sending {concurrency} concurrent requests..."
        )
        iteration_start = time.perf_counter()
        # launch concurrent requests and time each individually
        tasks = [
            timed_request(
                client,
                model.get_inference_name(),
                prompt,
                max_tokens,
                temperature,
            )
            for _ in range(concurrency)
        ]
        # Wait for all responses
        results = await asyncio.gather(*tasks)
        # Record iteration duration
        iteration_elapsed = time.perf_counter() - iteration_start
        durations.append(iteration_elapsed)
        print(f"  Iteration time: {iteration_elapsed:.2f} seconds")
        # Record per-request stats
        for res in results:
            per_request_durations.append(res["elapsed"])
            if res["prompt_tokens"] is not None:
                per_request_prompt_tokens.append(res["prompt_tokens"])
            if res["completion_tokens"] is not None:
                per_request_completion_tokens.append(res["completion_tokens"])

    # Compute statistics
    total_time = sum(durations)
    min_time = min(durations)
    max_time = max(durations)
    avg_time = statistics.mean(durations)
    std_time = statistics.stdev(durations) if len(durations) > 1 else 0.0
    total_requests = iterations * concurrency
    avg_per_request = total_time / total_requests
    # Compute per-request statistics
    pr_min = min(per_request_durations) if per_request_durations else 0.0
    pr_max = max(per_request_durations) if per_request_durations else 0.0
    pr_avg = statistics.mean(per_request_durations) if per_request_durations else 0.0
    pr_std = (
        statistics.stdev(per_request_durations)
        if len(per_request_durations) > 1
        else 0.0
    )
    avg_prompt_tokens = (
        statistics.mean(per_request_prompt_tokens)
        if per_request_prompt_tokens
        else None
    )
    avg_completion_tokens = (
        statistics.mean(per_request_completion_tokens)
        if per_request_completion_tokens
        else None
    )

    # Report results
    print("\nInference benchmark results:")
    print(f"  Iterations:    {iterations}")
    print(f"  Concurrency:   {concurrency}")
    print(f"  Total time:    {total_time:.2f} s")
    print(f"  Min iteration: {min_time:.2f} s")
    print(f"  Max iteration: {max_time:.2f} s")
    print(f"  Avg iteration: {avg_time:.2f} s")
    print(f"  Std dev iter:  {std_time:.2f} s")
    print(f"  Avg per req:   {avg_per_request:.2f} s/request")
    # Per-request latency statistics
    print(f"  Min request time: {pr_min:.2f} s")
    print(f"  Max request time: {pr_max:.2f} s")
    print(f"  Avg request time: {pr_avg:.2f} s")
    print(f"  Std dev request time: {pr_std:.2f} s")
    if avg_prompt_tokens is not None:
        print(f"  Avg prompt tokens: {avg_prompt_tokens:.2f}")
    if avg_completion_tokens is not None:
        print(f"  Avg completion tokens: {avg_completion_tokens:.2f}")


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/new_models/gemma3.py
================================================
import asyncio
import json

from dotenv import load_dotenv

import art
from art.local import LocalBackend

load_dotenv()


async def rollout(model: art.TrainableModel, prompt: str) -> art.Trajectory:
    messages: art.Messages = [
        {
            "role": "user",
            "content": prompt,
        }
    ]
    client = model.openai_client()
    chat_completion = await client.chat.completions.create(
        messages=messages,
        model=model.get_inference_name(),
        max_tokens=100,
        timeout=100,
    )
    choice = chat_completion.choices[0]
    content = choice.message.content
    assert isinstance(content, str)
    if content == "yes":
        reward = 0.5
    elif content == "no":
        reward = 0.75
    elif content == "maybe":
        reward = 1.0
    else:
        reward = 0.0
    return art.Trajectory(messages_and_choices=[*messages, choice], reward=reward)


async def main():
    with open("dev/new_models/prompts.json", "r") as f:
        prompts = json.load(f)
    print(prompts)

    backend = LocalBackend()
    model = art.TrainableModel(
        name="001-gemma3",
        project="yes-no-maybe-s",
        base_model="google/gemma-3-4b-it",
        _internal_config={
            "init_args": {
                "enable_prefix_caching": False,
            },
        },
    )
    await model.register(backend)
    for _ in range(await model.get_step(), 1_000):
        train_groups = await art.gather_trajectory_groups(
            (
                art.TrajectoryGroup(rollout(model, prompt) for _ in range(32))
                for prompt in prompts
            ),
            pbar_desc="gather",
        )
        await model.train(
            train_groups,
            config=art.TrainConfig(learning_rate=1e-4),
        )


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/new_models/prompts.json
================================================
["respond with 'yes', 'no', 'maybe'", "respond with 'maybe', 'yes', 'no'", "respond with 'no', 'yes', 'maybe'", "respond with 'yes', 'maybe', 'no'", "respond with yes or no", "respond with maybe or no", "respond with no or maybe", "respond with no or yes", "respond with yes or no", "respond with yes, no, maybe", "respond with maybe, yes, no", "respond with no, yes, maybe", "respond with yes, maybe, no", "respond with yes or no", "respond with maybe or no", "respond with no or maybe", "respond with no or yes", "respond with yes or no", "just respond with 'yes', 'no', 'maybe'", "just respond with 'maybe', 'yes', 'no'", "just respond with 'no', 'yes', 'maybe'", "just respond with 'yes', 'maybe', 'no'", "just respond with yes or no", "just respond with maybe or no", "just respond with no or maybe", "just respond with no or yes", "just respond with yes or no", "just respond with yes, no, maybe", "just respond with maybe, yes, no", "just respond with no, yes, maybe", "just respond with yes, maybe, no", "just respond with yes or no", "just respond with maybe or no", "just respond with no or maybe", "just respond with no or yes", "just respond with yes or no"]

================================================
FILE: dev/new_models/qwen3_try.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "import art\n",
    "from art.local import LocalBackend\n",
    "\n",
    "load_dotenv()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "async def rollout(model: art.TrainableModel, prompt: str) -> art.Trajectory:\n",
    "    messages: art.Messages = [\n",
    "        {\n",
    "            \"role\": \"user\",\n",
    "            \"content\": prompt,\n",
    "        }\n",
    "    ]\n",
    "    client = model.openai_client()\n",
    "    chat_completion = await client.chat.completions.create(\n",
    "        messages=messages,\n",
    "        model=model.get_inference_name(),\n",
    "        max_tokens=100,\n",
    "        timeout=100,\n",
    "        extra_body={\"chat_template_kwargs\": {\"enable_thinking\": False}},\n",
    "    )\n",
    "    choice = chat_completion.choices[0]\n",
    "    content = choice.message.content\n",
    "    print(content)\n",
    "    assert isinstance(content, str)\n",
    "    if content == \"yes\":\n",
    "        reward = 0.5\n",
    "    elif content == \"no\":\n",
    "        reward = 0.75\n",
    "    elif content == \"maybe\":\n",
    "        reward = 1.0\n",
    "    else:\n",
    "        reward = 0.0\n",
    "    return art.Trajectory(messages_and_choices=[*messages, choice], reward=reward)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open(\"prompts.json\", \"r\") as f:\n",
    "    prompts = json.load(f)\n",
    "print(prompts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "backend = LocalBackend()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qwen2 = art.TrainableModel(\n",
    "    name=\"004\",\n",
    "    project=\"yes-no-maybe-s\",\n",
    "    base_model=\"Qwen/Qwen2.5-0.5B-Instruct\",\n",
    "    # base_model=\"Qwen/Qwen2.5-0.5B-Instruct\",\n",
    ")\n",
    "await qwen2.register(backend)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "await rollout(qwen2, prompts[4])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "qwen3 = art.TrainableModel(\n",
    "    name=\"005\",\n",
    "    project=\"yes-no-maybe-s\",\n",
    "    base_model=\"Qwen/Qwen3-0.6B\",\n",
    "    # base_model=\"Qwen/Qwen2.5-0.5B-Instruct\",\n",
    ")\n",
    "await qwen3.register(backend)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "await rollout(qwen3, prompts[4])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for _ in range(await qwen3.get_step(), 1_000):\n",
    "    train_groups = await art.gather_trajectory_groups(\n",
    "        (\n",
    "            art.TrajectoryGroup(rollout(qwen3, prompt) for _ in range(32))\n",
    "            for prompt in prompts\n",
    "        ),\n",
    "        pbar_desc=\"gather\",\n",
    "    )\n",
    "    await qwen3.train(\n",
    "        train_groups,\n",
    "        config=art.TrainConfig(learning_rate=1e-4),\n",
    "    )"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: dev/new_models/qwen3_try.py
================================================
import asyncio
import json

from dotenv import load_dotenv

import art
from art.local import LocalBackend

load_dotenv()


async def rollout(model: art.TrainableModel, prompt: str) -> art.Trajectory:
    messages: art.Messages = [
        {
            "role": "user",
            "content": prompt,
        }
    ]
    client = model.openai_client()
    chat_completion = await client.chat.completions.create(
        messages=messages,
        model=model.get_inference_name(),
        max_tokens=100,
        timeout=100,
        extra_body={"chat_template_kwargs": {"enable_thinking": False}},
    )
    choice = chat_completion.choices[0]
    content = choice.message.content
    assert isinstance(content, str)
    if content == "yes":
        reward = 0.5
    elif content == "no":
        reward = 0.75
    elif content == "maybe":
        reward = 1.0
    else:
        reward = 0.0
    return art.Trajectory(messages_and_choices=[*messages, choice], reward=reward)


async def main():
    with open("dev/new_models/prompts.json", "r") as f:
        prompts = json.load(f)
    print(prompts)

    backend = LocalBackend()
    model = art.TrainableModel(
        name="007",
        project="yes-no-maybe-s",
        base_model="Qwen/Qwen3-0.6B",
        # base_model="Qwen/Qwen2.5-0.5B-Instruct",
    )
    await model.register(backend)
    for _ in range(await model.get_step(), 1_000):
        train_groups = await art.gather_trajectory_groups(
            (
                art.TrajectoryGroup(rollout(model, prompt) for _ in range(32))
                for prompt in prompts
            ),
            pbar_desc="gather",
        )
        await model.train(
            train_groups,
            config=art.TrainConfig(learning_rate=1e-4),
        )


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/profile.ipynb
================================================
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from art.dev.get_model_config import get_model_config\n",
    "from art.local.state import ModelState\n",
    "\n",
    "config = get_model_config(\n",
    "    \"Qwen/Qwen2.5-72B-Instruct\",\n",
    "    output_dir=\"./.art/models/test\",\n",
    "    config={\n",
    "        \"engine_args\": {\n",
    "            \"enable_sleep_mode\": True,\n",
    "            \"enforce_eager\": True,\n",
    "            \"gpu_memory_utilization\": 0.9,\n",
    "        },\n",
    "        \"peft_args\": {\n",
    "            # \"use_gradient_checkpointing\": False,\n",
    "        },\n",
    "    },\n",
    ")\n",
    "state = ModelState(config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import asyncio\n",
    "import random\n",
    "\n",
    "import torch\n",
    "import vllm\n",
    "\n",
    "from art.local.vllm import set_vllm_log_file\n",
    "\n",
    "set_vllm_log_file(\"./vllm.log\")\n",
    "\n",
    "num_tokens = 16384\n",
    "\n",
    "\n",
    "async def warmup(request_id: str) -> None:\n",
    "    max_tokens = random.randint(0, num_tokens * 2)\n",
    "    async for _ in state.vllm.async_engine.generate(\n",
    "        prompt={\n",
    "            \"prompt_token_ids\": torch.randint(\n",
    "                0,\n",
    "                int(state.tokenizer.vocab_size),  # type: ignore\n",
    "                (\n",
    "                    max(\n",
    "                        (\n",
    "                            (\n",
    "                                state.vllm.async_engine.engine.cache_config.num_gpu_blocks  # type: ignore\n",
    "                                * state.vllm.async_engine.engine.cache_config.block_size\n",
    "                            )\n",
    "                            // state.vllm.async_engine.engine.scheduler_config.max_num_seqs\n",
    "                        )\n",
    "                        - 16\n",
    "                        - max_tokens,\n",
    "                        1,\n",
    "                    ),\n",
    "                ),\n",
    "            ).tolist(),\n",
    "        },\n",
    "        sampling_params=vllm.SamplingParams(max_tokens=max_tokens),\n",
    "        request_id=request_id,\n",
    "    ):\n",
    "        pass\n",
    "\n",
    "\n",
    "warmup_future = asyncio.gather(\n",
    "    *(\n",
    "        warmup(f\"{i}\")\n",
    "        for i in range(state.vllm.async_engine.engine.scheduler_config.max_num_seqs)\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "warmup_future.cancel()\n",
    "await asyncio.sleep(0.01)\n",
    "try:\n",
    "    warmup_future.result()\n",
    "except asyncio.CancelledError:\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import asyncio\n",
    "\n",
    "from art.local.train import train\n",
    "\n",
    "results_queue = asyncio.Queue()\n",
    "train_task = asyncio.create_task(train(state.trainer, results_queue))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "from typing import cast\n",
    "\n",
    "from art.local.service import TrainInputs\n",
    "from art.types import TrainConfig\n",
    "\n",
    "seq_len = 32768\n",
    "batch_size = 1\n",
    "shape = (batch_size, seq_len)\n",
    "num_steps = 1\n",
    "async with state.vllm.train_mode():\n",
    "    start_time = time.time()\n",
    "    for _ in range(num_steps):\n",
    "        state.inputs_queue.put_nowait(\n",
    "            TrainInputs(\n",
    "                tokens=torch.randint(0, cast(int, state.tokenizer.vocab_size), shape),\n",
    "                group_ids=torch.randint(0, 10, shape),\n",
    "                parent_ids=torch.randint(0, 10, shape),\n",
    "                input_pos=torch.tensor([list(range(shape[1]))]),\n",
    "                assistant_mask=torch.ones(shape, dtype=torch.bool),\n",
    "                logprobs=torch.zeros(shape),\n",
    "                advantages=torch.zeros(shape),\n",
    "                weights=torch.ones(shape),\n",
    "                config=TrainConfig(lr=1e-7, kl_coef=0.01),\n",
    "            )\n",
    "        )\n",
    "        done, _ = await asyncio.wait(\n",
    "            [asyncio.create_task(results_queue.get()), train_task],\n",
    "            return_when=asyncio.FIRST_COMPLETED,\n",
    "        )\n",
    "        for task in done:\n",
    "            result = task.result()\n",
    "            # If `result` is `None`, the training task finished somehow.\n",
    "            assert result is not None, \"The training task should never finish.\"\n",
    "            results_queue.task_done()\n",
    "            display(result)\n",
    "    total_tokens = num_steps * batch_size * seq_len\n",
    "    elapsed_time = time.time() - start_time\n",
    "    tokens_per_second = total_tokens / elapsed_time\n",
    "    print(f\"Tokens per second: {tokens_per_second:.2f} tokens/s\")\n",
    "    print(f\"Total time: {elapsed_time:.2f} seconds\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


================================================
FILE: dev/run_yes_no_maybe_kl_advantage.py
================================================
"""Launch yes-no-maybe-kl-advantage training on SkyPilot (Kubernetes).

Usage:
    uv run dev/run_yes_no_maybe_kl_advantage.py
    uv run dev/run_yes_no_maybe_kl_advantage.py --fast
    uv run dev/run_yes_no_maybe_kl_advantage.py --base-model Qwen/Qwen2.5-7B-Instruct
"""

import argparse
import os
import textwrap

from dotenv import load_dotenv
import sky
from sky import ClusterStatus

load_dotenv()

parser = argparse.ArgumentParser(
    description="Launch yes-no-maybe KL advantage training on SkyPilot."
)
parser.add_argument(
    "--fast", action="store_true", help="Skip setup (for re-runs on existing cluster)."
)
parser.add_argument(
    "--base-model", type=str, default="meta-llama/Meta-Llama-3.1-8B-Instruct"
)
parser.add_argument("--num-steps", type=int, default=20)
parser.add_argument("--kl-penalty-coef", type=float, default=0.1)
parser.add_argument("--accelerator", type=str, default="H200:1")
parser.add_argument("--cluster-name", type=str, default=None)
parser.add_argument(
    "--kl-ref-step",
    type=int,
    default=None,
    help="Checkpoint step of training model to use as KL reference",
)
parser.add_argument(
    "--kl-ref-adapter-path",
    type=str,
    default=None,
    help="Path to LoRA adapter checkpoint to use as KL reference",
)
args = parser.parse_args()

cluster_name = args.cluster_name or f"ynm-kl-{args.kl_penalty_coef}"
cluster_prefix = os.environ.get("CLUSTER_PREFIX")
if cluster_prefix:
    cluster_name = f"{cluster_prefix}-{cluster_name}"

setup_script = textwrap.dedent("""\
    echo 'Setting up environment...'
    apt install -y nvtop
    curl -LsSf https://astral.sh/uv/install.sh | sh
    source $HOME/.local/bin/env
""")

kl_ref_env = ""
if args.kl_ref_step is not None:
    kl_ref_env = f"KL_REF_STEP={args.kl_ref_step} "
elif args.kl_ref_adapter_path is not None:
    kl_ref_env = f"KL_REF_ADAPTER_PATH={args.kl_ref_adapter_path} "

run_script = textwrap.dedent(f"""\
    source $HOME/.local/bin/env
    cd ~/sky_workdir
    {kl_ref_env}BASE_MODEL={args.base_model} NUM_STEPS={args.num_steps} KL_PENALTY_COEF={args.kl_penalty_coef} uv run --python 3.11 --extra backend dev/yes-no-maybe-kl-advantage.py
""")

task = sky.Task(
    name="yes-no-maybe-kl-advantage",
    setup=setup_script,
    run=run_script,
    workdir=".",
)
task.set_resources(
    sky.Resources(accelerators=args.accelerator, cloud=sky.clouds.Kubernetes())
)
task.set_file_mounts(
    {
        "~/sky_workdir/.env": ".env",
    }
)

print(f"Launching on cluster: {cluster_name}")
print(f"  base_model: {args.base_model}")
print(f"  accelerator: {args.accelerator}")
print(f"  num_steps: {args.num_steps}")
print(f"  kl_penalty_coef: {args.kl_penalty_coef}")
if args.kl_ref_step is not None:
    print(f"  kl_ref_step: {args.kl_ref_step}")
if args.kl_ref_adapter_path is not None:
    print(f"  kl_ref_adapter_path: {args.kl_ref_adapter_path}")

# Cancel any existing jobs on this cluster
cluster_status = sky.stream_and_get(sky.status(cluster_names=[cluster_name]))
if len(cluster_status) > 0 and cluster_status[0]["status"] == ClusterStatus.UP:
    print(f"Cluster {cluster_name} is UP. Canceling any active jobs...")
    sky.stream_and_get(sky.cancel(cluster_name, all=True))

job_id, _ = sky.stream_and_get(
    sky.launch(
        task,
        cluster_name=cluster_name,
        retry_until_up=True,
        idle_minutes_to_autostop=60,
        down=True,
        fast=args.fast,
    )
)

print(f"Job submitted (ID: {job_id}). Streaming logs...")
exit_code = sky.tail_logs(cluster_name=cluster_name, job_id=job_id, follow=True)
print(f"Job {job_id} finished with exit code {exit_code}.")


================================================
FILE: dev/sft/dataset.jsonl
================================================
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a medical_record. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"**Confidential Medical Record**\n\n**Patient Information:**\n\n- **Name:** Victor Roberts\n- **Date of Birth:** 1982-04-07\n- **Age:** 28\n- **Gender:** Male\n- **Personal ID:** 260-56-6683\n\n---\n\n**Medical History Overview:**\n\n**Current Medical Condition:**\n- **Diagnosis:** Aneurysm  \n- **Date of Diagnosis:** 2010-09-15  \n- **Presenting Symptoms:** Severe headaches, vision impairment, dizziness.\n\n**Treatment Plan:**\n- **Medications Prescribed:**\n  - Metoprolol 50 mg, once daily\n  - Aspirin 81 mg, once daily\n- **Scheduled Follow-ups:** Every 3 months with Neurology Department\n\n**Previous Medical History:**\n- **2001:** Tonsillectomy\n- **1996:** Admitted for severe bout of influenza\n- **Allergies:** Penicillin\n\n---\n\n**Lifestyle and Social History:**\n\n- **Occupation:** Structural Engineer\n- **Smoking Status:** Non-smoker\n- **Alcohol Consumption:** Occasional, primarily socially\n- **Exercise Habits:** Engages in light exercises twice a week\n\n---\n\n**Family Medical History:**\n\n- **Father:** Hypertension\n- **Mother:** No known medical conditions\n- **Siblings:** One brother, no known medical conditions\n\n---\n\n**Emergency Contact:**\n\n- **Name:** Linda Roberts\n- **Relationship:** Sister\n- **Phone Number:** [Redacted]\n\n---\n\n**Consultations:**\n\n**Neurologist:** Dr. Emily Chen  \n**Next Appointment:** 2023-11-12 at 10:00 AM\n\n**Cardiologist:** Dr. Raj Singh  \n**Next Appointment:** 2023-11-20 at 2:30 PM\n\n**Notes:**\n- Continue to monitor blood pressure closely.\n- MRI scan of the brain scheduled for 2023-12-05 to evaluate treatment effectiveness.\n\n---\n\n**Doctor's Notes:**\n\n- **Reviewed by:** Dr. Natalie Brooks, MD\n- **Date of Review:** 2023-10-19\n\n**Remarks:**\n- Patient remains attentive to prescribed treatment and lifestyle recommendations. \n- Mr. Roberts reports no new symptoms or concerns since the last visit.\n- Encourage continuation of cardiovascular adherence and stress-reducing activities.\n\n**End of Record**"},{"content":"{\"fields_to_redact\":[{\"string\":\"Victor Roberts\",\"pii_type\":\"person_name\"},{\"string\":\"1982-04-07\",\"pii_type\":\"date_of_birth\"},{\"string\":\"28\",\"pii_type\":\"age\"},{\"string\":\"Male\",\"pii_type\":\"gender\"},{\"string\":\"260-56-6683\",\"pii_type\":\"personal_id\"},{\"string\":\"Aneurysm\",\"pii_type\":\"medical_condition\"},{\"string\":\"2010-09-15\",\"pii_type\":\"date\"},{\"string\":\"Penicillin\",\"pii_type\":\"medical_condition\"},{\"string\":\"Linda Roberts\",\"pii_type\":\"person_name\"},{\"string\":\"Dr. Emily Chen\",\"pii_type\":\"person_name\"},{\"string\":\"2023-11-12\",\"pii_type\":\"date\"},{\"string\":\"Dr. Raj Singh\",\"pii_type\":\"person_name\"},{\"string\":\"2023-11-20\",\"pii_type\":\"date\"},{\"string\":\"2023-12-05\",\"pii_type\":\"date\"},{\"string\":\"Dr. Natalie Brooks, MD\",\"pii_type\":\"person_name\"},{\"string\":\"2023-10-19\",\"pii_type\":\"date\"},{\"string\":\"Hypertension\",\"pii_type\":\"medical_condition\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a company_memo. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"---\n\n**Memo**\n\n**To**: All Employees  \n**From**: Management   \n**Date**: January 11, 2011  \n**Subject**: Update on Internal Policies and Procedures  \n\nDear Team,\n\nAs we strive towards continuous excellence at Clay Group, we are implementing some updates to our internal policies and procedures, effective immediately. Please take a moment to review these changes and how they may impact your daily operations.\n\n**1. Emergency Protocol**  \nThe safety and well-being of our employees are a top priority. We have updated our emergency contact procedures. In case of any urgent situations, please contact the main office line at 001-835-693-0988 with the following extension: 8706. Make sure this number is readily accessible.\n\n**2. Cybersecurity Policy**  \nWith the recent rise in cybersecurity threats, it is imperative to maintain a robust security posture. Starting next month, all team members will be required to undergo additional training on the secure handling of sensitive information. Details will be communicated in due course.\n\n**3. Remote Working Policies**  \nGiven the positive feedback from our previous trials, we are pleased to announce the continuation of our flexible working arrangements. Employees can choose to work remotely up to two days a week, pending manager approval and adherence to productivity benchmarks.\n\n**4. Internal Communications**  \nTo enhance transparency and communication within the organization, we are introducing a new company-wide digital bulletin board. This platform will be used for announcements, team achievements, and events. More information on accessing the bulletin board will follow.\n\nYour cooperation and dedication are vital as we work together to maintain Clay Group's reputation as a leader in innovation and employee satisfaction. If you have any questions regarding these updates, do not hesitate to contact your department head.\n\nThank you for your continued hard work and commitment.\n\nBest regards,\n\n[Signature, if applicable]\n\nManagement  \nClay Group\n\n---"},{"content":"{\"fields_to_redact\":[{\"string\":\"January 11, 2011\",\"pii_type\":\"date\"},{\"string\":\"Clay Group\",\"pii_type\":\"organization_name\"},{\"string\":\"001-835-693-0988\",\"pii_type\":\"phone_number\"},{\"string\":\"8706\",\"pii_type\":\"other_id\"},{\"string\":\"Clay Group\",\"pii_type\":\"organization_name\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a company_memo. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"---\n\n**MEMORANDUM**\n\n**TO:** All Staff\n\n**FROM:** Amy Singleton, Director of Human Resources\n\n**DATE:** January 29, 1993\n\n**SUBJECT:** Exciting New Developments at O'Connor-Thomas!\n\nDear Team,\n\nI am thrilled to reach out to each of you on behalf of O'Connor-Thomas, an organization that prides itself on innovation and excellence. As we continue our journey of growth and success, I would like to share some exciting updates and opportunities that await us.\n\n**Innovation and Sustainability Initiatives**\n\nFirst, I am pleased to announce that we will be launching a new series of sustainability initiatives designed to reduce our environmental footprint. As a sector leader, O'Connor-Thomas is committed to responsible innovation, and we are adamant about making our operations greener. More details will be provided in the upcoming weeks.\n\n**Employee Development Workshops**\n\nWe understand the importance of continuous learning, which is why we are introducing a series of workshops focused on skill enhancement across various departments. These workshops aim to augment your technical and professional skills, ensuring our team remains at the forefront of industry trends. Participation details will soon follow.\n\n**Building a Stronger Community**\n\nOur company believes in the power of community. This year, we will increase our involvement with local charities and community events. Volunteering tells a great story of who we are as a company and individuals. Let's make a positive impact together!\n\nPlease feel free to reach out to me if you have any questions or suggestions regarding our upcoming projects. Feedback is always appreciated as it helps us build a better and more inclusive workplace. Together, we can achieve more!\n\nLooking forward to all the great things 1993 will bring for O'Connor-Thomas and each of us individually.\n\nWarm regards,\n\n**Amy Singleton**  \nDirector of Human Resources  \nO’Connor-Thomas \n\n---\n\nRemember, keep pushing boundaries and stay inspired!\n\n---"},{"content":"{\"fields_to_redact\":[{\"string\":\"January 29, 1993\",\"pii_type\":\"date\"},{\"string\":\"O'Connor-Thomas\",\"pii_type\":\"organization_name\"},{\"string\":\"Amy Singleton\",\"pii_type\":\"person_name\"},{\"string\":\"Amy Singleton\",\"pii_type\":\"person_name\"},{\"string\":\"O’Connor-Thomas\",\"pii_type\":\"organization_name\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a medical_record. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"```plaintext\n                                                      CONFIDENTIAL MEDICAL RECORD\n\nPatient Name: Brenda Davila\nAge: 58\nGender: Male\nDate of Birth: 25th December 2022    Identifier Number: ZZ025922T\n\n--------------------------------------------------------------------------\nADDRESS: \n529 Tina Villages Suite 522\nLake Robert, CO 31592\n\nPHONE: +1-555-834-9771\n\n--------------------------------------------------------------------------\nMEDICAL DIAGNOSIS:\n\nCondition: Lead Poisoning\n\nBackground: \nLead poisoning is a type of metal poisoning caused by increased levels of the heavy metal lead in the body. Exposure to lead over the long term can cause harmful effects on the brain, liver, kidney, and reproductive systems, among others.\n\nPresenting Symptoms: \n- Abdominal Pain\n- Fatigue\n- Irritability\n- Memory Loss\n\nCurrent Treatment Plan:\n- Chelation Therapy: Succimer 10 mg/kg orally every 8 hours for 5 days\n- Regular Monitoring of Blood Lead Levels (BLL)\n- Dietary Adjustments: Ensure high intake of calcium, iron, and vitamin C\n- Environmental modifications to reduce lead exposure\n\nNext Review Appointment: 15th November 2023\n\n--------------------------------------------------------------------------\nNotes:\n- Due to age discrepancy, additional verification documents for identity and medical history may be required.\n- Careful consideration of residential environment as potential source of lead.\n- Emergency Contact details were not provided.\n\n---------------------------------------------------------------------------\n\nDoctor: Dr. Anthony Williamson \nLicense No: MD335912\nContact: a.williamson@lakerobertclinic.com \n\nThis document is intended for the sole use of the individual to whom it is addressed and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.\n```"},{"content":"{\"fields_to_redact\":[{\"string\":\"Brenda Davila\",\"pii_type\":\"person_name\"},{\"string\":\"58\",\"pii_type\":\"age\"},{\"string\":\"Male\",\"pii_type\":\"gender\"},{\"string\":\"25th December 2022\",\"pii_type\":\"date_of_birth\"},{\"string\":\"ZZ025922T\",\"pii_type\":\"personal_id\"},{\"string\":\"529 Tina Villages Suite 522\\nLake Robert, CO 31592\",\"pii_type\":\"street_address\"},{\"string\":\"+1-555-834-9771\",\"pii_type\":\"phone_number\"},{\"string\":\"Lead Poisoning\",\"pii_type\":\"medical_condition\"},{\"string\":\"15th November 2023\",\"pii_type\":\"date\"},{\"string\":\"Dr. Anthony Williamson\",\"pii_type\":\"person_name\"},{\"string\":\"a.williamson@lakerobertclinic.com\",\"pii_type\":\"email_address\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a personal_email. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"Subject: Long Time No See!\n\nHi Margaud-Anaïs,\n\nI hope this email finds you well. It’s been ages since we last caught up, and I've missed our conversations. I remember the last time we chatted, you mentioned that you were working on an exciting project. How's that going? \n\nBy the way, I was reminiscing about that hilarious incident during our holiday trip back in 2003. Unbelievably, January 2nd, 2003, will always be etched in my memory as the day we nearly missed that flight because we were too engrossed in spotting cute street cats. Time really flies!\n\nAnyway, I'm reaching out because I’m planning a small get-together with some old friends next month and I would love for you to join us. Let me know if you're available – it would be so nice to catch up in person and relive some of those fun memories.\n\nPlease feel free to either shoot me an email at josephestrada@example.org or give me a call at 03 45 02 61 71 when you get a chance.\n\nLooking forward to hearing from you soon!\n\nWarm regards,\n\nJoseph"},{"content":"{\"fields_to_redact\":[{\"string\":\"Margaud-Anaïs\",\"pii_type\":\"person_name\"},{\"string\":\"January 2nd, 2003\",\"pii_type\":\"date\"},{\"string\":\"josephestrada@example.org\",\"pii_type\":\"email_address\"},{\"string\":\"03 45 02 61 71\",\"pii_type\":\"phone_number\"},{\"string\":\"Joseph\",\"pii_type\":\"person_name\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a personal_email. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"Subject: Catching Up and Exciting News!\n\nHi James,\n\nI hope this email finds you well! It’s been too long since we last caught up. I wanted to take a moment to reach out and share some exciting updates with you.\n\nFirst off, can you believe it’s already been a year since we had that unforgettable trip to the Grand Canyon? Time really flies! It was truly an experience to remember.\n\nOn a personal note, I’ve embarked on a new project that I’m really passionate about. I’m working with a local non-profit organization focused on environmental conservation here in Seville. It’s been incredibly rewarding to see the impact we’re making and I’d love to tell you more about it.\n\nAlso, since you are the ultimate movie buff, I wanted to get your recommendations on the latest sci-fi flicks. I’m planning a weekend movie marathon, and nobody curates a list quite like you do!\n\nLooking forward to hearing from you soon. Let's try to set up a call or something sometime next month?\n\nBest wishes,\n\nAlejo Villaverde Morcillo\n\nP.S.: If you’re up for a little nostalgia, I stumbled upon some old photos from our hiking days. Happy to share them if you’re interested!\n\nSent on: Monday, 28th February 2005"},{"content":"{\"fields_to_redact\":[{\"string\":\"James\",\"pii_type\":\"person_name\"},{\"string\":\"Seville\",\"pii_type\":\"street_address\"},{\"string\":\"Alejo Villaverde Morcillo\",\"pii_type\":\"person_name\"},{\"string\":\"Monday, 28th February 2005\",\"pii_type\":\"date\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a support_email. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"Subject: Urgent Assistance Required for Jessica Anderson\n\nFrom: janice34@example.net  \nDate: 2024-07-30  \nTo: support@pmp-services.com  \n\nHi Support Team,\n\nI hope this email finds you well. My name is Jessica Anderson, and I am having some trouble accessing my account with your organization, Pichardo, Montez y Padilla. I attempted several times today and was unable to log in successfully.\n\nDetails for verification:\n- Date of Birth: 1996-10-27\n- Personal ID: 564 588 200\n- Banking Number: EALY56899882811142\n- Registered Phone Number: +44(0)1174960628\n\nThe error message said something about account authorization, which I haven't encountered before. I suspect it may have something to do with recent changes in my profile information that I updated last week. Could you please verify my account status and assist me in restoring access as soon as possible?\n\nThank you for your prompt attention to this matter. I'm looking forward to your urgent response.\n\nBest regards,  \nJessica Anderson"},{"content":"{\"fields_to_redact\":[{\"string\":\"Jessica Anderson\",\"pii_type\":\"person_name\"},{\"string\":\"janice34@example.net\",\"pii_type\":\"email_address\"},{\"string\":\"Pichardo, Montez y Padilla\",\"pii_type\":\"organization_name\"},{\"string\":\"1996-10-27\",\"pii_type\":\"date_of_birth\"},{\"string\":\"564 588 200\",\"pii_type\":\"personal_id\"},{\"string\":\"EALY56899882811142\",\"pii_type\":\"banking_number\"},{\"string\":\"+44(0)1174960628\",\"pii_type\":\"phone_number\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a utility_bill. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"```\nSunshine Power & Utilities Co.\nCustomer Service Hotline: 1-800-432-0098\nEmail: support@sunpowerutilities.net\nWebsite: www.sunpowerutilities.net\n\n---------------------------------------------------------------------\nUTILITY BILL STATEMENT\n---------------------------------------------------------------------\n\nAccount Number: 2049-987-231                 Date: 1995-07-11\n\nBilling Summary:\n----------------------------------------------\nName: Mtro. Catalina Olvera\nService Address: 749 Danielle Lakes Apt. 791\n                           Lake Davidton, NU E3L 3V2\n\n---------------------------------------------------------------------\n\nElectricity Consumption Details:\n----------------------------------------------\nBilling Period: Jun 01, 1995 - Jun 30, 1995\nMeter Number: EL-67234-NE\n\nPrevious Reading: 12,345 kWh\nCurrent Reading: 12,910 kWh\nTotal Usage: 565 kWh\n\nCurrent Charge:\nElectricity Charge: $0.12/kWh\nTotal Electricity Cost: $67.80\n\nOther Charges:\n- Environment Fee: $4.50\n- Service Connection Fee: $15.00\n\nTotal Amount Due: $87.30\n\n---------------------------------------------------------------------\n\nPayment Due Date: July 26, 1995\n\nPayment Methods:\n- By Phone: Call 1-800-123-5678 with your account number\n- Online: Log in to your account at www.sunpowerutilities.net\n- Mobile App: Available on iOS and Android\n- In-Person: Visit our office at 123 Solar Street, Lake Davidton\n- Mail: Use the return envelope enclosed with this bill\n\nNote: Late fees apply if payment is not received by the due date.\n\nThank you for using Sunshine Power & Utilities Co.!\n\nPlease remember to conserve energy:\n- Turn off lights when not in use.\n- Use energy-efficient appliances.\n- Set your thermostat wisely.\n\n---------------------------------------------------------------------\n\nVisit our website for tips on energy conservation and to learn more\nabout our renewable energy projects!\n---------------------------------------------------------------------\n```\n"},{"content":"{\"fields_to_redact\":[{\"string\":\"support@sunpowerutilities.net\",\"pii_type\":\"email_address\"},{\"string\":\"2049-987-231\",\"pii_type\":\"personal_id\"},{\"string\":\"1995-07-11\",\"pii_type\":\"date\"},{\"string\":\"Mtro. Catalina Olvera\",\"pii_type\":\"person_name\"},{\"string\":\"749 Danielle Lakes Apt. 791\\n                           Lake Davidton, NU E3L 3V2\",\"pii_type\":\"street_address\"},{\"string\":\"123 Solar Street, Lake Davidton\",\"pii_type\":\"street_address\"},{\"string\":\"1995\",\"pii_type\":\"date\"},{\"string\":\"1995\",\"pii_type\":\"date\"},{\"string\":\"July 26, 1995\",\"pii_type\":\"date\"}]}","refusal":null,"role":"assistant"}],"rejected_message":{"content":"{\"fields_to_redact\":[{\"string\":\"support@sunpowerutilities.net\",\"pii_type\":\"email_address\"},{\"string\":\"Mtro. Catalina Olvera\",\"pii_type\":\"person_name\"},{\"string\":\"749 Danielle Lakes Apt. 791\\n                           Lake Davidton, NU E3L 3V2\",\"pii_type\":\"street_address\"},{\"string\":\"2049-987-231\",\"pii_type\":\"personal_id\"},{\"string\":\"1995-07-11\",\"pii_type\":\"date\"},{\"string\":\"June 01, 1995 - June 30, 1995\",\"pii_type\":\"date\"},{\"string\":\"July 26, 1995\",\"pii_type\":\"date\"}]}","refusal":null,"role":"assistant","tool_calls":[]},"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a rental_agreement. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"**RESIDENTIAL RENTAL AGREEMENT**\n\n**This Residential Lease Agreement (\"Agreement\") is entered into on the 16th day of April, 1983, by and between Patel Inc (\"Landlord\") and Brian Smith (\"Tenant\").**\n\n**Landlord:**\nPatel Inc  \nRegistered Office: 47 Zenith Plaza  \nCorporate ID: ALPHA-83920\n\n**Tenant:**\nBrian Smith  \nContact Number: +44(0)292018253  \nCurrent Address: Calzada Argentina 247 Edif. 025, Depto. 150  \nVieja Namibia, BCS 32896-6646\n\n**1. TERM:**\nThe term of this lease shall commence on the 1st day of May, 1983, and shall continue on a month-to-month basis unless terminated as provided herein.\n\n**2. RENT:**\nTenant agrees to pay Landlord a monthly rent of £850, due on the 1st day of each month. Payments shall be made by bank transfer to Landlord's designated account.\n\n**3. SECURITY DEPOSIT:**\nA security deposit of £1,000 is due at the signing of this Agreement and is refundable upon lease termination, subject to the terms herein.\n\n**4. PREMISES:**\nThe Landlord leases to the Tenant the residential premises located at Calzada Argentina 247 Edif. 025, Depto. 150, Vieja Namibia, BCS 32896-6646.\n\n**5. UTILITIES:**\nTenant will be responsible for all utilities and services incurred at the premises except for water and maintenance, which shall be paid by the Landlord.\n\n**6. MAINTENANCE AND REPAIRS:**\nTenant shall maintain the premises in a clean and sanitary condition. Tenant shall be responsible for any repairs due to negligence or misuse, excluding normal wear and tear.\n\n**7. ALTERATIONS:**\nTenant shall not make any alterations or additions to the premises without the prior written consent of Landlord. Any improvements will become the property of the Landlord.\n\n**8. ACCESS:**\nLandlord reserves the right to enter the premises for inspection, repairs, or showings, provided that 24 hours' notice is given, except in emergency situations.\n\n**9. TERMINATION:**\nEither party may terminate this Agreement by providing a 30-day written notice. Landlord may terminate immediately if Tenant violates any terms of this Agreement.\n\n**Tenant Signatory**\n\n_________________________  \nBrian Smith\n\n**Landlord Signatory**\n\n_________________________  \nAuthorized Representative  \nPatel Inc\n\n**Witnesses:**\n\n1. _________________________ Date: ___________\n\n2. _________________________ Date: ___________\n\n**This document constitutes the entire agreement between the parties and supersedes all previous representations or agreements.**"},{"content":"{\"fields_to_redact\":[{\"string\":\"April, 1983\",\"pii_type\":\"date\"},{\"string\":\"Brian Smith\",\"pii_type\":\"person_name\"},{\"string\":\"+44(0)292018253\",\"pii_type\":\"phone_number\"},{\"string\":\"Calzada Argentina 247 Edif. 025, Depto. 150, Vieja Namibia, BCS 32896-6646\",\"pii_type\":\"street_address\"},{\"string\":\"May, 1983\",\"pii_type\":\"date\"},{\"string\":\"ALPHA-83920\",\"pii_type\":\"other_id\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}
{"messages":[{"role":"system","content":"\n            You are an expert model trained to redact potentially sensitive information from documents. You have been given a document to redact. The document is a rental_agreement. Your goal is to accurately redact the sensitive information from the document. Sensitive information can be in one of the following categories:\n            \n            - age: a person's age\n- credit_card_info: a credit card number, expiration date, CCV, etc.\n- nationality: a country when used to reference place of birth, residence, or citizenship\n- date: a specific calendar date\n- date_of_birth: a specific calendar date representing birth\n- domain_name: a domain on the internet\n- email_address: an email ID\n- demographic_group: Anything that identifies race or ethnicity\n- gender: a gender identifier\n- personal_id: Any ID string like a national ID, subscriber number, etc.\n- other_id: Any ID not associated with a person like an organization ID, database ID, etc.\n- banking_number: a number associated with a bank account\n- medical_condition: A diagnosis, treatment code or other information identifying a medical condition\n- organization_name: name of an organization\n- person_name: name of a person\n- phone_number: a telephone number\n- street_address: a physical address\n- password: a secure string used for authentication\n- secure_credential: any secure credential like an API key, private key, 2FA token\n- religious_affiliation: anything that identifies religious affiliation\n            \n            You should return the specific string that needs to be redacted, along with the category of sensitive information that it belongs to. If there is no sensitive information in the document, return no strings. \n          "},{"role":"user","content":"**Residental Lease Agreement**\n\n**This Agreement is made on the 28th day of February, 2003**\n\n**BETWEEN**\n\nLessor: Hillcrest Property Management  \nAddress: 345 Sunrise Avenue, Alexbury, LN3 9UP  \nAND  \nLessee: Jennifer Payne  \nPersonal ID: 14698103794  \nAddress: 49 Collins Fall, Alexbury, LN2 8WR  \n\n**TERMS OF RENTAL**\n\n1. **Property Description:**  \n   The property is a 2-bedroom apartment located at 49 Collins Fall, Alexbury, LN2 8WR. \n\n2. **Term of Lease:**  \n   The lease commences on March 1, 2003, and ends on February 28, 2004. Upon expiration, the lease may be extended or renewed by mutual agreement of both parties.\n\n3. **Rent:**  \n   The monthly rent shall be £850, payable in advance on the first day of each month to Hillcrest Property Management at the address provided above.\n\n4. **Security Deposit:**  \n   A security deposit of £850 shall be provided by the Lessee upon signing of this Agreement. This deposit will be held as security for any damages caused to the property or failure to fulfill the terms of the lease.\n\n5. **Utilities:**  \n   The Lessee agrees to assume responsibility for all utility services such as water, gas, electricity, internet and any other relevant utilities, starting from the commencement date of the lease.\n\n6. **Agreement of Use:**  \n   The premises shall only be used for residential purposes and occupied by the Lessee. No subletting or assignment of the lease is permitted without prior written consent from the Lessor.\n\n7. **Inspection:**  \n   The Lessor reserves the right to inspect the premises upon giving prior reasonable notice to ensure that the terms of the agreement are being upheld.\n\n8. **Maintenance and Repairs:**  \n   The Lessee must maintain the premises in good condition and report any required repairs to the Lessor in a timely manner. The Lessee shall be liable for costs due to negligence or excessive damage.\n\n9. **Termination of Lease:**  \n   Either party may terminate this lease upon 30-day written notice, with reasons being either breach of agreement or mutual desire to end tenancy.\n\n10. **Notices:**  \n    All notices under this lease shall be in writing and sent to the addresses mentioned herein.\n\n**Signatures:**  \n\nLessor: _____________________   Date: _____________  \nLessee, Jennifer Payne: _____________________   Date: 2003-02-28  \n\n---\n\n*Note: This Lease Agreement is a legally binding contract. Ensure you have read and understood all terms and conditions before signing.*"},{"content":"{\"fields_to_redact\":[{\"string\":\"February, 2003\",\"pii_type\":\"date\"},{\"string\":\"Jennifer Payne\",\"pii_type\":\"person_name\"},{\"string\":\"14698103794\",\"pii_type\":\"personal_id\"},{\"string\":\"49 Collins Fall, Alexbury, LN2 8WR\",\"pii_type\":\"street_address\"},{\"string\":\"March 1, 2003\",\"pii_type\":\"date\"},{\"string\":\"February 28, 2004\",\"pii_type\":\"date\"},{\"string\":\"Jennifer Payne\",\"pii_type\":\"person_name\"},{\"string\":\"2003-02-28\",\"pii_type\":\"date\"}]}","refusal":null,"role":"assistant"}],"tools":[],"response_format":{"type":"json_schema","json_schema":{"name":"RedactionResponse","schema":{"type":"object","$defs":{"RedactionField":{"type":"object","title":"RedactionField","required":["string","pii_type"],"properties":{"string":{"type":"string","title":"String","description":"The exact matching string to redact. Include any whitespace or punctuation. Must be an exact string match!"},"pii_type":{"enum":["age","credit_card_info","nationality","date","date_of_birth","domain_name","email_address","demographic_group","gender","personal_id","other_id","banking_number","medical_condition","organization_name","person_name","phone_number","street_address","password","secure_credential","religious_affiliation"],"type":"string","title":"Pii Type"}},"additionalProperties":false}},"title":"RedactionResponse","required":["fields_to_redact"],"properties":{"fields_to_redact":{"type":"array","items":{"$ref":"#/$defs/RedactionField"},"title":"Fields To Redact"}},"additionalProperties":false},"strict":true}},"split":"TRAIN"}


================================================
FILE: dev/sft/distillation.py
================================================
"""Distillation example: Train a small model using completions from a large model."""

import asyncio
import os

from dotenv import load_dotenv
from openai import AsyncOpenAI

import art
from art.local import LocalBackend
from art.utils.sft import create_sft_dataset_iterator

load_dotenv()

if not os.environ.get("OPENROUTER_API_KEY"):
    raise ValueError("OPENROUTER_API_KEY environment variable is required")

TEACHER_MODEL = "z-ai/glm-5"
STUDENT_BASE_MODEL = "Qwen/Qwen2.5-7B-Instruct"
PROMPT = "Explain the concept of recursion in programming with a simple example."


async def main():
    # Get completion from teacher model
    teacher_client = AsyncOpenAI(
        api_key=os.environ["OPENROUTER_API_KEY"],
        base_url="https://openrouter.ai/api/v1",
    )

    print(f"Getting completion from teacher model ({TEACHER_MODEL})...")
    completion = await teacher_client.chat.completions.create(
        model=TEACHER_MODEL,
        messages=[{"role": "user", "content": PROMPT}],
    )
    teacher_response = completion.choices[0].message.content
    print(
        f"Teacher response ({len(teacher_response)} chars):\n{teacher_response[:500]}..."
    )

    # Create trajectories from teacher completion
    trajectories = [
        art.Trajectory(
            messages_and_choices=[
                {"role": "user", "content": PROMPT},
                {"role": "assistant", "content": teacher_response},
            ],
        )
    ]

    # Train student model
    backend = LocalBackend()
    student = art.TrainableModel(
        name="sft-distillation-001",
        project="sft-distillation",
        base_model=STUDENT_BASE_MODEL,
    )
    await student.register(backend)

    print(f"Training student model ({STUDENT_BASE_MODEL})...")
    for chunk in create_sft_dataset_iterator(trajectories, peak_lr=2e-4):
        await student.train_sft(chunk.trajectories, chunk.config)
    print("Training complete!")


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/sft/sft-from-file.py
================================================
"""Simple SFT training script using train_sft_from_file helper."""

import asyncio
import random

import art
from art.local import LocalBackend
from art.utils.sft import train_sft_from_file


async def main():
    backend = LocalBackend()

    model_name = "run-" + "".join(
        random.choices("abcdefghijklmnopqrstuvwxyz0123456789", k=8)
    )
    model = art.TrainableModel(
        name=model_name,
        project="sft-from-file",
        base_model="meta-llama/Llama-3.1-8B-Instruct",
    )
    await model.register(backend)

    await train_sft_from_file(
        model=model,
        file_path="dev/sft/dataset.jsonl",
        epochs=1,
        peak_lr=2e-4,
    )

    print("Training complete!")


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/sft/sft-warmup.py
================================================
"""Minimal example demonstrating SFT -> RL -> SFT switching."""

import asyncio
import os
import random

from dotenv import load_dotenv

import art
from art.local import LocalBackend
from art.utils.sft import create_sft_dataset_iterator

# Simple SFT trajectories - teach model to respond "maybe"
SFT_TRAJECTORIES = [
    art.Trajectory(
        messages_and_choices=[
            {"role": "user", "content": "respond with yes, no, or maybe"},
            {"role": "assistant", "content": "maybe"},
        ],
    ),
] * 100


async def rl_rollout(model: art.TrainableModel, prompt: str) -> art.Trajectory:
    """Single RL rollout with reward based on response."""
    messages: art.Messages = [{"role": "user", "content": prompt}]
    client = model.openai_client()

    completion = await client.chat.completions.create(
        messages=messages,
        model=model.get_inference_name(),
        max_tokens=10,
        timeout=30,
    )
    choice = completion.choices[0]
    content = choice.message.content or ""

    # Reward: "maybe" > "no" > "yes" > other
    reward = {"maybe": 1.0, "no": 0.75, "yes": 0.5}.get(content.strip().lower(), 0.0)
    return art.Trajectory(messages_and_choices=[*messages, choice], reward=reward)


async def main():
    load_dotenv()

    backend = LocalBackend()
    model_name = "sft-warmup-" + "".join(
        random.choices("abcdefghijklmnopqrstuvwxyz0123456789", k=8)
    )
    model = art.TrainableModel(
        name=model_name,
        project="sft-warmup",
        base_model="Qwen/Qwen2.5-7B-Instruct",
    )
    await model.register(backend)

    # ========================================================================
    # Phase 1: SFT
    # ========================================================================
    print("\n[Phase 1] SFT training...")
    for chunk in create_sft_dataset_iterator(
        SFT_TRAJECTORIES, batch_size=1, peak_lr=1e-5
    ):
        await model.train_sft(chunk.trajectories, chunk.config)
    print("SFT phase 1 complete.")

    # ========================================================================
    # Phase 2: RL (GRPO)
    # ========================================================================
    print("\n[Phase 2] RL training...")
    prompt = "respond with yes, no, or maybe"

    for i in range(10):
        print(f"  RL step {i + 1}")
        train_groups = await art.gather_trajectory_groups(
            [
                art.TrajectoryGroup(rl_rollout(model, prompt) for _ in range(6))
                for _ in range(12)
            ]
        )
        await model.train(train_groups)
    print("RL phase 2 complete.")

    # ========================================================================
    # Phase 3: SFT again
    # ========================================================================
    print("\n[Phase 3] SFT training again...")
    for chunk in create_sft_dataset_iterator(
        SFT_TRAJECTORIES, batch_size=1, peak_lr=1e-5
    ):
        await model.train_sft(chunk.trajectories, chunk.config)
    print("SFT phase 3 complete.")

    # ========================================================================
    # Phase 4: RL (GRPO) again
    # ========================================================================
    print("\n[Phase 4] RL training...")
    prompt = "respond with yes, no, or maybe"

    for i in range(10):
        print(f"  RL step {i + 1}")
        train_groups = await art.gather_trajectory_groups(
            [
                art.TrajectoryGroup(rl_rollout(model, prompt) for _ in range(6))
                for _ in range(12)
            ]
        )
        await model.train(train_groups)
    print("RL phase 4 complete.")

    # ========================================================================
    # Test: Check model output
    # ========================================================================
    print("\n[Test] Model output after training:")
    client = model.openai_client()
    completion = await client.chat.completions.create(
        messages=[{"role": "user", "content": "respond with yes, no, or maybe"}],
        model=model.get_inference_name(),
        max_tokens=10,
    )
    print(f"Response: {completion.choices[0].message.content}")

    print("\nAll phases complete!")


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/yes-no-maybe-kl-advantage.py
================================================
"""Yes-no-maybe training with KL-penalized advantage adjustment.

Demonstrates the kl_penalty_coef feature: tokens where the policy has drifted
more from the reference model get reduced advantages, while tokens that have
drifted less get increased advantages.

Uses meta-llama/Meta-Llama-3.1-8B-Instruct as the base model (trained locally).
"""

import asyncio
from itertools import permutations
import os

from dotenv import load_dotenv
import openai

import art
from art.local import LocalBackend


async def rollout(
    client: openai.AsyncOpenAI, model: art.TrainableModel, prompt: str
) -> art.Trajectory:
    messages: art.Messages = [
        {
            "role": "user",
            "content": prompt,
        }
    ]
    chat_completion = await client.chat.completions.create(
        messages=messages, model=model.get_inference_name(), max_tokens=100, timeout=100
    )
    choice = chat_completion.choices[0]
    content = choice.message.content
    assert isinstance(content, str)
    if content == "yes":
        reward = 0.5
    elif content == "no":
        reward = 0.75
    elif content == "maybe":
        reward = 1.0
    else:
        reward = 0.0
    return art.Trajectory(messages_and_choices=[*messages, choice], reward=reward)


def with_quotes(w: str) -> str:
    return f"'{w}'"


async def main():
    load_dotenv()

    backend = LocalBackend()
    base_model = os.environ.get("BASE_MODEL", "meta-llama/Meta-Llama-3.1-8B-Instruct")
    kl_penalty_coef = float(os.environ.get("KL_PENALTY_COEF", "0.1"))
    model = art.TrainableModel(
        name=os.environ.get("MODEL_NAME", f"kl-{kl_penalty_coef}"),
        project="yes-no-maybe",
        base_model=base_model,
    )
    await model.register(backend)

    kl_penalty_reference_step: int | None = (
        int(os.environ["KL_REF_STEP"])
        if os.environ.get("KL_REF_STEP") is not None
        else None
    )
    kl_ref_adapter_path: str | None = os.environ.get("KL_REF_ADAPTER_PATH") or None

    prompts = [
        f"{prefix} with {', '.join([with_quotes(w) if use_quotes else w for w in words]) if len(words) == 3 else f'{words[0]}' + (f' or {words[1]}' if len(words) > 1 else '')}"
        for prefix in ["respond", "just respond"]
        for use_quotes in [True, False]
        for words in (
            list(p) for n in [3, 2] for p in permutations(["yes", "no", "maybe"], n)
        )
    ]

    openai_client = model.openai_client()
    max_steps = int(os.environ.get("NUM_STEPS", "20"))
    start_step = await model.get_step()
    for step in range(start_step, start_step + max_steps):
        train_groups = await art.gather_trajectory_groups(
            (
                art.TrajectoryGroup(
                    rollout(openai_client, model, prompt) for _ in range(32)
                )
                for prompt in prompts
            )
        )
        result = await backend.train(
            model,
            train_groups,
            learning_rate=1e-4,
            kl_penalty_coef=kl_penalty_coef,
            kl_penalty_reference_step=kl_penalty_reference_step,
            kl_ref_adapter_path=kl_ref_adapter_path,
        )
        await model.log(
            train_groups,
            metrics=result.metrics,
            step=result.step,
            split="train",
        )
        print(f"step {result.step}: {result.metrics}")


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/yes-no-maybe-megatron.py
================================================
import asyncio
from itertools import permutations
import os

from dotenv import load_dotenv
import openai
import torch

import art
from art.megatron import MegatronBackend


async def rollout(
    client: openai.AsyncOpenAI, model_name: str, prompt: str
) -> art.Trajectory:
    messages: art.Messages = [{"role": "user", "content": prompt}]
    chat_completion = await client.chat.completions.create(
        messages=messages, model=model_name, max_tokens=100, timeout=100
    )
    choice = chat_completion.choices[0]
    content = choice.message.content
    assert isinstance(content, str)
    if content == "yes":
        reward = 0.5
    elif content == "no":
        reward = 0.75
    elif content == "maybe":
        reward = 1.0
    else:
        reward = 0.0
    return art.Trajectory(messages_and_choices=[*messages, choice], reward=reward)


def with_quotes(w: str) -> str:
    return f"'{w}'"


async def main():
    load_dotenv()

    backend = MegatronBackend()
    base_model = os.environ.get("BASE_MODEL", "Qwen/Qwen3-30B-A3B-Instruct-2507")
    model = art.TrainableModel(
        name=os.environ.get("MODEL_NAME", "megatron-001"),
        project="yes-no-maybe-megatron",
        base_model=base_model,
        _internal_config=art.dev.InternalModelConfig(
            engine_args=art.dev.EngineArgs(
                gpu_memory_utilization=0.8,
                tensor_parallel_size=torch.cuda.device_count(),
            ),
        ),
    )
    await model.register(backend)

    prompts = [
        f"{prefix} with {', '.join([with_quotes(w) if use_quotes else w for w in words]) if len(words) == 3 else f'{words[0]}' + (f' or {words[1]}' if len(words) > 1 else '')}"
        for prefix in ["respond", "just respond"]
        for use_quotes in [True, False]
        for words in (
            list(p) for n in [3, 2] for p in permutations(["yes", "no", "maybe"], n)
        )
    ]

    openai_client = model.openai_client()
    max_steps = int(os.environ.get("NUM_STEPS", "20"))
    start_step = await model.get_step()

    for step in range(start_step, start_step + max_steps):
        print(f"\n=== Step {step + 1} ===")
        train_groups = await art.gather_trajectory_groups(
            (
                art.TrajectoryGroup(
                    rollout(openai_client, model.get_inference_name(), prompt)
                    for _ in range(32)
                )
                for prompt in prompts
            )
        )
        await model.train(
            train_groups,
            config=art.TrainConfig(learning_rate=1e-4),
        )


if __name__ == "__main__":
    asyncio.run(main())


================================================
FILE: dev/yes-no-maybe-metrics.py
================================================
"""Yes-no-maybe metrics demo for the LocalBackend `model.train()` path.

This keeps the same prompt family, rollout structure, and reward ordering as
`dev/yes-no-maybe.py` while adding explicit metrics taxonomy instrumentation for
actor/eval timing and data metrics, while relying on LocalBackend for automatic
step wall time and GPU cost logging.
"""

from __future__ import annotations

import asyncio
from itertools import permutations
import os
import time

from dotenv import load_dotenv
import openai

try:
    import unsloth  # noqa: F401
except ImportError:
    pass

import art
from art.local import LocalBackend


async def create_chat_completion(
    client: openai.AsyncOpenAI,
    *,
    model_name: str,
    messages: art.Messages,
    max_tokens: int,
    timeout: float,
) -> openai.types.chat.chat_completion.ChatCompletion:
    return await client.chat.completions.create(
        messages=messages,
        model=model_name,
        max_tokens=max_tokens,
        timeout=timeout,
    )


def with_quotes(word: str) -> str:
    return f"'{word}'"


def build_prompts() -> list[str]:
    return [
        f"{prefix} with {', '.join([with_quotes(word) if use_quotes else word for word in words]) if len(words) == 3 else f'{words[0]}' + (f' or {words[1]}' if len(words) > 1 else '')}"
        for prefix in ["respond", "just respond"]
        for use_quotes in [True, False]
        for words in (
            list(permutation)
            for length in [3, 2]
            for permutation in permutations(["yes", "no", "maybe"], length)
        )
    ]


def reward_for_answer(content: str | None) -> float:
    if content == "yes":
        return 0.5
    if content == "no":
        return 0.75
    if content == "maybe":
        return 1.0
    return 0.0


def scenario_id_for_prompt(prompt: str) -> str:
    return prompt.replace(" ", "_").replace("'", "")


def response_total_tokens(
    response: openai.types.chat.chat_completion.ChatCompletion,
) -> int:
    usage = response.usage
    if usage is None:
        return 0
    prompt_tokens = int(usage.prompt_tokens or 0)
    completion_tokens = int(usage.completion_tokens or 0)
    return prompt_tokens + completion_tokens


def total_actor_tokens(groups: list[art.TrajectoryGroup]) -> int:
    return sum(
        int(trajectory.metadata.get("actor_total_tokens", 0) or 0)
        for group in groups
        for trajectory in group.trajectories
    )


async def rollout(
    client: openai.AsyncOpenAI,
    model: art.TrainableModel,
    prompt: str,
    *,
    max_tokens: int,
    timeout: float,
) -> art.Trajectory:
    messages: art.Messages = [{"role": "user", "content": prompt}]
    chat_completion = await create_chat_completion(
        client,
        model_name=model.get_inference_name(),
        messages=messages,
        max_tokens=max_tokens,
        timeout=timeout,
    )
    choice = chat_completion.choices[0]
    content = choice.message.content
    return art.Trajectory(
        messages_and_choices=[*messages, choice],
        reward=reward_for_answer(content),
        metadata={
            "scenario_id": scenario_id_for_prompt(prompt),
            "actor_total_tokens": response_total_tokens(chat_completion),
        },
        metrics={
            "valid_answer": reward_for_answer(content) > 0.0,
        },
    )


async def evaluate(
    client: openai.AsyncOpenAI,
    model: art.TrainableModel,
    prompts: list[str],
    *,
    max_tokens: int,
    timeout: float,
) -> list[art.TrajectoryGroup]:
    groups = await art.gather_trajectory_groups(
        art.TrajectoryGroup(
            [
                rollout(
                    client,
                    model,
                    prompt,
                    max_tokens=max_tokens,
                    timeout=timeout,
                )
            ],
            metadata={"scenario_id": scenario_id_for_prompt(prompt)},
        )
        for prompt in prompts
    )
    return groups


def print_history_summary(model: art.TrainableModel) -> None:
    history_path = (
        model.base_path + f"/{model.project}/models/{model.name}/history.jsonl"
    )
    print(f"History: {history_path}")


def build_internal_config() -> art.dev.InternalModelConfig:
    return art.dev.InternalModelConfig(
        engine_args=art.dev.EngineArgs(
            gpu_memory_utilization=float(
                os.environ.get("GPU_MEMORY_UTILIZATION", "0.85")
            ),
            max_model_len=int(os.environ.get("MAX_MODEL_LEN", "4096")),
        )
    )


async def main() -> None:
    load_dotenv()

    backend = LocalBackend()
    base_model = os.environ.get("BASE_MODEL", "Qwen/Qwen3-30B-A3B-Instruct-2507")
    project = os.environ.get("PROJECT", "yes-no-maybe-metrics")
    model = art.TrainableModel(
        name=os.environ.get("MODEL_NAME", f"yes-no-maybe-metrics-{int(time.time())}"),
        project=project,
        base_model=base_model,
        report_metrics=["wandb"],
        _internal_config=build_internal_config(),
    )
    try:
        await model.register(backend)

        prompts = build_prompts()
        eval_prompts = prompts[: int(os.environ.get("EVAL_PROMPTS", "12"))]
        openai_client = model.openai_client()
        max_steps = int(os.environ.get("NUM_STEPS", "20"))
        rollouts_per_prompt = int(os.environ.get("ROLLOUTS_PER_PROMPT", "32"))
        max_tokens = int(os.environ.get("MAX_TOKENS", "100"))
        timeout = float(os.environ.get("TIMEOUT", "100"))
        eval_every_n_steps = int(os.environ.get("EVAL_EVERY_N_STEPS", "1"))
        learning_rate = float(os.environ.

Download .txt

gitextract_7jngjjdm/

├── .agents/
│   └── skills/
│       ├── fix-art-issues/
│       │   └── SKILL.md
│       ├── train-rl/
│       │   └── SKILL.md
│       └── train-sft/
│           └── SKILL.md
├── .dockerignore
├── .github/
│   └── workflows/
│       ├── create-draft-release.yml
│       ├── package-install.yml
│       ├── prek.yml
│       └── release.yml
├── .gitignore
├── .pre-commit-config.yaml
├── .python-version
├── .skyignore
├── AGENT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── THIRD-PARTY-NOTICES
├── dev/
│   ├── demo_logging.py
│   ├── math-vista/
│   │   ├── math-vista.ipynb
│   │   └── math-vista.py
│   ├── new_models/
│   │   ├── benchmark_inference.py
│   │   ├── gemma3.py
│   │   ├── prompts.json
│   │   ├── qwen3_try.ipynb
│   │   └── qwen3_try.py
│   ├── profile.ipynb
│   ├── run_yes_no_maybe_kl_advantage.py
│   ├── sft/
│   │   ├── dataset.jsonl
│   │   ├── distillation.py
│   │   ├── sft-from-file.py
│   │   └── sft-warmup.py
│   ├── yes-no-maybe-kl-advantage.py
│   ├── yes-no-maybe-megatron.py
│   ├── yes-no-maybe-metrics.py
│   ├── yes-no-maybe-vision/
│   │   ├── generate_images.py
│   │   └── train.ipynb
│   ├── yes-no-maybe.ipynb
│   └── yes-no-maybe.py
├── docs/
│   ├── .gitignore
│   ├── README.md
│   ├── analytics.js
│   ├── docs.json
│   ├── experimental/
│   │   └── gspo.mdx
│   ├── features/
│   │   ├── additional-histories.mdx
│   │   ├── checkpoint-deletion.mdx
│   │   ├── checkpoint-forking.mdx
│   │   ├── mcp-rl.mdx
│   │   └── tracking-metrics.mdx
│   ├── fundamentals/
│   │   ├── art-backend.mdx
│   │   ├── art-client.mdx
│   │   ├── ruler.mdx
│   │   ├── sft-training.mdx
│   │   └── training-loop.mdx
│   ├── getting-started/
│   │   ├── about.mdx
│   │   ├── faq.mdx
│   │   ├── installation-setup.mdx
│   │   ├── notebooks.mdx
│   │   └── quick-start.mdx
│   ├── integrations/
│   │   ├── langgraph-integration.mdx
│   │   └── openenv-integration.mdx
│   ├── package.json
│   ├── proposals/
│   │   └── backend-first-training-api.md
│   ├── resources/
│   │   ├── glossary.mdx
│   │   └── models.mdx
│   ├── style.css
│   └── tutorials/
│       ├── open-deep-research.mdx
│       └── summarizer.mdx
├── examples/
│   ├── 2048/
│   │   ├── display_benchmarks.ipynb
│   │   ├── generate_benchmarks.py
│   │   ├── rollout.py
│   │   ├── train.py
│   │   └── utils.py
│   ├── benchmarking_comparison_models.py
│   ├── codenames/
│   │   ├── codenames_words.json
│   │   └── dictionary.json
│   ├── data/
│   │   ├── greentext/
│   │   │   └── prompts.txt
│   │   └── temporal-clue/
│   │       └── puzzles.json
│   ├── hn_title_generator/
│   │   ├── reference_grpo_trainer.py
│   │   ├── train.py
│   │   └── utils.py
│   ├── just-the-facts/
│   │   ├── .gitignore
│   │   ├── README.md
│   │   ├── just_the_facts/
│   │   │   ├── __init__.py
│   │   │   ├── checks.py
│   │   │   ├── display_benchmarks.ipynb
│   │   │   ├── experiments.py
│   │   │   ├── find_articles.py
│   │   │   ├── generate_benchmarks.py
│   │   │   ├── rollout.py
│   │   │   ├── scenarios.py
│   │   │   ├── train.py
│   │   │   └── utils.py
│   │   ├── main.py
│   │   ├── pyproject.toml
│   │   └── test_scraper.py
│   ├── mcp-rl/
│   │   ├── README.md
│   │   ├── all_experiments.py
│   │   ├── mcp_rl/
│   │   │   ├── __init__.py
│   │   │   ├── benchmarks/
│   │   │   │   ├── display_benchmarks/
│   │   │   │   │   ├── mcp_alphavantage.ipynb
│   │   │   │   │   └── mcp_balldontlie.ipynb
│   │   │   │   └── generate_benchmarks.py
│   │   │   ├── checks.py
│   │   │   ├── mcp_server.py
│   │   │   ├── rollout.py
│   │   │   ├── scenario_generator.py
│   │   │   ├── train.py
│   │   │   └── utils.py
│   │   ├── pyproject.toml
│   │   ├── servers/
│   │   │   └── python/
│   │   │       ├── mcp_alphavantage/
│   │   │       │   ├── README.md
│   │   │       │   ├── __init__.py
│   │   │       │   ├── scenarios/
│   │   │       │   │   ├── train.jsonl
│   │   │       │   │   └── val.jsonl
│   │   │       │   ├── scenarios.jsonl
│   │   │       │   ├── server.py
│   │   │       │   └── server_params.py
│   │   │       ├── mcp_balldontlie/
│   │   │       │   ├── README.md
│   │   │       │   ├── __init__.py
│   │   │       │   ├── scenarios/
│   │   │       │   │   ├── train.jsonl
│   │   │       │   │   └── val.jsonl
│   │   │       │   ├── server.py
│   │   │       │   └── server_params.py
│   │   │       └── mcp_googlemaps/
│   │   │           ├── README.md
│   │   │           ├── __init__.py
│   │   │           ├── pyproject.toml
│   │   │           ├── server.py
│   │   │           └── server_params.py
│   │   └── test_scenario_generation.py
│   ├── openenv_echo.py
│   ├── prisoners-dilemma.ipynb
│   ├── rock-paper-tool-use.ipynb
│   ├── roflbot/
│   │   └── .gitignore
│   ├── temporal_clue/
│   │   ├── temporal-clue-7b-async.ipynb
│   │   ├── temporal-clue-7b.ipynb
│   │   └── temporal-clue.py
│   ├── tic_tac_toe/
│   │   ├── display-benchmarks.ipynb
│   │   ├── game_utils.py
│   │   ├── rollout.py
│   │   └── tic-tac-toe.py
│   └── tic_tac_toe_self_play/
│       ├── deploy_step.py
│       ├── game_utils.py
│       ├── gather_trajectory_groups_by_index.py
│       ├── rollout.py
│       ├── train.py
│       └── train_o4_mini.py
├── licenses/
│   ├── GPL-3.0.txt
│   └── LGPL-3.0.txt
├── pyproject.toml
├── requirements/
│   └── backend.vcs.txt
├── scripts/
│   ├── bump_version.py
│   ├── ci/
│   │   ├── build_and_push_uv_cache.sh
│   │   └── compute_uv_fingerprint.py
│   ├── deploy-model.py
│   ├── kill-gpu-processes.sh
│   ├── migrate-s3-checkpoints.py
│   ├── publish.sh
│   └── setup.sh
├── skypilot-config.yaml
├── src/
│   ├── art/
│   │   ├── __init__.py
│   │   ├── api_costs.py
│   │   ├── auto_trajectory.py
│   │   ├── backend.py
│   │   ├── batches.py
│   │   ├── cli.py
│   │   ├── costs.py
│   │   ├── dev/
│   │   │   ├── __init__.py
│   │   │   ├── engine.py
│   │   │   ├── get_model_config.py
│   │   │   ├── model.py
│   │   │   ├── openai_server.py
│   │   │   ├── train.py
│   │   │   └── validate.py
│   │   ├── errors.py
│   │   ├── gather.py
│   │   ├── guided_completion.py
│   │   ├── langgraph/
│   │   │   ├── __init__.py
│   │   │   ├── llm_wrapper.py
│   │   │   ├── logging.py
│   │   │   └── message_utils.py
│   │   ├── local/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   ├── checkpoints.py
│   │   │   └── service.py
│   │   ├── loss.py
│   │   ├── mcp/
│   │   │   ├── __init__.py
│   │   │   ├── default_tools.py
│   │   │   ├── generate_scenarios.py
│   │   │   └── types.py
│   │   ├── megatron/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   ├── flex_attention.py
│   │   │   ├── lora.py
│   │   │   ├── offload.py
│   │   │   ├── provider.py
│   │   │   ├── service.py
│   │   │   ├── setup.sh
│   │   │   └── train.py
│   │   ├── metrics.py
│   │   ├── metrics_taxonomy.py
│   │   ├── model.py
│   │   ├── openai.py
│   │   ├── pipeline_trainer/
│   │   │   ├── __init__.py
│   │   │   ├── binary_prefix_tool_pipeline.py
│   │   │   ├── state.py
│   │   │   ├── status.py
│   │   │   ├── trainer.py
│   │   │   ├── types.py
│   │   │   └── yes_no_maybe_pipeline.py
│   │   ├── preprocessing/
│   │   │   ├── __init__.py
│   │   │   ├── inputs.py
│   │   │   ├── pack.py
│   │   │   └── tokenize.py
│   │   ├── py.typed
│   │   ├── rewards/
│   │   │   ├── __init__.py
│   │   │   └── ruler.py
│   │   ├── serverless/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   └── client.py
│   │   ├── test/
│   │   │   ├── test_kl_advantage.py
│   │   │   └── test_step_skipping.py
│   │   ├── tinker/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   ├── cookbook_v/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── hyperparam_utils.py
│   │   │   │   ├── image_processing_utils.py
│   │   │   │   ├── renderers/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── base.py
│   │   │   │   │   ├── deepseek_v3.py
│   │   │   │   │   ├── gpt_oss.py
│   │   │   │   │   ├── kimi_k2.py
│   │   │   │   │   ├── kimi_k25.py
│   │   │   │   │   ├── kimi_k2_5_tool_declaration_ts.py
│   │   │   │   │   ├── llama3.py
│   │   │   │   │   ├── qwen3.py
│   │   │   │   │   ├── qwen3_5.py
│   │   │   │   │   └── role_colon.py
│   │   │   │   ├── tokenizer_utils.py
│   │   │   │   └── utils/
│   │   │   │       ├── __init__.py
│   │   │   │       └── misc_utils.py
│   │   │   ├── prefix_cache.py
│   │   │   ├── renderers.py
│   │   │   ├── server.py
│   │   │   └── service.py
│   │   ├── tinker_native/
│   │   │   ├── __init__.py
│   │   │   ├── backend.py
│   │   │   └── data.py
│   │   ├── trajectories.py
│   │   ├── transformers/
│   │   │   ├── __init__.py
│   │   │   └── patches.py
│   │   ├── types.py
│   │   ├── unsloth/
│   │   │   ├── __init__.py
│   │   │   ├── service.py
│   │   │   └── train.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── benchmark_rollout.py
│   │   │   ├── benchmarking/
│   │   │   │   ├── aggregate_trajectories.py
│   │   │   │   ├── charts/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── percentage_comparison_bar_chart.py
│   │   │   │   │   └── training_progress_chart.py
│   │   │   │   ├── filter_model_split.py
│   │   │   │   ├── load_trajectories.py
│   │   │   │   ├── log_constant_metrics_wandb.py
│   │   │   │   ├── pull_model_trajectories.py
│   │   │   │   └── types.py
│   │   │   ├── convert_moe_lora.py
│   │   │   ├── deploy_model.py
│   │   │   ├── deployment/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── common.py
│   │   │   │   ├── legacy.py
│   │   │   │   ├── together.py
│   │   │   │   └── wandb.py
│   │   │   ├── format_message.py
│   │   │   ├── get_model_step.py
│   │   │   ├── get_repo_root_path.py
│   │   │   ├── group_aggregate.py
│   │   │   ├── iterate_dataset.py
│   │   │   ├── limit_concurrency.py
│   │   │   ├── litellm.py
│   │   │   ├── log_http_errors.py
│   │   │   ├── logging.py
│   │   │   ├── model_config.py
│   │   │   ├── old_benchmarking/
│   │   │   │   ├── calculate_step_metrics.py
│   │   │   │   ├── display_image_grid.py
│   │   │   │   ├── generate_comparison_table.py
│   │   │   │   ├── generate_line_graphs.py
│   │   │   │   ├── load_benchmarked_models.py
│   │   │   │   └── types.py
│   │   │   ├── output_dirs.py
│   │   │   ├── record_provenance.py
│   │   │   ├── retry.py
│   │   │   ├── s3.py
│   │   │   ├── s3_checkpoint_utils.py
│   │   │   ├── sft.py
│   │   │   ├── strip_logprobs.py
│   │   │   ├── suppress_litellm_serialization_warnings.py
│   │   │   ├── trajectory_logging.py
│   │   │   └── trajectory_migration.py
│   │   ├── vllm/
│   │   │   ├── __init__.py
│   │   │   ├── dedicated_server.py
│   │   │   ├── engine.py
│   │   │   ├── patches.py
│   │   │   └── server.py
│   │   └── yield_trajectory.py
│   └── mp_actors/
│       ├── __init__.py
│       ├── move.py
│       └── traceback.py
└── tests/
    ├── integration/
    │   ├── __init__.py
    │   ├── test_live_api_cost.py
    │   ├── test_multi_checkpoint_training.py
    │   ├── test_pipeline_localbackend_dedicated.py
    │   ├── test_provenance.py
    │   ├── test_push_and_fork.py
    │   ├── test_tinker_native_backend.py
    │   └── test_vllm_contract.py
    ├── integration.py
    ├── test_backend_train_api.py
    └── unit/
        ├── test_auto_trajectory.py
        ├── test_benchmarking_loader.py
        ├── test_dedicated_config.py
        ├── test_dedicated_server.py
        ├── test_frontend_logging.py
        ├── test_metric_routing.py
        ├── test_metrics_builder.py
        ├── test_metrics_taxonomy.py
        ├── test_model_openai_client_costs.py
        ├── test_mp_actors.py
        ├── test_multi_checkpoint_inference.py
        ├── test_pipeline_trainer_batching.py
        ├── test_pipeline_trainer_local_backend.py
        ├── test_pipeline_trainer_metrics.py
        ├── test_prefix_cache.py
        ├── test_ruler_metrics.py
        ├── test_sft.py
        ├── test_strip_logprobs.py
        ├── test_tinker_renderers.py
        ├── test_tokenize_trajectory_groups.ipynb
        ├── test_track_api_cost.py
        ├── test_trajectory_copy.py
        ├── test_trajectory_parquet.py
        ├── test_unsloth_metrics.py
        ├── test_vllm_patches_contract.py
        └── test_yield_trajectory.py

Download .txt

SYMBOL INDEX (1696 symbols across 213 files)

FILE: dev/demo_logging.py
  function demo_basic_logging (line 9) | def demo_basic_logging():
  function demo_color_codes (line 25) | def demo_color_codes():
  function demo_timestamp (line 52) | def demo_timestamp():
  function demo_real_world_usage (line 65) | def demo_real_world_usage():
  function demo_progress_tracking (line 99) | def demo_progress_tracking():
  function demo_error_scenarios (line 123) | def demo_error_scenarios():
  function demo_formatting_combinations (line 146) | def demo_formatting_combinations():
  function demo_log_levels (line 177) | def demo_log_levels():
  function main (line 206) | def main():

FILE: dev/math-vista/math-vista.py
  class DecodedImage (line 14) | class DecodedImage(TypedDict):
  class Scenario (line 18) | class Scenario(TypedDict):
  function main (line 26) | async def main(model_name: str, steps: int) -> None:
  function parse_args (line 116) | def parse_args() -> argparse.Namespace:

FILE: dev/new_models/benchmark_inference.py
  function timed_request (line 21) | async def timed_request(client, model_name, prompt, max_tokens, temperat...
  function main (line 46) | async def main():

FILE: dev/new_models/gemma3.py
  function rollout (line 12) | async def rollout(model: art.TrainableModel, prompt: str) -> art.Traject...
  function main (line 40) | async def main():

FILE: dev/new_models/qwen3_try.py
  function rollout (line 12) | async def rollout(model: art.TrainableModel, prompt: str) -> art.Traject...
  function main (line 41) | async def main():

FILE: dev/sft/distillation.py
  function main (line 23) | async def main():

FILE: dev/sft/sft-from-file.py
  function main (line 11) | async def main():

FILE: dev/sft/sft-warmup.py
  function rl_rollout (line 24) | async def rl_rollout(model: art.TrainableModel, prompt: str) -> art.Traj...
  function main (line 43) | async def main():

FILE: dev/yes-no-maybe-kl-advantage.py
  function rollout (line 21) | async def rollout(
  function with_quotes (line 47) | def with_quotes(w: str) -> str:
  function main (line 51) | async def main():

FILE: dev/yes-no-maybe-megatron.py
  function rollout (line 13) | async def rollout(
  function with_quotes (line 34) | def with_quotes(w: str) -> str:
  function main (line 38) | async def main():

FILE: dev/yes-no-maybe-metrics.py
  function create_chat_completion (line 28) | async def create_chat_completion(
  function with_quotes (line 44) | def with_quotes(word: str) -> str:
  function build_prompts (line 48) | def build_prompts() -> list[str]:
  function reward_for_answer (line 61) | def reward_for_answer(content: str | None) -> float:
  function scenario_id_for_prompt (line 71) | def scenario_id_for_prompt(prompt: str) -> str:
  function response_total_tokens (line 75) | def response_total_tokens(
  function total_actor_tokens (line 86) | def total_actor_tokens(groups: list[art.TrajectoryGroup]) -> int:
  function rollout (line 94) | async def rollout(
  function evaluate (line 125) | async def evaluate(
  function print_history_summary (line 151) | def print_history_summary(model: art.TrainableModel) -> None:
  function build_internal_config (line 158) | def build_internal_config() -> art.dev.InternalModelConfig:
  function main (line 169) | async def main() -> None:

FILE: dev/yes-no-maybe-vision/generate_images.py
  function _load_font (line 14) | def _load_font(font_path: str | Path | None, preferred_size: int) -> Any:
  function generate_yes_no_maybe_prompts (line 36) | def generate_yes_no_maybe_prompts() -> list[str]:
  function _wrap_text_to_width (line 54) | def _wrap_text_to_width(
  function _max_fit_font_size (line 80) | def _max_fit_font_size(
  function save_prompt_images (line 132) | def save_prompt_images(
  function _slugify (line 229) | def _slugify(text: str, max_length: int = 80) -> str:

FILE: dev/yes-no-maybe.py
  function rollout (line 12) | async def rollout(client: openai.AsyncOpenAI, prompt: str) -> art.Trajec...
  function with_quotes (line 36) | def with_quotes(w: str) -> str:
  function main (line 40) | async def main():

FILE: examples/2048/generate_benchmarks.py
  function log_comparison_model (line 42) | async def log_comparison_model(comparison_model: art.Model):
  function run_benchmarks (line 63) | async def run_benchmarks():

FILE: examples/2048/rollout.py
  function rollout (line 26) | async def rollout(
  function main (line 128) | async def main():

FILE: examples/2048/train.py
  function train (line 32) | async def train():

FILE: examples/2048/utils.py
  class TwentyFortyEightGame (line 15) | class TwentyFortyEightGame(TypedDict):
  class Direction (line 20) | class Direction(str, Enum):
  function populate_random_cell (line 28) | def populate_random_cell(game: TwentyFortyEightGame) -> None:
  function generate_game (line 43) | def generate_game(board_length: int = 4) -> TwentyFortyEightGame:
  function render_board (line 59) | def render_board(game: TwentyFortyEightGame) -> str:
  function condense_sequence (line 89) | def condense_sequence(sequence: list[int | None]) -> list[int | None]:
  function condense_board (line 111) | def condense_board(game: TwentyFortyEightGame, direction: Direction) -> ...
  function apply_agent_move (line 144) | def apply_agent_move(game: TwentyFortyEightGame, move_xml: str) -> None:
  function max_cell_value (line 162) | def max_cell_value(game: TwentyFortyEightGame) -> int:
  function check_game_finished (line 167) | def check_game_finished(game: TwentyFortyEightGame) -> bool:
  function total_board_value (line 179) | def total_board_value(game: TwentyFortyEightGame) -> int:

FILE: examples/benchmarking_comparison_models.py
  class MyTask (line 15) | class MyTask(BaseModel):
  class MyConfig (line 40) | class MyConfig(BaseModel):
  function reward (line 49) | def reward(config: MyConfig, scenario: MyTask, response: str):
  function benchmark_model (line 64) | async def benchmark_model(model: art.Model):
  function rollout (line 84) | async def rollout(model: art.Model[MyConfig], scenario: MyTask) -> art.T...
  function train_model (line 116) | async def train_model(model: art.TrainableModel):
  function main (line 143) | async def main():

FILE: examples/hn_title_generator/reference_grpo_trainer.py
  function filter_on_length (line 57) | def filter_on_length(
  function load_title_data (line 81) | async def load_title_data(
  function score_title_async (line 107) | async def score_title_async(row_data: dict) -> float:
  function calculate_rewards (line 136) | async def calculate_rewards(
  function reward_func (line 210) | def reward_func(
  class ValidationCallback (line 231) | class ValidationCallback(TrainerCallback):
    method __init__ (line 232) | def __init__(
    method on_step_end (line 253) | def on_step_end(

FILE: examples/hn_title_generator/train.py
  function filter_on_length (line 35) | def filter_on_length(data: Dataset, max_length: int, tokenizer_name: str...
  function load_data (line 76) | async def load_data(
  function call_score_title (line 115) | async def call_score_title(row_with_title: Dict[str, Any]) -> float:
  function check_title_matches_body (line 120) | async def check_title_matches_body(
  function rollout (line 159) | async def rollout(
  function main (line 235) | async def main():

FILE: examples/hn_title_generator/utils.py
  class ScoreRequest (line 19) | class ScoreRequest(BaseModel):
    method serialize_time (line 27) | def serialize_time(self, value: datetime) -> str:
  function serialize_story (line 33) | def serialize_story(story):
  function with_serialized_stories (line 39) | def with_serialized_stories(df: pl.DataFrame) -> pl.DataFrame:
  function calculate_metrics_by_split (line 47) | def calculate_metrics_by_split(df: pl.DataFrame) -> pl.DataFrame:
  function score_title (line 95) | async def score_title(
  function pull_data (line 131) | def pull_data(
  function prompt_for_title (line 146) | def prompt_for_title(content: str) -> list[dict]:

FILE: examples/just-the-facts/just_the_facts/checks.py
  function check_includes_all_facts (line 20) | async def check_includes_all_facts(original_text: str, summary_text: str...
  function check_hallucinated_facts (line 65) | async def check_hallucinated_facts(original_text: str, summary_text: str...
  function check_has_conservative_bias (line 110) | async def check_has_conservative_bias(original_text: str, summary_text: ...
  function check_has_liberal_bias (line 151) | async def check_has_liberal_bias(original_text: str, summary_text: str) ...

FILE: examples/just-the-facts/just_the_facts/experiments.py
  class JustTheFactsConfig (line 6) | class JustTheFactsConfig(BaseModel):

FILE: examples/just-the-facts/just_the_facts/generate_benchmarks.py
  function log_comparison_model (line 59) | async def log_comparison_model(comparison_model: art.Model):
  function run_benchmarks (line 78) | async def run_benchmarks():

FILE: examples/just-the-facts/just_the_facts/rollout.py
  class FactsScenario (line 21) | class FactsScenario(BaseModel):
  function rollout (line 26) | async def rollout(model: art.Model, scenario: FactsScenario) -> art.Traj...

FILE: examples/just-the-facts/just_the_facts/train.py
  function train (line 20) | async def train(model: art.TrainableModel[JustTheFactsConfig]):

FILE: examples/just-the-facts/just_the_facts/utils.py
  function scrape_article (line 9) | async def scrape_article(url: str) -> str:
  function _fallback_scrape (line 57) | async def _fallback_scrape(url: str) -> str:

FILE: examples/just-the-facts/main.py
  function main (line 1) | def main():

FILE: examples/just-the-facts/test_scraper.py
  function test_scraper (line 9) | async def test_scraper():

FILE: examples/mcp-rl/all_experiments.py
  class McpPolicyConfig (line 6) | class McpPolicyConfig(BaseModel):

FILE: examples/mcp-rl/mcp_rl/benchmarks/generate_benchmarks.py
  function generate_val_groups (line 31) | async def generate_val_groups(
  function calculate_beat_comp (line 46) | async def calculate_beat_comp(
  function log_comparison_model (line 102) | async def log_comparison_model(
  function run_benchmarks (line 124) | async def run_benchmarks(server: str = "mcp_alphavantage"):

FILE: examples/mcp-rl/mcp_rl/checks.py
  function check_successful (line 24) | async def check_successful(trajectory: art.Trajectory) -> bool:

FILE: examples/mcp-rl/mcp_rl/mcp_server.py
  class McpServer (line 13) | class McpServer(ABC):
    method __init__ (line 16) | def __init__(self):
    method start (line 21) | async def start(self) -> None:
    method stop (line 26) | async def stop(self) -> None:
    method get_tools (line 31) | async def get_tools(self) -> List[Dict[str, Any]]:
    method apply_tool (line 40) | async def apply_tool(self, tool_name: str, arguments: Dict[str, Any]) ...
    method __aenter__ (line 52) | async def __aenter__(self):
    method __aexit__ (line 57) | async def __aexit__(self, exc_type, exc_val, exc_tb):
  class LocalMcpServer (line 62) | class LocalMcpServer(McpServer):
    method __init__ (line 65) | def __init__(self):
    method init_server (line 75) | async def init_server(self) -> StdioServerParameters:
    method start (line 83) | async def start(self) -> None:
    method stop (line 98) | async def stop(self) -> None:
    method get_tools (line 111) | async def get_tools(self) -> List[Dict[str, Any]]:
    method apply_tool (line 139) | async def apply_tool(self, tool_name: str, arguments: Dict[str, Any]) ...
  class RemoteMcpServer (line 178) | class RemoteMcpServer(McpServer):
    method __init__ (line 181) | def __init__(self, api_endpoint: str):
    method start (line 191) | async def start(self) -> None:
    method stop (line 200) | async def stop(self) -> None:
    method get_tools (line 206) | async def get_tools(self) -> List[Dict[str, Any]]:
    method apply_tool (line 243) | async def apply_tool(self, tool_name: str, arguments: Dict[str, Any]) ...
  class AlphaMcpServer (line 298) | class AlphaMcpServer(LocalMcpServer):
    method __init__ (line 301) | def __init__(self, api_key: str = "demo"):
    method init_server (line 310) | async def init_server(self) -> StdioServerParameters:

FILE: examples/mcp-rl/mcp_rl/rollout.py
  class McpScenario (line 34) | class McpScenario:
  function rollout (line 43) | async def rollout(
  function test_rollout (line 254) | async def test_rollout():
  function main (line 272) | async def main():

FILE: examples/mcp-rl/mcp_rl/scenario_generator.py
  function generate_scenarios (line 19) | async def generate_scenarios(
  function main (line 195) | async def main():

FILE: examples/mcp-rl/mcp_rl/train.py
  function train_mcp_agent (line 33) | async def train_mcp_agent(model: art.TrainableModel):
  function main (line 172) | def main():

FILE: examples/mcp-rl/mcp_rl/utils.py
  function get_content_text (line 4) | def get_content_text(result: types.CallToolResult) -> str:

FILE: examples/mcp-rl/servers/python/mcp_alphavantage/server.py
  class AlphaVantageClient (line 16) | class AlphaVantageClient:
    method __init__ (line 19) | def __init__(self, api_key: str):
    method fetch_data (line 23) | async def fetch_data(self, function: str, **params) -> Dict[str, Any]:
  function main (line 62) | def main(api_key: Optional[str], port: int, transport: str) -> int:
  function _format_json (line 359) | def _format_json(data: Dict[str, Any]) -> str:

FILE: examples/mcp-rl/servers/python/mcp_balldontlie/server.py
  class BalldontlieClient (line 16) | class BalldontlieClient:
    method __init__ (line 19) | def __init__(self, api_key: str):
    method fetch_data (line 23) | async def fetch_data(self, endpoint: str, **params) -> Dict[str, Any]:
  function main (line 51) | def main(api_key: Optional[str], port: int, transport: str) -> int:
  function _format_json (line 318) | def _format_json(data: Dict[str, Any]) -> str:

FILE: examples/mcp-rl/servers/python/mcp_googlemaps/server.py
  class GoogleMapsClient (line 16) | class GoogleMapsClient:
    method __init__ (line 19) | def __init__(self, api_key: str):
    method geocode (line 24) | async def geocode(self, address: str, **params) -> Dict[str, Any]:
    method reverse_geocode (line 48) | async def reverse_geocode(self, lat: float, lng: float, **params) -> D...
    method places_nearby_search (line 72) | async def places_nearby_search(
    method places_text_search (line 101) | async def places_text_search(self, query: str, **params) -> Dict[str, ...
    method place_details (line 124) | async def place_details(self, place_id: str, **params) -> Dict[str, Any]:
    method place_autocomplete (line 147) | async def place_autocomplete(self, input_text: str, **params) -> Dict[...
  function main (line 180) | def main(api_key: Optional[str], port: int, transport: str) -> int:
  function _format_json (line 486) | def _format_json(data: Dict[str, Any]) -> str:

FILE: examples/mcp-rl/test_scenario_generation.py
  function create_sample_tools (line 15) | def create_sample_tools() -> List[MCPTool]:
  function create_sample_resources (line 85) | def create_sample_resources() -> List[MCPResource]:
  function test_basic_scenario_generation (line 109) | async def test_basic_scenario_generation():
  function test_scenario_generation_with_resources (line 150) | async def test_scenario_generation_with_resources():
  function test_dict_input_compatibility (line 182) | async def test_dict_input_compatibility():
  function test_error_handling (line 239) | async def test_error_handling():
  function test_tool_resource_classes (line 275) | def test_tool_resource_classes():
  function main (line 318) | async def main():

FILE: examples/openenv_echo.py
  function rollout (line 25) | async def rollout(model: art.TrainableModel, env_client: EchoEnv) -> art...
  function main (line 58) | async def main() -> None:

FILE: examples/temporal_clue/temporal-clue.py
  class TemporalCluePuzzle (line 16) | class TemporalCluePuzzle(TypedDict):
  function rollout (line 35) | async def rollout(model: art.Model, puzzle: TemporalCluePuzzle) -> art.T...
  function main (line 56) | async def main():

FILE: examples/tic_tac_toe/game_utils.py
  class TicTacToeGame (line 6) | class TicTacToeGame(TypedDict):
  function generate_game (line 12) | def generate_game(board_length: int = 3) -> TicTacToeGame:
  function render_board (line 23) | def render_board(game: TicTacToeGame) -> str:
  function get_opponent_move (line 39) | def get_opponent_move(game: TicTacToeGame) -> tuple[int, int]:
  function apply_agent_move (line 47) | def apply_agent_move(game: TicTacToeGame, move: str) -> None:
  function check_winner (line 80) | def check_winner(board: list[list[str]]) -> Literal["x", "o", "draw", No...

FILE: examples/tic_tac_toe/rollout.py
  class TicTacToeScenario (line 22) | class TicTacToeScenario(BaseModel):
  function rollout (line 28) | async def rollout(model: art.Model, scenario: TicTacToeScenario) -> art....

FILE: examples/tic_tac_toe/tic-tac-toe.py
  function main (line 25) | async def main():

FILE: examples/tic_tac_toe_self_play/deploy_step.py
  function deploy_step (line 12) | async def deploy_step():

FILE: examples/tic_tac_toe_self_play/game_utils.py
  class TicTacToeGame (line 5) | class TicTacToeGame(TypedDict):
  function generate_game (line 9) | def generate_game(board_length: int = 3) -> TicTacToeGame:
  function render_board (line 29) | def render_board(game: TicTacToeGame) -> str:
  function unwrap_move (line 45) | def unwrap_move(move: str) -> str:
  function apply_agent_move (line 53) | def apply_agent_move(
  function check_winner (line 82) | def check_winner(board: list[list[str]]) -> Literal["x", "o", "draw", No...

FILE: examples/tic_tac_toe_self_play/gather_trajectory_groups_by_index.py
  function gather_trajectory_groups_by_index (line 10) | async def gather_trajectory_groups_by_index(

FILE: examples/tic_tac_toe_self_play/rollout.py
  class PlayerState (line 25) | class PlayerState(BaseModel):
  class ModelConfig (line 31) | class ModelConfig(BaseModel):
  function get_agent_move (line 35) | async def get_agent_move(
  function record_first_move_metrics (line 95) | def record_first_move_metrics(trajectory: art.Trajectory, square: str) -...
  class TicTacToeScenario (line 102) | class TicTacToeScenario(BaseModel):
  function rollout (line 112) | async def rollout(

FILE: examples/tic_tac_toe_self_play/train.py
  function main (line 27) | async def main():

FILE: examples/tic_tac_toe_self_play/train_o4_mini.py
  function main (line 24) | async def main():

FILE: scripts/bump_version.py
  function get_current_version (line 17) | def get_current_version():
  function bump_version (line 27) | def bump_version(current_version, bump_type):
  function update_version (line 41) | def update_version(new_version):
  function main (line 57) | def main():

FILE: scripts/ci/compute_uv_fingerprint.py
  function _sha256_file (line 13) | def _sha256_file(path: Path) -> str:
  function _build_parser (line 17) | def _build_parser() -> argparse.ArgumentParser:
  function main (line 54) | def main() -> int:

FILE: scripts/deploy-model.py
  function parse_args (line 20) | def parse_args() -> argparse.Namespace:
  function deploy (line 50) | async def deploy() -> None:

FILE: scripts/migrate-s3-checkpoints.py
  function main (line 25) | async def main():

FILE: src/art/__init__.py
  class GuidedDecodingParams (line 18) | class GuidedDecodingParams:
    method __init__ (line 21) | def __init__(self, **kwargs):

FILE: src/art/api_costs.py
  class TokenPricing (line 22) | class TokenPricing:
  class _OpenAITokenUsage (line 31) | class _OpenAITokenUsage:
  class _AnthropicTokenUsage (line 38) | class _AnthropicTokenUsage:
  function _configured_token_pricing (line 60) | def _configured_token_pricing(model_name: str) -> TokenPricing | None:
  function normalize_provider (line 74) | def normalize_provider(provider: str | None) -> str | None:
  function _read_usage_field (line 83) | def _read_usage_field(usage: Any, field: str) -> float | None:
  function _read_usage_nested_field (line 95) | def _read_usage_nested_field(usage: Any, *fields: str) -> float | None:
  function _read_field (line 109) | def _read_field(container: Any, field: str) -> Any:
  function _coerce_float (line 117) | def _coerce_float(value: Any) -> float | None:
  function _extract_cost_value (line 126) | def _extract_cost_value(value: Any) -> float | None:
  function _response_usage (line 133) | def _response_usage(response: Any) -> Any:
  function _extract_direct_response_cost (line 139) | def _extract_direct_response_cost(response: Any) -> float | None:
  function _extract_openai_token_counts (line 157) | def _extract_openai_token_counts(response: Any) -> _OpenAITokenUsage | N...
  function _extract_anthropic_token_counts (line 178) | def _extract_anthropic_token_counts(response: Any) -> _AnthropicTokenUsa...
  function _estimate_openai_cost (line 201) | def _estimate_openai_cost(
  function _estimate_anthropic_cost (line 229) | def _estimate_anthropic_cost(
  function _estimate_provider_cost (line 262) | def _estimate_provider_cost(
  function _resolve_registered_or_default_pricing (line 277) | def _resolve_registered_or_default_pricing(
  function _merge_token_pricing (line 288) | def _merge_token_pricing(
  function normalize_model_name (line 326) | def normalize_model_name(model_name: str | None) -> str | None:
  function _resolve_token_pricing (line 335) | def _resolve_token_pricing(
  function extract_api_cost (line 400) | def extract_api_cost(
  function _record_api_cost (line 449) | def _record_api_cost(
  function track_api_cost (line 483) | def track_api_cost(

FILE: src/art/auto_trajectory.py
  function parse_sse_to_chat_completion (line 16) | def parse_sse_to_chat_completion(content: bytes) -> ChatCompletion:
  function auto_trajectory (line 46) | def auto_trajectory(*, required: Literal[True]) -> Trajectory: ...
  function auto_trajectory (line 50) | def auto_trajectory(*, required: Literal[False] = False) -> Trajectory |...
  function auto_trajectory (line 53) | def auto_trajectory(*, required: bool = False) -> Trajectory | None:
  function capture_auto_trajectory (line 64) | async def capture_auto_trajectory(coroutine: Coroutine[Any, Any, Any]) -...
  class AutoTrajectoryContext (line 70) | class AutoTrajectoryContext:
    method __init__ (line 71) | def __init__(self) -> None:
    method __enter__ (line 74) | def __enter__(self) -> Trajectory:
    method __exit__ (line 78) | def __exit__(self, exc_type: Any, exc_value: Any, traceback: Any) -> N...
    method handle_httpx_response (line 82) | def handle_httpx_response(self, response: httpx._models.Response) -> N...
  function patch_httpx (line 143) | def patch_httpx() -> None:

FILE: src/art/backend.py
  class Backend (line 15) | class Backend(Protocol):
    method _model_inference_name (line 18) | def _model_inference_name(
    method close (line 22) | async def close(self) -> None: ...
    method register (line 24) | async def register(self, model: AnyModel) -> None: ...
    method _get_step (line 26) | async def _get_step(self, model: AnyTrainableModel) -> int: ...
    method _delete_checkpoint_files (line 28) | async def _delete_checkpoint_files(
    method _prepare_backend_for_training (line 32) | async def _prepare_backend_for_training(
    method train (line 38) | async def train(
    method _train_model (line 45) | def _train_model(
    method _train_sft (line 54) | def _train_sft(

FILE: src/art/batches.py
  function trajectory_group_batches (line 10) | async def trajectory_group_batches(

FILE: src/art/cli.py
  function _get_skill_path (line 20) | def _get_skill_path(skill_name: str) -> Path:
  function _chat_with_skill (line 39) | def _chat_with_skill(skill_name: str) -> None:
  function train_sft (line 122) | def train_sft() -> None:
  function train_rl (line 128) | def train_rl() -> None:
  function install_skills (line 134) | def install_skills(
  function migrate (line 177) | def migrate(
  function run (line 286) | def run(host: str = "0.0.0.0", port: int = 7999) -> None:

FILE: src/art/costs.py
  class ModelPricing (line 10) | class ModelPricing:
  function get_model_pricing (line 71) | def get_model_pricing(
  function tokens_to_cost (line 87) | def tokens_to_cost(num_tokens: float, price_per_million: float) -> float:
  function compute_sample_costs (line 92) | def compute_sample_costs(
  function build_cost_calculator (line 113) | def build_cost_calculator(pricing: ModelPricing) -> CostCalculator:
  function compute_train_cost (line 131) | def compute_train_cost(train_tokens: float, pricing: ModelPricing) -> fl...

FILE: src/art/dev/engine.py
  class EngineArgs (line 6) | class EngineArgs(TypedDict, total=False):

FILE: src/art/dev/get_model_config.py
  function get_model_config (line 6) | def get_model_config(

FILE: src/art/dev/model.py
  class OptimizerNames (line 9) | class OptimizerNames(str, Enum):
  class DebugOption (line 58) | class DebugOption(str, Enum):
  class IntervalStrategy (line 64) | class IntervalStrategy(str, Enum):
  class HubStrategy (line 75) | class HubStrategy(str, Enum):
  class SchedulerType (line 83) | class SchedulerType(str, Enum):
  class FSDPOption (line 98) | class FSDPOption(str, Enum):
  class InternalModelConfig (line 108) | class InternalModelConfig(TypedDict, total=False):
  class TinkerArgs (line 135) | class TinkerArgs(TypedDict, total=False):
  class TinkerNativeArgs (line 140) | class TinkerNativeArgs(TypedDict, total=False):
  class TinkerTrainingClientArgs (line 145) | class TinkerTrainingClientArgs(TypedDict, total=False):
  class InitArgs (line 154) | class InitArgs(TypedDict, total=False):
  class PeftArgs (line 180) | class PeftArgs(TypedDict, total=False):
  class TrainerArgs (line 198) | class TrainerArgs(TypedDict, total=False):

FILE: src/art/dev/openai_server.py
  function get_openai_server_config (line 8) | def get_openai_server_config(
  class OpenAIServerConfig (line 50) | class OpenAIServerConfig(TypedDict, total=False):
  class ServerArgs (line 68) | class ServerArgs(TypedDict, total=False):

FILE: src/art/dev/train.py
  class TrainConfig (line 6) | class TrainConfig(TypedDict, total=False):
  class TrainSFTConfig (line 34) | class TrainSFTConfig(TypedDict, total=False):

FILE: src/art/dev/validate.py
  function is_dedicated_mode (line 6) | def is_dedicated_mode(config: InternalModelConfig) -> bool:
  function validate_dedicated_config (line 11) | def validate_dedicated_config(config: InternalModelConfig) -> None:

FILE: src/art/errors.py
  class ARTError (line 8) | class ARTError(Exception):
    method __init__ (line 11) | def __init__(self, message: str, status_code: int):
  class ForbiddenBucketCreationError (line 17) | class ForbiddenBucketCreationError(ARTError):
    method __init__ (line 26) | def __init__(self, message: str):
  class UnsupportedLoRADeploymentProviderError (line 30) | class UnsupportedLoRADeploymentProviderError(ARTError):
    method __init__ (line 37) | def __init__(self, message: str):
  class UnsupportedBaseModelDeploymentError (line 41) | class UnsupportedBaseModelDeploymentError(ARTError):
    method __init__ (line 48) | def __init__(self, message: str):
  class LoRADeploymentTimedOutError (line 52) | class LoRADeploymentTimedOutError(ARTError):
    method __init__ (line 58) | def __init__(self, message: str):

FILE: src/art/gather.py
  function gather_trajectory_groups (line 14) | async def gather_trajectory_groups(
  function gather_trajectories (line 69) | async def gather_trajectories(
  function gather_trajectories (line 79) | async def gather_trajectories(
  function gather_trajectories (line 89) | async def gather_trajectories(
  function gather_trajectories (line 99) | async def gather_trajectories(
  function gather_trajectories (line 108) | async def gather_trajectories(
  function wrap_group_awaitable (line 140) | async def wrap_group_awaitable(
  function wrap_trajectories_awaitable (line 159) | async def wrap_trajectories_awaitable(
  function record_metrics (line 183) | def record_metrics(context: "GatherContext", trajectory: Trajectory) -> ...
  class GatherContext (line 203) | class GatherContext:
    method update_pbar (line 211) | def update_pbar(self, n: int) -> None:
    method too_many_exceptions (line 228) | def too_many_exceptions(self) -> bool:
    method reset (line 237) | def reset(self) -> None:
  function set_gather_context (line 248) | def set_gather_context(context: GatherContext) -> Iterator[None]:
  function get_gather_context (line 256) | def get_gather_context() -> GatherContext:

FILE: src/art/guided_completion.py
  function freeze_tool_schema (line 16) | def freeze_tool_schema(tool: dict, fixed_args: dict) -> ChatCompletionTo...
  function get_guided_completion_params (line 33) | def get_guided_completion_params(

FILE: src/art/langgraph/llm_wrapper.py
  function add_thread (line 26) | def add_thread(thread_id, base_url, api_key, model):
  function create_messages_from_logs (line 40) | def create_messages_from_logs(log_path: str, trajectory: Trajectory):
  function wrap_rollout (line 95) | def wrap_rollout(model, fn):
  function init_chat_model (line 110) | def init_chat_model(
  class LoggingLLM (line 130) | class LoggingLLM(Runnable):
    method __init__ (line 131) | def __init__(self, llm, logger, structured_output=None, tools=None):
    method _log (line 137) | def _log(self, completion_id, input, output):
    method invoke (line 142) | def invoke(self, input, config=None, **kwargs):
    method ainvoke (line 164) | async def ainvoke(self, input, config=None, **kwargs):
    method with_structured_output (line 191) | def with_structured_output(self, tools):
    method bind_tools (line 199) | def bind_tools(self, tools):
    method with_retry (line 202) | def with_retry(
    method with_config (line 212) | def with_config(

FILE: src/art/langgraph/logging.py
  class FileLogger (line 5) | class FileLogger:
    method __init__ (line 6) | def __init__(self, filepath):
    method log (line 10) | def log(self, name, entry):
    method load_logs (line 19) | def load_logs(self):

FILE: src/art/langgraph/message_utils.py
  function make_message_param (line 47) | def make_message_param(role: str, **kwargs) -> ChatCompletionMessageParam:
  function langchain_msg_to_openai (line 54) | def langchain_msg_to_openai(msg: BaseMessage):
  function convert_langgraph_messages (line 90) | def convert_langgraph_messages(messages: List[object]) -> MessagesAndCho...

FILE: src/art/local/backend.py
  class LocalBackend (line 74) | class LocalBackend(Backend):
    method __init__ (line 75) | def __init__(
    method supports_automatic_train_step_metrics (line 109) | def supports_automatic_train_step_metrics(self) -> bool:
    method automatic_gpu_cost_per_hour_usd (line 112) | def automatic_gpu_cost_per_hour_usd(self, model: Model) -> float | None:
    method _resolve_gpu_cost_per_hour_usd (line 122) | def _resolve_gpu_cost_per_hour_usd(self) -> float | None:
    method _allocated_gpu_count (line 148) | def _allocated_gpu_count(self, model: Model) -> int:
    method __enter__ (line 160) | def __enter__(self) -> Self:
    method __aenter__ (line 163) | async def __aenter__(self) -> Self:
    method __exit__ (line 166) | def __exit__(
    method __aexit__ (line 174) | async def __aexit__(
    method close (line 182) | async def close(self) -> None:
    method _close (line 196) | def _close(self) -> None:
    method register (line 203) | async def register(
    method _model_inference_name (line 235) | def _model_inference_name(self, model: Model, step: int | None = None)...
    method _get_service (line 271) | async def _get_service(self, model: TrainableModel) -> ModelService:
    method _get_packed_tensors (line 317) | def _get_packed_tensors(
    method _get_step (line 383) | async def _get_step(self, model: AnyTrainableModel) -> int:
    method __get_step (line 386) | def __get_step(self, model: Model) -> int:
    method _delete_checkpoint_files (line 393) | async def _delete_checkpoint_files(
    method _prepare_backend_for_training (line 412) | async def _prepare_backend_for_training(
    method _monitor_openai_server (line 443) | async def _monitor_openai_server(
    method _trajectory_log (line 513) | def _trajectory_log(self, trajectory: Trajectory) -> str:
    method train (line 525) | async def train(  # type: ignore[override]
    method _train_model (line 731) | async def _train_model(
    method _train_sft (line 851) | async def _train_sft(
    method _experimental_pull_model_checkpoint (line 957) | async def _experimental_pull_model_checkpoint(
    method _experimental_pull_from_s3 (line 1097) | async def _experimental_pull_from_s3(
    method _experimental_push_to_s3 (line 1178) | async def _experimental_push_to_s3(
    method _experimental_fork_checkpoint (line 1198) | async def _experimental_fork_checkpoint(

FILE: src/art/local/checkpoints.py
  function delete_checkpoints (line 7) | def delete_checkpoints(output_dir: str, excluding: list[int]) -> None:
  function get_last_checkpoint_dir (line 23) | def get_last_checkpoint_dir(output_dir: str) -> str | None:
  function migrate_checkpoints_to_new_structure (line 33) | def migrate_checkpoints_to_new_structure(output_dir: str) -> None:

FILE: src/art/local/service.py
  class ModelService (line 9) | class ModelService(Protocol):
    method __init__ (line 10) | def __init__(
    method start_openai_server (line 19) | async def start_openai_server(
    method vllm_engine_is_sleeping (line 23) | async def vllm_engine_is_sleeping(self) -> bool: ...
    method train (line 25) | def train(
    method train_sft (line 33) | def train_sft(

FILE: src/art/loss.py
  class Loss (line 14) | class Loss(BaseModel):
  function loss_fn (line 23) | def loss_fn(
  function shift_tensor (line 145) | def shift_tensor(tensor: torch.Tensor, pad: int | float | bool) -> torch...

FILE: src/art/mcp/generate_scenarios.py
  function preview_scenarios (line 13) | def preview_scenarios(scenarios: List[Dict[str, Any]], n: int = 5):
  function generate_scenarios (line 27) | async def generate_scenarios(

FILE: src/art/mcp/types.py
  class MCPTool (line 11) | class MCPTool:
    method from_dict (line 19) | def from_dict(cls, data: Dict[str, Any]) -> "MCPTool":
    method to_dict (line 27) | def to_dict(self) -> Dict[str, Any]:
    method to_tool_schema (line 35) | def to_tool_schema(self) -> Dict[str, Any]:
  class MCPResource (line 44) | class MCPResource:
    method from_dict (line 53) | def from_dict(cls, data: Dict[str, Any]) -> "MCPResource":
    method to_dict (line 62) | def to_dict(self) -> Dict[str, Any]:
  class GeneratedScenario (line 71) | class GeneratedScenario:
    method __post_init__ (line 77) | def __post_init__(self):
    method from_dict (line 82) | def from_dict(cls, data: Dict[str, Any]) -> "GeneratedScenario":
    method to_dict (line 86) | def to_dict(self) -> Dict[str, Any]:
    method preview (line 90) | def preview(self, max_length: int = 120) -> str:
  class GeneratedScenarioCollection (line 97) | class GeneratedScenarioCollection:
    method __init__ (line 100) | def __init__(self, scenarios: List[GeneratedScenario]):
    method from_dicts (line 104) | def from_dicts(cls, data: List[Dict[str, Any]]) -> "GeneratedScenarioC...
    method from_json (line 110) | def from_json(cls, json_str: str) -> "GeneratedScenarioCollection":
    method to_dicts (line 119) | def to_dicts(self) -> List[Dict[str, Any]]:
    method to_json (line 123) | def to_json(self, indent: Optional[int] = None) -> str:
    method __len__ (line 127) | def __len__(self) -> int:
    method __iter__ (line 130) | def __iter__(self):
    method __getitem__ (line 133) | def __getitem__(self, index):
    method shuffle (line 136) | def shuffle(self) -> "GeneratedScenarioCollection":
    method split (line 142) | def split(
    method filter_by_difficulty (line 158) | def filter_by_difficulty(
    method get_difficulty_distribution (line 169) | def get_difficulty_distribution(self) -> Counter:
    method preview (line 173) | def preview(self, n: int = 5, max_task_length: int = 120) -> None:
    method print_difficulty_distribution (line 184) | def print_difficulty_distribution(self) -> None:
    method get_summary (line 193) | def get_summary(self) -> Dict[str, Any]:

FILE: src/art/megatron/backend.py
  class MegatronBackend (line 9) | class MegatronBackend(LocalBackend):
    method __init__ (line 10) | def __init__(
    method _get_service (line 18) | async def _get_service(self, model: TrainableModel) -> ModelService:

FILE: src/art/megatron/flex_attention.py
  class SharedPrefixAttentionState (line 21) | class SharedPrefixAttentionState(BaseModel):
  class FlexAttentionWrapper (line 28) | class FlexAttentionWrapper(torch.nn.Module):
    method forward (line 42) | def forward(
  function create_shared_prefix_attention_state (line 69) | def create_shared_prefix_attention_state(
  class FlexDotProductAttention (line 107) | class FlexDotProductAttention(torch.nn.Module):
    method __init__ (line 113) | def __init__(
    method forward (line 158) | def forward(

FILE: src/art/megatron/lora.py
  class LoRA (line 19) | class LoRA(torch.nn.Module):
    method __init__ (line 20) | def __init__(
    method num_local_experts (line 51) | def num_local_experts(self) -> int:
    method reset_lora_parameters (line 54) | def reset_lora_parameters(self) -> None:
    method load_lora (line 63) | def load_lora(self, adapter_model: dict[str, torch.Tensor]) -> None:
    method load_weights (line 79) | def load_weights(
    method load_weight (line 102) | def load_weight(self, weight: torch.Tensor, *, into: torch.nn.Paramete...
    method sharded_lora_state_dict (line 119) | def sharded_lora_state_dict(self) -> dict[str, torch.Tensor]:
    method forward (line 145) | def forward(
  class SelfAttentionLinearProjLoRA (line 165) | class SelfAttentionLinearProjLoRA(torch.nn.Module):
    method __init__ (line 166) | def __init__(
    method forward (line 188) | def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor...
  class SelfAttentionLinearQKVLoRA (line 205) | class SelfAttentionLinearQKVLoRA(torch.nn.Module):
    method __init__ (line 206) | def __init__(
    method forward (line 262) | def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor...
  class MLPExpertsLinearFC1LoRA (line 293) | class MLPExpertsLinearFC1LoRA(torch.nn.Module):
    method __init__ (line 294) | def __init__(
    method forward (line 327) | def forward(
  class MLPExpertsLinearFC2LoRA (line 337) | class MLPExpertsLinearFC2LoRA(torch.nn.Module):
    method __init__ (line 338) | def __init__(
    method forward (line 361) | def forward(
  function apply_lora_adapters (line 369) | def apply_lora_adapters(

FILE: src/art/megatron/offload.py
  class OffloadState (line 9) | class OffloadState:
  function offload_to_cpu (line 14) | def offload_to_cpu(
  function reload_to_gpu (line 88) | def reload_to_gpu(

FILE: src/art/megatron/provider.py
  function _resolve_layer_spec (line 16) | def _resolve_layer_spec(
  function get_provider (line 31) | def get_provider(model: str) -> GPTModelProvider:

FILE: src/art/megatron/service.py
  class MegatronTrainingJob (line 31) | class MegatronTrainingJob(BaseModel):
  class MegatronService (line 42) | class MegatronService:
    method _next_lora_id (line 53) | def _next_lora_id(self) -> int:
    method _get_optimizer_state_path (line 57) | def _get_optimizer_state_path(self) -> str:
    method _default_lora_adapter_config (line 64) | def _default_lora_adapter_config(self) -> LoraConfig:
    method _adapter_has_weights (line 81) | def _adapter_has_weights(self, lora_path: str) -> bool:
    method _create_identity_lora (line 95) | def _create_identity_lora(self, lora_path: str) -> None:
    method _ensure_identity_lora (line 119) | def _ensure_identity_lora(self, lora_path: str) -> None:
    method _ensure_lora_adapter_config (line 124) | def _ensure_lora_adapter_config(
    method _add_lora_aliases (line 139) | async def _add_lora_aliases(
    method register_lora_for_step (line 153) | async def register_lora_for_step(self, step: int, checkpoint_dir: str)...
    method _ensure_megatron_running (line 159) | async def _ensure_megatron_running(self) -> None:
    method start_openai_server (line 184) | async def start_openai_server(
    method vllm_engine_is_sleeping (line 212) | async def vllm_engine_is_sleeping(self) -> bool:
    method train (line 215) | async def train(
    method train_sft (line 299) | async def train_sft(
    method _merge_lora_adapter (line 307) | def _merge_lora_adapter(self, lora_path: str) -> None:
    method llm (line 336) | def llm(self) -> asyncio.Task[AsyncLLM]:

FILE: src/art/megatron/train.py
  function _set_cache_dir (line 5) | def _set_cache_dir(env_var: str, default_path: str) -> None:
  function freeze_model (line 52) | def freeze_model(model_chunks: list[MegatronModule]) -> list[MegatronMod...
  function _preprocess_hook (line 80) | def _preprocess_hook(*args, **kwargs):
  class TrainingJob (line 125) | class TrainingJob(BaseModel):
  function print0 (line 133) | def print0(*values: Any) -> None:

FILE: src/art/metrics.py
  function is_cumulative_metric_key (line 27) | def is_cumulative_metric_key(key: str) -> bool:
  function is_builder_managed_metric (line 32) | def is_builder_managed_metric(key: str) -> bool:
  function to_cumulative_metric_key (line 36) | def to_cumulative_metric_key(key: str) -> str:
  class _PendingMetricsState (line 47) | class _PendingMetricsState:
  class _SharedMetricsState (line 53) | class _SharedMetricsState:
  function _new_pending_metrics_state (line 62) | def _new_pending_metrics_state() -> _PendingMetricsState:
  function _new_shared_metrics_state (line 66) | def _new_shared_metrics_state() -> _SharedMetricsState:
  class MetricsBuilder (line 77) | class MetricsBuilder:
    method __init__ (line 80) | def __init__(
    method add_cost (line 98) | def add_cost(self, path: str, usd: float) -> None:
    method add_response_cost (line 104) | def add_response_cost(
    method add_metric (line 139) | def add_metric(self, key: str, value: float) -> None:
    method add_data (line 144) | def add_data(
    method add_user_timing (line 159) | def add_user_timing(
    method add_idle_times (line 172) | def add_idle_times(
    method measure (line 186) | def measure(self, key: str):
    method flush (line 193) | async def flush(self) -> dict[str, float]:
    method activate (line 223) | def activate(self) -> Token["MetricsBuilder"]:
    method activate_context (line 227) | def activate_context(self):
    method get_active (line 235) | def get_active() -> "MetricsBuilder":
    method for_cost_context (line 238) | def for_cost_context(
    method register_cost_extractor (line 262) | def register_cost_extractor(self, provider: str, extractor: CostExtrac...
    method register_model_pricing (line 268) | def register_model_pricing(
    method state_dict (line 301) | def state_dict(self) -> dict[str, Any]:
    method load_state_dict (line 307) | def load_state_dict(self, state: dict[str, Any]) -> None:
    method _validate_and_add (line 319) | def _validate_and_add(self, key: str, value: float) -> None:
    method _pending_state (line 340) | def _pending_state(self) -> _PendingMetricsState:
    method _compute_rollups (line 347) | def _compute_rollups(self, cost_metrics: dict[str, float]) -> dict[str...
    method _update_throughput_metrics (line 376) | def _update_throughput_metrics(self, result: dict[str, float]) -> None:

FILE: src/art/metrics_taxonomy.py
  function average_metric_samples (line 12) | def average_metric_samples(
  class TrajectoryBatchSummary (line 46) | class TrajectoryBatchSummary:
  function summarize_trajectory_groups (line 54) | def summarize_trajectory_groups(
  function build_data_metrics_from_summary (line 79) | def build_data_metrics_from_summary(
  function build_training_summary_metrics (line 94) | def build_training_summary_metrics(
  function _group_is_trainable (line 105) | def _group_is_trainable(group: TrajectoryGroup) -> bool:
  function _extract_scenario_id (line 110) | def _extract_scenario_id(group: TrajectoryGroup) -> str | None:
  function _extract_scenario_id_from_metadata (line 121) | def _extract_scenario_id_from_metadata(

FILE: src/art/model.py
  class _OpenAIChatCompletionsProxy (line 41) | class _OpenAIChatCompletionsProxy:
    method __init__ (line 42) | def __init__(self, completions: Any, record_costs: Any) -> None:
    method create (line 46) | async def create(self, *args: Any, **kwargs: Any) -> Any:
    method __getattr__ (line 51) | def __getattr__(self, name: str) -> Any:
  class _OpenAIChatProxy (line 55) | class _OpenAIChatProxy:
    method __init__ (line 56) | def __init__(self, chat: Any, record_costs: Any) -> None:
    method __getattr__ (line 60) | def __getattr__(self, name: str) -> Any:
  class _OpenAIClientProxy (line 64) | class _OpenAIClientProxy:
    method __init__ (line 65) | def __init__(self, client: Any, record_costs: Any) -> None:
    method with_options (line 70) | def with_options(self, *args: Any, **kwargs: Any) -> "_OpenAIClientPro...
    method __getattr__ (line 76) | def __getattr__(self, name: str) -> Any:
  class Model (line 95) | class Model(
    method __init__ (line 167) | def __init__(
    method __new__ (line 208) | def __new__(
    method __new__ (line 224) | def __new__(
    method __new__ (line 239) | def __new__(  # pyright: ignore[reportInconsistentOverload]
    method safe_model_dump (line 246) | def safe_model_dump(self, *args, **kwargs) -> dict:
    method backend (line 255) | def backend(self) -> "Backend":
    method register (line 262) | async def register(self, backend: "Backend") -> None:
    method openai_client (line 274) | def openai_client(
    method litellm_completion_params (line 314) | def litellm_completion_params(self, step: int | None = None) -> dict:
    method get_inference_name (line 336) | def get_inference_name(self, step: int | None = None) -> str:
    method _record_openai_completion_costs (line 361) | def _record_openai_completion_costs(self, _response: Any) -> None:
    method _get_output_dir (line 365) | def _get_output_dir(self) -> str:
    method overwrite_state (line 369) | def overwrite_state(self, state: StateType) -> None:
    method write_state (line 395) | def write_state(self, state: StateType) -> None:
    method merge_state (line 404) | def merge_state(self, state: StateType) -> StateType:
    method _deep_merge_dicts (line 419) | def _deep_merge_dicts(
    method _merge_wandb_config (line 435) | def _merge_wandb_config(
    method read_state (line 462) | def read_state(self) -> StateType | None:
    method update_wandb_config (line 481) | def update_wandb_config(
    method _sync_wandb_config (line 503) | def _sync_wandb_config(
    method _get_wandb_run (line 518) | def _get_wandb_run(self) -> Optional["Run"]:
    method _log_metrics (line 569) | def _log_metrics(
    method _define_wandb_step_metrics (line 623) | def _define_wandb_step_metrics(self, keys: Iterable[str]) -> None:
    method _route_metrics_and_collect_non_costs (line 636) | def _route_metrics_and_collect_non_costs(
    method _collect_automatic_backend_metrics (line 658) | def _collect_automatic_backend_metrics(
    method _add_default_step_metrics (line 698) | def _add_default_step_metrics(
    method metrics_builder (line 724) | def metrics_builder(self, cost_context: str | None = None) -> MetricsB...
    method activate_metrics_context (line 730) | def activate_metrics_context(self, cost_context: str) -> Token[Metrics...
    method _metrics_builder_for_split (line 733) | def _metrics_builder_for_split(self, split: str) -> MetricsBuilder:
    method _load_metrics_builder_state (line 740) | def _load_metrics_builder_state(self) -> None:
    method _persist_metrics_builder_state (line 749) | def _persist_metrics_builder_state(self) -> None:
    method _normalize_trajectory_groups (line 754) | def _normalize_trajectory_groups(
    method log (line 773) | async def log(
    method get_step (line 934) | async def get_step(self) -> int:
  class TrainableModel (line 948) | class TrainableModel(Model[ModelConfig, StateType], Generic[ModelConfig,...
    method __init__ (line 957) | def __init__(
    method cost_calculator (line 989) | def cost_calculator(self) -> CostCalculator:
    method set_cost_calculator (line 992) | def set_cost_calculator(self, calculator: CostCalculator | None) -> None:
    method _noop_cost_calculator (line 1000) | def _noop_cost_calculator(
    method _record_openai_completion_costs (line 1007) | def _record_openai_completion_costs(self, _response: Any) -> None:
    method __new__ (line 1036) | def __new__(
    method __new__ (line 1051) | def __new__(
    method __new__ (line 1065) | def __new__(  # pyright: ignore[reportInconsistentOverload]
    method model_dump (line 1072) | def model_dump(self, *args, **kwargs) -> dict:
    method safe_model_dump (line 1077) | def safe_model_dump(self, *args, **kwargs) -> dict:
    method register (line 1086) | async def register(
    method delete_checkpoints (line 1106) | async def delete_checkpoints(
    method train (line 1139) | async def train(
    method train_sft (line 1198) | async def train_sft(

FILE: src/art/openai.py
  function consume_chat_completion_stream (line 16) | async def consume_chat_completion_stream(
  function consume_sync_chat_completion_stream (line 52) | def consume_sync_chat_completion_stream(
  function init_chat_completion (line 64) | def init_chat_completion(chunk: ChatCompletionChunk) -> ChatCompletion:
  function update_chat_completion (line 82) | def update_chat_completion(

FILE: src/art/pipeline_trainer/binary_prefix_tool_pipeline.py
  class PipelineConfig (line 28) | class PipelineConfig:
  function is_valid_guess (line 78) | def is_valid_guess(guess: str) -> bool:
  function shared_prefix_len (line 82) | def shared_prefix_len(guess: str, secret: str) -> int:
  function _parse_guess_args (line 91) | def _parse_guess_args(arguments: str | None) -> str | None:
  function _tool_name_and_args (line 127) | def _tool_name_and_args(tool_call: Any) -> tuple[str | None, str | None]:
  function extract_guess (line 137) | def extract_guess(choice: Any) -> tuple[str | None, str]:
  function get_model_output_dir (line 150) | def get_model_output_dir(model: art.TrainableModel) -> Path:
  function print_history_summary (line 154) | def print_history_summary(model: art.TrainableModel, tail: int = 5) -> N...
  function main (line 176) | async def main() -> None:

FILE: src/art/pipeline_trainer/state.py
  class PipelineState (line 8) | class PipelineState:

FILE: src/art/pipeline_trainer/status.py
  class StatusReporter (line 14) | class StatusReporter:
    method __init__ (line 15) | def __init__(
    method start (line 62) | def start(self, *, initial_step: int | None = None) -> None:
    method close (line 80) | def close(self) -> None:
    method flush (line 86) | def flush(self) -> None:
    method set_step (line 89) | def set_step(self, step: int) -> None:
    method log_if_due (line 93) | def log_if_due(self, *, force: bool = False) -> None:
    method note_rollout_started (line 102) | def note_rollout_started(self) -> None:
    method note_rollout_finished (line 106) | def note_rollout_finished(self, *, errored: bool) -> None:
    method note_group_enqueued (line 113) | def note_group_enqueued(self, _group: TrajectoryGroup) -> None:
    method note_group_dequeued (line 117) | def note_group_dequeued(self, _group: TrajectoryGroup) -> None:
    method note_stale (line 122) | def note_stale(self, count: int) -> None:
    method note_zero_variance_discarded (line 127) | def note_zero_variance_discarded(self, count: int) -> None:
    method note_training_start (line 132) | def note_training_start(self, group_count: int) -> None:
    method note_training_end (line 136) | def note_training_end(self) -> None:
    method note_training_batch (line 140) | def note_training_batch(
    method note_val_started (line 160) | def note_val_started(self, step: int) -> None:
    method note_val_finished (line 164) | def note_val_finished(self, step: int, reward: float | None) -> None:
    method _build_snapshot (line 171) | def _build_snapshot(self) -> dict[str, object]:
    method _format_condensed_line (line 195) | def _format_condensed_line(self) -> str:
    method _format_full_log (line 260) | def _format_full_log(self) -> str:
    method _write_log_line (line 311) | def _write_log_line(self, line: str) -> None:
    method _refresh_status (line 317) | def _refresh_status(self, *, force: bool = False) -> None:
    method _count_zero_variance_groups (line 329) | def _count_zero_variance_groups(self, batch: list[TrajectoryGroup]) ->...
    method _group_zero_variance (line 332) | def _group_zero_variance(self, group: TrajectoryGroup) -> bool:
    method _compute_group_std_dev (line 339) | def _compute_group_std_dev(self, group: TrajectoryGroup) -> float:
    method _compute_batch_avg_std_dev (line 347) | def _compute_batch_avg_std_dev(self, batch: list[TrajectoryGroup]) -> ...
    method _compute_batch_avg_reward (line 353) | def _compute_batch_avg_reward(self, batch: list[TrajectoryGroup]) -> f...
    method _update_ewa (line 359) | def _update_ewa(self, previous: float | None, new_value: float) -> float:
    method _format_count (line 366) | def _format_count(value: int) -> str:
    method _format_scaled (line 374) | def _format_scaled(value: int, scale: int, suffix: str) -> str:
    method _fmt_int (line 382) | def _fmt_int(value: object) -> str:
    method _fmt_int_compact (line 388) | def _fmt_int_compact(value: object) -> str:
    method _fmt_float (line 394) | def _fmt_float(value: object, decimals: int) -> str:
    method _fmt_float_compact (line 402) | def _fmt_float_compact(value: object, decimals: int) -> str:

FILE: src/art/pipeline_trainer/trainer.py
  function _to_async_iterator (line 23) | def _to_async_iterator(iterable: Iterable[T] | AsyncIterator[T]) -> Asyn...
  function make_group_rollout_fn (line 35) | def make_group_rollout_fn(
  class PipelineTrainer (line 57) | class PipelineTrainer(Generic[ScenarioT, ConfigT]):
    method __init__ (line 60) | def __init__(
    method train (line 159) | async def train(self, *, handle_signals: bool = True) -> None:
    method request_stop (line 256) | def request_stop(self) -> None:
    method _validate_backend_support (line 281) | def _validate_backend_support(self) -> None:
    method _skip_scenarios (line 317) | async def _skip_scenarios(
    method _get_next_scenario (line 334) | async def _get_next_scenario(self) -> ScenarioT | None:
    method _wait_for_policy (line 346) | async def _wait_for_policy(self) -> None:
    method _rollout_worker (line 355) | async def _rollout_worker(self, worker_id: int) -> None:
    method _rollout_stage (line 401) | async def _rollout_stage(self) -> None:
    method _training_stage (line 411) | async def _training_stage(self) -> None:
    method _collect_batch (line 530) | async def _collect_batch(
    method _check_all_failed (line 576) | def _check_all_failed(self, group: TrajectoryGroup) -> None:
    method _eval_stage (line 585) | async def _eval_stage(self) -> None:
    method _status_loop (line 610) | async def _status_loop(self) -> None:
    method _run_eval (line 616) | async def _run_eval(self, step: int) -> None:
    method _normalize_eval_items (line 671) | def _normalize_eval_items(
    method _apply_policy_versions (line 690) | def _apply_policy_versions(
    method _apply_scenario_metadata (line 703) | def _apply_scenario_metadata(
    method _is_group_stale (line 720) | def _is_group_stale(self, group: TrajectoryGroup, min_version: int) ->...
    method _record_zero_variance (line 726) | def _record_zero_variance(self, group: TrajectoryGroup) -> bool:
    method _trigger_collapse (line 734) | def _trigger_collapse(self) -> None:
    method _log_zero_variance_groups (line 760) | async def _log_zero_variance_groups(self, step: int) -> None:
    method _group_zero_variance (line 768) | def _group_zero_variance(group: TrajectoryGroup) -> bool:
    method _group_initial_version (line 775) | def _group_initial_version(self, group: TrajectoryGroup) -> int | None:
    method _average_steps_off_policy (line 785) | def _average_steps_off_policy(
    method _should_eval_step (line 798) | def _should_eval_step(self, step: int) -> bool:
    method _read_pipeline_state (line 805) | def _read_pipeline_state(self) -> dict[str, Any]:
    method _persist_state (line 809) | def _persist_state(self, training_step: int) -> None:
    method _is_scalar_metadata (line 819) | def _is_scalar_metadata(value: object) -> bool:
    method _put_output_group (line 822) | async def _put_output_group(self, group: TrajectoryGroup) -> float:
    method _consume_batch_rollout_timings (line 834) | def _consume_batch_rollout_timings(
    method _pop_float_metadata (line 845) | def _pop_float_metadata(group: TrajectoryGroup, key: str) -> float:

FILE: src/art/pipeline_trainer/yes_no_maybe_pipeline.py
  function build_scenarios (line 30) | def build_scenarios() -> list[dict]:
  function reward_for_answer (line 46) | def reward_for_answer(text: str) -> float:
  function eval_fn (line 54) | async def eval_fn(
  function rollout_fn (line 82) | async def rollout_fn(model, scenario, _config) -> art.TrajectoryGroup:
  function main (line 102) | async def main() -> None:

FILE: src/art/preprocessing/inputs.py
  class TrainInputs (line 11) | class TrainInputs(PackedTensors):
  function create_train_inputs (line 19) | def create_train_inputs(

FILE: src/art/preprocessing/pack.py
  class PackedTensors (line 13) | class PackedTensors(TypedDict):
  class DiskPackedTensors (line 26) | class DiskPackedTensors(TypedDict):
  function packed_tensors_from_tokenized_results (line 34) | def packed_tensors_from_tokenized_results(
  function packed_tensors_from_dir (line 177) | def packed_tensors_from_dir(**kwargs: Unpack[DiskPackedTensors]) -> Pack...
  function _add_tensor_list (line 202) | def _add_tensor_list(
  function packed_tensors_to_dir (line 225) | def packed_tensors_to_dir(tensors: PackedTensors, dir: str) -> DiskPacke...
  function _get_tensor_list_info (line 246) | def _get_tensor_list_info(
  function plot_packed_tensors (line 262) | def plot_packed_tensors(

FILE: src/art/preprocessing/tokenize.py
  class TokenizedResult (line 17) | class TokenizedResult:
    method tokens (line 33) | def tokens(self) -> list[str]:
    method without_prompt (line 36) | def without_prompt(self) -> "TokenizedResult":
  class SFTBatch (line 55) | class SFTBatch:
  function tokenize_trajectory_groups (line 71) | def tokenize_trajectory_groups(
  function tokenize_trajectory (line 154) | def tokenize_trajectory(
  function tokenize_sft_batch (line 366) | def tokenize_sft_batch(

FILE: src/art/rewards/ruler.py
  class TrajectoryScore (line 25) | class TrajectoryScore(BaseModel):
  class Response (line 35) | class Response(BaseModel):
  function _judge_provider (line 53) | def _judge_provider(judge_model: str) -> str | None:
  function _record_ruler_cost (line 63) | def _record_ruler_cost(judge_model: str, response: ModelResponse) -> None:
  function ruler (line 87) | async def ruler(
  function ruler_score_group (line 264) | async def ruler_score_group(

FILE: src/art/serverless/backend.py
  function _extract_step_from_wandb_artifact (line 29) | def _extract_step_from_wandb_artifact(artifact: "wandb.Artifact") -> int...
  function _canonicalize_upstream_metric_key (line 60) | def _canonicalize_upstream_metric_key(metric: str) -> str:
  function _canonicalize_upstream_metrics (line 70) | def _canonicalize_upstream_metrics(metrics: dict[str, float]) -> dict[st...
  class ServerlessBackend (line 78) | class ServerlessBackend(Backend):
    method __init__ (line 79) | def __init__(
    method close (line 86) | async def close(self) -> None:
    method register (line 89) | async def register(
    method delete (line 117) | async def delete(
    method _model_inference_name (line 137) | def _model_inference_name(self, model: "Model", step: int | None = Non...
    method _get_step (line 152) | async def _get_step(self, model: "Model") -> int:
    method _delete_checkpoint_files (line 162) | async def _delete_checkpoint_files(
    method _prepare_backend_for_training (line 180) | async def _prepare_backend_for_training(
    method train (line 190) | async def train(  # type: ignore[override]
    method _train_model (line 321) | async def _train_model(
    method _train_sft (line 388) | async def _train_sft(
    method _experimental_pull_model_checkpoint (line 572) | async def _experimental_pull_model_checkpoint(
    method _experimental_pull_from_s3 (line 656) | async def _experimental_pull_from_s3(
    method _experimental_push_to_s3 (line 674) | async def _experimental_push_to_s3(
    method _experimental_fork_checkpoint (line 736) | async def _experimental_fork_checkpoint(

FILE: src/art/serverless/client.py
  class Model (line 27) | class Model(BaseModel):
  class Checkpoint (line 36) | class Checkpoint(BaseModel):
  class CheckpointListParams (line 42) | class CheckpointListParams(TypedDict, total=False):
  class DeleteCheckpointsResponse (line 48) | class DeleteCheckpointsResponse(BaseModel):
  class ExperimentalTrainingConfig (line 53) | class ExperimentalTrainingConfig(TypedDict, total=False):
  class SFTTrainingConfig (line 69) | class SFTTrainingConfig(TypedDict, total=False):
  class TrainingJob (line 74) | class TrainingJob(BaseModel):
  class SFTTrainingJob (line 78) | class SFTTrainingJob(BaseModel):
  class TrainingJobEventListParams (line 82) | class TrainingJobEventListParams(TypedDict, total=False):
  class TrainingJobEvent (line 87) | class TrainingJobEvent(BaseModel):
  class Models (line 95) | class Models(AsyncAPIResource):
    method create (line 96) | async def create(
    method log (line 117) | async def log(
    method delete (line 137) | async def delete(self, *, model_id: str) -> None:
    method checkpoints (line 144) | def checkpoints(self) -> "Checkpoints":
  class Checkpoints (line 148) | class Checkpoints(AsyncAPIResource):
    method list (line 149) | def list(
    method delete (line 173) | async def delete(
  class TrainingJobs (line 183) | class TrainingJobs(AsyncAPIResource):
    method create (line 184) | async def create(
    method events (line 205) | def events(self) -> "TrainingJobEvents":
  class TrainingJobEvents (line 209) | class TrainingJobEvents(AsyncAPIResource):
    method list (line 210) | def list(
  class SFTTrainingJobs (line 233) | class SFTTrainingJobs(AsyncAPIResource):
    method create (line 234) | async def create(
    method events (line 252) | def events(self) -> "TrainingJobEvents":
  class Client (line 256) | class Client(AsyncAPIClient):
    method __init__ (line 259) | def __init__(
    method request (line 277) | async def request(
    method models (line 293) | def models(self) -> Models:
    method training_jobs (line 297) | def training_jobs(self) -> TrainingJobs:
    method sft_training_jobs (line 301) | def sft_training_jobs(self) -> SFTTrainingJobs:
    method qs (line 310) | def qs(self) -> Querystring:
    method auth_headers (line 315) | def auth_headers(self) -> dict[str, str]:
    method default_headers (line 321) | def default_headers(self) -> dict[str, str | Omit]:
    method _make_status_error (line 333) | def _make_status_error(

FILE: src/art/test/test_kl_advantage.py
  function _make_inputs (line 8) | def _make_inputs(
  function test_kl_advantage_no_effect_when_disabled (line 36) | def test_kl_advantage_no_effect_when_disabled():
  function test_kl_advantage_enabled (line 51) | def test_kl_advantage_enabled():
  function test_kl_advantage_zero_mean_penalty (line 63) | def test_kl_advantage_zero_mean_penalty():
  function test_kl_advantage_direction (line 85) | def test_kl_advantage_direction():
  function test_kl_advantage_does_not_affect_when_no_ref (line 110) | def test_kl_advantage_does_not_affect_when_no_ref():

FILE: src/art/test/test_step_skipping.py
  function test_step_skipping (line 39) | async def test_step_skipping():

FILE: src/art/tinker/backend.py
  class TinkerBackend (line 15) | class TinkerBackend(LocalBackend):
    method __init__ (line 16) | def __init__(
    method _prepare_backend_for_training (line 31) | async def _prepare_backend_for_training(
    method _get_service (line 44) | async def _get_service(self, model: TrainableModel) -> ModelService:

FILE: src/art/tinker/cookbook_v/hyperparam_utils.py
  function _list_param_shapes_from_safetensors_remote (line 17) | def _list_param_shapes_from_safetensors_remote(
  function get_lora_lr_over_full_finetune_lr (line 68) | def get_lora_lr_over_full_finetune_lr(model_name: str, lora_alpha: int =...
  function _get_hidden_size (line 77) | def _get_hidden_size(model_name: str) -> int:
  function get_lora_param_count (line 102) | def get_lora_param_count(
  function get_lr (line 156) | def get_lr(model_name: str, is_lora: bool = True) -> float:
  function get_full_finetune_param_count (line 172) | def get_full_finetune_param_count(model_name: str) -> float:
  function get_full_finetune_lr_multiplier (line 179) | def get_full_finetune_lr_multiplier(model_name: str):
  function get_lora_lr_multiplier (line 183) | def get_lora_lr_multiplier(model_name: str):

FILE: src/art/tinker/cookbook_v/image_processing_utils.py
  function get_image_processor (line 26) | def get_image_processor(model_name: str) -> ImageProcessor:
  function resize_image (line 40) | def resize_image(image: Image.Image, max_size: int) -> Image.Image:

FILE: src/art/tinker/cookbook_v/renderers/__init__.py
  function register_renderer (line 51) | def register_renderer(
  function get_registered_renderer_names (line 70) | def get_registered_renderer_names() -> list[str]:
  function is_renderer_registered (line 75) | def is_renderer_registered(name: str) -> bool:
  function unregister_renderer (line 80) | def unregister_renderer(name: str) -> bool:
  function get_renderer (line 95) | def get_renderer(

FILE: src/art/tinker/cookbook_v/renderers/base.py
  class StrictBase (line 31) | class StrictBase(pydantic.BaseModel):
    method __str__ (line 38) | def __str__(self) -> str:
  class ToolCall (line 42) | class ToolCall(StrictBase):
    class FunctionBody (line 59) | class FunctionBody(pydantic.BaseModel):
  class UnparsedToolCall (line 82) | class UnparsedToolCall(StrictBase):
  class TextPart (line 104) | class TextPart(TypedDict):
  class ImagePart (line 112) | class ImagePart(TypedDict):
  class ThinkingPart (line 121) | class ThinkingPart(TypedDict):
  class StreamingMessageHeader (line 137) | class StreamingMessageHeader:
  class StreamingTextDelta (line 148) | class StreamingTextDelta:
  class StreamingThinkingDelta (line 161) | class StreamingThinkingDelta:
  class Utf8TokenDecoder (line 187) | class Utf8TokenDecoder:
    method __post_init__ (line 207) | def __post_init__(self) -> None:
    method _is_valid_decode (line 216) | def _is_valid_decode(self, text: str) -> bool:
    method decode (line 225) | def decode(self, tokens: list[int]) -> str | None:
    method flush (line 267) | def flush(self) -> str:
    method reset (line 284) | def reset(self) -> None:
    method has_pending (line 288) | def has_pending(self) -> bool:
  class Message (line 301) | class Message(TypedDict):
  class RenderContext (line 344) | class RenderContext:
  class ToolSpec (line 371) | class ToolSpec(TypedDict):
  function ensure_text (line 400) | def ensure_text(content: Content) -> str:
  function ensure_list (line 417) | def ensure_list(content: Content) -> list[ContentPart]:
  function remove_thinking (line 424) | def remove_thinking(parts: list[ContentPart]) -> list[ContentPart]:
  function get_text_content (line 429) | def get_text_content(message: Message) -> str:
  function format_content_as_string (line 441) | def format_content_as_string(content: Content, separator: str = "\n") ->...
  function _parse_tool_call_json (line 473) | def _parse_tool_call_json(
  function parse_content_blocks (line 513) | def parse_content_blocks(
  function parse_think_blocks (line 583) | def parse_think_blocks(content: str) -> list[ContentPart] | None:
  function _tool_call_payload (line 623) | def _tool_call_payload(tool_call: ToolCall) -> dict[str, object]:
  class RenderedMessage (line 633) | class RenderedMessage:
  class TrainOnWhat (line 677) | class TrainOnWhat(StrEnum):
  function _unpickle_renderer (line 687) | def _unpickle_renderer(
  class Renderer (line 709) | class Renderer(ABC):
    method __init__ (line 739) | def __init__(self, tokenizer: Tokenizer):
    method __reduce__ (line 742) | def __reduce__(self) -> tuple:
    method has_extension_property (line 763) | def has_extension_property(self) -> bool:
    method _bos_tokens (line 782) | def _bos_tokens(self) -> list[int]:
    method get_stop_sequences (line 786) | def get_stop_sequences(self) -> list[str] | list[int]:
    method render_message (line 791) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method parse_response (line 811) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method to_openai_message (line 824) | def to_openai_message(self, message: Message) -> dict:
    method create_conversation_prefix_with_tools (line 896) | def create_conversation_prefix_with_tools(
    method _get_generation_suffix (line 917) | def _get_generation_suffix(self, role: Role, ctx: RenderContext) -> li...
    method build_generation_prompt (line 941) | def build_generation_prompt(
    method build_supervised_examples (line 1004) | def build_supervised_examples(
    method build_supervised_example (line 1028) | def build_supervised_example(
  function tokens_weights_from_strings_weights (line 1165) | def tokens_weights_from_strings_weights(
  function parse_response_for_stop_token (line 1184) | def parse_response_for_stop_token(
  class ImageProcessorProtocol (line 1211) | class ImageProcessorProtocol(Protocol):
    method get_number_of_image_patches (line 1215) | def get_number_of_image_patches(
    method get_resize_config (line 1220) | def get_resize_config(self, image_data: dict[str, Any]) -> dict[str, A...
  function image_to_chunk (line 1224) | def image_to_chunk(

FILE: src/art/tinker/cookbook_v/renderers/deepseek_v3.py
  class _DeepSeekV3BaseRenderer (line 29) | class _DeepSeekV3BaseRenderer(Renderer):
    method __init__ (line 44) | def __init__(
    method has_extension_property (line 55) | def has_extension_property(self) -> bool:
    method render_message (line 66) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method _get_special_token (line 184) | def _get_special_token(self, name: str) -> int:
    method _bos_tokens (line 192) | def _bos_tokens(self) -> list[int]:
    method _end_message_token (line 196) | def _end_message_token(self) -> int:
    method get_stop_sequences (line 199) | def get_stop_sequences(self) -> list[int]:
    method _parse_deepseek_tool_calls (line 202) | def _parse_deepseek_tool_calls(
    method parse_response (line 249) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method to_openai_message (line 285) | def to_openai_message(self, message: Message) -> dict:
    method create_conversation_prefix_with_tools (line 332) | def create_conversation_prefix_with_tools(
  class DeepSeekV3ThinkingRenderer (line 376) | class DeepSeekV3ThinkingRenderer(_DeepSeekV3BaseRenderer):
    method render_message (line 392) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method build_generation_prompt (line 426) | def build_generation_prompt(
    method parse_response (line 445) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
  class DeepSeekV3DisableThinkingRenderer (line 463) | class DeepSeekV3DisableThinkingRenderer(_DeepSeekV3BaseRenderer):
    method has_extension_property (line 477) | def has_extension_property(self) -> bool:
    method render_message (line 481) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...

FILE: src/art/tinker/cookbook_v/renderers/gpt_oss.py
  function _json_type_to_typescript (line 34) | def _json_type_to_typescript(schema: dict) -> str:
  function _json_schema_to_typescript (line 67) | def _json_schema_to_typescript(schema: dict) -> str:
  function _schema_comments (line 84) | def _schema_comments(schema: dict) -> list[str]:
  function _format_parameters_block (line 102) | def _format_parameters_block(schema: dict) -> str:
  function _format_tool_definition (line 131) | def _format_tool_definition(tool: ToolSpec) -> str:
  class GptOssRenderer (line 143) | class GptOssRenderer(Renderer):
    method __init__ (line 192) | def __init__(
    method render_message (line 210) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method _render_tool_calls (line 284) | def _render_tool_calls(self, tool_calls: list[ToolCall]) -> str:
    method _render_tool_result_message (line 304) | def _render_tool_result_message(
    method _get_system_message (line 352) | def _get_system_message(self) -> Message | None:
    method _bos_tokens (line 371) | def _bos_tokens(self) -> list[int]:
    method _warn_if_user_system_message (line 375) | def _warn_if_user_system_message(self, messages: list[Message]) -> None:
    method build_generation_prompt (line 386) | def build_generation_prompt(
    method build_supervised_example (line 399) | def build_supervised_example(
    method _return_token (line 412) | def _return_token(self) -> int:
    method _call_token (line 418) | def _call_token(self) -> int:
    method get_stop_sequences (line 423) | def get_stop_sequences(self) -> list[int]:
    method parse_response (line 428) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method to_openai_message (line 464) | def to_openai_message(self, message: Message) -> dict:
    method _parse_harmony_output (line 512) | def _parse_harmony_output(
    method _parse_harmony_messages (line 557) | def _parse_harmony_messages(self, content: str) -> list[dict[str, str ...
    method create_conversation_prefix_with_tools (line 621) | def create_conversation_prefix_with_tools(

FILE: src/art/tinker/cookbook_v/renderers/kimi_k2.py
  function _split_tool_calls_section (line 47) | def _split_tool_calls_section(content: str) -> tuple[str, str | None]:
  function _extract_tool_name (line 55) | def _extract_tool_name(tool_id: str) -> str:
  function _parse_tool_calls_section (line 64) | def _parse_tool_calls_section(
  function _longest_matching_suffix_prefix (line 101) | def _longest_matching_suffix_prefix(text: str, tag: str) -> int:
  class KimiK2StreamingParser (line 135) | class KimiK2StreamingParser:
    method __post_init__ (line 163) | def __post_init__(self) -> None:
    method feed (line 173) | def feed(self, token: int) -> Iterator[MessageDelta]:
    method _emit_deltas (line 208) | def _emit_deltas(self) -> Iterator[MessageDelta]:
    method finish (line 280) | def finish(self) -> Iterator[MessageDelta]:
    method reset (line 343) | def reset(self) -> None:
  class KimiK2Renderer (line 355) | class KimiK2Renderer(Renderer):
    method __init__ (line 372) | def __init__(self, tokenizer: Tokenizer, strip_thinking_from_history: ...
    method _ensure_system_message (line 376) | def _ensure_system_message(self, messages: list[Message]) -> list[Mess...
    method render_message (line 406) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method _encode_multipart_content (line 508) | def _encode_multipart_content(
    method build_generation_prompt (line 515) | def build_generation_prompt(
    method build_supervised_examples (line 564) | def build_supervised_examples(
    method build_supervised_example (line 615) | def build_supervised_example(
    method _end_message_token (line 712) | def _end_message_token(self) -> int:
    method get_stop_sequences (line 719) | def get_stop_sequences(self) -> list[int]:
    method parse_response (line 722) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method parse_response_streaming (line 748) | def parse_response_streaming(self, response: list[int]) -> Iterator[Me...
    method to_openai_message (line 785) | def to_openai_message(self, message: Message) -> dict:
    method create_conversation_prefix_with_tools (line 832) | def create_conversation_prefix_with_tools(

FILE: src/art/tinker/cookbook_v/renderers/kimi_k25.py
  class KimiK25Renderer (line 21) | class KimiK25Renderer(KimiK2Renderer):
    method __init__ (line 40) | def __init__(
    method _encode_multipart_content (line 51) | def _encode_multipart_content(
    method _image_prefix (line 87) | def _image_prefix(self) -> str:
    method _image_suffix (line 91) | def _image_suffix(self) -> str:
    method build_generation_prompt (line 94) | def build_generation_prompt(
    method create_conversation_prefix_with_tools (line 106) | def create_conversation_prefix_with_tools(
  class KimiK25DisableThinkingRenderer (line 134) | class KimiK25DisableThinkingRenderer(KimiK25Renderer):
    method build_generation_prompt (line 146) | def build_generation_prompt(

FILE: src/art/tinker/cookbook_v/renderers/kimi_k2_5_tool_declaration_ts.py
  class _SchemaRegistry (line 19) | class _SchemaRegistry:
    method __init__ (line 22) | def __init__(self):
    method register_definitions (line 26) | def register_definitions(self, defs: dict[str, Any]):
    method resolve_ref (line 33) | def resolve_ref(self, ref: str) -> dict[str, Any]:
  function _format_description (line 47) | def _format_description(description: str, indent: str = "") -> str:
  class _BaseType (line 53) | class _BaseType:
    method __init__ (line 57) | def __init__(
    method to_typescript_style (line 68) | def to_typescript_style(self, indent: str = "") -> str:
    method format_docstring (line 71) | def format_docstring(self, indent: str) -> str:
  class _ParameterTypeScalar (line 85) | class _ParameterTypeScalar(_BaseType):
    method __init__ (line 88) | def __init__(self, type: str, extra_props: dict[str, Any] | None = None):
    method to_typescript_style (line 101) | def to_typescript_style(self, indent: str = "") -> str:
  class _ParameterTypeObject (line 108) | class _ParameterTypeObject(_BaseType):
    method __init__ (line 112) | def __init__(
    method to_typescript_style (line 152) | def to_typescript_style(self, indent: str = "") -> str:
  class _ParameterTypeArray (line 192) | class _ParameterTypeArray(_BaseType):
    method __init__ (line 195) | def __init__(
    method to_typescript_style (line 208) | def to_typescript_style(self, indent: str = "") -> str:
  class _ParameterTypeEnum (line 225) | class _ParameterTypeEnum(_BaseType):
    method __init__ (line 229) | def __init__(self, json_schema_object: dict[str, Any]):
    method to_typescript_style (line 258) | def to_typescript_style(self, indent: str = "") -> str:
  class _ParameterTypeAnyOf (line 264) | class _ParameterTypeAnyOf(_BaseType):
    method __init__ (line 267) | def __init__(
    method to_typescript_style (line 277) | def to_typescript_style(self, indent: str = "") -> str:
  class _ParameterTypeUnion (line 281) | class _ParameterTypeUnion(_BaseType):
    method __init__ (line 284) | def __init__(self, json_schema_object: dict[str, Any]):
    method to_typescript_style (line 298) | def to_typescript_style(self, indent: str = "") -> str:
  class _ParameterTypeRef (line 302) | class _ParameterTypeRef(_BaseType):
    method __init__ (line 306) | def __init__(self, json_schema_object: dict[str, Any], registry: _Sche...
    method to_typescript_style (line 318) | def to_typescript_style(self, indent: str = "") -> str:
  class _Parameter (line 334) | class _Parameter:
    method parse_extended (line 346) | def parse_extended(cls, attributes: dict[str, Any]) -> "_Parameter":
    method to_typescript_style (line 357) | def to_typescript_style(self, indent: str = "") -> str:
  function _parse_parameter_type (line 374) | def _parse_parameter_type(
  function _openai_function_to_typescript_style (line 409) | def _openai_function_to_typescript_style(
  function encode_tools_to_typescript_style (line 461) | def encode_tools_to_typescript_style(

FILE: src/art/tinker/cookbook_v/renderers/llama3.py
  class Llama3Renderer (line 15) | class Llama3Renderer(Renderer):
    method has_extension_property (line 38) | def has_extension_property(self) -> bool:
    method render_message (line 42) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method _bos_tokens (line 58) | def _bos_tokens(self) -> list[int]:
    method _end_message_token (line 62) | def _end_message_token(self) -> int:
    method get_stop_sequences (line 66) | def get_stop_sequences(self) -> list[int]:
    method parse_response (line 69) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:

FILE: src/art/tinker/cookbook_v/renderers/qwen3.py
  function _merge_consecutive_text_parts (line 38) | def _merge_consecutive_text_parts(
  class Qwen3Renderer (line 60) | class Qwen3Renderer(Renderer):
    method __init__ (line 86) | def __init__(self, tokenizer: Tokenizer, strip_thinking_from_history: ...
    method has_extension_property (line 105) | def has_extension_property(self) -> bool:
    method _get_qwen_role_for_message (line 116) | def _get_qwen_role_for_message(self, message: Message) -> str:
    method _wrap_qwen_tool_response (line 126) | def _wrap_qwen_tool_response(self, content: str) -> str:
    method render_message (line 130) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method _end_message_token (line 189) | def _end_message_token(self) -> int:
    method get_stop_sequences (line 196) | def get_stop_sequences(self) -> list[int]:
    method parse_response (line 199) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method to_openai_message (line 232) | def to_openai_message(self, message: Message) -> dict:
    method _to_openai_tool_arguments (line 286) | def _to_openai_tool_arguments(self, arguments: str) -> str | dict:
    method create_conversation_prefix_with_tools (line 294) | def create_conversation_prefix_with_tools(
  class Qwen3DisableThinkingRenderer (line 340) | class Qwen3DisableThinkingRenderer(Qwen3Renderer):
    method render_message (line 353) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
  class Qwen3InstructRenderer (line 385) | class Qwen3InstructRenderer(Qwen3Renderer):
    method has_extension_property (line 395) | def has_extension_property(self) -> bool:
  class Qwen3VLRenderer (line 404) | class Qwen3VLRenderer(Qwen3Renderer):
    method __init__ (line 424) | def __init__(
    method _format_thinking_text (line 436) | def _format_thinking_text(self, thinking: str) -> str:
    method _assistant_header_suffix (line 440) | def _assistant_header_suffix(self, message: Message, ctx: RenderContex...
    method _preprocess_message_parts (line 444) | def _preprocess_message_parts(
    method _wrap_qwen_tool_response_chunks (line 489) | def _wrap_qwen_tool_response_chunks(
    method _format_tool_calls_chunks (line 499) | def _format_tool_calls_chunks(self, message: Message) -> list[ImagePar...
    method render_message (line 516) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
  class Qwen3VLInstructRenderer (line 574) | class Qwen3VLInstructRenderer(Qwen3VLRenderer):

FILE: src/art/tinker/cookbook_v/renderers/qwen3_5.py
  class Qwen3_5Renderer (line 45) | class Qwen3_5Renderer(Qwen3VLRenderer):
    method _get_generation_suffix (line 57) | def _get_generation_suffix(self, role: Role, ctx: RenderContext) -> li...
    method _assistant_header_suffix (line 67) | def _assistant_header_suffix(self, message: Message, ctx: RenderContex...
    method _format_thinking_text (line 81) | def _format_thinking_text(self, thinking: str) -> str:
    method _to_openai_tool_arguments (line 85) | def _to_openai_tool_arguments(self, arguments: str) -> str | dict:
    method _parse_qwen3_5_tool_call_xml (line 89) | def _parse_qwen3_5_tool_call_xml(
    method parse_response (line 139) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method _format_tool_call_xml (line 209) | def _format_tool_call_xml(self, tool_call: ToolCall) -> str:
    method _format_tool_calls_chunks (line 226) | def _format_tool_calls_chunks(self, message: Message) -> list[ImagePar...
    method create_conversation_prefix_with_tools (line 239) | def create_conversation_prefix_with_tools(
  class Qwen3_5DisableThinkingRenderer (line 294) | class Qwen3_5DisableThinkingRenderer(Qwen3_5Renderer):
    method _get_generation_suffix (line 303) | def _get_generation_suffix(self, role: Role, ctx: RenderContext) -> li...

FILE: src/art/tinker/cookbook_v/renderers/role_colon.py
  class RoleColonRenderer (line 15) | class RoleColonRenderer(Renderer):
    method has_extension_property (line 29) | def has_extension_property(self) -> bool:
    method render_message (line 33) | def render_message(self, message: Message, ctx: RenderContext) -> Rend...
    method get_stop_sequences (line 52) | def get_stop_sequences(self) -> list[str]:
    method parse_response (line 55) | def parse_response(self, response: list[int]) -> tuple[Message, bool]:
    method _bos_tokens (line 89) | def _bos_tokens(self) -> list[int]:
    method create_conversation_prefix_with_tools (line 96) | def create_conversation_prefix_with_tools(

FILE: src/art/tinker/cookbook_v/tokenizer_utils.py
  function register_tokenizer (line 27) | def register_tokenizer(
  function get_registered_tokenizer_names (line 46) | def get_registered_tokenizer_names() -> list[str]:
  function is_tokenizer_registered (line 51) | def is_tokenizer_registered(name: str) -> bool:
  function unregister_tokenizer (line 56) | def unregister_tokenizer(name: str) -> bool:
  function get_tokenizer (line 71) | def get_tokenizer(model_name: str) -> Tokenizer:
  function _get_hf_tokenizer (line 84) | def _get_hf_tokenizer(model_name: str) -> Tokenizer:

FILE: src/art/tinker/cookbook_v/utils/misc_utils.py
  function timed (line 19) | def timed(key: str, metrics: dict[str, Any]):
  function dict_mean (line 30) | def dict_mean(list_of_dicts: list[dict[str, float | int]]) -> dict[str, ...
  function all_same (line 38) | def all_same(xs: list[Any]) -> bool:
  function lookup_func (line 42) | def lookup_func(path_to_func: str, default_module: str | None = None):
  function split_list (line 58) | def split_list(lst: Sequence[T], num_splits: int) -> list[list[T]]:
  function concat_lists (line 90) | def concat_lists(list_of_lists: list[list[Any]]) -> list[Any]:
  function not_none (line 94) | def not_none(x: T | None) -> T:

FILE: src/art/tinker/prefix_cache.py
  class PrefixEntry (line 9) | class PrefixEntry:
  class PrefixCacheStats (line 15) | class PrefixCacheStats:
  class _RadixEdge (line 29) | class _RadixEdge:
    method __init__ (line 32) | def __init__(self, label: tuple[int, ...], child: _RadixNode) -> None:
  class _RadixNode (line 37) | class _RadixNode:
    method __init__ (line 40) | def __init__(
  function _common_prefix_len (line 49) | def _common_prefix_len(
  class LRUTrieCache (line 59) | class LRUTrieCache:
    method __init__ (line 62) | def __init__(self, max_entries: int = 16_384) -> None:
    method lookup (line 70) | def lookup(self, rendered_tokens: Sequence[int]) -> PrefixEntry | None:
    method insert (line 99) | def insert(self, rendered_prefix: Sequence[int], raw_prefix: Sequence[...
    method _evict (line 148) | def _evict(self) -> None:
    method _prune (line 155) | def _prune(self, node: _RadixNode) -> None:
    method snapshot_stats (line 181) | def snapshot_stats(self) -> dict[str, int | float]:
    method _node_count (line 201) | def _node_count(self) -> int:

FILE: src/art/tinker/renderers.py
  function get_renderer_name (line 1) | def get_renderer_name(base_model: str) -> str:

FILE: src/art/tinker/server.py
  class ModelList (line 41) | class ModelList(BaseModel):
  class ModelUpsert (line 46) | class ModelUpsert(BaseModel):
  class OpenAICompatibleTinkerServer (line 51) | class OpenAICompatibleTinkerServer:
    method models (line 64) | def models(self) -> dict[str, str]:
    method models (line 70) | def models(self, models: dict[str, str]) -> None:
    method start (line 75) | async def start(self) -> tuple[str, int]:
    method stop (line 101) | async def stop(self) -> None:
    method _get_request_tenant (line 109) | def _get_request_tenant(
    method _run (line 123) | async def _run(self, host: str, port: int) -> None:
    method _default_num_workers (line 271) | def _default_num_workers(self) -> int:
    method _get_tenant (line 277) | def _get_tenant(self, api_key: str) -> "OpenAICompatibleTinkerServerTe...
  class OpenAICompatibleTinkerServerSamplableModel (line 286) | class OpenAICompatibleTinkerServerSamplableModel:
    method sampling_client (line 294) | async def sampling_client(self) -> AsyncGenerator[tinker.SamplingClien...
  class OpenAICompatibleTinkerServerTenant (line 308) | class OpenAICompatibleTinkerServerTenant:
    method __init__ (line 309) | def __init__(self, api_key: str, max_concurrent_sampling_clients: int)...
    method get_samplable_model (line 321) | async def get_samplable_model(
    method _load_samplable_model (line 345) | async def _load_samplable_model(
  class OpenAICompatibleTinkerServerWorker (line 383) | class OpenAICompatibleTinkerServerWorker:
    method prompt_tokens (line 386) | async def prompt_tokens(
    method chat_completion_and_token_discrepancies (line 402) | async def chat_completion_and_token_discrepancies(
    method _get_renderer (line 486) | def _get_renderer(self, base_model: str) -> renderers.Renderer:
  function get_free_port (line 496) | def get_free_port(host: str | None = None) -> int:

FILE: src/art/tinker/service.py
  function log_timing (line 27) | def log_timing(msg: str) -> Generator[None, None, None]:
  class TinkerService (line 36) | class TinkerService:
    method start_openai_server (line 43) | async def start_openai_server(
    method vllm_engine_is_sleeping (line 55) | async def vllm_engine_is_sleeping(self) -> bool:
    method train (line 58) | async def train(
    method register_lora_for_step (line 147) | async def register_lora_for_step(self, step: int, checkpoint_dir: str)...
    method train_sft (line 164) | async def train_sft(
    method delete_checkpoints (line 172) | async def delete_checkpoints(self, steps_to_keep: list[int]) -> None:
    method _state_task (line 194) | def _state_task(self) -> asyncio.Task["TinkerState"]:
    method _get_state (line 197) | async def _get_state(self) -> "TinkerState":
    method _build_models_dict (line 233) | def _build_models_dict(self, base_model: str) -> dict[str, str]:
    method _checkpoints_path (line 248) | def _checkpoints_path(self) -> Path:
    method _get_last_checkpoint_dir (line 251) | def _get_last_checkpoint_dir(self) -> Path | None:
    method _save_checkpoint (line 260) | async def _save_checkpoint(
  function delete_checkpoint (line 283) | async def delete_checkpoint(
  class TinkerState (line 300) | class TinkerState:

FILE: src/art/tinker_native/backend.py
  function _canonicalize_upstream_metric_key (line 75) | def _canonicalize_upstream_metric_key(metric: str) -> str:
  class ModelState (line 86) | class ModelState:
  class TinkerNativeModelConfig (line 106) | class TinkerNativeModelConfig:
  class TinkerNativeBackend (line 111) | class TinkerNativeBackend(Backend):
    method __init__ (line 115) | def __init__(
    method _env_enabled (line 132) | def _env_enabled(self, env_name: str) -> bool:
    method _timestamp (line 138) | def _timestamp(self) -> str:
    method _tinker_call (line 141) | async def _tinker_call(
    method _tinker_train_call (line 162) | async def _tinker_train_call(self, label: str, awaitable: Awaitable[T]...
    method _tinker_sample_call (line 170) | async def _tinker_sample_call(self, label: str, awaitable: Awaitable[T...
    method close (line 178) | async def close(self) -> None:
    method register (line 183) | async def register(self, model: Model) -> None:
    method _prepare_backend_for_training (line 203) | async def _prepare_backend_for_training(
    method train (line 231) | async def train(  # type: ignore[override]
    method _get_step (line 350) | async def _get_step(self, model: TrainableModel) -> int:
    method _delete_checkpoint_files (line 356) | async def _delete_checkpoint_files(
    method _model_inference_name (line 363) | def _model_inference_name(self, model: Model, step: int | None = None)...
    method _run_openai_server (line 372) | async def _run_openai_server(
    method _crash_on_server_exit (line 495) | def _crash_on_server_exit(self, task: asyncio.Task[None]) -> None:
    method _wait_for_server_ready (line 506) | async def _wait_for_server_ready(
    method _build_model_state (line 527) | async def _build_model_state(self, model: TrainableModel) -> ModelState:
    method _resolve_model_config (line 613) | def _resolve_model_config(self, model: TrainableModel) -> TinkerNative...
    method _list_checkpoints (line 642) | async def _list_checkpoints(
    method _get_sampler_client (line 670) | async def _get_sampler_client(
    method _normalize_messages (line 700) | def _normalize_messages(self, messages: Iterable[Any]) -> list[dict[st...
    method _normalize_tools (line 709) | def _normalize_tools(
    method _parse_model_name (line 722) | def _parse_model_name(self, model_name: str | None) -> tuple[str, int]:
    method _format_response_model (line 745) | def _format_response_model(self, model_name: str, step: int) -> str:
    method _create_training_client_from_checkpoint (line 749) | async def _create_training_client_from_checkpoint(
    method _save_checkpoint (line 783) | async def _save_checkpoint(
    method _save_sampler_weights (line 806) | async def _save_sampler_weights(
    method _save_training_state (line 819) | async def _save_training_state(
    method _persist_model_state (line 832) | def _persist_model_state(self, model: TrainableModel, state: ModelStat...
    method _experimental_fork_checkpoint (line 840) | async def _experimental_fork_checkpoint(

FILE: src/art/tinker_native/data.py
  function create_conversation_prefix_with_tools (line 16) | def create_conversation_prefix_with_tools(
  function compute_advantages (line 23) | def compute_advantages(
  function convert_openai_messages_to_renderer_format (line 38) | def convert_openai_messages_to_renderer_format(
  function parse_completion_to_openai_message (line 112) | def parse_completion_to_openai_message(
  function _trajectory_has_choice (line 120) | def _trajectory_has_choice(trajectory: Trajectory) -> bool:
  function trajectory_groups_to_datums (line 131) | def trajectory_groups_to_datums(
  function iter_trajectory_histories (line 163) | def iter_trajectory_histories(trajectory: Trajectory) -> Iterable[History]:
  function find_last_choice (line 171) | def find_last_choice(
  function extract_logprobs_from_choice (line 181) | def extract_logprobs_from_choice(
  function history_to_datum (line 207) | def history_to_datum(
  function build_datum (line 241) | def build_datum(

FILE: src/art/trajectories.py
  class PydanticException (line 24) | class PydanticException(pydantic.BaseModel):
  class History (line 30) | class History(pydantic.BaseModel):
    method messages (line 34) | def messages(self) -> Messages:
  class Trajectory (line 38) | class Trajectory(pydantic.BaseModel):
    method log (line 50) | def log(self, message: str) -> None:
    method finish (line 53) | def finish(self) -> "Trajectory":
    method track_duration (line 59) | async def track_duration(self, metric_name: str) -> AsyncGenerator[Non...
    method __str__ (line 68) | def __str__(self) -> str:
    method messages (line 71) | def messages(self) -> Messages:
    method for_logging (line 75) | def for_logging(self) -> dict[str, Any]:
  function get_messages (line 97) | def get_messages(messages_and_choices: MessagesAndChoices) -> Messages:
  class TrajectoryGroup (line 130) | class TrajectoryGroup(pydantic.BaseModel):
    method __init__ (line 137) | def __init__(
    method __copy__ (line 181) | def __copy__(self):
    method __deepcopy__ (line 197) | def __deepcopy__(self, memo: dict[int, Any] | None = None):
    method log (line 223) | def log(self, message: str) -> None:
    method __iter__ (line 226) | def __iter__(self) -> Iterator[Trajectory]:  # type: ignore[override]
    method __len__ (line 229) | def __len__(self) -> int:
    method __new__ (line 233) | def __new__(
    method __new__ (line 244) | def __new__(
    method __new__ (line 254) | def __new__(

FILE: src/art/transformers/patches.py
  function _patched_preprocess_mask_arguments (line 14) | def _patched_preprocess_mask_arguments(
  function patch_preprocess_mask_arguments (line 36) | def patch_preprocess_mask_arguments() -> None:

FILE: src/art/types.py
  class TrainConfig (line 17) | class TrainConfig(pydantic.BaseModel):
  class TrainSFTConfig (line 22) | class TrainSFTConfig(pydantic.BaseModel):
  class TrainResult (line 36) | class TrainResult:
  class LocalTrainResult (line 49) | class LocalTrainResult(TrainResult):
  class ServerlessTrainResult (line 63) | class ServerlessTrainResult(TrainResult):

FILE: src/art/unsloth/service.py
  class SupportsLoadLora (line 52) | class SupportsLoadLora(Protocol):
    method load_lora (line 55) | def load_lora(self, lora_path: str, load_tensors: bool = True) -> LoRA...
  class _StopTrainInputs (line 58) | class _StopTrainInputs:
  function precalculate_new_logprobs (line 67) | def precalculate_new_logprobs(
  function process_train_batch (line 99) | async def process_train_batch(
  function save_checkpoint (line 158) | def save_checkpoint(
  function _get_trainer_optimizer (line 203) | def _get_trainer_optimizer(trainer: GRPOTrainer) -> Optimizer:
  class CausalLM (line 215) | class CausalLM(PreTrainedModel, GenerationMixin):
  class UnslothState (line 222) | class UnslothState:
    method offload_to_cpu (line 232) | def offload_to_cpu(self) -> None:
    method reload_to_gpu (line 279) | def reload_to_gpu(self, device: str = "cuda:0") -> None:
  class UnslothService (line 314) | class UnslothService:
    method is_dedicated (line 331) | def is_dedicated(self) -> bool:
    method _next_lora_id (line 334) | def _next_lora_id(self) -> int:
    method aclose (line 339) | async def aclose(self) -> None:
    method _start_vllm_subprocess (line 361) | async def _start_vllm_subprocess(
    method _reload_adapter (line 454) | async def _reload_adapter(self, checkpoint_path: str, step: int) -> None:
    method close (line 479) | def close(self) -> None:
    method start_openai_server (line 498) | async def start_openai_server(
    method vllm_engine_is_sleeping (line 533) | async def vllm_engine_is_sleeping(self) -> bool:
    method register_lora_for_step (line 538) | async def register_lora_for_step(self, step: int, checkpoint_dir: str)...
    method _reset_optimizer_if_mode_changed (line 567) | def _reset_optimizer_if_mode_changed(
    method train (line 588) | async def train(
    method _train_dedicated (line 607) | async def _train_dedicated(
    method _train_shared (line 668) | async def _train_shared(
    method train_sft (line 791) | async def train_sft(
    method _state (line 961) | def _state(self) -> UnslothState:
    method llm (line 1040) | def llm(self) -> asyncio.Task[AsyncLLM]:
  function do_sleep (line 1058) | def do_sleep(*, level: int) -> None:
  function do_wake_up (line 1124) | def do_wake_up() -> None:

FILE: src/art/unsloth/train.py
  class StopTrainingLoop (line 42) | class StopTrainingLoop(Exception):
  function _canonicalize_upstream_metric_key (line 46) | def _canonicalize_upstream_metric_key(metric: str) -> str:
  function _canonicalize_upstream_metrics (line 56) | def _canonicalize_upstream_metrics(metrics: dict[str, float]) -> dict[st...
  function train (line 64) | async def train(
  function get_compute_loss_fn (line 89) | def get_compute_loss_fn(trainer: "GRPOTrainer") -> Callable[..., torch.T...
  function get_log_fn (line 228) | def get_log_fn(
  function calculate_attn_bias (line 254) | def calculate_attn_bias(
  function calculate_mask (line 281) | def calculate_mask(
  function _use_adapter (line 307) | def _use_adapter(trainer: "GRPOTrainer", adapter_path: str):
  function calculate_logprobs (line 326) | def calculate_logprobs(
  function _calculate_logprobs (line 362) | def _calculate_logprobs(
  function gc_and_empty_cuda_cache (line 416) | def gc_and_empty_cuda_cache(n: int = 3) -> None:

FILE: src/art/utils/benchmark_rollout.py
  function benchmark_rollout (line 8) | async def benchmark_rollout(

FILE: src/art/utils/benchmarking/aggregate_trajectories.py
  function load_aggregated_trajectories (line 13) | async def load_aggregated_trajectories(
  function _load_history_timestamps (line 92) | def _load_history_timestamps(
  function load_latest_metrics (line 134) | async def load_latest_metrics(

FILE: src/art/utils/benchmarking/charts/percentage_comparison_bar_chart.py
  function percentage_comparison_bar_chart (line 13) | def percentage_comparison_bar_chart(

FILE: src/art/utils/benchmarking/charts/training_progress_chart.py
  function training_progress_chart (line 15) | def training_progress_chart(

FILE: src/art/utils/benchmarking/filter_model_split.py
  function filter_rename_model_split (line 12) | def filter_rename_model_split(

FILE: src/art/utils/benchmarking/load_trajectories.py
  function load_trajectories (line 27) | async def load_trajectories(

FILE: src/art/utils/benchmarking/log_constant_metrics_wandb.py
  function log_constant_metrics_wandb (line 8) | async def log_constant_metrics_wandb(

FILE: src/art/utils/benchmarking/pull_model_trajectories.py
  function pull_model_trajectories (line 7) | async def pull_model_trajectories(model: ArtModel) -> None:

FILE: src/art/utils/benchmarking/types.py
  class BenchmarkModelKey (line 1) | class BenchmarkModelKey:
    method __init__ (line 6) | def __init__(
    method __str__ (line 13) | def __str__(self):

FILE: src/art/utils/convert_moe_lora.py
  function _has_fused_moe_lora (line 23) | def _has_fused_moe_lora(tensors: dict[str, torch.Tensor]) -> bool:
  function _infer_moe_params (line 31) | def _infer_moe_params(
  function convert_fused_moe_lora (line 69) | def convert_fused_moe_lora(
  function convert_checkpoint_if_needed (line 144) | def convert_checkpoint_if_needed(checkpoint_dir: str) -> None:

FILE: src/art/utils/deployment/common.py
  class DeploymentConfig (line 15) | class DeploymentConfig(BaseModel):
  class DeploymentResult (line 21) | class DeploymentResult(BaseModel):
  function deploy_model (line 28) | async def deploy_model(

FILE: src/art/utils/deployment/legacy.py
  class LoRADeploymentProvider (line 10) | class LoRADeploymentProvider(str, Enum):
  class LoRADeploymentJob (line 17) | class LoRADeploymentJob(BaseModel):

FILE: src/art/utils/deployment/together.py
  class TogetherDeploymentConfig (line 25) | class TogetherDeploymentConfig(DeploymentConfig):
  class TogetherJobStatus (line 41) | class TogetherJobStatus(str, Enum):
  class TogetherJob (line 48) | class TogetherJob(BaseModel):
  function _init_session (line 63) | def _init_session() -> aiohttp.ClientSession:
  function _model_checkpoint_id (line 77) | def _model_checkpoint_id(model: "TrainableModel", step: int) -> str:
  function _upload_model (line 82) | async def _upload_model(
  function _convert_job_status (line 114) | def _convert_job_status(status: str, message: str | None = None) -> Toge...
  function _find_existing_job_id (line 129) | async def _find_existing_job_id(
  function _check_job_status (line 147) | async def _check_job_status(job_id: str, verbose: bool = False) -> Toget...
  function _wait_for_job (line 173) | async def _wait_for_job(job_id: str, verbose: bool = False) -> TogetherJob:
  function deploy_to_together (line 189) | async def deploy_to_together(

FILE: src/art/utils/deployment/wandb.py
  class WandbDeploymentConfig (line 14) | class WandbDeploymentConfig(DeploymentConfig):
  function deploy_wandb (line 36) | def deploy_wandb(

FILE: src/art/utils/format_message.py
  function format_message (line 8) | def format_message(message: Message) -> str:

FILE: src/art/utils/get_model_step.py
  function get_step_from_dir (line 10) | def get_step_from_dir(output_dir: str) -> int:
  function get_model_step (line 26) | def get_model_step(model: "TrainableModel", art_path: str) -> int:

FILE: src/art/utils/get_repo_root_path.py
  function get_repo_root_path (line 4) | def get_repo_root_path() -> str:

FILE: src/art/utils/group_aggregate.py
  function group_aggregate (line 8) | def group_aggregate(

FILE: src/art/utils/iterate_dataset.py
  class DatasetBatch (line 12) | class DatasetBatch(Generic[T]):
  function iterate_dataset (line 21) | def iterate_dataset(

FILE: src/art/utils/limit_concurrency.py
  function limit_concurrency (line 6) | def limit_concurrency(n: int, derive_key: Optional[Callable[..., str]] =...

FILE: src/art/utils/litellm.py
  function convert_litellm_choice_to_openai (line 20) | def convert_litellm_choice_to_openai(

FILE: src/art/utils/log_http_errors.py
  function log_http_errors (line 6) | def log_http_errors(func):

FILE: src/art/utils/logging.py
  class _C (line 5) | class _C:
  function _ts (line 19) | def _ts():
  function info (line 23) | def info(msg):
  function step (line 27) | def step(msg):
  function ok (line 31) | def ok(msg):
  function warn (line 35) | def warn(msg):
  function err (line 39) | def err(msg):
  function dim (line 43) | def dim(msg):

FILE: src/art/utils/model_config.py
  class ModelConfig (line 7) | class ModelConfig:
  function detect_chat_template_parts (line 25) | def detect_chat_template_parts(
  function get_instruction_response_parts (line 72) | def get_instruction_response_parts(

FILE: src/art/utils/old_benchmarking/calculate_step_metrics.py
  function calculate_step_std_dev (line 7) | def calculate_step_std_dev(trajectory_groups: list[TrajectoryGroup]) -> ...

FILE: src/art/utils/old_benchmarking/display_image_grid.py
  function display_image_grid (line 4) | def display_image_grid(image_paths: list[str], images_per_row: int = 2):

FILE: src/art/utils/old_benchmarking/generate_comparison_table.py
  function generate_comparison_table (line 7) | def generate_comparison_table(

FILE: src/art/utils/old_benchmarking/generate_line_graphs.py
  function generate_line_graphs (line 19) | def generate_line_graphs(

FILE: src/art/utils/old_benchmarking/load_benchmarked_models.py
  function load_benchmarked_models (line 18) | def load_benchmarked_models(

FILE: src/art/utils/old_benchmarking/types.py
  class BenchmarkedModelKey (line 4) | class BenchmarkedModelKey:
    method __init__ (line 9) | def __init__(self, model: str, split: str, step_indices: list[int] | N...
    method __str__ (line 14) | def __str__(self):
  class BenchmarkedModelStep (line 24) | class BenchmarkedModelStep:
    method __init__ (line 29) | def __init__(self, index: int, metrics: dict[str, float] | None = None):
    method __str__ (line 33) | def __str__(self):
  class BenchmarkedModel (line 37) | class BenchmarkedModel:
    method __init__ (line 41) | def __init__(
    method __str__ (line 49) | def __str__(self):

FILE: src/art/utils/output_dirs.py
  function get_default_art_path (line 7) | def get_default_art_path() -> str:
  function get_models_dir (line 12) | def get_models_dir(project_name: str, art_path: str | None = None) -> str:
  function get_model_dir (line 18) | def get_model_dir(model: Model, art_path: str | None = None) -> str:
  function get_output_dir_from_model_properties (line 24) | def get_output_dir_from_model_properties(
  function get_step_checkpoint_dir (line 32) | def get_step_checkpoint_dir(model_output_dir: str, step: int) -> str:
  function get_trajectories_dir (line 36) | def get_trajectories_dir(model_output_dir: str) -> str:
  function get_trajectories_split_dir (line 40) | def get_trajectories_split_dir(model_output_dir: str, split: str) -> str:

FILE: src/art/utils/record_provenance.py
  function record_provenance (line 9) | def record_provenance(run: wandb.Run, provenance: str) -> None:

FILE: src/art/utils/retry.py
  function retry (line 19) | def retry(

FILE: src/art/utils/s3.py
  class S3SyncError (line 23) | class S3SyncError(RuntimeError):
  function build_s3_path (line 27) | def build_s3_path(
  function build_s3_zipped_step_path (line 47) | def build_s3_zipped_step_path(
  function s3_sync (line 66) | async def s3_sync(
  function ensure_bucket_exists (line 127) | async def ensure_bucket_exists(
  function pull_model_from_s3 (line 161) | async def pull_model_from_s3(
  function push_model_to_s3 (line 223) | async def push_model_to_s3(
  function archive_and_presign_step_url (line 266) | async def archive_and_presign_step_url(

FILE: src/art/utils/s3_checkpoint_utils.py
  function get_latest_checkpoint_step_from_s3 (line 7) | async def get_latest_checkpoint_step_from_s3(
  function get_checkpoint_step_not_after_from_s3 (line 54) | async def get_checkpoint_step_not_after_from_s3(
  function migrate_s3_checkpoints_to_new_structure (line 107) | async def migrate_s3_checkpoints_to_new_structure(

FILE: src/art/utils/sft.py
  class SFTChunk (line 16) | class SFTChunk(NamedTuple):
  function _parse_jsonl_line (line 24) | def _parse_jsonl_line(line: str) -> "Trajectory":
  function get_file_row_count (line 42) | def get_file_row_count(file_path: str) -> int:
  function create_lr_schedule (line 70) | def create_lr_schedule(
  function create_sft_dataset_iterator (line 135) | def create_sft_dataset_iterator(
  function iterate_file (line 265) | def iterate_file(
  function train_sft_from_file (line 332) | async def train_sft_from_file(

FILE: src/art/utils/strip_logprobs.py
  function strip_logprobs (line 9) | def strip_logprobs(obj: Any) -> Any:
  function _strip_logprobs (line 35) | def _strip_logprobs(obj: Any) -> Any:

FILE: src/art/utils/suppress_litellm_serialization_warnings.py
  function suppress_litellm_serialization_warnings (line 4) | def suppress_litellm_serialization_warnings():

FILE: src/art/utils/trajectory_logging.py
  function _flatten_message (line 21) | def _flatten_message(msg: dict) -> dict:
  function _unflatten_message (line 46) | def _unflatten_message(msg_dict: dict) -> dict:
  function write_trajectory_groups_parquet (line 59) | def write_trajectory_groups_parquet(
  function read_trajectory_groups_parquet (line 153) | def read_trajectory_groups_parquet(path: str | Path) -> list[TrajectoryG...

FILE: src/art/utils/trajectory_migration.py
  function serialize_trajectory_groups (line 28) | def serialize_trajectory_groups(trajectory_groups: list[TrajectoryGroup]...
  function trajectory_group_to_dict (line 37) | def trajectory_group_to_dict(trajectory_group: TrajectoryGroup) -> dict[...
  function history_to_dict (line 50) | def history_to_dict(history: History) -> dict[str, Any]:
  function trajectory_to_dict (line 58) | def trajectory_to_dict(trajectory: Trajectory) -> dict[str, Any]:
  function message_or_choice_to_dict (line 79) | def message_or_choice_to_dict(message_or_choice: MessageOrChoice) -> dic...
  function deserialize_trajectory_groups (line 97) | def deserialize_trajectory_groups(serialized: str) -> list[TrajectoryGro...
  function dict_to_trajectory_group (line 110) | def dict_to_trajectory_group(d: dict[str, Any]) -> TrajectoryGroup:
  function dict_to_trajectory (line 119) | def dict_to_trajectory(d: dict[str, Any]) -> Trajectory:
  function dict_to_message_or_choice (line 132) | def dict_to_message_or_choice(d: dict[str, Any]) -> MessageOrChoice:
  class MigrationResult (line 145) | class MigrationResult:
    method compression_ratio (line 155) | def compression_ratio(self) -> float:
    method space_saved (line 162) | def space_saved(self) -> int:
    method __add__ (line 166) | def __add__(self, other: "MigrationResult") -> "MigrationResult":
  function migrate_jsonl_to_parquet (line 177) | def migrate_jsonl_to_parquet(
  function migrate_trajectories_dir (line 335) | def migrate_trajectories_dir(
  function migrate_model_dir (line 377) | def migrate_model_dir(
  function migrate_all_models (line 409) | def migrate_all_models(
  function auto_migrate_on_register (line 454) | def auto_migrate_on_register(model_dir: Path | str) -> MigrationResult:

FILE: src/art/vllm/dedicated_server.py
  function parse_args (line 16) | def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
  function main (line 35) | def main(argv: list[str] | None = None) -> None:

FILE: src/art/vllm/engine.py
  function get_llm (line 17) | async def get_llm(args: vllm.AsyncEngineArgs) -> AsyncLLM:  # ty:ignore[...
  function run_on_workers (line 48) | async def run_on_workers(
  function get_worker (line 72) | def get_worker() -> "ExtendedWorker":
  class WorkerExtension (line 77) | class WorkerExtension:
    method run (line 80) | def run(self, pickled_func: bytes, *args: Any, **kwargs: Any) -> Any:
    method time (line 89) | def time(self, name: str) -> Generator[None, None, None]:
  class ExtendedWorker (line 98) | class ExtendedWorker(Worker, WorkerExtension):

FILE: src/art/vllm/patches.py
  function subclass_chat_completion_request (line 6) | def subclass_chat_completion_request() -> None:
  function patch_listen_for_disconnect (line 22) | def patch_listen_for_disconnect() -> None:
  function patch_tool_parser_manager (line 38) | def patch_tool_parser_manager() -> None:

FILE: src/art/vllm/server.py
  function openai_server_task (line 22) | async def openai_server_task(
  function _openai_server_coroutine (line 134) | def _openai_server_coroutine(
  function get_uvicorn_logging_config (line 163) | def get_uvicorn_logging_config(path: str) -> dict[str, Any]:
  function set_vllm_log_file (line 184) | def set_vllm_log_file(path: str) -> None:

FILE: src/art/yield_trajectory.py
  function yield_trajectory (line 7) | def yield_trajectory(trajectory: Trajectory) -> None:
  function capture_yielded_trajectory (line 11) | async def capture_yielded_trajectory(coroutine: Coroutine[Any, Any, Any]...
  class YieldTrajectoryContext (line 20) | class YieldTrajectoryContext:
    method __init__ (line 21) | def __init__(self) -> None:
    method __enter__ (line 24) | def __enter__(self) -> None:
    method __exit__ (line 27) | def __exit__(self, exc_type: Any, exc_value: Any, traceback: Any) -> N...

FILE: src/mp_actors/move.py
  function _close_all_live_proxies (line 35) | def _close_all_live_proxies() -> None:
  function _register_proxy_for_atexit (line 46) | def _register_proxy_for_atexit(proxy: "Proxy") -> None:
  function move_to_child_process (line 54) | def move_to_child_process(
  function close_proxy (line 77) | def close_proxy(proxy: object) -> None:
  class Request (line 89) | class Request:
  class Response (line 98) | class Response:
  class Proxy (line 104) | class Proxy:
    method __init__ (line 105) | def __init__(
    method _process_error (line 129) | def _process_error(self) -> RuntimeError:
    method _fail_pending (line 140) | def _fail_pending(self, error: Exception) -> None:
    method _dispatch_responses (line 151) | def _dispatch_responses(self) -> None:
    method __getattr__ (line 179) | def __getattr__(self, name: str) -> Any:
    method close (line 252) | def close(self):
  function _target (line 275) | def _target(
  function _handle_requests (line 290) | async def _handle_requests(
  function _handle_request (line 303) | async def _handle_request(

FILE: src/mp_actors/traceback.py
  function streamline_tracebacks (line 10) | def streamline_tracebacks() -> Callable[[T], T]:
  function streamlined_traceback (line 33) | def streamlined_traceback() -> TracebackType | None:
  function get_child_traceback (line 41) | def get_child_traceback(

FILE: tests/integration.py
  function make_patch_source (line 46) | def make_patch_source() -> str:
  function make_variable_override_source (line 68) | def make_variable_override_source(variables: Dict[str, Any]) -> str:
  function _override_variables_in_notebook (line 82) | def _override_variables_in_notebook(nb, variables: Dict[str, Any]) -> li...
  class _NotebookPlugin (line 167) | class _NotebookPlugin:
    method __init__ (line 168) | def __init__(self, notebook_configs: list[dict]) -> None:
    method pytest_generate_tests (line 171) | def pytest_generate_tests(self, metafunc) -> None:
  function test_notebook_execution (line 176) | def test_notebook_execution(notebook_config: dict) -> None:
  function parse_indexes (line 231) | def parse_indexes(indexes_str: str, max_index: int) -> list[int]:
  function main (line 286) | def main() -> int:

FILE: tests/integration/test_live_api_cost.py
  function _require_live_test_env (line 17) | def _require_live_test_env(*required_vars: str) -> None:
  function _post_json (line 25) | def _post_json(url: str, *, headers: dict[str, str], payload: dict) -> d...
  function _cacheable_prefix (line 36) | def _cacheable_prefix(word_count: int = 1500) -> str:
  function _history_rows (line 40) | def _history_rows(history_path: Path) -> list[dict]:
  function _openai_completion (line 44) | def _openai_completion(*, api_key: str, prompt_cache_key: str, prefix: s...
  function _anthropic_message (line 64) | def _anthropic_message(*, api_key: str, prefix: str) -> dict:
  class TestLiveApiCost (line 90) | class TestLiveApiCost:
    method test_openai_gpt_4_1_cached_prompt_cost (line 92) | async def test_openai_gpt_4_1_cached_prompt_cost(self, tmp_path: Path)...
    method test_anthropic_claude_sonnet_4_6_prompt_cache_cost (line 156) | async def test_anthropic_claude_sonnet_4_6_prompt_cache_cost(

FILE: tests/integration/test_multi_checkpoint_training.py
  function get_base_model (line 33) | def get_base_model() -> str:
  function simple_rollout (line 38) | async def simple_rollout(
  function run_training_loop (line 63) | async def run_training_loop(
  function _run_inference_on_step (line 98) | async def _run_inference_on_step(
  function test_tinker_backend (line 116) | async def test_tinker_backend():
  function test_local_backend (line 145) | async def test_local_backend():
  function test_serverless_backend (line 175) | async def test_serverless_backend():

FILE: tests/integration/test_pipeline_localbackend_dedicated.py
  function get_base_model (line 24) | def get_base_model() -> str:
  function get_safe_gpu_memory_utilization (line 28) | def get_safe_gpu_memory_utilization() -> float:
  function get_dedicated_vllm_test_config (line 49) | def get_dedicated_vllm_test_config() -> art.dev.InternalModelConfig:
  function reward_for_answer (line 69) | def reward_for_answer(text: str) -> float:
  function assert_chat_logprobs (line 80) | async def assert_chat_logprobs(
  function test_pipeline_trainer_local_backend_dedicated_smoke (line 99) | async def test_pipeline_trainer_local_backend_dedicated_smoke() -> None:

FILE: tests/integration/test_provenance.py
  function simple_rollout (line 15) | async def simple_rollout(model: art.TrainableModel) -> art.Trajectory:
  function get_latest_artifact_provenance (line 40) | def get_latest_artifact_provenance(
  function main (line 49) | async def main() -> None:

FILE: tests/integration/test_push_and_fork.py
  function simple_rollout (line 31) | async def simple_rollout(
  function train_one_step (line 49) | async def train_one_step(
  function test_push_to_s3 (line 82) | async def test_push_to_s3():
  function test_fork_checkpoint_from_wandb (line 120) | async def test_fork_checkpoint_from_wandb():
  function test_push_then_fork_from_s3 (line 200) | async def test_push_then_fork_from_s3():

FILE: tests/integration/test_tinker_native_backend.py
  function get_base_model (line 16) | def get_base_model() -> str:
  function ensure_reward_variance (line 20) | def ensure_reward_variance(groups) -> None:
  function simple_rollout (line 30) | async def simple_rollout(
  function test_tinker_native_backend (line 58) | async def test_tinker_native_backend():
  function test_tinker_native_fork_checkpoint (line 122) | async def test_tinker_native_fork_checkpoint():

FILE: tests/integration/test_vllm_contract.py
  function get_base_model (line 23) | def get_base_model() -> str:
  function get_safe_gpu_memory_utilization (line 27) | def get_safe_gpu_memory_utilization() -> float:
  function get_vllm_test_config (line 45) | def get_vllm_test_config() -> art.dev.InternalModelConfig:
  function simple_rollout (line 63) | async def simple_rollout(
  function assert_chat_logprobs (line 89) | async def assert_chat_logprobs(
  function test_local_backend_vllm_contract (line 108) | async def test_local_backend_vllm_contract() -> None:

FILE: tests/test_backend_train_api.py
  function get_vllm_test_config (line 23) | def get_vllm_test_config() -> tuple[art.dev.InternalModelConfig, str | N...
  function simple_rollout (line 65) | async def simple_rollout(client, model_name: str, prompt: str) -> art.Tr...
  function main (line 88) | async def main():

FILE: tests/unit/test_auto_trajectory.py
  function test_server (line 189) | async def test_server():
  function test_auto_trajectory (line 212) | async def test_auto_trajectory(test_server: None) -> None:
  function test_litellm_auto_trajectory (line 324) | async def test_litellm_auto_trajectory(test_server: None) -> None:

FILE: tests/unit/test_benchmarking_loader.py
  function test_load_trajectories_group_columns (line 9) | async def test_load_trajectories_group_columns(tmp_path):

FILE: tests/unit/test_dedicated_config.py
  function test_shared_mode_empty_config (line 11) | def test_shared_mode_empty_config():
  function test_shared_mode_with_other_keys (line 16) | def test_shared_mode_with_other_keys():
  function test_dedicated_mode_detected (line 21) | def test_dedicated_mode_detected():
  function test_valid_shared_mode (line 26) | def test_valid_shared_mode():
  function test_valid_dedicated_two_gpus (line 30) | def test_valid_dedicated_two_gpus():
  function test_valid_dedicated_three_gpus (line 36) | def test_valid_dedicated_three_gpus():
  function test_valid_dedicated_four_gpus (line 42) | def test_valid_dedicated_four_gpus():
  function test_only_trainer_gpu_ids (line 48) | def test_only_trainer_gpu_ids():
  function test_only_inference_gpu_ids (line 53) | def test_only_inference_gpu_ids():
  function test_empty_trainer_gpu_ids (line 58) | def test_empty_trainer_gpu_ids():
  function test_empty_inference_gpu_ids (line 65) | def test_empty_inference_gpu_ids():
  function test_overlapping_gpu_ids (line 72) | def test_overlapping_gpu_ids():
  function test_multi_gpu_inference (line 79) | def test_multi_gpu_inference():
  function test_trainer_not_starting_at_zero (line 86) | def test_trainer_not_starting_at_zero():
  function test_trainer_not_contiguous (line 93) | def test_trainer_not_contiguous():
  function test_dedicated_rejects_fast_inference (line 100) | def test_dedicated_rejects_fast_inference():
  function test_dedicated_rejects_enable_sleep_mode (line 113) | def test_dedicated_rejects_enable_sleep_mode():
  function test_dedicated_allows_fast_inference_false (line 126) | def test_dedicated_allows_fast_inference_false():
  function test_get_model_config_shared_mode (line 137) | def test_get_model_config_shared_mode():
  function test_get_model_config_dedicated_mode (line 148) | def test_get_model_config_dedicated_mode():
  function test_get_model_config_dedicated_preserves_user_engine_args (line 163) | def test_get_model_config_dedicated_preserves_user_engine_args():

FILE: tests/unit/test_dedicated_server.py
  function test_parse_args_required (line 11) | def test_parse_args_required():
  function test_parse_args_with_engine_args (line 36) | def test_parse_args_with_engine_args():
  function test_parse_args_custom_host (line 56) | def test_parse_args_custom_host():
  function test_parse_args_with_server_args (line 76) | def test_parse_args_with_server_args():

FILE: tests/unit/test_frontend_logging.py
  class TestFrontendLoggingCompatibility (line 25) | class TestFrontendLoggingCompatibility:
    method sample_trajectories (line 29) | def sample_trajectories(self) -> list[Trajectory]:
    method sample_trajectory_groups (line 56) | def sample_trajectory_groups(
    method test_parquet_readable_by_read_trajectory_groups_parquet (line 72) | async def test_parquet_readable_by_read_trajectory_groups_parquet(
    method test_parquet_schema_preserved (line 98) | async def test_parquet_schema_preserved(
  class TestHistoryJsonlCompatibility (line 135) | class TestHistoryJsonlCompatibility:
    method sample_trajectory_groups (line 139) | def sample_trajectory_groups(self) -> list[TrajectoryGroup]:
    method test_history_jsonl_format (line 166) | async def test_history_jsonl_format(
    method test_history_readable_by_polars (line 191) | async def test_history_readable_by_polars(
    method test_history_appends_entries (line 212) | async def test_history_appends_entries(
  class TestPathStructure (line 238) | class TestPathStructure:
    method test_file_locations_match_localbackend (line 242) | async def test_file_locations_match_localbackend(self, tmp_path: Path):
    method test_step_numbering_format (line 272) | async def test_step_numbering_format(self, tmp_path: Path):
  class TestMetricCalculation (line 308) | class TestMetricCalculation:
    method test_metric_prefixes (line 312) | async def test_metric_prefixes(self, tmp_path: Path):
    method test_standard_metrics_present (line 359) | async def test_standard_metrics_present(self, tmp_path: Path):
    method test_group_metric_aggregation (line 398) | async def test_group_metric_aggregation(self, tmp_path: Path):
    method test_exception_rate_calculation (line 439) | async def test_exception_rate_calculation(self, tmp_path: Path):
    method test_exception_rate_counts_group_exceptions (line 472) | async def test_exception_rate_counts_group_exceptions(self, tmp_path: ...
    method test_generator_of_trajectories_is_consumed_once (line 501) | async def test_generator_of_trajectories_is_consumed_once(self, tmp_pa...
    method test_train_trajectory_metrics_default_to_train_prefix (line 531) | async def test_train_trajectory_metrics_default_to_train_prefix(
    method test_train_logs_add_default_data_metrics_from_trajectory_groups (line 569) | async def test_train_logs_add_default_data_metrics_from_trajectory_gro...
    method test_costs_are_logged_in_hierarchical_taxonomy (line 625) | async def test_costs_are_logged_in_hierarchical_taxonomy(self, tmp_pat...
    method test_cost_cumulative_persists_across_model_recreation (line 668) | async def test_cost_cumulative_persists_across_model_recreation(
    method test_metrics_builder_loads_resume_state_before_builder_use (line 707) | async def test_metrics_builder_loads_resume_state_before_builder_use(
    method test_direct_time_and_data_metrics_get_cumulative_variants (line 737) | async def test_direct_time_and_data_metrics_get_cumulative_variants(
    method test_log_without_new_builder_metrics_skips_extra_taxonomy_row (line 767) | async def test_log_without_new_builder_metrics_skips_extra_taxonomy_row(
  class TestWandbIntegration (line 805) | class TestWandbIntegration:
    method test_wandb_not_called_without_api_key (line 809) | async def test_wandb_not_called_without_api_key(self, tmp_path: Path):
    method test_should_log_wandb_logic_default (line 830) | def test_should_log_wandb_logic_default(self, tmp_path: Path):
    method test_should_log_wandb_logic_with_key (line 852) | def test_should_log_wandb_logic_with_key(self, tmp_path: Path):
    method test_should_log_wandb_logic_explicit_wandb (line 868) | def test_should_log_wandb_logic_explicit_wandb(self, tmp_path: Path):
    method test_should_log_wandb_logic_empty_list (line 883) | def test_should_log_wandb_logic_empty_list(self, tmp_path: Path):
  class TestLocalBackendAutomaticMetrics (line 900) | class TestLocalBackendAutomaticMetrics:
    method test_train_logs_automatic_wall_time_and_gpu_cost (line 902) | async def test_train_logs_automatic_wall_time_and_gpu_cost(
    method test_unknown_local_gpu_skips_cost_but_keeps_wall_time (line 952) | async def test_unknown_local_gpu_skips_cost_but_keeps_wall_time(
  class TestModelAttributes (line 989) | class TestModelAttributes:
    method test_base_path_default (line 992) | def test_base_path_default(self):
    method test_base_path_custom (line 997) | def test_base_path_custom(self):
    method test_report_metrics_default (line 1002) | def test_report_metrics_default(self):
    method test_report_metrics_custom (line 1007) | def test_report_metrics_custom(self):
  class TestTrainSFTMetricsAggregation (line 1013) | class TestTrainSFTMetricsAggregation:
    method test_train_sft_aggregates_metrics (line 1017) | async def test_train_sft_aggregates_metrics(self, tmp_path: Path):
    method test_train_sft_single_step_increment (line 1087) | async def test_train_sft_single_step_increment(self, tmp_path: Path):
    method test_train_sft_no_metrics_when_empty (line 1125) | async def test_train_sft_no_metrics_when_empty(self, tmp_path: Path):
  class TestGradientStepMetrics (line 1155) | class TestGradientStepMetrics:
    method test_model_train_logs_gradient_step_count (line 1157) | async def test_model_train_logs_gradient_step_count(self, tmp_path: Pa...
    method test_local_backend_train_returns_gradient_step_count (line 1203) | async def test_local_backend_train_returns_gradient_step_count(

FILE: tests/unit/test_metric_routing.py
  class TestMetricRoutingBaseline (line 12) | class TestMetricRoutingBaseline:
    method test_log_metrics_routes_known_sections_without_split_prefix (line 13) | def test_log_metrics_routes_known_sections_without_split_prefix(
    method test_get_wandb_run_registers_taxonomy_sections (line 43) | def test_get_wandb_run_registers_taxonomy_sections(self, tmp_path: Pat...
    method test_log_metrics_defines_nested_cost_keys_with_training_step (line 80) | def test_log_metrics_defines_nested_cost_keys_with_training_step(
    method test_update_wandb_config_seeds_wandb_init (line 128) | def test_update_wandb_config_seeds_wandb_init(self, tmp_path: Path) ->...
    method test_update_wandb_config_updates_active_run (line 159) | def test_update_wandb_config_updates_active_run(self, tmp_path: Path) ...
    method test_update_wandb_config_rejects_conflicting_values (line 198) | def test_update_wandb_config_rejects_conflicting_values(

FILE: tests/unit/test_metrics_builder.py
  class TestMetricsBuilder (line 8) | class TestMetricsBuilder:
    method test_rollup_correctness_across_depths (line 10) | async def test_rollup_correctness_across_depths(self) -> None:
    method test_cum_accumulates_for_hierarchical_sections (line 29) | async def test_cum_accumulates_for_hierarchical_sections(self) -> None:
    method test_helper_metrics_accumulate_within_a_single_step (line 61) | async def test_helper_metrics_accumulate_within_a_single_step(self) ->...
    method test_throughput_metrics_derive_from_time_and_token_cumulatives (line 82) | async def test_throughput_metrics_derive_from_time_and_token_cumulatives(
    method test_costs_all_generated_for_single_and_multiple_children (line 101) | async def test_costs_all_generated_for_single_and_multiple_children(se...
    method test_leaf_parent_conflicts_raise (line 113) | def test_leaf_parent_conflicts_raise(self) -> None:
    method test_duplicate_leaf_writes_are_summed (line 125) | async def test_duplicate_leaf_writes_are_summed(self) -> None:
    method test_cumulative_namespace_is_reserved (line 136) | def test_cumulative_namespace_is_reserved(self) -> None:
    method test_sparse_steps_omit_rollup_for_missing_costs (line 142) | async def test_sparse_steps_omit_rollup_for_missing_costs(self) -> None:
    method test_state_dict_round_trip_preserves_cumulative_state (line 157) | async def test_state_dict_round_trip_preserves_cumulative_state(self) ...
    method test_loaded_state_is_shared_with_other_cost_contexts (line 172) | async def test_loaded_state_is_shared_with_other_cost_contexts(self) -...
    method test_add_response_cost_uses_registered_model_pricing (line 188) | async def test_add_response_cost_uses_registered_model_pricing(self) -...
    method test_unique_scenario_count_tracks_exact_ids (line 211) | async def test_unique_scenario_count_tracks_exact_ids(self) -> None:
    method test_empty_flush_does_not_repeat_stale_derived_metrics (line 222) | async def test_empty_flush_does_not_repeat_stale_derived_metrics(self)...
    method test_concurrent_add_cost_calls_do_not_lose_updates (line 236) | async def test_concurrent_add_cost_calls_do_not_lose_updates(self) -> ...
    method test_contextvar_activate_and_get_active (line 250) | def test_contextvar_activate_and_get_active(self) -> None:

FILE: tests/unit/test_metrics_taxonomy.py
  function test_average_metric_samples_handles_sparse_keys (line 13) | def test_average_metric_samples_handles_sparse_keys() -> None:
  function test_build_training_summary_metrics_only_includes_data_section (line 26) | def test_build_training_summary_metrics_only_includes_data_section() -> ...
  function test_average_metric_samples_requires_invariant_gradient_step_count (line 46) | def test_average_metric_samples_requires_invariant_gradient_step_count()...
  function test_summarize_trajectory_groups_only_counts_explicit_scenario_id (line 56) | def test_summarize_trajectory_groups_only_counts_explicit_scenario_id() ...

FILE: tests/unit/test_model_openai_client_costs.py
  class _FakeUsage (line 10) | class _FakeUsage:
    method __init__ (line 11) | def __init__(self, prompt_tokens: int, completion_tokens: int) -> None:
  class _FakeResponse (line 16) | class _FakeResponse:
    method __init__ (line 17) | def __init__(
  class _FakeCompletions (line 28) | class _FakeCompletions:
    method __init__ (line 29) | def __init__(self, response: _FakeResponse) -> None:
    method create (line 32) | async def create(self, *args: Any, **kwargs: Any) -> _FakeResponse:
  function _patch_async_openai (line 36) | def _patch_async_openai(
  function _build_model (line 55) | def _build_model() -> TrainableModel:
  class TestModelOpenAIClientCosts (line 70) | class TestModelOpenAIClientCosts:
    method test_openai_client_automatically_logs_train_tinker_costs (line 72) | async def test_openai_client_automatically_logs_train_tinker_costs(
    method test_openai_client_automatically_logs_eval_tinker_costs (line 92) | async def test_openai_client_automatically_logs_eval_tinker_costs(
    method test_openai_client_does_not_log_costs_without_active_metrics_context (line 112) | async def test_openai_client_does_not_log_costs_without_active_metrics...
    method test_multiple_choices_scale_prefill_cost_once_per_sample (line 129) | async def test_multiple_choices_scale_prefill_cost_once_per_sample(
    method test_manual_cost_calculator_still_returns_tinker_metrics (line 148) | def test_manual_cost_calculator_still_returns_tinker_metrics(self) -> ...

FILE: tests/unit/test_mp_actors.py
  class DemoService (line 12) | class DemoService:
    method __init__ (line 13) | def __init__(self) -> None:
    method increment (line 16) | def increment(self) -> int:
    method aincrement (line 20) | async def aincrement(self) -> int:
    method ping (line 25) | async def ping(self) -> str:
    method ticker (line 29) | async def ticker(self, n: int = 1_000):
    method slow (line 34) | async def slow(self, delay: float = 5.0) -> str:
    method raise_error (line 38) | async def raise_error(self, message: str) -> None:
  class ExitService (line 43) | class ExitService:
    method child_exit (line 44) | async def child_exit(self) -> None:
    method ping (line 48) | async def ping(self) -> str:
  function _wait_for_count (line 53) | async def _wait_for_count(values: list[int], target: int, timeout: float...
  function test_proxy_supports_sync_async_and_attribute_access (line 61) | async def test_proxy_supports_sync_async_and_attribute_access() -> None:
  function test_child_exit_error_is_sticky_for_followup_calls (line 74) | async def test_child_exit_error_is_sticky_for_followup_calls() -> None:
  function test_async_generator_cancellation_does_not_break_future_calls (line 89) | async def test_async_generator_cancellation_does_not_break_future_calls(...
  function test_close_fails_inflight_requests_and_is_idempotent (line 111) | async def test_close_fails_inflight_requests_and_is_idempotent() -> None:
  function test_child_exceptions_are_propagated_and_proxy_recovers (line 131) | async def test_child_exceptions_are_propagated_and_proxy_recovers() -> N...

FILE: tests/unit/test_multi_checkpoint_inference.py
  class TestModelGetInferenceName (line 27) | class TestModelGetInferenceName:
    method test_get_inference_name_without_step_uses_name (line 30) | def test_get_inference_name_without_step_uses_name(self):
    method test_get_inference_name_without_step_uses_inference_model_name (line 35) | def test_get_inference_name_without_step_uses_inference_model_name(self):
    method test_get_inference_name_with_step_appends_suffix (line 44) | def test_get_inference_name_with_step_appends_suffix(self):
    method test_get_inference_name_with_step_uses_inference_model_name (line 51) | def test_get_inference_name_with_step_uses_inference_model_name(self):
    method test_get_inference_name_none_step_is_same_as_no_step (line 60) | def test_get_inference_name_none_step_is_same_as_no_step(self):
  class TestTrainableModelGetInferenceName (line 66) | class TestTrainableModelGetInferenceName:
    method test_trainable_model_get_inference_name_with_step (line 69) | def test_trainable_model_get_inference_name_with_step(self):
  class TestLitellmCompletionParams (line 80) | class TestLitellmCompletionParams:
    method test_litellm_completion_params_without_step (line 83) | def test_litellm_completion_params_without_step(self):
    method test_litellm_completion_params_with_step (line 97) | def test_litellm_completion_params_with_step(self):
    method test_litellm_completion_params_trainable_model_with_step (line 109) | def test_litellm_completion_params_trainable_model_with_step(self):
  class TestServerlessBackendModelInferenceName (line 133) | class TestServerlessBackendModelInferenceName:
    method test_model_inference_name_without_step (line 136) | def test_model_inference_name_without_step(self):
    method test_model_inference_name_with_step (line 154) | def test_model_inference_name_with_step(self):
    method test_model_inference_name_none_step_is_same_as_no_step (line 174) | def test_model_inference_name_none_step_is_same_as_no_step(self):
  class TestOpenAIServerConfigLoraName (line 198) | class TestOpenAIServerConfigLoraName:
    method test_lora_name_includes_step (line 201) | def test_lora_name_includes_step(self):
    method test_lora_name_step_zero (line 217) | def test_lora_name_step_zero(self):
    method test_served_model_name_uses_base_model_when_lora_enabled (line 232) | def test_served_model_name_uses_base_model_when_lora_enabled(self):
  class TestStepParsing (line 254) | class TestStepParsing:
    method tinker_native_backend_class (line 258) | def tinker_native_backend_class(self):
    method test_parse_step_from_model_name (line 267) | def test_parse_step_from_model_name(self, tinker_native_backend_class):
    method test_missing_step_suffix_fails_loudly (line 274) | def test_missing_step_suffix_fails_loudly(self, tinker_native_backend_...
    method test_invalid_step_suffix_fails_loudly (line 282) | def test_invalid_step_suffix_fails_loudly(self, tinker_native_backend_...
  class TestUnslothServiceMaxLoras (line 296) | class TestUnslothServiceMaxLoras:
    method unsloth_service_class (line 300) | def unsloth_service_class(self):
    method test_max_loras_default_is_2 (line 309) | def test_max_loras_default_is_2(self, unsloth_service_class):
    method test_max_loras_can_be_overridden (line 331) | def test_max_loras_can_be_overridden(self, unsloth_service_class):
  class TestPipelinedTrainingUsage (line 356) | class TestPipelinedTrainingUsage:
    method test_pipelined_training_pattern (line 359) | def test_pipelined_training_pattern(self):

FILE: tests/unit/test_pipeline_trainer_batching.py
  function _make_group (line 11) | def _make_group() -> TrajectoryGroup:
  function test_collect_batch_respects_max_batch_size (line 28) | async def test_collect_batch_respects_max_batch_size(tmp_path: Path) -> ...

FILE: tests/unit/test_pipeline_trainer_local_backend.py
  function _make_group (line 16) | def _make_group(rewards: list[float]) -> TrajectoryGroup:
  function _make_trainer (line 32) | def _make_trainer(
  function test_pipeline_trainer_preserves_backend_train_kwargs (line 54) | async def test_pipeline_trainer_preserves_backend_train_kwargs(tmp_path:...
  function test_pipeline_trainer_uses_same_train_kwargs_for_local_backend (line 92) | async def test_pipeline_trainer_uses_same_train_kwargs_for_local_backend(
  function test_local_backend_train_translates_loss_fn (line 131) | async def test_local_backend_train_translates_loss_fn(tmp_path: Path) ->...
  function test_local_backend_async_context_manager_awaits_async_cleanup (line 169) | async def test_local_backend_async_context_manager_awaits_async_cleanup(
  function test_pipeline_trainer_rejects_unsupported_local_backend_settings (line 199) | def test_pipeline_trainer_rejects_unsupported_local_backend_settings(
  function test_pipeline_trainer_rejects_shared_local_backend (line 223) | def test_pipeline_trainer_rejects_shared_local_backend(tmp_path: Path) -...
  function test_local_backend_inference_name_prefers_served_step_in_dedicated_mode (line 237) | def test_local_backend_inference_name_prefers_served_step_in_dedicated_m...

FILE: tests/unit/test_pipeline_trainer_metrics.py
  function _make_group (line 13) | def _make_group(
  function test_pipeline_trainer_logs_explicit_stale_and_zero_variance_metrics (line 32) | async def test_pipeline_trainer_logs_explicit_stale_and_zero_variance_me...

FILE: tests/unit/test_prefix_cache.py
  class TestLRUTrieCache (line 10) | class TestLRUTrieCache:
    method test_longest_prefix_match (line 11) | def test_longest_prefix_match(self) -> None:
    method test_lru_eviction (line 22) | def test_lru_eviction(self) -> None:
    method test_invalid_size (line 34) | def test_invalid_size(self) -> None:

FILE: tests/unit/test_ruler_metrics.py
  class _FakePromptTokenDetails (line 11) | class _FakePromptTokenDetails:
    method __init__ (line 12) | def __init__(self, *, cached_tokens: int = 0) -> None:
  class _FakeUsage (line 16) | class _FakeUsage:
    method __init__ (line 17) | def __init__(
  class _FakeMessage (line 35) | class _FakeMessage:
    method __init__ (line 36) | def __init__(self, content: str) -> None:
  class _FakeChoice (line 40) | class _FakeChoice:
    method __init__ (line 41) | def __init__(self, content: str) -> None:
  class _FakeResponse (line 45) | class _FakeResponse:
    method __init__ (line 46) | def __init__(
  function test_ruler_records_builder_cost_for_supported_judges (line 65) | async def test_ruler_records_builder_cost_for_supported_judges(monkeypat...
  function test_ruler_skips_cost_when_pricing_is_unavailable (line 107) | async def test_ruler_skips_cost_when_pricing_is_unavailable(monkeypatch):
  function test_ruler_records_direct_cost_for_openrouter_judges (line 145) | async def test_ruler_records_direct_cost_for_openrouter_judges(monkeypat...

FILE: tests/unit/test_sft.py
  function create_temp_jsonl (line 17) | def create_temp_jsonl(num_trajectories: int) -> Path:
  function test_iterate_file (line 32) | def test_iterate_file():
  function test_iterate_file_multiple_epochs (line 45) | def test_iterate_file_multiple_epochs():
  function test_iterate_file_with_initial_skip (line 59) | def test_iterate_file_with_initial_skip():
  function test_iterate_file_deterministic (line 73) | def test_iterate_file_deterministic():
  function test_lr_schedule_warmup_not_zero (line 89) | def test_lr_schedule_warmup_not_zero():
  function test_lr_schedule_edge_cases (line 107) | def test_lr_schedule_edge_cases():
  function test_lr_schedule_decay_methods (line 125) | def test_lr_schedule_decay_methods():
  function test_lr_schedule_no_warmup (line 148) | def test_lr_schedule_no_warmup():
  function _make_trajectories (line 158) | def _make_trajectories(n: int):
  function test_create_sft_dataset_iterator_lr_schedule_continuity (line 173) | def test_create_sft_dataset_iterator_lr_schedule_continuity():
  function test_create_sft_dataset_iterator_step_tracking (line 212) | def test_create_sft_dataset_iterator_step_tracking():
  function test_create_sft_dataset_iterator_initial_step (line 246) | def test_create_sft_dataset_iterator_initial_step():
  function test_create_sft_dataset_iterator_deterministic (line 279) | def test_create_sft_dataset_iterator_deterministic():
  function test_create_sft_dataset_iterator_empty_input (line 302) | def test_create_sft_dataset_iterator_empty_input():
  function test_create_sft_dataset_iterator_single_chunk (line 308) | def test_create_sft_dataset_iterator_single_chunk():

FILE: tests/unit/test_strip_logprobs.py
  class TestStripLogprobs (line 12) | class TestStripLogprobs:
    method test_strip_dict_with_logprobs (line 15) | def test_strip_dict_with_logprobs(self):
    method test_strip_nested_dict (line 29) | def test_strip_nested_dict(self):
    method test_strip_list_with_logprobs (line 46) | def test_strip_list_with_logprobs(self):
    method test_strip_tuple_with_logprobs (line 59) | def test_strip_tuple_with_logprobs(self):
    method test_strip_object_with_logprobs (line 73) | def test_strip_object_with_logprobs(self):
    method test_strip_mixed_nested_structure (line 89) | def test_strip_mixed_nested_structure(self):
    method test_strip_empty_structures (line 109) | def test_strip_empty_structures(self):
    method test_strip_none_and_primitives (line 115) | def test_strip_none_and_primitives(self):
    method test_no_logprobs_unchanged (line 123) | def test_no_logprobs_unchanged(self):
    method test_deepcopy_behavior (line 135) | def test_deepcopy_behavior(self):
    method test_deepcopy_failure_returns_original (line 149) | def test_deepcopy_failure_returns_original(self, caplog):
    method test_deepcopy_failure_with_recursion_error (line 175) | def test_deepcopy_failure_with_recursion_error(self, caplog):

FILE: tests/unit/test_tinker_renderers.py
  class FakeTokenizer (line 10) | class FakeTokenizer:
    method __init__ (line 15) | def __init__(self) -> None:
    method encode (line 23) | def encode(self, text: str, add_special_tokens: bool = False) -> list[...
    method decode (line 47) | def decode(self, tokens: int | list[int]) -> str:
  function _decode_model_input (line 53) | def _decode_model_input(tokenizer: FakeTokenizer, model_input: object) -...
  function _get_test_renderer (line 61) | def _get_test_renderer(name: str, tokenizer: FakeTokenizer) -> renderers...
  function test_get_renderer_name_autodetects_qwen3_5 (line 65) | def test_get_renderer_name_autodetects_qwen3_5() -> None:
  function test_qwen3_5_generation_prompt_matches_hf_suffixes (line 69) | def test_qwen3_5_generation_prompt_matches_hf_suffixes() -> None:
  function test_qwen3_5_parse_response_handles_xml_tool_calls (line 92) | def test_qwen3_5_parse_response_handles_xml_tool_calls() -> None:
  function test_qwen3_5_to_openai_message_uses_mapping_tool_arguments (line 127) | def test_qwen3_5_to_openai_message_uses_mapping_tool_arguments() -> None:
  function test_convert_openai_messages_to_renderer_format_stringifies_dict_arguments (line 157) | def test_convert_openai_messages_to_renderer_format_stringifies_dict_arg...
  function test_get_renderer_supports_kimi_k25_factory (line 191) | def test_get_renderer_supports_kimi_k25_factory() -> None:

FILE: tests/unit/test_track_api_cost.py
  class _OpenAIUsage (line 14) | class _OpenAIUsage:
    method __init__ (line 15) | def __init__(
  class _OpenAIResponse (line 35) | class _OpenAIResponse:
    method __init__ (line 36) | def __init__(
  class _AnthropicUsage (line 56) | class _AnthropicUsage:
    method __init__ (line 57) | def __init__(
  class _AnthropicResponse (line 71) | class _AnthropicResponse:
    method __init__ (line 72) | def __init__(
  class TestTrackApiCost (line 90) | class TestTrackApiCost:
    method test_compute_sample_costs_uses_tinker_leaf_names (line 91) | def test_compute_sample_costs_uses_tinker_leaf_names(self) -> None:
    method test_openai_cost_extraction_with_explicit_pricing (line 108) | async def test_openai_cost_extraction_with_explicit_pricing(self) -> N...
    method test_openai_cost_extraction_accounts_for_cached_tokens (line 131) | async def test_openai_cost_extraction_accounts_for_cached_tokens(self)...
    method test_anthropic_cost_extraction_uses_registered_model_pricing (line 159) | async def test_anthropic_cost_extraction_uses_registered_model_pricing(
    method test_anthropic_cost_extraction_accounts_for_cache_write_and_read (line 187) | async def test_anthropic_cost_extraction_accounts_for_cache_write_and_...
    method test_direct_usage_cost_is_used_before_provider_estimation (line 222) | async def test_direct_usage_cost_is_used_before_provider_estimation(se...
    method test_direct_model_extra_cost_is_used_when_usage_cost_missing (line 249) | async def test_direct_model_extra_cost_is_used_when_usage_cost_missing(
    method test_explicit_model_name_uses_global_pricing (line 278) | async def test_explicit_model_name_uses_global_pricing(
    method test_explicit_model_name_uses_registered_pricing (line 316) | async def test_explicit_model_name_uses_registered_pricing(
    method test_explicit_model_name_does_not_depend_on_response_model (line 350) | async def test_explicit_model_name_does_not_depend_on_response_model(s...
    method test_decorator_fails_fast_without_model_aware_pricing (line 377) | async def test_decorator_fails_fast_without_model_aware_pricing(self) ...
    method test_custom_extractor_takes_precedence (line 396) | async def test_custom_extractor_takes_precedence(self) -> None:
    method test_decorator_noops_without_active_builder (line 420) | async def test_decorator_noops_without_active_builder(self) -> None:
    method test_for_cost_context_routes_to_eval_and_shares_state (line 433) | async def test_for_cost_context_routes_to_eval_and_shares_state(self) ...
  class TestTrackApiCostIntegration (line 457) | class TestTrackApiCostIntegration:
    method test_model_log_emits_train_and_eval_costs (line 459) | async def test_model_log_emits_train_and_eval_costs(self, tmp_path: Pa...
    method test_model_log_keeps_pending_train_and_eval_costs_isolated (line 519) | async def test_model_log_keeps_pending_train_and_eval_costs_isolated(
    method test_pipeline_trainer_activates_train_context_for_rollouts (line 584) | async def test_pipeline_trainer_activates_train_context_for_rollouts(
    method test_pipeline_trainer_activates_eval_context_for_eval_fn (line 633) | async def test_pipeline_trainer_activates_eval_context_for_eval_fn(

FILE: tests/unit/test_trajectory_copy.py
  function sample_trajectory (line 13) | def sample_trajectory():
  function sample_trajectory_group (line 37) | def sample_trajectory_group(sample_trajectory):
  function test_shallow_copy (line 62) | def test_shallow_copy(sample_trajectory_group):
  function test_deep_copy (line 81) | def test_deep_copy(sample_trajectory_group):
  function test_deep_copy_with_exceptions (line 108) | def test_deep_copy_with_exceptions():
  function test_deep_copy_circular_reference (line 132) | def test_deep_copy_circular_reference():
  function test_deep_copy_preserves_metadata (line 158) | def test_deep_copy_preserves_metadata(sample_trajectory_group):
  function test_copy_empty_group (line 183) | def test_copy_empty_group():

FILE: tests/unit/test_trajectory_parquet.py
  function _ensure_message (line 58) | def _ensure_message(item: MessageOrChoice) -> ChatCompletionMessageParam:
  function _ensure_assistant_message (line 64) | def _ensure_assistant_message(
  function _ensure_tool_message (line 72) | def _ensure_tool_message(item: MessageOrChoice) -> ChatCompletionToolMes...
  function _ensure_user_message (line 78) | def _ensure_user_message(item: MessageOrChoice) -> ChatCompletionUserMes...
  class TestParquetRoundTrip (line 84) | class TestParquetRoundTrip:
    method test_simple_trajectory (line 87) | def test_simple_trajectory(self, tmp_path: Path):
    method test_tool_calls (line 122) | def test_tool_calls(self, tmp_path: Path):
    method test_group_level_fields_round_trip (line 182) | def test_group_level_fields_round_trip(self, tmp_path: Path):
    method test_choice_format (line 219) | def test_choice_format(self, tmp_path: Path):
    method test_unicode_content (line 267) | def test_unicode_content(self, tmp_path: Path):
    method test_multiple_trajectories (line 302) | def test_multiple_trajectories(self, tmp_path: Path):
    method test_multiple_groups_preserved (line 336) | def test_multiple_groups_preserved(self, tmp_path: Path):
    method test_empty_trajectory_group (line 412) | def test_empty_trajectory_group(self, tmp_path: Path):
    method test_nullable_fields (line 424) | def test_nullable_fields(self, tmp_path: Path):
  class TestMigration (line 453) | class TestMigration:
    method test_migrate_simple_jsonl (line 456) | def test_migrate_simple_jsonl(self, tmp_path: Path):
    method test_migrate_keeps_original_when_requested (line 505) | def test_migrate_keeps_original_when_requested(self, tmp_path: Path):
    method test_migrate_dry_run (line 515) | def test_migrate_dry_run(self, tmp_path: Path):
    method test_migrate_directory (line 526) | def test_migrate_directory(self, tmp_path: Path):
    method test_migrate_model_dir (line 579) | def test_migrate_model_dir(self, tmp_path: Path):
    method test_migrate_idempotent (line 608) | def test_migrate_idempotent(self, tmp_path: Path):
  class TestGoldenFiles (line 622) | class TestGoldenFiles:
    method fixtures_available (line 626) | def fixtures_available(self):
    method test_real_sample_migration (line 632) | def test_real_sample_migration(self, tmp_path: Path, fixtures_available):
    method test_fixture_roundtrips (line 666) | def test_fixture_roundtrips(self, tmp_path: Path, fixtures_available):
  class TestCompression (line 696) | class TestCompression:
    method test_long_content_compresses_well (line 699) | def test_long_content_compresses_well(self, tmp_path: Path):
  class TestEdgeCases (line 733) | class TestEdgeCases:
    method test_nonexistent_file (line 736) | def test_nonexistent_file(self, tmp_path: Path):
    method test_invalid_json (line 742) | def test_invalid_json(self, tmp_path: Path):
    method test_empty_file (line 751) | def test_empty_file(self, tmp_path: Path):
    method test_non_jsonl_file_skipped (line 762) | def test_non_jsonl_file_skipped(self, tmp_path: Path):

FILE: tests/unit/test_unsloth_metrics.py
  class _DummyTrainer (line 7) | class _DummyTrainer:
    method __init__ (line 8) | def __init__(self) -> None:
  function test_get_log_fn_routes_eval_metrics_to_val_namespace (line 12) | def test_get_log_fn_routes_eval_metrics_to_val_namespace() -> None:

FILE: tests/unit/test_vllm_patches_contract.py
  function test_subclass_chat_completion_request_forces_logprobs (line 13) | def test_subclass_chat_completion_request_forces_logprobs() -> None:
  function test_patch_tool_parser_manager_falls_back_to_empty_delta_message (line 32) | def test_patch_tool_parser_manager_falls_back_to_empty_delta_message() -...

FILE: tests/unit/test_yield_trajectory.py
  function test_server (line 111) | async def test_server():
  function test_yield_trajectory (line 131) | async def test_yield_trajectory(test_server: None) -> None:

Copy disabled (too large) Download .json

Condensed preview — 341 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (13,407K chars).

[
  {
    "path": ".agents/skills/fix-art-issues/SKILL.md",
    "chars": 3358,
    "preview": "---\nname: fix-art-issues\ndescription: >\n  Fix a GitHub issue on OpenPipe/ART and open a PR.\n  Use when the user asks to "
  },
  {
    "path": ".agents/skills/train-rl/SKILL.md",
    "chars": 15592,
    "preview": "---\nname: train-rl\ndescription: RL training reference for the ART framework. Use when the user asks to create, write, or"
  },
  {
    "path": ".agents/skills/train-sft/SKILL.md",
    "chars": 11586,
    "preview": "---\nname: train-sft\ndescription: SFT training reference for the ART framework. Use when the user asks to create, write, "
  },
  {
    "path": ".dockerignore",
    "chars": 226,
    "preview": "__pycache__/\n.art/\n# .env\n.venv/\ngrpo_trainer_lora_model/\nlogs/\nshared_cache.db\nstreaming-chat-completions/\nunsloth_comp"
  },
  {
    "path": ".github/workflows/create-draft-release.yml",
    "chars": 4063,
    "preview": "name: Create Draft Release\n\non:\n  workflow_dispatch:\n    inputs:\n      version_type:\n        description: 'Version bump "
  },
  {
    "path": ".github/workflows/package-install.yml",
    "chars": 1023,
    "preview": "name: Package Install\n\non:\n  pull_request:\n  push:\n    branches: [main]\n  workflow_dispatch:\n\npermissions:\n  contents: r"
  },
  {
    "path": ".github/workflows/prek.yml",
    "chars": 9061,
    "preview": "name: Prek\n\non:\n  pull_request:\n  push:\n    branches: [main]\n\npermissions:\n  contents: write\n\nenv:\n  CI_BASE_IMAGE: \"pyt"
  },
  {
    "path": ".github/workflows/release.yml",
    "chars": 2216,
    "preview": "name: Release\n\non:\n  pull_request:\n    types: [closed]\n    branches: [main]\n\npermissions:\n  contents: write\n  id-token: "
  },
  {
    "path": ".gitignore",
    "chars": 330,
    "preview": "__pycache__/\n.art/\n.env\n.venv/\ngrpo_trainer_lora_model/\nlogs/\nshared_cache.db\ndata/cache.db\nstreaming-chat-completions/\n"
  },
  {
    "path": ".pre-commit-config.yaml",
    "chars": 654,
    "preview": "repos:\n  - repo: https://github.com/astral-sh/ruff-pre-commit\n    rev: v0.12.1\n    hooks:\n      - id: ruff\n      - id: r"
  },
  {
    "path": ".python-version",
    "chars": 5,
    "preview": "3.11\n"
  },
  {
    "path": ".skyignore",
    "chars": 222,
    "preview": "__pycache__/\n.art/\n# .env\n.venv/\ngrpo_trainer_lora_model/\nlogs/\nshared_cache.db\nstreaming-chat-completions/\nunsloth_comp"
  },
  {
    "path": "AGENT.md",
    "chars": 1349,
    "preview": "## uv package manager by default\n\nThis project uses the `uv` package manager.\n\n- To add a dependency, run `uv add <packa"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 5668,
    "preview": "## Contributing to ART\n\nClone the repository:\n\n```bash\ngit clone https://github.com/OpenPipe/ART.git\ncd ART\n```\n\nInstall"
  },
  {
    "path": "LICENSE",
    "chars": 11013,
    "preview": "                                Apache License\n                        Version 2.0, January 2004\n                    htt"
  },
  {
    "path": "README.md",
    "chars": 14113,
    "preview": "<div align=\"center\">\n\n<a href=\"https://art.openpipe.ai\"><picture>\n<img alt=\"ART logo\" src=\"https://github.com/openpipe/a"
  },
  {
    "path": "THIRD-PARTY-NOTICES",
    "chars": 1980,
    "preview": "This project includes code from PyTorch TorchTune:\n\n- Repository: https://github.com/pytorch/torchtune\n- License: BSD 3-"
  },
  {
    "path": "dev/demo_logging.py",
    "chars": 7277,
    "preview": "#!/usr/bin/env python3\n\"\"\"Demo of all logging functionality from art.utils.logging.\"\"\"\n\nimport time\n\nfrom art.utils.logg"
  },
  {
    "path": "dev/math-vista/math-vista.ipynb",
    "chars": 6028,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"46a6ad6d\",\n   \"metadata\": {},\n   \"output"
  },
  {
    "path": "dev/math-vista/math-vista.py",
    "chars": 4304,
    "preview": "import argparse\nimport asyncio\nimport itertools\nimport os\nimport re\nfrom typing import Iterator, TypedDict, cast\n\nimport"
  },
  {
    "path": "dev/new_models/benchmark_inference.py",
    "chars": 5337,
    "preview": "\"\"\"\nBenchmark inference performance for Qwen2.5-7B-Instruct using ART.\n\nThis script sends 5 concurrent requests with app"
  },
  {
    "path": "dev/new_models/gemma3.py",
    "chars": 1832,
    "preview": "import asyncio\nimport json\n\nfrom dotenv import load_dotenv\n\nimport art\nfrom art.local import LocalBackend\n\nload_dotenv()"
  },
  {
    "path": "dev/new_models/prompts.json",
    "chars": 1170,
    "preview": "[\"respond with 'yes', 'no', 'maybe'\", \"respond with 'maybe', 'yes', 'no'\", \"respond with 'no', 'yes', 'maybe'\", \"respond"
  },
  {
    "path": "dev/new_models/qwen3_try.ipynb",
    "chars": 4029,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "dev/new_models/qwen3_try.py",
    "chars": 1816,
    "preview": "import asyncio\nimport json\n\nfrom dotenv import load_dotenv\n\nimport art\nfrom art.local import LocalBackend\n\nload_dotenv()"
  },
  {
    "path": "dev/profile.ipynb",
    "chars": 5975,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "dev/run_yes_no_maybe_kl_advantage.py",
    "chars": 3623,
    "preview": "\"\"\"Launch yes-no-maybe-kl-advantage training on SkyPilot (Kubernetes).\n\nUsage:\n    uv run dev/run_yes_no_maybe_kl_advant"
  },
  {
    "path": "dev/sft/dataset.jsonl",
    "chars": 52999,
    "preview": "{\"messages\":[{\"role\":\"system\",\"content\":\"\\n            You are an expert model trained to redact potentially sensitive i"
  },
  {
    "path": "dev/sft/distillation.py",
    "chars": 1984,
    "preview": "\"\"\"Distillation example: Train a small model using completions from a large model.\"\"\"\n\nimport asyncio\nimport os\n\nfrom do"
  },
  {
    "path": "dev/sft/sft-from-file.py",
    "chars": 762,
    "preview": "\"\"\"Simple SFT training script using train_sft_from_file helper.\"\"\"\n\nimport asyncio\nimport random\n\nimport art\nfrom art.lo"
  },
  {
    "path": "dev/sft/sft-warmup.py",
    "chars": 4333,
    "preview": "\"\"\"Minimal example demonstrating SFT -> RL -> SFT switching.\"\"\"\n\nimport asyncio\nimport os\nimport random\n\nfrom dotenv imp"
  },
  {
    "path": "dev/yes-no-maybe-kl-advantage.py",
    "chars": 3405,
    "preview": "\"\"\"Yes-no-maybe training with KL-penalized advantage adjustment.\n\nDemonstrates the kl_penalty_coef feature: tokens where"
  },
  {
    "path": "dev/yes-no-maybe-megatron.py",
    "chars": 2624,
    "preview": "import asyncio\nfrom itertools import permutations\nimport os\n\nfrom dotenv import load_dotenv\nimport openai\nimport torch\n\n"
  },
  {
    "path": "dev/yes-no-maybe-metrics.py",
    "chars": 8110,
    "preview": "\"\"\"Yes-no-maybe metrics demo for the LocalBackend `model.train()` path.\n\nThis keeps the same prompt family, rollout stru"
  },
  {
    "path": "dev/yes-no-maybe-vision/generate_images.py",
    "chars": 9497,
    "preview": "from __future__ import annotations\n\nfrom pathlib import Path\nfrom typing import Any, Iterable, Sequence\n\ntry:\n    from P"
  },
  {
    "path": "dev/yes-no-maybe-vision/train.ipynb",
    "chars": 3541,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "dev/yes-no-maybe.ipynb",
    "chars": 3996,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "dev/yes-no-maybe.py",
    "chars": 2884,
    "preview": "import asyncio\nfrom itertools import permutations\nimport os\n\nfrom dotenv import load_dotenv\nimport openai\n\nimport art\nfr"
  },
  {
    "path": "docs/.gitignore",
    "chars": 30,
    "preview": "node_modules\npackage-lock.json"
  },
  {
    "path": "docs/README.md",
    "chars": 1136,
    "preview": "# OpenPipe Documentation\n\nThis repository contains the source for the ART documentation website hosted at [https://art.o"
  },
  {
    "path": "docs/analytics.js",
    "chars": 919,
    "preview": "!(function () {\n  var reb2b = (window.reb2b = window.reb2b || []);\n  if (reb2b.invoked) return;\n  reb2b.invoked = true;\n"
  },
  {
    "path": "docs/docs.json",
    "chars": 2559,
    "preview": "{\n  \"name\": \"ART\",\n  \"description\": \"Train LLMs to be better agents using RL\",\n  \"theme\": \"mint\",\n  \"logo\": {\n    \"light"
  },
  {
    "path": "docs/experimental/gspo.mdx",
    "chars": 2234,
    "preview": "---\ntitle: \"GSPO (Group Sequence Policy Optimization)\"\ndescription: \"A stable and efficient RL algorithm for training la"
  },
  {
    "path": "docs/features/additional-histories.mdx",
    "chars": 7980,
    "preview": "---\ntitle: Additional Histories\ndescription: Learn how to use additional histories for complex agent training scenarios\n"
  },
  {
    "path": "docs/features/checkpoint-deletion.mdx",
    "chars": 4094,
    "preview": "---\ntitle: Deleting Checkpoints\ndescription: Learn how to automatically delete low-performing model checkpoints\n---\n\nTra"
  },
  {
    "path": "docs/features/checkpoint-forking.mdx",
    "chars": 4337,
    "preview": "---\ntitle: Checkpoint Forking\ndescription: Learn how to fork training from existing model checkpoints\n---\n\n<Frame captio"
  },
  {
    "path": "docs/features/mcp-rl.mdx",
    "chars": 7039,
    "preview": "---\ntitle: \"MCP•RL: Training Agents to Use MCP Servers\"\nsidebarTitle: \"MCP Training\"\ndescription: \"Learn how to train la"
  },
  {
    "path": "docs/features/tracking-metrics.mdx",
    "chars": 5907,
    "preview": "---\ntitle: \"Tracking Metrics\"\ndescription: \"See what ART logs automatically and how to add your own metrics and costs.\"\n"
  },
  {
    "path": "docs/fundamentals/art-backend.mdx",
    "chars": 5407,
    "preview": "---\ntitle: \"ART Backend\"\nicon: \"server\"\ndescription: \"Learn the underlying architecture of the ART backend\"\n---\n\nART div"
  },
  {
    "path": "docs/fundamentals/art-client.mdx",
    "chars": 8050,
    "preview": "---\ntitle: \"ART Client\"\nicon: \"laptop-code\"\ndescription: \"Integrate RL into existing codebases.\"\n---\n\nOne of ART's prima"
  },
  {
    "path": "docs/fundamentals/ruler.mdx",
    "chars": 11524,
    "preview": "---\ntitle: \"RULER\"\nicon: \"ruler\"\ndescription: \"Learn how to use RULER to automatically reward your agents.\"\n---\n\n# 📏RULE"
  },
  {
    "path": "docs/fundamentals/sft-training.mdx",
    "chars": 8936,
    "preview": "---\ntitle: \"SFT Training\"\ndescription: \"Train models using supervised fine-tuning with ART.\"\nsidebarTitle: \"SFT Training"
  },
  {
    "path": "docs/fundamentals/training-loop.mdx",
    "chars": 2711,
    "preview": "---\ntitle: \"ART Training Loop\"\ndescription: \"Learn how inference and training work within ART.\"\nsidebarTitle: \"Training "
  },
  {
    "path": "docs/getting-started/about.mdx",
    "chars": 4503,
    "preview": "---\ntitle: \"ART Docs\"\ndescription: \"Train your own multi-turn agents with **ART**, an open-source framework for LLM rein"
  },
  {
    "path": "docs/getting-started/faq.mdx",
    "chars": 5896,
    "preview": "---\ntitle: \"Frequently Asked Questions\"\nsidebarTitle: \"FAQ\"\nicon: \"block-question\"\n---\n\n<Accordion title=\"What is ART?\">"
  },
  {
    "path": "docs/getting-started/installation-setup.mdx",
    "chars": 2205,
    "preview": "---\ntitle: \"Installation + Setup\"\nsidebarTitle: \"Installation + Setup\"\nicon: \"gear\"\n---\n\n### Installing ART\n\nThe ART cli"
  },
  {
    "path": "docs/getting-started/notebooks.mdx",
    "chars": 5011,
    "preview": "---\ntitle: \"ART Notebooks\"\nsidebarTitle: \"Notebooks\"\ndescription: \"Use ART to train agents for many different tasks.\"\nic"
  },
  {
    "path": "docs/getting-started/quick-start.mdx",
    "chars": 2108,
    "preview": "---\ntitle: \"Quick Start\"\ndescription: \"Get started with ART in a few quick steps.\"\nicon: \"forward\"\n---\n\nIn this Quick St"
  },
  {
    "path": "docs/integrations/langgraph-integration.mdx",
    "chars": 18847,
    "preview": "---\ntitle: \"🦜🔗 LangGraph\"\ndescription: \"Build and train sophisticated AI agents using LangGraph with ART's reinforcement"
  },
  {
    "path": "docs/integrations/openenv-integration.mdx",
    "chars": 3333,
    "preview": "---\ntitle: \"🌍 OpenEnv\"\ndescription: \"Train AI agents in isolated execution environments using OpenEnv with ART's reinfor"
  },
  {
    "path": "docs/package.json",
    "chars": 390,
    "preview": "{\n  \"name\": \"docs\",\n  \"version\": \"1.0.0\",\n  \"description\": \"\",\n  \"main\": \"index.js\",\n  \"scripts\": {\n    \"dev\": \"mintlify"
  },
  {
    "path": "docs/proposals/backend-first-training-api.md",
    "chars": 11486,
    "preview": "# Proposal: Backend-First Training API\n\n## Summary\n\nReplace the current `model.train(trajectory_groups, config)` API wit"
  },
  {
    "path": "docs/resources/glossary.mdx",
    "chars": 2049,
    "preview": "---\ntitle: \"Glossary\"\ndescription: \"Terms and definitions used in the ART docs.\"\nicon: \"circle-info\"\n---\n\n## Additional "
  },
  {
    "path": "docs/resources/models.mdx",
    "chars": 2901,
    "preview": "---\ntitle: \"Supported Models\"\nsidebarTitle: \"Models\"\ndescription: \"Train open source models on ART.\"\nicon: \"robot\"\n---\n\n"
  },
  {
    "path": "docs/style.css",
    "chars": 1264,
    "preview": "/* Styles for tables wrapped in a .full-width container */\n.full-width table th,\n.full-width table td {\n  vertical-align"
  },
  {
    "path": "docs/tutorials/open-deep-research.mdx",
    "chars": 6816,
    "preview": "---\ntitle: \"Open Deep Research Tutorial\"\nsidebarTitle: \"Open Deep Research\"\ndescription: \"Train a deep research agent to"
  },
  {
    "path": "docs/tutorials/summarizer.mdx",
    "chars": 5316,
    "preview": "---\ntitle: \"Summarizer Tutorial\"\nsidebarTitle: \"Summarizer\"\ndescription: \"Train a summarizer model to outperform Sonnet "
  },
  {
    "path": "examples/2048/display_benchmarks.ipynb",
    "chars": 251021,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "examples/2048/generate_benchmarks.py",
    "chars": 1834,
    "preview": "import asyncio\nimport os\nimport random\n\nfrom dotenv import load_dotenv\nfrom rollout import rollout\n\nimport art\nfrom art."
  },
  {
    "path": "examples/2048/rollout.py",
    "chars": 4253,
    "preview": "import asyncio\nimport math\nimport os\n\nfrom dotenv import load_dotenv\nimport openai\nimport requests\nfrom utils import (\n "
  },
  {
    "path": "examples/2048/train.py",
    "chars": 2052,
    "preview": "import asyncio\nimport random\n\nfrom dotenv import load_dotenv\nfrom rollout import rollout\n\nimport art\nfrom art.local impo"
  },
  {
    "path": "examples/2048/utils.py",
    "chars": 5940,
    "preview": "from enum import Enum\nimport random\nimport string\nfrom typing import TypedDict\nimport xml.etree.ElementTree as ET\n\nfrom "
  },
  {
    "path": "examples/benchmarking_comparison_models.py",
    "chars": 7147,
    "preview": "import asyncio\nimport os\nimport re\n\nfrom pydantic import BaseModel\n\nimport art\nfrom art.local import LocalBackend\nfrom a"
  },
  {
    "path": "examples/codenames/codenames_words.json",
    "chars": 5284,
    "preview": "[\n    \"africa\",\n    \"agent\",\n    \"air\",\n    \"alien\",\n    \"alps\",\n    \"amazon\",\n    \"ambulance\",\n    \"america\",\n    \"ange"
  },
  {
    "path": "examples/codenames/dictionary.json",
    "chars": 8128837,
    "preview": "[\n    \"2\",\n    \"1080\",\n    \"&c\",\n    \"10-point\",\n    \"10th\",\n    \"11-point\",\n    \"12-point\",\n    \"16-point\",\n    \"18-poi"
  },
  {
    "path": "examples/data/greentext/prompts.txt",
    "chars": 58094,
    "preview": "Write a greentext about the most embarrassing moment in an online game\nTell a greentext story about accidentally destroy"
  },
  {
    "path": "examples/hn_title_generator/reference_grpo_trainer.py",
    "chars": 14331,
    "preview": "import asyncio\nimport os\nfrom typing import Tuple\n\nfrom datasets import Dataset\nfrom dotenv import load_dotenv\nimport nu"
  },
  {
    "path": "examples/hn_title_generator/train.py",
    "chars": 12020,
    "preview": "import asyncio\nfrom datetime import datetime\nimport os\nfrom typing import Any, Dict, Iterable, List\n\nfrom datasets impor"
  },
  {
    "path": "examples/hn_title_generator/utils.py",
    "chars": 5110,
    "preview": "from datetime import datetime\nimport math\nimport os\nfrom typing import Dict, Optional\n\nfrom datasets import Dataset, loa"
  },
  {
    "path": "examples/just-the-facts/.gitignore",
    "chars": 11,
    "preview": "*.egg-info/"
  },
  {
    "path": "examples/just-the-facts/README.md",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/just-the-facts/just_the_facts/__init__.py",
    "chars": 25,
    "preview": "# Just the Facts package\n"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/checks.py",
    "chars": 6008,
    "preview": "import json\nimport os\n\nfrom dotenv import load_dotenv\nfrom openai import AsyncOpenAI\nimport tenacity\n\nload_dotenv()\n\n\ncl"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/display_benchmarks.ipynb",
    "chars": 602721,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\":"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/experiments.py",
    "chars": 1856,
    "preview": "from pydantic import BaseModel\n\nimport art\n\n\nclass JustTheFactsConfig(BaseModel):\n    learning_rate: float = 1e-6\n    nu"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/find_articles.py",
    "chars": 589,
    "preview": "import random\n\nimport feedparser\n\nfeeds = {\n    \"NBC News Top Stories\": \"http://feeds.nbcnews.com/feeds/topstories\",\n   "
  },
  {
    "path": "examples/just-the-facts/just_the_facts/generate_benchmarks.py",
    "chars": 2637,
    "preview": "import asyncio\nimport copy\nimport os\nimport random\n\nfrom dotenv import load_dotenv\n\nimport art\nfrom art.local import Loc"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/rollout.py",
    "chars": 3358,
    "preview": "import asyncio\nimport os\n\nfrom dotenv import load_dotenv\nfrom openai import AsyncOpenAI\nfrom pydantic import BaseModel\ni"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/scenarios.py",
    "chars": 10922,
    "preview": "from just_the_facts.rollout import FactsScenario\n\ntrain_urls = [\n    \"https://www.nbcnews.com/select/shopping/best-duste"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/train.py",
    "chars": 2638,
    "preview": "import asyncio\nimport os\n\nfrom dotenv import load_dotenv\nfrom rollout import rollout\nimport weave\n\nimport art\nfrom art.u"
  },
  {
    "path": "examples/just-the-facts/just_the_facts/utils.py",
    "chars": 3882,
    "preview": "import asyncio\n\nimport aiohttp\nfrom newspaper import Article\n\ncached_articles = {}\n\n\nasync def scrape_article(url: str) "
  },
  {
    "path": "examples/just-the-facts/main.py",
    "chars": 92,
    "preview": "def main():\n    print(\"Hello from just-the-facts!\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "examples/just-the-facts/pyproject.toml",
    "chars": 898,
    "preview": "[project]\nname = \"just-the-facts\"\nversion = \"0.1.0\"\ndescription = \"Add your description here\"\nreadme = \"README.md\"\nrequi"
  },
  {
    "path": "examples/just-the-facts/test_scraper.py",
    "chars": 803,
    "preview": "#!/usr/bin/env python3\n\nimport asyncio\n\nfrom just_the_facts.scenarios import train_urls, val_urls\nfrom just_the_facts.ut"
  },
  {
    "path": "examples/mcp-rl/README.md",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/mcp-rl/all_experiments.py",
    "chars": 2708,
    "preview": "from pydantic import BaseModel\n\nimport art\n\n\nclass McpPolicyConfig(BaseModel):\n    max_turns: int = 5\n    max_tokens: in"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/__init__.py",
    "chars": 185,
    "preview": "\"\"\"ART MCP package.\"\"\"\n\nfrom .mcp_server import AlphaMcpServer, McpServer\nfrom .rollout import McpScenario, rollout\n\n__a"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/benchmarks/display_benchmarks/mcp_alphavantage.ipynb",
    "chars": 248327,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"dat"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/benchmarks/display_benchmarks/mcp_balldontlie.ipynb",
    "chars": 231903,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\":"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/benchmarks/generate_benchmarks.py",
    "chars": 7008,
    "preview": "import argparse\nimport asyncio\nimport json\nimport os\nimport random\nfrom typing import List\n\nfrom dotenv import load_dote"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/checks.py",
    "chars": 3836,
    "preview": "\"\"\"Task completion checking using LLM evaluation.\"\"\"\n\nimport json\nimport os\n\nfrom dotenv import load_dotenv\nfrom openai "
  },
  {
    "path": "examples/mcp-rl/mcp_rl/mcp_server.py",
    "chars": 10811,
    "preview": "\"\"\"MCP Server wrapper class for integration with ART.\"\"\"\n\nfrom abc import ABC, abstractmethod\nimport json\nimport ssl\nfro"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/rollout.py",
    "chars": 10958,
    "preview": "\"\"\"MCP agent rollout implementation.\n\nThis module provides a rollout function for running MCP agents with scenarios.\nBas"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/scenario_generator.py",
    "chars": 9534,
    "preview": "\"\"\"Generate scenarios for MCP agent evaluation using OpenAI's o3 model.\"\"\"\n\nimport argparse\nimport asyncio\nimport import"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/train.py",
    "chars": 7210,
    "preview": "\"\"\"Training example for MCP agent using rollout with AlphaMcpServer in scenarios.\"\"\"\n\nimport argparse\nimport asyncio\nimp"
  },
  {
    "path": "examples/mcp-rl/mcp_rl/utils.py",
    "chars": 752,
    "preview": "from mcp import types\n\n\ndef get_content_text(result: types.CallToolResult) -> str:\n    # Extract text content from MCP r"
  },
  {
    "path": "examples/mcp-rl/pyproject.toml",
    "chars": 587,
    "preview": "[project]\nname = \"art-mcp\"\nversion = \"0.1.0\"\ndescription = \"Add your description here\"\nreadme = \"README.md\"\nrequires-pyt"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/README.md",
    "chars": 1054,
    "preview": "# MCP AlphaVantage Python Server\n\nA Python implementation of the MCP server for Alpha Vantage financial data API.\n\n## Fe"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/__init__.py",
    "chars": 37,
    "preview": "\"\"\"MCP AlphaVantage Python Server\"\"\"\n"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/scenarios/train.jsonl",
    "chars": 3191,
    "preview": "{\"task\": \"Calculate the daily Simple Moving Average (SMA) for Apple (AAPL) using the closing price, and generate a summa"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/scenarios/val.jsonl",
    "chars": 1673,
    "preview": "{\"task\": \"Calculate the 60-minute SMA for IBM and generate a detailed trend analysis including a summary of the work don"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/scenarios.jsonl",
    "chars": 3501,
    "preview": "{\"task\": \"Get the current stock price for Apple (AAPL).\", \"difficulty\": 1}\n{\"task\": \"Retrieve daily time series data for"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/server.py",
    "chars": 13709,
    "preview": "import asyncio\nfrom typing import Any, Dict, Optional\n\nimport aiohttp\nimport click\nfrom mcp.server.lowlevel import Serve"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_alphavantage/server_params.py",
    "chars": 371,
    "preview": "import os\n\nfrom dotenv import load_dotenv\nfrom mcp import StdioServerParameters\n\nload_dotenv()\n\nserver_params = StdioSer"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_balldontlie/README.md",
    "chars": 1516,
    "preview": "# Balldontlie NBA Stats MCP Server\n\nAn MCP server that provides access to NBA statistics through the Balldontlie API, fo"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_balldontlie/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_balldontlie/scenarios/train.jsonl",
    "chars": 3227,
    "preview": "{\"task\": \"Get detailed team information using a specific team ID, analyze their win-loss records for the season, and com"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_balldontlie/scenarios/val.jsonl",
    "chars": 1469,
    "preview": "{\"task\": \"Lookup details for players with the first name 'LeBron' across all teams, then compile a thorough report compa"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_balldontlie/server.py",
    "chars": 11642,
    "preview": "import asyncio\nfrom typing import Any, Dict, Optional\n\nimport aiohttp\nimport click\nfrom mcp.server.lowlevel import Serve"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_balldontlie/server_params.py",
    "chars": 363,
    "preview": "import os\n\nfrom dotenv import load_dotenv\nfrom mcp import StdioServerParameters\n\nload_dotenv()\n\nserver_params = StdioSer"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_googlemaps/README.md",
    "chars": 2717,
    "preview": "# MCP Google Maps Python Server\n\nA Python implementation of the MCP server for Google Maps APIs, providing access to geo"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_googlemaps/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_googlemaps/pyproject.toml",
    "chars": 477,
    "preview": "[project]\nname = \"mcp-googlemaps\"\nversion = \"0.1.0\"\ndescription = \"Google Maps MCP Server - Provides access to Google Ma"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_googlemaps/server.py",
    "chars": 18553,
    "preview": "import asyncio\nfrom typing import Any, Dict, Optional\n\nimport aiohttp\nimport click\nfrom mcp.server.lowlevel import Serve"
  },
  {
    "path": "examples/mcp-rl/servers/python/mcp_googlemaps/server_params.py",
    "chars": 362,
    "preview": "import os\n\nfrom dotenv import load_dotenv\nfrom mcp import StdioServerParameters\n\nload_dotenv()\n\nserver_params = StdioSer"
  },
  {
    "path": "examples/mcp-rl/test_scenario_generation.py",
    "chars": 11398,
    "preview": "#!/usr/bin/env python3\n\"\"\"Test scenario generation functionality.\"\"\"\n\nimport asyncio\nimport os\nfrom typing import List\n\n"
  },
  {
    "path": "examples/openenv_echo.py",
    "chars": 3327,
    "preview": "# /// script\n# requires-python = \">=3.10\"\n# dependencies = [\n#     \"openenv-core==0.1.13\",\n#     \"openpipe-art==0.5.1\",\n"
  },
  {
    "path": "examples/prisoners-dilemma.ipynb",
    "chars": 8047,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "examples/rock-paper-tool-use.ipynb",
    "chars": 7511,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "examples/roflbot/.gitignore",
    "chars": 24,
    "preview": "*.db\n/data/\n.env\n.venv/\n"
  },
  {
    "path": "examples/temporal_clue/temporal-clue-7b-async.ipynb",
    "chars": 6011,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n "
  },
  {
    "path": "examples/temporal_clue/temporal-clue-7b.ipynb",
    "chars": 4910,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": "
  },
  {
    "path": "examples/temporal_clue/temporal-clue.py",
    "chars": 3475,
    "preview": "import asyncio\nimport json\nimport os\nimport random\nimport re\nfrom typing import TypedDict\n\nfrom dotenv import load_doten"
  },
  {
    "path": "examples/tic_tac_toe/display-benchmarks.ipynb",
    "chars": 261543,
    "preview": "{\n \"cells\": [\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\""
  },
  {
    "path": "examples/tic_tac_toe/game_utils.py",
    "chars": 3301,
    "preview": "import random\nfrom typing import Literal, TypedDict\nimport xml.etree.ElementTree as ET\n\n\nclass TicTacToeGame(TypedDict):"
  },
  {
    "path": "examples/tic_tac_toe/rollout.py",
    "chars": 3168,
    "preview": "import math\nimport os\nimport time\n\nfrom dotenv import load_dotenv\nfrom game_utils import (\n    apply_agent_move,\n    che"
  },
  {
    "path": "examples/tic_tac_toe/tic-tac-toe.py",
    "chars": 4233,
    "preview": "import asyncio\nimport os\nimport random\n\nfrom dotenv import load_dotenv\nfrom rollout import TicTacToeScenario, rollout\nim"
  },
  {
    "path": "examples/tic_tac_toe_self_play/deploy_step.py",
    "chars": 1963,
    "preview": "import argparse\nimport asyncio\nimport os\n\nfrom rollout import ModelConfig, TicTacToeScenario, rollout\nfrom train import "
  },
  {
    "path": "examples/tic_tac_toe_self_play/game_utils.py",
    "chars": 3082,
    "preview": "from typing import Literal, TypedDict\nimport xml.etree.ElementTree as ET\n\n\nclass TicTacToeGame(TypedDict):\n    board: li"
  },
  {
    "path": "examples/tic_tac_toe_self_play/gather_trajectory_groups_by_index.py",
    "chars": 1746,
    "preview": "import asyncio\nfrom typing import Awaitable, Iterable\n\nimport tqdm\n\nimport art\nfrom art.gather import GatherContext, set"
  },
  {
    "path": "examples/tic_tac_toe_self_play/rollout.py",
    "chars": 7214,
    "preview": "import math\nimport os\nimport time\n\nfrom dotenv import load_dotenv\nfrom game_utils import (\n    TicTacToeGame,\n    apply_"
  },
  {
    "path": "examples/tic_tac_toe_self_play/train.py",
    "chars": 3808,
    "preview": "import asyncio\nimport os\nimport random\n\nfrom dotenv import load_dotenv\nfrom game_utils import possible_moves\nfrom gather"
  },
  {
    "path": "examples/tic_tac_toe_self_play/train_o4_mini.py",
    "chars": 4058,
    "preview": "import asyncio\nimport os\nimport random\n\nfrom dotenv import load_dotenv\nfrom game_utils import possible_moves\nfrom gather"
  },
  {
    "path": "licenses/GPL-3.0.txt",
    "chars": 35128,
    "preview": "GNU GENERAL PUBLIC LICENSE\n                       Version 3, 29 June 2007\n\n Copyright (C) 2007 Free Software Foundation,"
  },
  {
    "path": "licenses/LGPL-3.0.txt",
    "chars": 7632,
    "preview": "GNU LESSER GENERAL PUBLIC LICENSE\n                       Version 3, 29 June 2007\n\n Copyright (C) 2007 Free Software Foun"
  },
  {
    "path": "pyproject.toml",
    "chars": 5326,
    "preview": "[project]\nname = \"openpipe-art\"\nversion = \"0.5.17\"\ndescription = \"The OpenPipe Agent Reinforcement Training (ART) librar"
  },
  {
    "path": "requirements/backend.vcs.txt",
    "chars": 458,
    "preview": "# Pinned backend dependencies that must come from VCS (not allowed in PyPI metadata).\n# Install with:\n#   uv pip install"
  },
  {
    "path": "scripts/bump_version.py",
    "chars": 2436,
    "preview": "#!/usr/bin/env python3\n\"\"\"\nVersion bumping script for ART releases.\n\nUsage:\n    python scripts/bump_version.py patch  # "
  },
  {
    "path": "scripts/ci/build_and_push_uv_cache.sh",
    "chars": 11506,
    "preview": "#!/usr/bin/env bash\nset -euo pipefail\n\nSCRIPT_DIR=\"$(cd \"$(dirname \"${BASH_SOURCE[0]}\")\" && pwd)\"\nREPO_ROOT=\"$(cd \"${SCR"
  },
  {
    "path": "scripts/ci/compute_uv_fingerprint.py",
    "chars": 2608,
    "preview": "#!/usr/bin/env python3\n\"\"\"Compute a stable fingerprint for the prek CI uv-cache contract.\"\"\"\n\nfrom __future__ import ann"
  },
  {
    "path": "scripts/deploy-model.py",
    "chars": 3364,
    "preview": "import argparse\nimport asyncio\nimport os\n\nfrom dotenv import load_dotenv\n\nimport art\nfrom art.utils.deployment import To"
  },
  {
    "path": "scripts/kill-gpu-processes.sh",
    "chars": 88,
    "preview": "#!/bin/bash\nnvidia-smi --query-compute-apps=pid --format=csv,noheader | xargs -r kill -9"
  },
  {
    "path": "scripts/migrate-s3-checkpoints.py",
    "chars": 1794,
    "preview": "#!/usr/bin/env python3\n\"\"\"\nScript to migrate model checkpoints in S3 from old to new structure.\n\nOld structure: s3://buc"
  },
  {
    "path": "scripts/publish.sh",
    "chars": 390,
    "preview": "#!/bin/bash\nset -e\n\n# Load the .env file\nset -o allexport\nsource .env\n\n# Check if PYPI_ART_TOKEN is set\nif [[ -z \"${PYPI"
  },
  {
    "path": "scripts/setup.sh",
    "chars": 1931,
    "preview": "#!/bin/bash\n\n# Load environment variables from .env file if it exists\nif [ -f .env ]; then\n    # Read .env file line by "
  },
  {
    "path": "skypilot-config.yaml",
    "chars": 17302,
    "preview": "# Title: Task YAML — SkyPilot documentation\n\n# URL Source: https://docs.skypilot.co/en/latest/reference/yaml-spec.html\n\n"
  },
  {
    "path": "src/art/__init__.py",
    "chars": 2684,
    "preview": "import os\n\nfrom dotenv import load_dotenv\n\nload_dotenv()\n\nif os.getenv(\"SUPPRESS_LITELLM_SERIALIZATION_WARNINGS\", \"1\") ="
  },
  {
    "path": "src/art/api_costs.py",
    "chars": 18149,
    "preview": "from __future__ import annotations\n\nfrom collections.abc import Awaitable, Callable, Mapping\nfrom dataclasses import dat"
  },
  {
    "path": "src/art/auto_trajectory.py",
    "chars": 6656,
    "preview": "import contextvars\nimport json\nimport logging\nfrom typing import Any, AsyncIterator, Coroutine, Iterator, Literal, overl"
  },
  {
    "path": "src/art/backend.py",
    "chars": 1809,
    "preview": "from typing import TYPE_CHECKING, Any, AsyncIterator, Iterable, Protocol, TypeAlias\n\nfrom . import dev\nfrom .trajectorie"
  },
  {
    "path": "src/art/batches.py",
    "chars": 1889,
    "preview": "import asyncio\nfrom typing import AsyncIterator, Awaitable, Iterable\n\nfrom tqdm import auto as tqdm\n\nfrom .gather import"
  },
  {
    "path": "src/art/cli.py",
    "chars": 13476,
    "preview": "import json\nfrom pathlib import Path\nimport socket\nfrom typing import Any, AsyncIterator\n\nfrom dotenv import load_dotenv"
  },
  {
    "path": "src/art/costs.py",
    "chars": 5255,
    "preview": "\"\"\"Cost utilities for ART training and Tinker inference.\"\"\"\n\nfrom __future__ import annotations\n\nfrom dataclasses import"
  },
  {
    "path": "src/art/dev/__init__.py",
    "chars": 750,
    "preview": "from .engine import EngineArgs\nfrom .model import (\n    InitArgs,\n    InternalModelConfig,\n    PeftArgs,\n    TinkerArgs,"
  },
  {
    "path": "src/art/dev/engine.py",
    "chars": 4489,
    "preview": "from typing import Any, Literal, Tuple\n\nfrom typing_extensions import TypedDict\n\n\nclass EngineArgs(TypedDict, total=Fals"
  },
  {
    "path": "src/art/dev/get_model_config.py",
    "chars": 2688,
    "preview": "from .engine import EngineArgs\nfrom .model import InitArgs, InternalModelConfig, PeftArgs, TrainerArgs\nfrom .validate im"
  },
  {
    "path": "src/art/dev/model.py",
    "chars": 10265,
    "preview": "from enum import Enum\n\nfrom typing_extensions import Required, TypedDict\n\nfrom .engine import EngineArgs\n\n\n# Vendored fr"
  },
  {
    "path": "src/art/dev/openai_server.py",
    "chars": 6527,
    "preview": "from typing import Literal\n\nfrom typing_extensions import TypedDict\n\nfrom .engine import EngineArgs\n\n\ndef get_openai_ser"
  },
  {
    "path": "src/art/dev/train.py",
    "chars": 1274,
    "preview": "from typing import Literal\n\nfrom typing_extensions import TypedDict\n\n\nclass TrainConfig(TypedDict, total=False):\n    adv"
  },
  {
    "path": "src/art/dev/validate.py",
    "chars": 2391,
    "preview": "\"\"\"Validation functions for model configuration.\"\"\"\n\nfrom .model import InternalModelConfig\n\n\ndef is_dedicated_mode(conf"
  },
  {
    "path": "src/art/errors.py",
    "chars": 1800,
    "preview": "\"\"\"\nThis file contains errors that are returned by LocalBackend. They are normal exceptions\nwith status_code and detail "
  },
  {
    "path": "src/art/gather.py",
    "chars": 8145,
    "preview": "import asyncio\nfrom collections import Counter\nimport contextlib\nimport contextvars\nfrom dataclasses import dataclass, f"
  },
  {
    "path": "src/art/guided_completion.py",
    "chars": 3058,
    "preview": "from copy import deepcopy\nimport json\nfrom typing import Iterable, List, Literal, Tuple\n\nfrom openai.types.chat.chat_com"
  },
  {
    "path": "src/art/langgraph/__init__.py",
    "chars": 102,
    "preview": "from .llm_wrapper import init_chat_model, wrap_rollout\n\n__all__ = [\"wrap_rollout\", \"init_chat_model\"]\n"
  },
  {
    "path": "src/art/langgraph/llm_wrapper.py",
    "chars": 7559,
    "preview": "\"\"\"LLM wrapper with logging functionality.\"\"\"\n\nimport asyncio\nimport contextvars\nimport json\nimport os\nfrom typing impor"
  },
  {
    "path": "src/art/langgraph/logging.py",
    "chars": 810,
    "preview": "import os\nimport pickle\n\n\nclass FileLogger:\n    def __init__(self, filepath):\n        self.text_path = filepath\n        "
  },
  {
    "path": "src/art/langgraph/message_utils.py",
    "chars": 3963,
    "preview": "import json\nfrom typing import List, Union\n\nfrom langchain_core.messages import (\n    AIMessage,\n    BaseMessage,\n    Fu"
  },
  {
    "path": "src/art/local/__init__.py",
    "chars": 62,
    "preview": "from .backend import LocalBackend\n\n__all__ = [\"LocalBackend\"]\n"
  },
  {
    "path": "src/art/local/backend.py",
    "chars": 55896,
    "preview": "import asyncio\nimport json\nimport logging\nimport math\nimport os\nimport shutil\nimport socket\nimport subprocess\nimport tim"
  },
  {
    "path": "src/art/local/checkpoints.py",
    "chars": 2367,
    "preview": "import os\nimport shutil\n\nfrom art.utils.get_model_step import get_step_from_dir\n\n\ndef delete_checkpoints(output_dir: str"
  },
  {
    "path": "src/art/local/service.py",
    "chars": 1232,
    "preview": "from typing import AsyncIterator, Protocol, runtime_checkable\n\nfrom .. import dev, types\nfrom ..preprocessing.pack impor"
  },
  {
    "path": "src/art/loss.py",
    "chars": 5577,
    "preview": "from typing import TYPE_CHECKING\n\nfrom pydantic import BaseModel, ConfigDict\nimport torch\n\nfrom art.utils.group_aggregat"
  },
  {
    "path": "src/art/mcp/__init__.py",
    "chars": 422,
    "preview": "\"\"\"MCP utilities for Agent Reinforcement Training.\"\"\"\n\nfrom .default_tools import complete_task_tool\nfrom .generate_scen"
  },
  {
    "path": "src/art/mcp/default_tools.py",
    "chars": 380,
    "preview": "from art.mcp.types import MCPTool\n\ncomplete_task_tool = MCPTool(\n    name=\"complete_task\",\n    description=\"Complete a t"
  },
  {
    "path": "src/art/mcp/generate_scenarios.py",
    "chars": 8453,
    "preview": "\"\"\"Scenario generation for MCP tools.\"\"\"\n\nimport json\nimport time\nfrom typing import Any, Dict, List, Optional\n\nimport o"
  },
  {
    "path": "src/art/mcp/types.py",
    "chars": 7072,
    "preview": "from collections import Counter\nfrom dataclasses import dataclass\nimport json\nimport random\nfrom typing import Any, Dict"
  },
  {
    "path": "src/art/megatron/__init__.py",
    "chars": 68,
    "preview": "from .backend import MegatronBackend\n\n__all__ = [\"MegatronBackend\"]\n"
  },
  {
    "path": "src/art/megatron/backend.py",
    "chars": 1406,
    "preview": "from mp_actors import move_to_child_process\n\nfrom ..local.backend import LocalBackend\nfrom ..local.service import ModelS"
  },
  {
    "path": "src/art/megatron/flex_attention.py",
    "chars": 6967,
    "preview": "\"\"\"Flex attention plumbing for ART's Megatron backend.\"\"\"\n\nimport math\nfrom typing import Any, ClassVar, cast\n\nfrom mega"
  },
  {
    "path": "src/art/megatron/lora.py",
    "chars": 18306,
    "preview": "import math\nfrom typing import Sequence\n\nfrom megatron.bridge.models.gpt_provider import GPTModelProvider\nfrom megatron."
  },
  {
    "path": "src/art/megatron/offload.py",
    "chars": 5088,
    "preview": "from dataclasses import dataclass, field\nimport gc\nfrom typing import Any, Sequence\n\nimport torch\n\n\n@dataclass\nclass Off"
  },
  {
    "path": "src/art/megatron/provider.py",
    "chars": 2482,
    "preview": "import copy\nfrom functools import partial\nimport inspect\nfrom typing import Callable\n\nfrom megatron.bridge import AutoBr"
  },
  {
    "path": "src/art/megatron/service.py",
    "chars": 12853,
    "preview": "import asyncio\nfrom dataclasses import asdict, dataclass\nimport datetime\nfrom functools import cached_property\nimport js"
  },
  {
    "path": "src/art/megatron/setup.sh",
    "chars": 618,
    "preview": "#!/usr/bin/env bash\nset -euo pipefail\n\nexport CUDA_HOME=\"${CUDA_HOME:-/usr/local/cuda-12.8}\"\nexport TORCH_CUDA_ARCH_LIST"
  },
  {
    "path": "src/art/megatron/train.py",
    "chars": 12596,
    "preview": "# isort: off\nimport os\n\n\ndef _set_cache_dir(env_var: str, default_path: str) -> None:\n    if not os.environ.get(env_var)"
  },
  {
    "path": "src/art/metrics.py",
    "chars": 14989,
    "preview": "from __future__ import annotations\n\nimport asyncio\nfrom contextlib import contextmanager\nfrom contextvars import Context"
  },
  {
    "path": "src/art/metrics_taxonomy.py",
    "chars": 3916,
    "preview": "from __future__ import annotations\n\nfrom dataclasses import dataclass\nfrom typing import Any, Iterable\n\nfrom .trajectori"
  },
  {
    "path": "src/art/model.py",
    "chars": 46006,
    "preview": "import asyncio\nfrom contextvars import Token\nfrom datetime import datetime\nimport json\nimport os\nimport time\nfrom typing"
  },
  {
    "path": "src/art/openai.py",
    "chars": 6449,
    "preview": "from typing import Any, Callable\n\nfrom openai import AsyncStream, Stream\nfrom openai.types.chat.chat_completion import C"
  },
  {
    "path": "src/art/pipeline_trainer/__init__.py",
    "chars": 320,
    "preview": "from .status import StatusReporter\nfrom .trainer import PipelineTrainer, make_group_rollout_fn\nfrom .types import EvalFn"
  }
]

// ... and 141 more files (download for full content)

About this extraction

This page contains the full source code of the OpenPipe/ART GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 341 files (25.0 MB), approximately 3.0M tokens, and a symbol index with 1696 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo