Showing preview only (1,439K chars total). Download the full file or copy to clipboard to get everything.
Repository: openai/gpt-oss
Branch: main
Commit: 599476783c6f
Files: 155
Total size: 1.3 MB
Directory structure:
gitextract_586tz2l0/
├── .github/
│ ├── CODEOWNERS
│ ├── ISSUE_TEMPLATE/
│ │ └── config.yml
│ └── workflows/
│ └── CI.yml
├── .gitignore
├── CMakeLists.txt
├── LICENSE
├── MANIFEST.in
├── README.md
├── USAGE_POLICY
├── _build/
│ └── gpt_oss_build_backend/
│ ├── __init__.py
│ └── backend.py
├── awesome-gpt-oss.md
├── compatibility-test/
│ ├── .gitignore
│ ├── README.md
│ ├── analysis.ts
│ ├── cases.jsonl
│ ├── index.ts
│ ├── package.json
│ ├── providers.ts
│ ├── runCase.ts
│ └── tools.ts
├── examples/
│ ├── agents-sdk-js/
│ │ ├── index.ts
│ │ └── package.json
│ ├── agents-sdk-python/
│ │ ├── example.py
│ │ └── pyproject.toml
│ ├── gradio/
│ │ └── gradio_chat.py
│ ├── reinforcement-fine-tuning.ipynb
│ └── streamlit/
│ └── streamlit_chat.py
├── gpt-oss-mcp-server/
│ ├── README.md
│ ├── browser_server.py
│ ├── build-system-prompt.py
│ ├── pyproject.toml
│ ├── python_server.py
│ └── reference-system-prompt.py
├── gpt_oss/
│ ├── __init__.py
│ ├── chat.py
│ ├── evals/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── __main__.py
│ │ ├── abcd_grader.py
│ │ ├── aime_eval.py
│ │ ├── basic_eval.py
│ │ ├── chat_completions_sampler.py
│ │ ├── gpqa_eval.py
│ │ ├── healthbench_eval.py
│ │ ├── report.py
│ │ ├── responses_sampler.py
│ │ └── types.py
│ ├── generate.py
│ ├── metal/
│ │ ├── CMakeLists.txt
│ │ ├── __init__.py
│ │ ├── benchmark/
│ │ │ ├── end-to-end-threadgroup.cc
│ │ │ ├── end-to-end.cc
│ │ │ ├── f32-bf16w-rmsnorm.cc
│ │ │ ├── f32-random.cc
│ │ │ ├── mf4-f32-convert.cc
│ │ │ └── u32-random.cc
│ │ ├── examples/
│ │ │ ├── chat.py
│ │ │ └── generate.py
│ │ ├── include/
│ │ │ ├── gpt-oss/
│ │ │ │ ├── functions.h
│ │ │ │ ├── macros.h
│ │ │ │ └── types.h
│ │ │ └── gpt-oss.h
│ │ ├── python/
│ │ │ ├── context.c
│ │ │ ├── model.c
│ │ │ ├── module.c
│ │ │ ├── module.h
│ │ │ └── tokenizer.c
│ │ ├── scripts/
│ │ │ └── create-local-model.py
│ │ ├── source/
│ │ │ ├── accumulate.metal
│ │ │ ├── context.c
│ │ │ ├── convert.metal
│ │ │ ├── embeddings.metal
│ │ │ ├── expert_routing_metadata.metal
│ │ │ ├── gather_and_accumulate.metal
│ │ │ ├── generate.c
│ │ │ ├── include/
│ │ │ │ └── internal/
│ │ │ │ ├── datatype.h
│ │ │ │ ├── datatype.hpp
│ │ │ │ ├── kernel-args.h
│ │ │ │ ├── log.h
│ │ │ │ ├── macros.h
│ │ │ │ ├── math.h
│ │ │ │ ├── metal-kernels.h
│ │ │ │ ├── metal.h
│ │ │ │ ├── metal.hpp
│ │ │ │ ├── model.h
│ │ │ │ ├── rng.h
│ │ │ │ ├── rng.hpp
│ │ │ │ ├── storage.h
│ │ │ │ └── uuid.h
│ │ │ ├── log.c
│ │ │ ├── matmul.metal
│ │ │ ├── metal-kernels.c
│ │ │ ├── metal.m
│ │ │ ├── model.c
│ │ │ ├── moematmul.metal
│ │ │ ├── random.metal
│ │ │ ├── rmsnorm.metal
│ │ │ ├── rope.metal
│ │ │ ├── sample.metal
│ │ │ ├── scatter.metal
│ │ │ ├── sdpa.metal
│ │ │ ├── tokenizer.c
│ │ │ └── topk.metal
│ │ └── test/
│ │ ├── bf16-f32-embeddings.cc
│ │ ├── embeddings-kernel-tester.hpp
│ │ ├── f32-bf16w-matmul.cc
│ │ ├── f32-bf16w-rmsnorm.cc
│ │ ├── f32-random.cc
│ │ ├── f32-rope.cc
│ │ ├── fill-random-kernel-tester.hpp
│ │ ├── matmul-kernel-tester.hpp
│ │ ├── mf4-f32-convert.cc
│ │ ├── rmsnorm-kernel-tester.hpp
│ │ ├── rope-kernel-tester.hpp
│ │ └── u32-random.cc
│ ├── responses_api/
│ │ ├── __init__.py
│ │ ├── api_server.py
│ │ ├── events.py
│ │ ├── inference/
│ │ │ ├── __init__.py
│ │ │ ├── metal.py
│ │ │ ├── ollama.py
│ │ │ ├── stub.py
│ │ │ ├── transformers.py
│ │ │ ├── triton.py
│ │ │ └── vllm.py
│ │ ├── serve.py
│ │ ├── types.py
│ │ └── utils.py
│ ├── tokenizer.py
│ ├── tools/
│ │ ├── __init__.py
│ │ ├── apply_patch.md
│ │ ├── apply_patch.py
│ │ ├── python_docker/
│ │ │ └── docker_tool.py
│ │ ├── simple_browser/
│ │ │ ├── __init__.py
│ │ │ ├── backend.py
│ │ │ ├── page_contents.py
│ │ │ └── simple_browser_tool.py
│ │ └── tool.py
│ ├── torch/
│ │ ├── __init__.py
│ │ ├── model.py
│ │ ├── utils.py
│ │ └── weights.py
│ ├── triton/
│ │ ├── __init__.py
│ │ ├── attention.py
│ │ ├── model.py
│ │ └── moe.py
│ └── vllm/
│ └── token_generator.py
├── pyproject.toml
├── tests/
│ ├── conftest.py
│ ├── gpt_oss/
│ │ └── tools/
│ │ └── simple_browser/
│ │ └── test_backend.py
│ ├── test_api_endpoints.py
│ └── test_responses_api.py
└── tests-data/
├── basic-event-stream.txt
└── web-search-event-stream.txt
================================================
FILE CONTENTS
================================================
================================================
FILE: .github/CODEOWNERS
================================================
@openai/developer-experience
dkundel-openai
Maratyszcza
scott-oai
volsgd
================================================
FILE: .github/ISSUE_TEMPLATE/config.yml
================================================
blank_issues_enabled: false
contact_links:
- name: 🐛 Model Issues
url: https://huggingface.co/openai/gpt-oss-120b/discussions
about: For general questions about the models, please use the Community feature on Hugging Face.
- name: 💡 General Feedback
url: https://openai.com/open-models
about: Suggest new features on our feature request page.
================================================
FILE: .github/workflows/CI.yml
================================================
name: CI
on:
release:
types: [published]
push:
tags:
- "v*"
workflow_dispatch:
# Minimal repo-level permissions; job-level permissions override where needed.
permissions:
contents: read
id-token: write
jobs:
publish:
name: Build & Publish to PyPI (Trusted Publishing)
runs-on: ubuntu-latest
# Run in the GitHub environment named "release" so you can gate it with approvals.
environment: release
# Extra permissions required for pypa action to do OIDC exchange:
permissions:
contents: read
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.12"
- name: Install build tools
run: |
python -m pip install --upgrade pip setuptools wheel build
- name: Install uv (if needed)
run: |
python -m pip install --upgrade uv || true
- name: Build package with uv
run: |
pwd
ls -la
uv build
- name: Inspect dist folder
run: |
ls -la dist || ls -la build || echo "no dist/ or build/ — check uv output"
- name: Publish to PyPI using Trusted Publishing
# Note: No pypi_token / username / password provided — Trusted Publishing via OIDC is used.
uses: pypa/gh-action-pypi-publish@release/v1
with:
attestations: true # optional (default for Trusted Publishing) - set to false to disable
================================================
FILE: .gitignore
================================================
build
_skbuild
tmp*
__pycache__
*.egg*
node_modules/
*.log
================================================
FILE: CMakeLists.txt
================================================
cmake_minimum_required(VERSION 3.26)
project(gpt_oss LANGUAGES C CXX)
# If not defined externally, auto-detect
if(NOT DEFINED GPTOSS_BUILD_METAL)
if(APPLE AND CMAKE_SYSTEM_PROCESSOR MATCHES "arm64")
message(STATUS "Apple Silicon detected → enabling GPTOSS_BUILD_METAL")
set(GPTOSS_BUILD_METAL ON)
else()
message(STATUS "Non-Apple Silicon → disabling GPTOSS_BUILD_METAL")
set(GPTOSS_BUILD_METAL OFF)
endif()
else()
message(STATUS "GPTOSS_BUILD_METAL manually set to: ${GPTOSS_BUILD_METAL}")
endif()
# Now declare it as a cache variable (respects user-provided value)
set(GPTOSS_BUILD_METAL "${GPTOSS_BUILD_METAL}" CACHE BOOL "Enable Metal backend")
if(GPTOSS_BUILD_METAL)
enable_language(OBJC)
add_subdirectory(gpt_oss/metal)
endif()
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: MANIFEST.in
================================================
recursive-include _build *
================================================
FILE: README.md
================================================
<img alt="gpt-oss-120" src="./docs/gpt-oss.svg">
<p align="center">
<a href="https://gpt-oss.com"><strong>Try gpt-oss</strong></a> ·
<a href="https://cookbook.openai.com/topic/gpt-oss"><strong>Guides</strong></a> ·
<a href="https://arxiv.org/abs/2508.10925"><strong>Model card</strong></a> ·
<a href="https://openai.com/index/introducing-gpt-oss/"><strong>OpenAI blog</strong></a>
</p>
<p align="center">
<strong>Download <a href="https://huggingface.co/openai/gpt-oss-120b">gpt-oss-120b</a> and <a href="https://huggingface.co/openai/gpt-oss-20b">gpt-oss-20b</a> on Hugging Face</strong>
</p>
<br>
Welcome to the gpt-oss series, [OpenAI's open-weight models](https://openai.com/open-models/) designed for powerful reasoning, agentic tasks, and versatile developer use cases.
We're releasing two flavors of these open models:
- `gpt-oss-120b` — for production, general purpose, high reasoning use cases that fit into a single 80GB GPU (like NVIDIA H100 or AMD MI300X) (117B parameters with 5.1B active parameters)
- `gpt-oss-20b` — for lower latency, and local or specialized use cases (21B parameters with 3.6B active parameters)
Both models were trained using our [harmony response format][harmony] and should only be used with this format; otherwise, they will not work correctly.
## Table of Contents
- [Highlights](#highlights)
- [Inference examples](#inference-examples)
- [About this repository](#about-this-repository)
- [Setup](#setup)
- [Download the model](#download-the-model)
- [Reference PyTorch implementation](#reference-pytorch-implementation)
- [Reference Triton implementation (single GPU)](#reference-triton-implementation-single-gpu)
- [Reference Metal implementation](#reference-metal-implementation)
- [Harmony format & tools](#harmony-format--tools)
- [Clients](#clients)
- [Tools](#tools)
- [Other details](#other-details)
- [Contributing](#contributing)
### Highlights
- **Permissive Apache 2.0 license:** Build freely without copyleft restrictions or patent risk—ideal for experimentation, customization, and commercial deployment.
- **Configurable reasoning effort:** Easily adjust the reasoning effort (low, medium, high) based on your specific use case and latency needs.
- **Full chain-of-thought:** Provides complete access to the model's reasoning process, facilitating easier debugging and greater trust in outputs. This information is not intended to be shown to end users.
- **Fine-tunable:** Fully customize models to your specific use case through parameter fine-tuning.
- **Agentic capabilities:** Use the models' native capabilities for function calling, [web browsing](#browser), [Python code execution](#python), and Structured Outputs.
- **MXFP4 quantization:** The models were post-trained with MXFP4 quantization of the MoE weights, making `gpt-oss-120b` run on a single 80GB GPU (like NVIDIA H100 or AMD MI300X) and the `gpt-oss-20b` model run within 16GB of memory. All evals were performed with the same MXFP4 quantization.
### Inference examples
#### Transformers
You can use `gpt-oss-120b` and `gpt-oss-20b` with the Transformers library. If you use Transformers' chat template, it will automatically apply the [harmony response format][harmony]. If you use `model.generate` directly, you need to apply the harmony format manually using the chat template or use our [`openai-harmony`][harmony] package.
```python
from transformers import pipeline
import torch
model_id = "openai/gpt-oss-120b"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "user", "content": "Explain quantum mechanics clearly and concisely."},
]
outputs = pipe(
messages,
max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])
```
[Learn more about how to use gpt-oss with Transformers.](https://cookbook.openai.com/articles/gpt-oss/run-transformers)
#### vLLM
vLLM recommends using [`uv`](https://docs.astral.sh/uv/) for Python dependency management. You can use vLLM to spin up an OpenAI-compatible web server. The following command will automatically download the model and start the server.
```bash
uv pip install --pre vllm==0.10.1+gptoss \
--extra-index-url https://wheels.vllm.ai/gpt-oss/ \
--extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
--index-strategy unsafe-best-match
vllm serve openai/gpt-oss-20b
```
[Learn more about how to use gpt-oss with vLLM.](https://cookbook.openai.com/articles/gpt-oss/run-vllm)
Offline Serve Code:
- run this code after installing proper libraries as described, while additionally installing this:
- `uv pip install openai-harmony`
```python
# source .oss/bin/activate
import os
os.environ["VLLM_USE_FLASHINFER_SAMPLER"] = "0"
import json
from openai_harmony import (
HarmonyEncodingName,
load_harmony_encoding,
Conversation,
Message,
Role,
SystemContent,
DeveloperContent,
)
from vllm import LLM, SamplingParams
import os
# --- 1) Render the prefill with Harmony ---
encoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
convo = Conversation.from_messages(
[
Message.from_role_and_content(Role.SYSTEM, SystemContent.new()),
Message.from_role_and_content(
Role.DEVELOPER,
DeveloperContent.new().with_instructions("Always respond in riddles"),
),
Message.from_role_and_content(Role.USER, "What is the weather like in SF?"),
]
)
prefill_ids = encoding.render_conversation_for_completion(convo, Role.ASSISTANT)
# Harmony stop tokens (pass to sampler so they won't be included in output)
stop_token_ids = encoding.stop_tokens_for_assistant_actions()
# --- 2) Run vLLM with prefill ---
llm = LLM(
model="openai/gpt-oss-20b",
trust_remote_code=True,
gpu_memory_utilization = 0.95,
max_num_batched_tokens=4096,
max_model_len=5000,
tensor_parallel_size=1
)
sampling = SamplingParams(
max_tokens=128,
temperature=1,
stop_token_ids=stop_token_ids,
)
outputs = llm.generate(
prompt_token_ids=[prefill_ids], # batch of size 1
sampling_params=sampling,
)
# vLLM gives you both text and token IDs
gen = outputs[0].outputs[0]
text = gen.text
output_tokens = gen.token_ids # <-- these are the completion token IDs (no prefill)
# --- 3) Parse the completion token IDs back into structured Harmony messages ---
entries = encoding.parse_messages_from_completion_tokens(output_tokens, Role.ASSISTANT)
# 'entries' is a sequence of structured conversation entries (assistant messages, tool calls, etc.).
for message in entries:
print(f"{json.dumps(message.to_dict())}")
```
#### PyTorch / Triton / Metal
These implementations are largely reference implementations for educational purposes and are not expected to be run in production.
[Learn more below.](#reference-pytorch-implementation)
#### Ollama
If you are trying to run `gpt-oss` on consumer hardware, you can use Ollama by running the following commands after [installing Ollama](https://ollama.com/download).
```bash
# gpt-oss-20b
ollama pull gpt-oss:20b
ollama run gpt-oss:20b
# gpt-oss-120b
ollama pull gpt-oss:120b
ollama run gpt-oss:120b
```
[Learn more about how to use gpt-oss with Ollama.](https://cookbook.openai.com/articles/gpt-oss/run-locally-ollama)
#### LM Studio
If you are using [LM Studio](https://lmstudio.ai/) you can use the following commands to download.
```bash
# gpt-oss-20b
lms get openai/gpt-oss-20b
# gpt-oss-120b
lms get openai/gpt-oss-120b
```
Check out our [awesome list](./awesome-gpt-oss.md) for a broader collection of gpt-oss resources and inference partners.
## About this repository
This repository provides a collection of reference implementations:
- **Inference:**
- [`torch`](#reference-pytorch-implementation) — a non-optimized [PyTorch](https://pytorch.org/) implementation for educational purposes only. Requires at least 4× H100 GPUs due to lack of optimization.
- [`triton`](#reference-triton-implementation-single-gpu) — a more optimized implementation using [PyTorch](https://pytorch.org/) & [Triton](https://github.com/triton-lang/triton) incl. using CUDA graphs and basic caching
- [`metal`](#reference-metal-implementation) — a Metal-specific implementation for running the models on Apple Silicon hardware
- **Tools:**
- [`browser`](#browser) — a reference implementation of the browser tool the models got trained on
- [`python`](#python) — a stateless reference implementation of the python tool the model got trained on
- **Client examples:**
- [`chat`](#terminal-chat) — a basic terminal chat application that uses the PyTorch or Triton implementations for inference along with the python and browser tools
- [`responses_api`](#responses-api) — an example Responses API compatible server that implements the browser tool along with other Responses-compatible functionality
## Setup
### Requirements
- Python 3.12
- On macOS: Install the Xcode CLI tools --> `xcode-select --install`
- On Linux: These reference implementations require CUDA
- On Windows: These reference implementations have not been tested on Windows. Try using solutions like Ollama if you are trying to run the model locally.
### Installation
If you want to try any of the code you can install it directly from [PyPI](https://pypi.org/project/gpt-oss/)
```shell
# if you just need the tools
pip install gpt-oss
# if you want to try the torch implementation
pip install gpt-oss[torch]
# if you want to try the triton implementation
pip install gpt-oss[triton]
```
If you want to modify the code or try the metal implementation set the project up locally:
```shell
git clone https://github.com/openai/gpt-oss.git
GPTOSS_BUILD_METAL=1 pip install -e ".[metal]"
```
## Download the model
You can download the model weights from the [Hugging Face Hub](https://huggingface.co/collections/openai/gpt-oss-68911959590a1634ba11c7a4) directly from Hugging Face CLI:
```shell
# gpt-oss-120b
hf download openai/gpt-oss-120b --include "original/*" --local-dir gpt-oss-120b/
# gpt-oss-20b
hf download openai/gpt-oss-20b --include "original/*" --local-dir gpt-oss-20b/
```
## Reference PyTorch implementation
We include an inefficient reference PyTorch implementation in [gpt_oss/torch/model.py](gpt_oss/torch/model.py). This code uses basic PyTorch operators to show the exact model architecture, with a small addition of supporting tensor parallelism in MoE so that the larger model can run with this code (e.g., on 4xH100 or 2xH200). In this implementation, we upcast all weights to BF16 and run the model in BF16.
To run the reference implementation, install the dependencies:
```shell
pip install -e ".[torch]"
```
And then run:
```shell
# On 4xH100:
torchrun --nproc-per-node=4 -m gpt_oss.generate gpt-oss-120b/original/
```
## Reference Triton implementation (single GPU)
We also include an optimized reference implementation that uses [an optimized triton MoE kernel](https://github.com/triton-lang/triton/tree/main/python/triton_kernels/triton_kernels) that supports MXFP4. It also has some optimization on the attention code to reduce the memory cost. To run this implementation, the nightly version of triton and torch will be installed. This version can be run on a single 80GB GPU for `gpt-oss-120b`.
To install the reference Triton implementation run
```shell
# You need to install triton from source to use the triton implementation
git clone https://github.com/triton-lang/triton
cd triton/
pip install -r python/requirements.txt
pip install -e . --verbose --no-build-isolation
pip install -e python/triton_kernels
# Install the gpt-oss triton implementation
pip install -e ".[triton]"
```
And then run:
```shell
# On 1xH100
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
python -m gpt_oss.generate --backend triton gpt-oss-120b/original/
```
If you encounter `torch.OutOfMemoryError`, make sure to turn on the expandable allocator to avoid crashes when loading weights from the checkpoint.
## Reference Metal implementation
Additionally we are providing a reference implementation for Metal to run on Apple Silicon. This implementation is not production-ready but is accurate to the PyTorch implementation.
The implementation will get automatically compiled when running the `.[metal]` installation on an Apple Silicon device:
```shell
GPTOSS_BUILD_METAL=1 pip install -e ".[metal]"
```
To perform inference you'll need to first convert the SafeTensor weights from Hugging Face into the right format using:
```shell
python gpt_oss/metal/scripts/create-local-model.py -s <model_dir> -d <output_file>
```
Or download the pre-converted weights:
```shell
hf download openai/gpt-oss-120b --include "metal/*" --local-dir gpt-oss-120b/metal/
hf download openai/gpt-oss-20b --include "metal/*" --local-dir gpt-oss-20b/metal/
```
To test it you can run:
```shell
python gpt_oss/metal/examples/generate.py gpt-oss-20b/metal/model.bin -p "why did the chicken cross the road?"
```
## Harmony format & tools
Along with the model, we are also releasing a new chat format library `harmony` to interact with the model. Check [this guide](https://cookbook.openai.com/articles/openai-harmony) for more info about harmony.
We also include two system tools for the model: browsing and python container. Check [gpt_oss/tools](gpt_oss/tools) for the tool implementation.
## Clients
### Terminal Chat
The terminal chat application is a basic example of how to use the harmony format together with the PyTorch, Triton, and vLLM implementations. It also exposes both the python and browser tool as optional tools that can be used.
```bash
usage: python -m gpt_oss.chat [-h] [-r REASONING_EFFORT] [-a] [-b] [--show-browser-results] [-p] [--developer-message DEVELOPER_MESSAGE] [-c CONTEXT] [--raw] [--backend {triton,torch,vllm}] FILE
Chat example
positional arguments:
FILE Path to the SafeTensors checkpoint
options:
-h, --help show this help message and exit
-r REASONING_EFFORT, --reasoning-effort REASONING_EFFORT
Reasoning effort (default: low)
-a, --apply-patch Make apply_patch tool available to the model (default: False)
-b, --browser Use browser tool (default: False)
--show-browser-results
Show browser results (default: False)
-p, --python Use python tool (default: False)
--developer-message DEVELOPER_MESSAGE
Developer message (default: )
-c CONTEXT, --context CONTEXT
Max context length (default: 8192)
--raw Raw mode (does not render Harmony encoding) (default: False)
--backend {triton,torch,vllm}
Inference backend (default: triton)
```
> [!NOTE]
> The torch and triton implementations require original checkpoint under `gpt-oss-120b/original/` and `gpt-oss-20b/original/` respectively. While vLLM uses the Hugging Face converted checkpoint under `gpt-oss-120b/` and `gpt-oss-20b/` root directory respectively.
### Responses API
We also include an example Responses API server. This server does not implement every feature and event of the Responses API but should be compatible with most of the basic use cases and serve as inspiration for anyone building their own server. Some of our inference partners are also offering their own Responses API.
You can start this server with the following inference backends:
- `triton` — uses the triton implementation
- `metal` — uses the metal implementation on Apple Silicon only
- `ollama` — uses the Ollama /api/generate API as an inference solution
- `vllm` — uses your installed vllm version to perform inference
- `transformers` — uses your installed transformers version to perform local inference
```bash
usage: python -m gpt_oss.responses_api.serve [-h] [--checkpoint FILE] [--port PORT] [--inference-backend BACKEND]
Responses API server
options:
-h, --help show this help message and exit
--checkpoint FILE Path to the SafeTensors checkpoint
--port PORT Port to run the server on
--inference-backend BACKEND Inference backend to use
```
### Codex
We support [codex](https://github.com/openai/codex) as a client for gpt-oss. To run the 20b version, set this to `~/.codex/config.toml`:
```
disable_response_storage = true
show_reasoning_content = true
[model_providers.local]
name = "local"
base_url = "http://localhost:11434/v1"
[profiles.oss]
model = "gpt-oss:20b"
model_provider = "local"
```
This will work with any chat completions-API compatible server listening on port 11434, like ollama. Start the server and point codex to the oss model:
```
ollama run gpt-oss:20b
codex -p oss
```
## Tools
### Browser
> [!WARNING]
> This implementation is purely for educational purposes and should not be used in production. You should implement your own equivalent of the [`YouComBackend`](gpt_oss/tools/simple_browser/backend.py) class with your own browsing environment. Currently we have available `YouComBackend` and `ExaBackend`.
Both gpt-oss models were trained with the capability to browse using the `browser` tool that exposes the following three methods:
- `search` to search for key phrases
- `open` to open a particular page
- `find` to look for contents on a page
#### Usage
To enable the browser tool, you'll have to place the definition into the `system` message of your harmony formatted prompt. You can either use the `with_browser_tool()` method if your tool implements the full interface or modify the definition using `with_tools()`. For example:
```python
import datetime
from gpt_oss.tools.simple_browser import SimpleBrowserTool
from gpt_oss.tools.simple_browser.backend import YouComBackend
from openai_harmony import SystemContent, Message, Conversation, Role, load_harmony_encoding, HarmonyEncodingName
encoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
# Depending on the choice of the browser backend you need corresponding env variables setup
# In case you use You.com backend requires you to have set the YDC_API_KEY environment variable,
# while for Exa you might need EXA_API_KEY environment variable set
backend = YouComBackend(
source="web",
)
# backend = ExaBackend(
# source="web",
# )
browser_tool = SimpleBrowserTool(backend=backend)
# create a basic system prompt
system_message_content = SystemContent.new().with_conversation_start_date(
datetime.datetime.now().strftime("%Y-%m-%d")
)
# if you want to use the browser tool
if use_browser_tool:
# enables the tool
system_message_content = system_message_content.with_tools(browser_tool.tool_config)
# alternatively you could use the following if your tool is not stateless
system_message_content = system_message_content.with_browser_tool()
# construct the system message
system_message = Message.from_role_and_content(Role.SYSTEM, system_message_content)
# create the overall prompt
messages = [system_message, Message.from_role_and_content(Role.USER, "What's the weather in SF?")]
conversation = Conversation.from_messages(messages)
# convert to tokens
token_ids = encoding.render_conversation_for_completion(conversation, Role.ASSISTANT)
# perform inference
# ...
# parse the output
messages = encoding.parse_messages_from_completion_tokens(output_tokens, Role.ASSISTANT)
last_message = messages[-1]
if last_message.recipient.startswith("browser"):
# perform browser call
response_messages = await browser_tool.process(last_message)
# extend the current messages and run inference again
messages.extend(response_messages)
```
#### Details
To control the context window size this tool uses a scrollable window of text that the model can interact with. So it might fetch the first 50 lines of a page and then scroll to the next 20 lines after that. The model has also been trained to then use citations from this tool in its answers.
To improve performance the tool caches requests so that the model can revisit a different part of a page without having to reload the page. For that reason you should create a new browser instance for every request.
### Python
The model was trained to use a python tool to perform calculations and other actions as part of its chain-of-thought. During the training the model used a stateful tool which makes running tools between CoT loops easier. This reference implementation, however, uses a stateless mode. As a result the PythonTool defines its own tool description to override the definition in [`openai-harmony`][harmony].
> [!WARNING]
> This implementation runs in a permissive Docker container which could be problematic in cases like prompt injections. It's serving as an example and you should consider implementing your own container restrictions in production.
#### Usage
To enable the python tool, you'll have to place the definition into the `system` message of your harmony formatted prompt. You can either use the `with_python()` method if your tool implements the full interface or modify the definition using `with_tools()`. For example:
```python
import datetime
from gpt_oss.tools.python_docker.docker_tool import PythonTool
from openai_harmony import SystemContent, Message, Conversation, Role, load_harmony_encoding, HarmonyEncodingName
encoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
python_tool = PythonTool()
# create a basic system prompt
system_message_content = SystemContent.new().with_conversation_start_date(
datetime.datetime.now().strftime("%Y-%m-%d")
)
# if you want to use the python tool
if use_python_tool:
# enables the tool making sure that the prompt gets set with the stateless tool description
system_message_content = system_message_content.with_tools(python_tool.tool_config)
# alternatively you could use the following if your tool is not stateless
system_message_content = system_message_content.with_python()
# construct the system message
system_message = Message.from_role_and_content(Role.SYSTEM, system_message_content)
# create the overall prompt
messages = [system_message, Message.from_role_and_content(Role.USER, "What's the square root of 9001?")]
conversation = Conversation.from_messages(messages)
# convert to tokens
token_ids = encoding.render_conversation_for_completion(conversation, Role.ASSISTANT)
# perform inference
# ...
# parse the output
messages = encoding.parse_messages_from_completion_tokens(output_tokens, Role.ASSISTANT)
last_message = messages[-1]
if last_message.recipient == "python":
# perform python call
response_messages = await python_tool.process(last_message)
# extend the current messages and run inference again
messages.extend(response_messages)
```
### Apply Patch
`apply_patch` can be used to create, update or delete files locally.
## Other details
### Precision format
We released the models with native quantization support. Specifically, we use [MXFP4](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf) for the linear projection weights in the MoE layer. We store the MoE tensor in two parts:
- `tensor.blocks` stores the actual fp4 values. We pack every two values in one `uint8` value.
- `tensor.scales` stores the block scale. The block scaling is done among the last dimension for all MXFP4 tensors.
All other tensors will be in BF16. We also recommend using BF16 as the activation precision for the model.
### Recommended Sampling Parameters
We recommend sampling with `temperature=1.0` and `top_p=1.0`.
## Contributing
The reference implementations in this repository are meant as a starting point and inspiration. Outside of bug fixes we do not intend to accept new feature contributions. If you build implementations based on this code such as new tool implementations you are welcome to contribute them to the [`awesome-gpt-oss.md`](./awesome-gpt-oss.md) file.
[harmony]: https://github.com/openai/harmony
## Citation
```bibtex
@misc{openai2025gptoss120bgptoss20bmodel,
title={gpt-oss-120b & gpt-oss-20b Model Card},
author={OpenAI},
year={2025},
eprint={2508.10925},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.10925},
}
```
================================================
FILE: USAGE_POLICY
================================================
We aim for our tools to be used safely, responsibly, and democratically, while maximizing your control over how you use them. By using OpenAI gpt-oss-120b and gpt-oss-20b, you agree to comply with all applicable law.
================================================
FILE: _build/gpt_oss_build_backend/__init__.py
================================================
"""In-tree PEP 517 backend package for gpt-oss."""
================================================
FILE: _build/gpt_oss_build_backend/backend.py
================================================
"""
Build backend for gpt-oss that supports two modes:
1) Default (pure wheel for PyPI)
- Delegates to setuptools.build_meta.
- Produces a py3-none-any wheel so PyPI accepts it (no linux_x86_64 tag).
2) Optional Metal/C extension build (local only)
- If the environment variable GPTOSS_BUILD_METAL is set to a truthy value
(1/true/on/yes), delegates to scikit_build_core.build.
- Dynamically injects build requirements (scikit-build-core, cmake, ninja,
pybind11) only for this mode.
Why this is needed
- PyPI rejects Linux wheels tagged linux_x86_64; manylinux/musllinux is required
for binary wheels. We ship a pure wheel by default, but still allow developers
to build/install the native Metal backend locally when needed.
Typical usage
- Publish pure wheel: `python -m build` (do not set GPTOSS_BUILD_METAL).
- Local Metal dev: `GPTOSS_BUILD_METAL=1 pip install -e ".[metal]"`.
- CI: keep GPTOSS_BUILD_METAL unset for releases; set it in internal jobs that
exercise the extension.
Notes
- The base package remains importable without the extension. The Metal backend
is only used when `gpt_oss.metal` is explicitly imported.
- This file is discovered via `backend-path = ["_build"]` and
`build-backend = "gpt_oss_build_backend.backend"` in pyproject.toml.
"""
import os
from importlib import import_module
from typing import Any, Mapping, Sequence
TRUE_VALUES = {"1", "true", "TRUE", "on", "ON", "yes", "YES"}
def _use_metal_backend() -> bool:
return str(os.environ.get("GPTOSS_BUILD_METAL", "")).strip() in TRUE_VALUES
def _setuptools_backend():
from setuptools import build_meta as _bm # type: ignore
return _bm
def _scikit_build_backend():
return import_module("scikit_build_core.build")
def _backend():
return _scikit_build_backend() if _use_metal_backend() else _setuptools_backend()
# Required PEP 517 hooks
def build_wheel(
wheel_directory: str,
config_settings: Mapping[str, Any] | None = None,
metadata_directory: str | None = None,
) -> str:
return _backend().build_wheel(wheel_directory, config_settings, metadata_directory)
def build_sdist(
sdist_directory: str, config_settings: Mapping[str, Any] | None = None
) -> str:
return _backend().build_sdist(sdist_directory, config_settings)
def prepare_metadata_for_build_wheel(
metadata_directory: str, config_settings: Mapping[str, Any] | None = None
) -> str:
# Fallback if backend doesn't implement it
be = _backend()
fn = getattr(be, "prepare_metadata_for_build_wheel", None)
if fn is None:
# setuptools exposes it; scikit-build-core may not. Defer to building a wheel for metadata.
return _setuptools_backend().prepare_metadata_for_build_wheel(
metadata_directory, config_settings
)
return fn(metadata_directory, config_settings)
# Optional hooks
def build_editable(
editable_directory: str, config_settings: Mapping[str, Any] | None = None, metadata_directory: str | None = None
) -> str:
be = _backend()
fn = getattr(be, "build_editable", None)
if fn is None:
# setuptools implements build_editable; if not available, raise the standard error
raise RuntimeError("Editable installs not supported by the selected backend")
return fn(editable_directory, config_settings)
def get_requires_for_build_wheel(
config_settings: Mapping[str, Any] | None = None,
) -> Sequence[str]:
if _use_metal_backend():
# Add dynamic build requirements only when building the Metal backend
return [
"scikit-build-core>=0.10",
"pybind11>=2.12",
"cmake>=3.26",
"ninja",
]
# setuptools usually returns []
return list(_setuptools_backend().get_requires_for_build_wheel(config_settings))
def get_requires_for_build_sdist(
config_settings: Mapping[str, Any] | None = None,
) -> Sequence[str]:
# No special requirements for SDist
be = _backend()
fn = getattr(be, "get_requires_for_build_sdist", None)
if fn is None:
return []
return list(fn(config_settings))
def get_requires_for_build_editable(
config_settings: Mapping[str, Any] | None = None,
) -> Sequence[str]:
if _use_metal_backend():
return [
"scikit-build-core>=0.10",
"pybind11>=2.12",
"cmake>=3.26",
"ninja",
]
be = _setuptools_backend()
fn = getattr(be, "get_requires_for_build_editable", None)
if fn is None:
return []
return list(fn(config_settings))
================================================
FILE: awesome-gpt-oss.md
================================================

# Awesome gpt-oss
This is a list of guides and resources to help you get started with the gpt-oss models.
- [Inference](#inference)
- [Local](#local)
- [Server](#server)
- [Cloud](#cloud)
- [Examples / Tutorials](#examples--tutorials)
- [Tools](#tools)
- [Training](#training)
## Inference
### Local
- Ollama
- [How to run gpt-oss locally with Ollama](https://cookbook.openai.com/articles/gpt-oss/run-locally-ollama)
- [Ollama & gpt-oss launch blog](https://ollama.com/blog/gpt-oss)
- [Check out the models Ollama](https://ollama.com/library/gpt-oss)
- LM Studio
- [LM Studio & gpt-oss launch blog](https://lmstudio.ai/blog/gpt-oss)
- [Use gpt-oss-20b with LM Studio](https://lmstudio.ai/models/openai/gpt-oss-20b)
- [Use gpt-oss-120b with LM Studio](https://lmstudio.ai/models/openai/gpt-oss-120b)
- Hugging Face & Transformers
- [How to run gpt-oss with Transformers](https://cookbook.openai.com/articles/gpt-oss/run-transformers)
- [Hugging Face & gpt-oss launch blog](https://huggingface.co/blog/welcome-openai-gpt-oss)
- [Collection of Hugging Face examples](https://github.com/huggingface/gpt-oss-recipes)
- NVIDIA
- [gpt-oss on RTX](https://blogs.nvidia.com/blog/rtx-ai-garage-openai-oss)
- AMD
- [Running gpt-oss models on AMD Ryzen AI Processors and Radeon Graphics Cards](https://www.amd.com/en/blogs/2025/how-to-run-openai-gpt-oss-20b-120b-models-on-amd-ryzen-ai-radeon.html)
- [Running gpt-oss on STX Halo and Radeon dGPUs using Lemonade](https://lemonade-server.ai/news/gpt-oss.html)
- llama.cpp
- [Running gpt-oss with llama.cpp](https://github.com/ggml-org/llama.cpp/discussions/15396)
- [Running gpt-oss with Unsloth GGUFs](https://docs.unsloth.ai/new/gpt-oss-how-to-run-and-fine-tune#run-gpt-oss-20b)
### Server
- vLLM
- [How to run gpt-oss with vLLM](https://cookbook.openai.com/articles/gpt-oss/run-vllm)
- [vLLM & gpt-oss recipies](https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html)
- NVIDIA
- [Optimizing gpt-oss with NVIDIA TensorRT-LLM](https://cookbook.openai.com/articles/run-nvidia)
- [Deploying gpt-oss on TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md)
- AMD
- [Running the Latest Open Models from OpenAI on AMD AI Hardware](https://rocm.blogs.amd.com/ecosystems-and-partners/openai-day-0/README.html)
### Cloud
- Groq
- [Groq & gpt-oss launch blog](https://groq.com/blog/day-zero-support-for-openai-open-models)
- [gpt-oss-120b model on the GroqCloud Playground](https://console.groq.com/playground?model=openai/gpt-oss-120b)
- [gpt-oss-20b model on the GroqCloud Playground](https://console.groq.com/playground?model=openai/gpt-oss-20b)
- [gpt-oss with built-in web search on GroqCloud](https://console.groq.com/docs/browser-search)
- [gpt-oss with built-in code execution on GroqCloud](https://console.groq.com/docs/code-execution)
- [Responses API on Groq](https://console.groq.com/docs/responses-api)
- NVIDIA
- [NVIDIA launch blog post](https://blogs.nvidia.com/blog/openai-gpt-oss/)
- [NVIDIA & gpt-oss developer launch blog post](https://developer.nvidia.com/blog/delivering-1-5-m-tps-inference-on-nvidia-gb200-nvl72-nvidia-accelerates-openai-gpt-oss-models-from-cloud-to-edge/)
- Use [gpt-oss-120b](https://build.nvidia.com/openai/gpt-oss-120b) and [gpt-oss-20b](https://build.nvidia.com/openai/gpt-oss-20b) on NVIDIA's Cloud
- Cloudflare
- [Cloudflare & gpt-oss launch blog post](https://blog.cloudflare.com/openai-gpt-oss-on-workers-ai)
- [gpt-oss-120b on Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/models/gpt-oss-120b)
- [gpt-oss-20b on Cloudflare Workers AI](https://developers.cloudflare.com/workers-ai/models/gpt-oss-20b)
- AMD
- [gpt-oss-120B on AMD MI300X](https://huggingface.co/spaces/amd/gpt-oss-120b-chatbot)
- AWS
- Deploy via Tensorfuse: [Deploy gpt-oss for both 20b and 120b models on AWS EKS](https://tensorfuse.io/docs/guides/modality/text/openai_oss)
- [AWS launch blog post](https://aws.amazon.com/blogs/aws/openai-open-weight-models-now-available-on-aws/)
- Google Colab
- [gpt-oss-20b inference notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/GPT_OSS_MXFP4_(20B)-Inference.ipynb)
## Examples & Tutorials
- [OpenAI harmony response format](https://cookbook.openai.com/articles/openai-harmony)
## Tools
- [Example `python` tool for gpt-oss](./gpt_oss/tools/python_docker/)
- [Example `browser` tool for gpt-oss](./gpt_oss/tools/simple_browser/)
## Training
- [Hugging Face TRL examples](https://github.com/huggingface/gpt-oss-recipes)
- [LlamaFactory examples](https://llamafactory.readthedocs.io/en/latest/advanced/best_practice/gpt-oss.html)
- [Unsloth examples](https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune)
### Reinforcement Learning
- [Auto solving the 2048 game](https://github.com/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb)
## Contributing
Feel free to open a PR to add your own guides and resources on how to run gpt-oss. We will try to review it and add it here.
================================================
FILE: compatibility-test/.gitignore
================================================
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
lerna-debug.log*
# Diagnostic reports (https://nodejs.org/api/report.html)
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json
# Runtime data
pids
*.pid
*.seed
*.pid.lock
# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov
# Coverage directory used by tools like istanbul
coverage
*.lcov
# nyc test coverage
.nyc_output
# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
.grunt
# Bower dependency directory (https://bower.io/)
bower_components
# node-waf configuration
.lock-wscript
# Compiled binary addons (https://nodejs.org/api/addons.html)
build/Release
# Dependency directories
node_modules/
jspm_packages/
# Snowpack dependency directory (https://snowpack.dev/)
web_modules/
# TypeScript cache
*.tsbuildinfo
# Optional npm cache directory
.npm
# Optional eslint cache
.eslintcache
# Optional stylelint cache
.stylelintcache
# Optional REPL history
.node_repl_history
# Output of 'npm pack'
*.tgz
# Yarn Integrity file
.yarn-integrity
# dotenv environment variable files
.env
.env.*
!.env.example
# parcel-bundler cache (https://parceljs.org/)
.cache
.parcel-cache
# Next.js build output
.next
out
# Nuxt.js build / generate output
.nuxt
dist
# Gatsby files
.cache/
# Comment in the public line in if your project uses Gatsby and not Next.js
# https://nextjs.org/blog/next-9-1#public-directory-support
# public
# vuepress build output
.vuepress/dist
# vuepress v2.x temp and cache directory
.temp
.cache
# Sveltekit cache directory
.svelte-kit/
# vitepress build output
**/.vitepress/dist
# vitepress cache directory
**/.vitepress/cache
# Docusaurus cache and generated files
.docusaurus
# Serverless directories
.serverless/
# FuseBox cache
.fusebox/
# DynamoDB Local files
.dynamodb/
# Firebase cache directory
.firebase/
# TernJS port file
.tern-port
# Stores VSCode versions used for testing VSCode extensions
.vscode-test
# yarn v3
.pnp.*
.yarn/*
!.yarn/patches
!.yarn/plugins
!.yarn/releases
!.yarn/sdks
!.yarn/versions
# Vite logs files
vite.config.js.timestamp-*
vite.config.ts.timestamp-*
rollout_*.jsonl
analysis_*.json
================================================
FILE: compatibility-test/README.md
================================================
# API Compatibility Test
This script uses the Agents SDK in TypeScript and the underlying OpenAI client to verify the shape of the API calls but also whether the API performs tool calling.
## What it tests
1.
## How to run
0. Run `npm install` in this directory.
1. Update `providers.ts` to create an entry for the API to test. Change `vllm` to the provider name of your choice. Use `chat` for Chat Completions tests and `responses` for Responses API tests.
2. Run an initial quick test to make sure things work. This will only run one test
```
npm start -- --provider <name> -n 1 -k 1
```
3. Run the full test (runs each test 5 times to test consistency)
```
npm start -- --provider <name> -k 5
```
## Considerations
1. The tests will fail if the API shape does not match the expected behavior
2. Events in the chat API are currently not tested
3. If the schema validation succeeds but the input is wrong the test will still pass for this test. That's because it's likely more of a prompt engineering issue or a validator issue than an API issue as it still nailed the input
================================================
FILE: compatibility-test/analysis.ts
================================================
export function analyze(caseResults: any[], tries: number) {
// Group results by unique task: test_case + apiType
type TaskKey = string;
const taskKeyFor = (r: any): TaskKey =>
`${r.test_case}::${r.result?.apiType}`;
const successesByTask: Map<TaskKey, Map<number, boolean>> = new Map();
// Count wrong-input tool calls (schema correct but incorrect arguments)
let wrongInputToolCalls = 0;
// Count invalid response shapes per API type
const totalByApiType: Record<string, number> = {};
const invalidByApiType: Record<string, number> = {};
for (const r of caseResults) {
if (!r?.result || typeof r.result.apiType !== "string") continue;
// Parse attempt index from run_id `${i}_${k}` safely
let attemptIndex: number | undefined;
if (typeof r.run_id === "string") {
const parts = r.run_id.split("_");
const k = Number(parts[1]);
if (Number.isFinite(k)) attemptIndex = k;
}
const key = taskKeyFor(r);
if (!successesByTask.has(key)) successesByTask.set(key, new Map());
if (attemptIndex != null) {
successesByTask.get(key)!.set(attemptIndex, Boolean(r.success));
}
const d = r.result.toolCallingDetails ?? {};
const calledToolAtLeastOnce = Boolean(d.calledToolAtLeastOnce);
const calledToolWithRightSchema = Boolean(d.calledToolWithRightSchema);
const calledToolWithRightArguments = Boolean(
d.calledToolWithRightArguments
);
if (
calledToolAtLeastOnce &&
calledToolWithRightSchema &&
!calledToolWithRightArguments
) {
wrongInputToolCalls++;
}
// Track invalid/total per apiType for response shape
const apiType = r.result.apiType as string;
totalByApiType[apiType] = (totalByApiType[apiType] ?? 0) + 1;
const isValidResponse = r.result.validResponse === true;
if (!isValidResponse) {
invalidByApiType[apiType] = (invalidByApiType[apiType] ?? 0) + 1;
}
}
const totalTasks = successesByTask.size;
// Compute pass@k and pass^k for k = 1..tries
const passAtKByK: number[] = [];
const passHatKByK: number[] = [];
for (let k = 1; k <= tries; k++) {
let tasksSuccessfulK = 0; // any success in first k attempts
let tasksAllSuccessfulK = 0; // all success in first k attempts
for (const [, attemptsMap] of successesByTask) {
let anySuccess = false;
let allSuccess = true;
for (let i = 0; i < k; i++) {
const v = attemptsMap.get(i) === true;
anySuccess = anySuccess || v;
if (!v) allSuccess = false;
}
if (anySuccess) tasksSuccessfulK++;
if (allSuccess) tasksAllSuccessfulK++;
}
const passAtK = totalTasks > 0 ? tasksSuccessfulK / totalTasks : 0;
const passHatK = totalTasks > 0 ? tasksAllSuccessfulK / totalTasks : 0;
passAtKByK.push(passAtK);
passHatKByK.push(passHatK);
}
// Convenience: final k=tries values
const passAtK = passAtKByK[tries - 1] ?? 0;
const passHatK = passHatKByK[tries - 1] ?? 0;
return {
totalTasks,
passAtKByK,
passHatKByK,
passAtK,
passHatK,
wrongInputToolCalls,
// New stats for invalid response shapes per API
invalidByApiType,
totalByApiType,
};
}
export function printAnalysis(
stats: ReturnType<typeof analyze>,
caseResults: any[],
provider: string,
selectedLines: string[],
tries: number,
skipped: number,
analysisFile: string
) {
const formatPerK = (arr: number[]) =>
Array.from({ length: tries }, (_, i) => {
const v = arr[i] ?? 0;
return `${i + 1}=${v.toFixed(3)}`;
}).join(", ");
console.log("Summary:");
console.log(` Provider: ${provider}`);
console.log(` Total input cases: ${selectedLines.length}`);
console.log(` Tries: ${tries}`);
console.log(` Total tasks: ${stats.totalTasks}`);
console.log(` Total runs: ${caseResults.length}`);
// Conditionally print invalid response shape stats per API type
if ((stats.totalByApiType["responses"] ?? 0) > 0) {
const bad = stats.invalidByApiType["responses"] ?? 0;
const tot = stats.totalByApiType["responses"] ?? 0;
console.log(` Invalid Responses API responses: ${bad} (out of ${tot})`);
}
if ((stats.totalByApiType["chat"] ?? 0) > 0) {
const bad = stats.invalidByApiType["chat"] ?? 0;
const tot = stats.totalByApiType["chat"] ?? 0;
console.log(
` Invalid Chat Completions API responses: ${bad} (out of ${tot})`
);
}
console.log(` pass@k (k=1..${tries}): ${formatPerK(stats.passAtKByK)}`);
console.log(` pass^k (k=1..${tries}): ${formatPerK(stats.passHatKByK)}`);
console.log(` pass@k (k=${tries}): ${stats.passAtK.toFixed(3)}`);
console.log(` pass^k (k=${tries}): ${stats.passHatK.toFixed(3)}`);
console.log(` Wrong-input tool calls: ${stats.wrongInputToolCalls}`);
console.log(` Invalid cases.jsonl lines: ${skipped}`);
console.log(` Analysis written to ${analysisFile}`);
}
================================================
FILE: compatibility-test/cases.jsonl
================================================
{"tool_name":"get_system_health","input":"Hey, quick check: is everything up and running?","expected_arguments":"{}"}
{"tool_name":"get_system_health","input":"Status report please.","expected_arguments":"{}"}
{"tool_name":"get_system_health","input":"Can you confirm the LLM health before we start?","expected_arguments":"{}"}
{"tool_name":"get_system_health","input":"Need a health snapshot.","expected_arguments":"{}"}
{"tool_name":"get_system_health","input":"Hi, what's the current system health?","expected_arguments":"{}"}
{"tool_name":"markdown_to_html","input":"Convert this markdown to HTML:\n\n# Title\n\nSome *italic* text.","expected_arguments":"{\"markdown\":\"# Title\\n\\nSome *italic* text.\"}"}
{"tool_name":"markdown_to_html","input":"Hey, could you turn `## Docs` into HTML?","expected_arguments":"{\"markdown\":\"## Docs\"}"}
{"tool_name":"markdown_to_html","input":"Please render the following markdown:\n\n- item 1\n- item 2","expected_arguments":"{\"markdown\":\"- item 1\\n- item 2\"}"}
{"tool_name":"markdown_to_html","input":"I have `**bold**` markdown; give me HTML.","expected_arguments":"{\"markdown\":\"**bold**\"}"}
{"tool_name":"markdown_to_html","input":"Markdown to HTML: > quote","expected_arguments":"{\"markdown\":\"> quote\"}"}
{"tool_name":"detect_language","input":"Hey, what language is this: 'Buenos días, ¿cómo estás?'","expected_arguments":"{\"text\":\"Buenos días, ¿cómo estás?\"}"}
{"tool_name":"detect_language","input":"Identify the language: \"Guten Morgen\"","expected_arguments":"{\"text\":\"Guten Morgen\"}"}
{"tool_name":"detect_language","input":"Language detection needed: こんにちは、お元気ですか?","expected_arguments":"{\"text\":\"こんにちは、お元気ですか?\"}"}
{"tool_name":"detect_language","input":"Detect language for: 'Привет, как дела?'","expected_arguments":"{\"text\":\"Привет, как дела?\"}"}
{"tool_name":"detect_language","input":"What language is 'Bonjour tout le monde'?","expected_arguments":"{\"text\":\"Bonjour tout le monde\"}"}
{"tool_name":"generate_chart","input":"Plot a simple line chart for these points: (1,2),(2,4),(3,9).","expected_arguments":"{\"data\":[[1,2],[2,4],[3,9]],\"chart_type\":\"line\"}"}
{"tool_name":"generate_chart","input":"Hey, can I get a bar chart of my sales: 10, 20, 30 across Q1–Q3?","expected_arguments":"{\"data\":[[1,10],[2,20],[3,30]],\"chart_type\":\"bar\",\"title\":\"Quarterly Sales\"}"}
{"tool_name":"generate_chart","input":"Make a scatter chart titled 'Experiment' with x label Time and y label Value for data [ [0,1], [1,1.5], [2,2.2] ].","expected_arguments":"{\"data\":[[0,1],[1,1.5],[2,2.2]],\"chart_type\":\"scatter\",\"title\":\"Experiment\",\"x_label\":\"Time\",\"y_label\":\"Value\"}"}
{"tool_name":"generate_chart","input":"Create a line chart of temperatures 70,72,68,65 over 4 days, label x as 'Day'.","expected_arguments":"{\"data\":[[1,70],[2,72],[3,68],[4,65]],\"chart_type\":\"line\",\"x_label\":\"Day\"}"}
{"tool_name":"generate_chart","input":"Visualize visits per day with a bar chart; numbers: 100,150,120.","expected_arguments":"{\"data\":[[1,100],[2,150],[3,120]],\"chart_type\":\"bar\",\"title\":\"Daily Visits\",\"y_label\":\"Visitors\"}"}
{"tool_name":"query_database","input":"Give me the ids and emails from users table, limit 5.","expected_arguments":"{\"table\":\"users\",\"columns\":[\"id\",\"email\"],\"limit\":5}"}
{"tool_name":"query_database","input":"Hey, fetch order_id and amount from orders where status is 'shipped'.","expected_arguments":"{\"table\":\"orders\",\"columns\":[\"order_id\",\"amount\"],\"filters\":\"status = 'shipped'\"}"}
{"tool_name":"query_database","input":"Retrieve name and price from products ordered by price descending, top 10 please.","expected_arguments":"{\"table\":\"products\",\"columns\":[\"name\",\"price\"],\"limit\":10,\"order_by\":\"price DESC\"}"}
{"tool_name":"query_database","input":"I need the first 3 log entries from audit_log table.","expected_arguments":"{\"table\":\"audit_log\",\"columns\":[\"id\",\"timestamp\",\"action\"],\"limit\":3}"}
{"tool_name":"query_database","input":"Query the customers table for name, city where city = 'Berlin'.","expected_arguments":"{\"table\":\"customers\",\"columns\":[\"name\",\"city\"],\"filters\":\"city = 'Berlin'\"}"}
{"tool_name":"get_weather","input":"What's the weather in San Francisco right now?","expected_arguments":"{\"location\":\"San Francisco\"}"}
{"tool_name":"get_weather","input":"Weather for Tokyo, please.","expected_arguments":"{\"location\":\"Tokyo\"}"}
{"tool_name":"get_weather","input":"Get me the current weather for 10001.","expected_arguments":"{\"location\":\"10001\"}"}
{"tool_name":"get_weather","input":"How's the weather in Paris today?","expected_arguments":"{\"location\":\"Paris\"}"}
{"tool_name":"get_weather","input":"Check the weather for Sydney.","expected_arguments":"{\"location\":\"Sydney\"}"}
================================================
FILE: compatibility-test/index.ts
================================================
import { parseArgs } from "node:util";
import { createWriteStream } from "node:fs";
import { readFile, writeFile } from "node:fs/promises";
import path from "node:path";
import process from "node:process";
import { runCase, RunCaseSummary } from "./runCase";
import { Listr, ListrTaskWrapper } from "listr2";
import { analyze, printAnalysis } from "./analysis";
function formatTimestamp(d: Date): string {
const pad = (n: number) => String(n).padStart(2, "0");
const yyyy = d.getFullYear();
const mm = pad(d.getMonth() + 1);
const dd = pad(d.getDate());
const hh = pad(d.getHours());
const mi = pad(d.getMinutes());
const ss = pad(d.getSeconds());
return `${yyyy}${mm}${dd}_${hh}${mi}${ss}`;
}
async function main() {
const args = parseArgs({
options: {
cases: { type: "string", short: "c", default: "cases.jsonl" },
provider: { type: "string", short: "p", default: "openai" },
streaming: { type: "boolean", short: "s", default: false },
maxTurns: { type: "string", short: "t", default: "10" },
n: { type: "string", short: "n" },
strict: { type: "boolean", short: "s", default: false },
tries: { type: "string", short: "k", default: "1" },
},
});
const casesPathArg = args.values.cases;
const provider = args.values.provider as string;
const streaming = Boolean(args.values.streaming);
const maxTurns = Number(args.values.maxTurns ?? 10);
const nRaw = args.values.n as string | undefined;
const triesRaw = args.values.tries as string | undefined;
const tries = triesRaw != null ? Number(triesRaw) : 1;
const limit = nRaw != null ? Number(nRaw) : undefined;
if (limit != null && (!Number.isFinite(limit) || limit <= 0)) {
console.error("--n must be a positive integer");
process.exitCode = 1;
return;
}
if (!casesPathArg) {
console.error("--cases is required (path to JSONL file)");
process.exitCode = 1;
return;
}
const casesPath = path.isAbsolute(casesPathArg)
? casesPathArg
: path.join(process.cwd(), casesPathArg);
const timestamp = formatTimestamp(new Date());
const defaultFilename = `rollout_${provider}_${timestamp}.jsonl`;
const outputFile = path.join(process.cwd(), defaultFilename);
const analysisFile = path.join(
process.cwd(),
`analysis_${provider}_${timestamp}.json`
);
let fileContent: string;
try {
fileContent = await readFile(casesPath, "utf8");
} catch (err: any) {
console.error(
`Failed to read cases file at ${casesPath}: ${err?.message ?? err}`
);
process.exitCode = 1;
return;
}
const lines = fileContent
.split(/\r?\n/)
.map((l) => l.trim())
.filter((l) => l.length > 0);
const selectedLines =
typeof limit === "number" ? lines.slice(0, limit) : lines;
const out = createWriteStream(outputFile, { flags: "w", encoding: "utf8" });
const writeLine = (obj: any) =>
new Promise<void>((resolve, reject) => {
const str = JSON.stringify(obj) + "\n";
out.write(str, (err) => (err ? reject(err) : resolve()));
});
// Accumulators for post-run analysis
let skipped = 0; // invalid JSON lines
const caseResults: Array<{
run_id: string;
success: boolean;
provider: string;
test_case: number;
tool_name: string;
input: string;
result: RunCaseSummary;
}> = [];
async function processIndex(
i: number,
k: number,
task: ListrTaskWrapper<any, any, any>
) {
const line = selectedLines[i];
let caseObj: any;
try {
caseObj = JSON.parse(line);
} catch (err: any) {
console.error(
`Skipping invalid JSON on line ${i + 1}: ${err?.message ?? err}`
);
skipped++;
return;
}
try {
const summaries = await runCase(provider, caseObj, {
maxTurns,
streaming,
strict: args.values.strict,
});
for (const summary of summaries) {
const record = {
run_id: `${i}_${k}`,
success: summary.success,
provider,
test_case: i,
tool_name: caseObj.tool_name,
input: caseObj.input,
result: summary,
};
task.output = `Case ${i} (attempt ${k + 1}): ${
summary.success ? "Success" : "Failed"
} ${summary.toolCallingDetails.warning || ""}`;
caseResults.push(record);
await writeLine(record);
}
} catch (err: any) {
const record = {
provider,
test_case: i,
tool_name: caseObj?.tool_name,
input: caseObj?.input,
expected_output: caseObj?.expected_output,
instructions: caseObj?.instructions,
error: String(err?.message ?? err),
};
await writeLine(record);
task.output = `Case ${i} failed: ${err?.message ?? err}`;
}
}
const listr = new Listr<{
output: string;
}>(
selectedLines.flatMap((line, index) => {
return Array.from({ length: tries }, (_, attempt) => ({
title: `Processing case ${index} (attempt ${attempt + 1})`,
task: async (_, task) => {
await processIndex(index, attempt, task);
},
rendererOptions: { persistentOutput: true },
}));
}),
{
concurrent: 5,
}
);
await listr.run();
await new Promise((resolve) => out.end(resolve));
console.log(`Results written to ${outputFile}`);
const stats = analyze(caseResults, tries);
await writeFile(analysisFile, JSON.stringify(stats, null, 2), "utf8");
printAnalysis(
stats,
caseResults,
provider,
selectedLines,
tries,
skipped,
analysisFile
);
}
main().catch((err) => {
console.error(err);
process.exitCode = 1;
});
================================================
FILE: compatibility-test/package.json
================================================
{
"type": "module",
"dependencies": {
"@openai/agents": "^0.0.15",
"ajv": "^8.17.1",
"listr2": "^9.0.1"
},
"scripts": {
"start": "tsx index.ts"
}
}
================================================
FILE: compatibility-test/providers.ts
================================================
export const PROVIDERS = {
vllm: {
apiBaseUrl: "http://localhost:8000/v1",
apiKey: "vllm",
apiType: ["responses", "chat"], // choose from responses, chat, or both
modelName: "openai/gpt-oss-120b",
providerDetails: {
// add any provider-specific details here. These will be passed as part of every request
// for example to fix the provider for openrouter, you can do:
// provider: {
// only: ["example"],
// },
},
},
};
================================================
FILE: compatibility-test/runCase.ts
================================================
import {
Agent,
Runner,
OpenAIResponsesModel,
OpenAIChatCompletionsModel,
RunResult,
StreamedRunResult,
FunctionTool,
setTracingDisabled,
} from "@openai/agents";
import { Ajv } from "ajv";
import { OpenAI } from "openai";
import { PROVIDERS } from "./providers";
import { TOOLS_MAP } from "./tools";
setTracingDisabled(true);
const ajv = new Ajv();
export type Case = {
tool_name: string;
input: string;
expected_arguments: string;
instructions?: string;
};
// Summary shape for each apiType
export type RunCaseSummary = {
apiType: string;
success: boolean;
validResponse: boolean;
validEvents?: boolean;
details: Record<string, any>;
history: any[];
successToolCall: boolean;
toolCallingDetails: Record<string, any>;
};
export async function runCase(
provider: string,
caseData: Case,
{
maxTurns,
streaming,
strict,
}: { maxTurns: number; streaming: boolean; strict: boolean }
): Promise<RunCaseSummary[]> {
const config = PROVIDERS[provider];
if (!config) {
throw new Error(
`Provider ${provider} not found. Valid providers are: ${Object.keys(
PROVIDERS
).join(", ")}`
);
}
const agent = new Agent({
name: caseData.tool_name,
instructions: caseData.instructions,
tools: [TOOLS_MAP[caseData.tool_name]],
});
const client = new OpenAI({
apiKey: config.apiKey,
baseURL: config.apiBaseUrl,
});
const summaries: RunCaseSummary[] = [];
for (const apiType of config.apiType) {
const runner = new Runner({
model:
apiType === "responses"
? new OpenAIResponsesModel(client, config.modelName)
: new OpenAIChatCompletionsModel(client, config.modelName),
modelSettings: {
providerData: config.providerDetails ?? {},
},
});
let result: RunResult<any, any> | StreamedRunResult<any, any>;
let streamedEvents: any[] | undefined = undefined;
if (streaming) {
result = await runner.run(agent, caseData.input, {
stream: streaming,
maxTurns: maxTurns,
});
if (result instanceof StreamedRunResult) {
// Collect streaming events if applicable
streamedEvents = [];
for await (const event of result) {
if (event.type === "raw_model_stream_event") {
if (event.data.type === "model") {
streamedEvents.push(event.data.event);
}
}
}
await result.completed;
}
} else {
result = await runner.run(agent, caseData.input, {
maxTurns: maxTurns,
});
}
const { success: successToolCall, details: toolCallingDetails } =
testToolCall(apiType, caseData, result, strict);
const { validResponse, details } = testOutputData(
apiType,
result.rawResponses,
streaming
);
const { validEvents, details: eventsDetails } = streaming
? testEvents(apiType, streamedEvents)
: { validEvents: true, details: {} };
let success = successToolCall && validResponse;
if (streaming) {
success = success && validEvents;
}
const summary: RunCaseSummary = {
apiType,
success,
validResponse,
validEvents,
details: {
...details,
...eventsDetails,
},
history: result?.rawResponses.map((entry) => entry.providerData) ?? [],
successToolCall,
toolCallingDetails,
};
summaries.push(summary);
}
return summaries;
}
function testToolCall(apiType, caseData, result, strict) {
let details: Record<string, boolean | string> = {};
result.newItems.forEach((item) => {
// for this test for now we only care if the tool is called at least once
if (details.calledToolAtLeastOnce) {
return;
}
const isToolCall = item.type === "tool_call_item";
if (isToolCall) {
if (item.rawItem.type === "function_call") {
if (item.rawItem.name === caseData.tool_name) {
const validate = ajv.compile(
(TOOLS_MAP[caseData.tool_name] as FunctionTool).parameters
);
const valid = validate(JSON.parse(item.rawItem.arguments));
details.calledToolWithRightSchema = valid;
details.calledToolAtLeastOnce = true;
if (details.calledToolWithRightSchema) {
const parsedArguments = JSON.parse(item.rawItem.arguments);
const expectedArguments = JSON.parse(caseData.expected_arguments);
details.calledToolWithRightArguments = deepEqual(
parsedArguments,
expectedArguments
);
if (!details.calledToolWithRightArguments) {
if (details.calledToolWithRightSchema) {
details.warning = `Tool call with wrong arguments but correct schema. Check logs for full details. Not failing this test. Parsed: ${JSON.stringify(
parsedArguments
)} Expected: ${JSON.stringify(expectedArguments)}`;
}
details.actualArguments = parsedArguments;
details.expectedArguments = expectedArguments;
}
}
}
}
}
});
return {
success:
!!details.calledToolAtLeastOnce &&
!!details.calledToolWithRightSchema &&
(!strict || !!details.calledToolWithRightArguments),
details,
};
}
function testEvents(apiType, events) {
// In an ideal world we would check all the events to follow and reconstruct the final response
// and then compare it against the final response in the response.completed event
// for now we just check that certain events are present
let details: Record<string, boolean> = {};
let validEvents: boolean = false;
if (apiType === "chat") {
let hasReasoningDeltas = false;
for (const event of events) {
hasReasoningDeltas =
hasReasoningDeltas ||
(typeof event.choices[0].delta.reasoning === "string" &&
event.choices[0].delta.reasoning.length > 0);
}
details.hasReasoningDeltas = hasReasoningDeltas;
validEvents = hasReasoningDeltas;
}
if (apiType === "responses") {
let hasReasoningDeltaEvents = false;
let hasReasoningDoneEvents = false;
for (const event of events) {
if (event.type === "raw_model_stream_event") {
if (event.data.type === "model") {
if (event.data.event.type === "response.reasoning_text.delta") {
hasReasoningDeltaEvents = true;
}
if (event.data.event.type === "response.reasoning_text.done") {
hasReasoningDoneEvents = true;
}
}
}
}
details.hasReasoningDeltaEvents = hasReasoningDeltaEvents;
details.hasReasoningDoneEvents = hasReasoningDoneEvents;
validEvents =
details.hasReasoningDeltaEvents && details.hasReasoningDoneEvents;
}
return {
validEvents,
details,
};
}
function testOutputData(apiType, rawResponses, streaming) {
let details: Record<string, boolean> = {};
let validResponse: boolean = false;
if (apiType === "chat") {
for (const response of rawResponses) {
if (streaming && !response.providerData) {
// with Chat Completions we don't have a final response object that's native so we skip this test
return {
validResponse: true,
details: {
skippedBecauseStreaming: true,
},
};
}
// this is the actual HTTP response from the provider
// Since it's not guaranteed that every response has a reasoning field, we check if it's present
// at least once across all responses
const data = response.providerData;
const message = data.choices[0].message;
if (message.role === "assistant" && !message.refusal) {
details.hasReasoningField =
details.hasReasoningField ||
("reasoning" in message && typeof message.reasoning === "string");
details.hasReasoningContentField =
details.hasReasoningContentField ||
("reasoning_content" in message &&
typeof message.reasoning_content === "string");
validResponse =
validResponse ||
(details.hasReasoningField && message.reasoning.length > 0);
}
}
} else if (apiType === "responses") {
// this is the actual HTTP response from the provider
const data = rawResponses[0].providerData;
for (const item of data.output) {
// Since it's not guaranteed that every response has a reasoning field, we check if it's present
// at least once across all responses
if (item.type === "reasoning") {
details.hasReasoningContentArray = Array.isArray(item.content);
details.hasReasoningContentArrayLength = item.content.length > 0;
details.hasReasoningContentArrayItemType = item.content.every(
(item) => item.type === "reasoning_text"
);
details.hasReasoningContentArrayItemText = item.content.every(
(item) => item.text.length > 0
);
validResponse =
details.hasReasoningContentArray &&
details.hasReasoningContentArrayLength &&
details.hasReasoningContentArrayItemType &&
details.hasReasoningContentArrayItemText;
}
}
}
return {
validResponse,
details,
};
}
function deepEqual(a: any, b: any): boolean {
if (a === b) return true;
if (typeof a !== typeof b) return false;
if (a && b && typeof a === "object") {
if (Array.isArray(a) !== Array.isArray(b)) return false;
if (Array.isArray(a)) {
if (a.length !== b.length) return false;
for (let i = 0; i < a.length; i++) {
if (!deepEqual(a[i], b[i])) return false;
}
return true;
} else {
const aKeys = Object.keys(a);
const bKeys = Object.keys(b);
if (aKeys.length !== bKeys.length) return false;
for (const key of aKeys) {
if (!b.hasOwnProperty(key)) return false;
if (!deepEqual(a[key], b[key])) return false;
}
return true;
}
}
return false;
}
================================================
FILE: compatibility-test/tools.ts
================================================
import { Tool, tool } from "@openai/agents";
function convertToTool(toolData: any) {
return tool({
name: toolData.name,
description: toolData.description,
parameters: toolData.parameters,
execute: async (parameters) => {
return toolData.output;
},
strict: false,
});
}
export const TOOLS = [
{
type: "function",
name: "get_weather",
description: "Get the weather for a given location",
parameters: {
type: "object",
properties: {
location: {
type: "string",
description: "The location to get the weather for",
},
},
required: ["location"],
additionalProperties: false,
},
output: '{"weather":"sunny"}',
},
{
type: "function",
name: "get_system_health",
description:
"Returns the current health status of the LLM runtime—use before critical operations to verify the service is live.",
parameters: { type: "object", properties: {} },
output: '{"status":"ok","uptime_seconds":372045}',
},
{
type: "function",
name: "markdown_to_html",
description:
"Converts a Markdown string to sanitized HTML—use when you need browser-renderable output.",
parameters: {
type: "object",
properties: {
markdown: { type: "string", description: "Raw Markdown content" },
},
required: ["markdown"],
additionalProperties: false,
},
output: '{"html":"<h1>Hello World</h1><p>This is <em>great</em>.</p>"}',
},
{
type: "function",
name: "detect_language",
description:
"Identifies the ISO language code of the supplied text—use for routing text to language-specific models.",
parameters: {
type: "object",
properties: {
text: {
type: "string",
description: "Text whose language should be detected",
},
},
required: ["text"],
additionalProperties: false,
},
output: '{"language":"de","confidence":0.98}',
},
{
type: "function",
name: "generate_chart",
description:
"Creates a base64-encoded PNG chart from tabular data—use for quick visualizations inside chat.",
parameters: {
type: "object",
properties: {
data: {
type: "array",
items: { type: "array", items: { type: "number" } },
description: "2-D numeric data matrix",
},
chart_type: {
type: "string",
enum: ["line", "bar", "scatter"],
description: "Type of chart to generate",
},
title: {
type: "string",
description: "Chart title",
default: "",
},
x_label: {
type: "string",
description: "Label for the x-axis",
default: "",
},
y_label: {
type: "string",
description: "Label for the y-axis",
default: "",
},
},
required: ["data", "chart_type"],
additionalProperties: false,
},
output: '{"image_png_base64":"iVBORw0KGgoAAAANSUhEUgAA..."}',
},
{
type: "function",
name: "query_database",
description:
"Runs a parameterized SQL SELECT on the internal analytics DB—use for lightweight data look-ups.",
parameters: {
type: "object",
properties: {
table: { type: "string", description: "Table name to query" },
columns: {
type: "array",
items: { type: "string" },
description: "Columns to return",
},
filters: {
type: "string",
description: "SQL WHERE clause without the word WHERE",
default: "",
},
limit: {
type: "integer",
minimum: 1,
maximum: 10000,
description: "Max rows to return",
default: 100,
},
order_by: {
type: "string",
description: "Column to order by (optional)",
default: "",
},
},
required: ["table", "columns"],
additionalProperties: false,
},
output:
'{"rows":[{"id":1,"email":"user@example.com"},{"id":2,"email":"foo@bar.com"}],"row_count":2}',
},
];
export const TOOLS_MAP = TOOLS.reduce((acc, tool) => {
acc[tool.name] = convertToTool(tool);
return acc;
}, {} as Record<string, Tool>);
================================================
FILE: examples/agents-sdk-js/index.ts
================================================
import { OpenAI } from "openai";
import {
Agent,
run,
setDefaultOpenAIClient,
setOpenAIAPI,
setTracingDisabled,
tool,
MCPServerStdio,
} from "@openai/agents";
import { z } from "zod";
import path from "node:path";
import process from "node:process";
import { styleText } from "node:util";
import { createInterface } from "node:readline/promises";
async function prompt(question: string) {
const rl = createInterface({
input: process.stdin,
output: process.stdout,
});
const answer = await rl.question(question);
rl.close();
return answer;
}
const openai = new OpenAI({
apiKey: "local",
baseURL: "http://localhost:11434/v1",
});
const samplesDir = path.join(process.cwd());
const mcpServer = new MCPServerStdio({
name: "Filesystem MCP Server, via npx",
fullCommand: `npx -y @modelcontextprotocol/server-filesystem ${samplesDir}`,
});
await mcpServer.connect();
setTracingDisabled(true);
setDefaultOpenAIClient(openai);
setOpenAIAPI("chat_completions");
const searchTool = tool({
name: "get_current_weather",
description: "Get the current weather in a given location",
parameters: z.object({
location: z.string(),
}),
execute: async ({ location }) => {
return `The weather in ${location} is sunny.`;
},
});
const agent = new Agent({
name: "My Agent",
instructions: "You are a helpful assistant.",
tools: [searchTool],
model: "gpt-oss:20b-test",
mcpServers: [mcpServer],
});
const input = await prompt("> ");
const result = await run(agent, input, {
stream: true,
});
for await (const event of result) {
if (event.type === "raw_model_stream_event" && event.data.type === "model") {
if (event.data.event.choices[0].delta.content) {
process.stdout.write(event.data.event.choices[0].delta.content);
} else if (event.data.event.choices[0].delta.reasoning) {
process.stdout.write(event.data.event.choices[0].delta.reasoning);
}
} else if (
event.type === "run_item_stream_event" &&
event.item.type === "tool_call_item" &&
event.item.rawItem.type == "function_call"
) {
console.log(
`\nCalling ${event.item.rawItem.name} with: ${event.item.rawItem.arguments}`
);
}
}
console.log("\n");
await result.completed;
await mcpServer.close();
================================================
FILE: examples/agents-sdk-js/package.json
================================================
{
"type": "module",
"name": "agents-sdk",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"start": "tsx index.ts",
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"dependencies": {
"@openai/agents": "^0.0.14",
"tsx": "^4.20.3",
"typescript": "^5.8.3",
"zod": "^3.25.67"
}
}
================================================
FILE: examples/agents-sdk-python/example.py
================================================
import asyncio
from pathlib import Path
import shutil
from openai import AsyncOpenAI
from agents import (
Agent,
ItemHelpers,
Runner,
set_default_openai_api,
set_default_openai_client,
set_tracing_disabled,
function_tool,
)
from agents.mcp import MCPServerStdio
async def prompt_user(question: str) -> str:
"""Async input prompt function"""
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, input, question)
async def main():
# Set up OpenAI client for local server (e.g., Ollama)
openai_client = AsyncOpenAI(
api_key="local",
base_url="http://localhost:11434/v1",
)
# Get current working directory
samples_dir = str(Path.cwd())
# Create MCP server for filesystem operations
mcp_server = MCPServerStdio(
name="Filesystem MCP Server, via npx",
params={
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
samples_dir,
],
},
)
# Connect to MCP server
await mcp_server.connect()
# Configure agents SDK
set_tracing_disabled(True)
set_default_openai_client(openai_client)
set_default_openai_api("chat_completions")
# Define weather tool
@function_tool
async def get_weather(location: str) -> str:
return f"The weather in {location} is sunny."
# Create agent
agent = Agent(
name="My Agent",
instructions="You are a helpful assistant.",
tools=[get_weather],
model="gpt-oss:20b-test",
mcp_servers=[mcp_server],
)
# Get user input
user_input = await prompt_user("> ")
# Run agent with streaming
result = Runner.run_streamed(agent, user_input)
# Process streaming results
async for event in result.stream_events():
if event.type == "raw_response_event":
continue
elif event.type == "agent_updated_stream_event":
print(f"Agent updated: {event.new_agent.name}")
elif event.type == "run_item_stream_event":
if event.item.type == "tool_call_item":
print("-- Tool was called")
elif event.item.type == "tool_call_output_item":
print(f"-- Tool output: {event.item.output}")
elif event.item.type == "message_output_item":
print(
f"-- Message output:\n {ItemHelpers.text_message_output(event.item)}"
)
else:
pass
print("=== Run complete ===")
if __name__ == "__main__":
if not shutil.which("npx"):
raise RuntimeError(
"npx is not installed. Please install it with `npm install -g npx`."
)
asyncio.run(main())
================================================
FILE: examples/agents-sdk-python/pyproject.toml
================================================
[project]
name = "agents-sdk-python"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
"openai-agents>=0.2.4",
]
================================================
FILE: examples/gradio/gradio_chat.py
================================================
import json
import requests
import gradio as gr
DEFAULT_FUNCTION_PROPERTIES = """
{
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
""".strip()
def chat_with_model(message, history, model_choice, instructions, effort, use_functions,
function_name, function_description, function_parameters,
use_browser_search, temperature, max_output_tokens, debug_mode):
if not message.strip():
return history, ""
# Append user message and empty assistant placeholder (idiomatic Gradio pattern)
history = history + [[message, ""]]
# Build messages list from history (excluding the empty assistant placeholder)
messages = []
# Convert history to messages format (excluding the last empty assistant message)
for user_msg, assistant_msg in history[:-1]:
if user_msg:
messages.append({
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": user_msg}]
})
if assistant_msg:
messages.append({
"type": "message",
"role": "assistant",
"content": [{"type": "output_text", "text": assistant_msg}]
})
# Add current user message
messages.append({
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": message}]
})
# Prepare tools
tools = []
if use_functions:
try:
tools.append({
"type": "function",
"name": function_name,
"description": function_description,
"parameters": json.loads(function_parameters),
})
except json.JSONDecodeError:
pass
if use_browser_search:
tools.append({"type": "browser_search"})
# Get URL based on model (matching streamlit logic)
options = ["large", "small"]
URL = ("http://localhost:8081/v1/responses" if model_choice == options[1]
else "http://localhost:8000/v1/responses")
try:
response = requests.post(
URL,
json={
"input": messages,
"stream": True,
"instructions": instructions,
"reasoning": {"effort": effort},
"metadata": {"__debug": debug_mode},
"tools": tools,
"temperature": temperature,
"max_output_tokens": max_output_tokens,
},
stream=True,
)
full_content = ""
text_delta = ""
current_output_index = 0
in_reasoning = False
for line in response.iter_lines(decode_unicode=True):
if not line or not line.startswith("data:"):
continue
data_str = line[len("data:"):].strip()
if not data_str:
continue
try:
data = json.loads(data_str)
except Exception:
continue
event_type = data.get("type", "")
output_index = data.get("output_index", 0)
if event_type == "response.output_item.added":
current_output_index = output_index
output_type = data.get("item", {}).get("type", "message")
text_delta = ""
if output_type == "reasoning":
if not in_reasoning:
full_content += "🤔 **Thinking...**\n"
in_reasoning = True
elif output_type == "message":
if in_reasoning:
full_content += "\n\n"
in_reasoning = False
elif event_type == "response.reasoning_text.delta":
delta = data.get("delta", "")
full_content += delta
# Update last assistant message (idiomatic Gradio pattern)
history[-1][1] = full_content
yield history, ""
elif event_type == "response.output_text.delta":
delta = data.get("delta", "")
full_content += delta
# Update last assistant message (idiomatic Gradio pattern)
history[-1][1] = full_content
yield history, ""
elif event_type == "response.output_item.done":
item = data.get("item", {})
if item.get("type") == "function_call":
function_call_text = f"\n\n🔨 Called `{item.get('name')}`\n**Arguments**\n```json\n{item.get('arguments', '')}\n```"
full_content += function_call_text
# Update last assistant message (idiomatic Gradio pattern)
history[-1][1] = full_content
yield history, ""
elif item.get("type") == "web_search_call":
web_search_text = f"\n\n🌐 **Web Search**\n```json\n{json.dumps(item.get('action', {}), indent=2)}\n```\n✅ Done"
full_content += web_search_text
# Update last assistant message (idiomatic Gradio pattern)
history[-1][1] = full_content
yield history, ""
elif event_type == "response.completed":
response_data = data.get("response", {})
if debug_mode:
debug_info = response_data.get("metadata", {}).get("__debug", "")
if debug_info:
full_content += f"\n\n**Debug**\n```\n{debug_info}\n```"
# Update last assistant message (idiomatic Gradio pattern)
history[-1][1] = full_content
yield history, ""
break
# Return final history and empty string to clear textbox
return history, ""
except Exception as e:
error_message = f"❌ Error: {str(e)}"
history[-1][1] = error_message
return history, ""
# Create the Gradio interface
with gr.Blocks(title="💬 Chatbot") as demo:
gr.Markdown("# 💬 Chatbot")
with gr.Row():
with gr.Column(scale=3):
chatbot = gr.Chatbot(height=500)
with gr.Row():
msg = gr.Textbox(placeholder="Type a message...", scale=4, show_label=False)
send_btn = gr.Button("Send", scale=1)
clear_btn = gr.Button("Clear Chat")
with gr.Column(scale=1):
model_choice = gr.Radio(["large", "small"], value="small", label="Model")
instructions = gr.Textbox(
label="Instructions",
value="You are a helpful assistant that can answer questions and help with tasks.",
lines=3
)
effort = gr.Radio(["low", "medium", "high"], value="medium", label="Reasoning effort")
gr.Markdown("#### Functions")
use_functions = gr.Checkbox(label="Use functions", value=False)
with gr.Column(visible=False) as function_group:
function_name = gr.Textbox(label="Function name", value="get_weather")
function_description = gr.Textbox(
label="Function description",
value="Get the weather for a given city"
)
function_parameters = gr.Textbox(
label="Function parameters",
value=DEFAULT_FUNCTION_PROPERTIES,
lines=6
)
# Conditional browser search (matching Streamlit logic)
# In Streamlit: if "show_browser" in st.query_params:
# For Gradio, we'll always show it (simplified)
gr.Markdown("#### Built-in Tools")
use_browser_search = gr.Checkbox(label="Use browser search", value=False)
temperature = gr.Slider(0.0, 1.0, value=1.0, step=0.01, label="Temperature")
max_output_tokens = gr.Slider(1000, 20000, value=1024, step=100, label="Max output tokens")
debug_mode = gr.Checkbox(label="Debug mode", value=False)
# Event handlers
def toggle_function_group(use_funcs):
return gr.update(visible=use_funcs)
use_functions.change(toggle_function_group, use_functions, function_group)
# Chat functionality
inputs = [msg, chatbot, model_choice, instructions, effort, use_functions,
function_name, function_description, function_parameters,
use_browser_search, temperature, max_output_tokens, debug_mode]
msg.submit(chat_with_model, inputs, [chatbot, msg])
send_btn.click(chat_with_model, inputs, [chatbot, msg])
clear_btn.click(lambda: [], outputs=chatbot)
if __name__ == "__main__":
demo.launch()
================================================
FILE: examples/reinforcement-fine-tuning.ipynb
================================================
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/github/openai/gpt-oss/blob/main/examples/reinforcement-fine-tuning.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Free Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hzPgFeIkZn9q"
},
"source": [
"# Make gpt-oss play games with Reinforcement Learning\n",
"\n",
"This notebook demonstrates how you make `gpt-oss` play the 2048 game autonomously by using reinforcement learning (RL).\n",
"\n",
"We will train `gpt-oss-20b` using [Unsloth](https://github.com/unslothai/unsloth) to develop a strategy for playing 2048. The strategy will run until the game ends, and the model will be rewarded or penalized based on whether it wins or loses.\n",
"\n",
"<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/2048_win.png/500px-2048_win.png\" width=300 />"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "31KIMLJLnHET"
},
"source": [
"# Installation\n",
"To run `gpt-oss-20b` RL on a free Google Colab instance, we’ll use the GRPO algorithm along with [Unsloth](https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning), an open-source tool that enables less VRAM usage and faster training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "CGoDZwcunHEU"
},
"outputs": [],
"source": [
"%%capture\n",
"!pip install --upgrade -qqq uv\n",
"try: import numpy; get_numpy = f\"numpy=={numpy.__version__}\"\n",
"except: get_numpy = \"numpy\"\n",
"!uv pip install -qqq \\\n",
" \"torch>=2.8.0\" \"triton>=3.4.0\" {get_numpy} torchvision bitsandbytes \"transformers==4.56.2\" \\\n",
" \"unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo\" \\\n",
" \"unsloth[base] @ git+https://github.com/unslothai/unsloth\" \\\n",
" git+https://github.com/triton-lang/triton.git@05b2c186c1b6c9a08375389d5efe9cb4c401c075#subdirectory=python/triton_kernels\n",
"!uv pip install --upgrade --no-deps transformers==4.56.2 tokenizers\n",
"!uv pip install --no-deps trl==0.22.2"
]
},
{
"cell_type": "markdown",
"source": [
"We'll load gpt-oss-20b and set some parameters:\n",
"* `max_seq_length = 768` The maximum context length of the model. Increasing it will use more memory, and 768 was the maximum we found to fit on a free 15GB Tesla T4 machine\n",
"* `lora_rank = 4` The larger this number, the smarter the RL process, but the slower and more memory usage\n",
"* `load_in_4bit = True` Uses quantization to reduce memory usage by 75% without reducing accuracy that much. `load_in_16bit` will be faster but will need a 80GB GPU (H100, B200)\n",
"* `offload_embedding = True` Unsloth optimization which moves the embedding to CPU RAM, reducing VRAM by 1GB"
],
"metadata": {
"id": "CcLYwLyQLADE"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 575,
"referenced_widgets": [
"abe2b0a2913d4633943f44333ae799f8",
"2c40c6b846924200b29616a590af1672",
"749e8407a901483c8b513a2fb71596c8",
"7baca79d720c40b5a923b9717e28c982",
"68ea891644ca4753a8e1bf278ff47e84",
"06ab9eaa6f0f48c4b68cff1ca4b9f2fa",
"d98c2b1e979b4929891a8ee0c11f55df",
"ef01b874478b4bb497d31d2f8dd6145a",
"d50ea8cded9848ffa18be1ae6a2559df",
"ffabf89ecd9d48a5a3fc2a1c855ce080",
"614c5332c7d045109102a329e7f69dfd",
"caf742160db041a1b6c2cfdf78f2dc9a",
"34a9e38b0b454a69a067d1ddadec7626",
"263b7dc0b3fd465fac89b9266b19d526",
"5b7af68130f04a63ad3efa3d9f602ebe",
"2a6aa92676c74509b58373ca604c5b3b",
"9c4d6839934b4b13952a850d2084d498",
"c6a1decbc0e7421db622033214913cb9",
"147743757c804b85af2ef194f5f84e6a",
"2820e352ab004e818949acc31eb3888d",
"80fa3aef5e2040d9904c6b87b7214ca0",
"0f99489932aa409b94ba34764aff19b0",
"6ab4e5676ad84807a126fffa99f7a0d4",
"e61ef80398444c13bf7cd20ef21a5057",
"5ebe7b4e4ed24c53b783ee46377c682d",
"e0fdef0087bc4a91a11932a2d933c001",
"596c2a62a635469eb74233ce00586a6f",
"da4324e287e64e5ba98fc110693066df",
"8c7c6bb04a3f4a1494b34529f95a195c",
"51aaa109480d4ae6bd419aea689d22ee",
"acf4e50a248342f68d26daef21baa419",
"7d3379cbd27a4218a9d84c5a12f3bb88",
"7841bc90b6a74120ab3e603c76332a01",
"3f9b801b52da4eb79f730d87bea5c338",
"b66c6ded549d4db8a2e5ea8e5016615c",
"43da5073c3ad4e98a3ade17a0bb3b93d",
"40365e2c9fef49148e4c93592d458afc",
"7e9d5212fc7844f286e14b70cbf0bc7a",
"77d34c0f1de548b4872208a063bb5017",
"bf96e8666c224c26b0a01451d08e907a",
"4513a73fa95b41b5b6edadc9143ba9c1",
"792d75a7d18945e7972826ac5b2ac386",
"2a6f43b64d164636a2d9708f0190f21b",
"65c62d2198e64ee4a9e6547c2733135a",
"219ca32ab51e4b4385b2c1026a78503a",
"6c2ccfe3363b40b58fc26ea164d4ead4",
"07f0420c4dfa477caccd7ae96551c2e4",
"1c96edb2f7c948b9968b1239982af942",
"d93be4994f104b6e99d89a9e73cd6abd",
"4da21f53bf7f4e2d8132eb43e6ecc739",
"735f70fac43449e3974de1b783d56d33",
"ad75f887a140416abfca615b2fc3c385",
"dee02a37a6f44f168546ee0077dc20d1",
"ee23056662ad4b719b65005d776e0e72",
"87765ca0996b403dbe29deef48d548bf",
"8db5e86577744ff1a39c8e198eee5dd3",
"4b9b3fe8dc764eedb9e18f166fe2f548",
"cca95e973bc445d3811335debf7c446e",
"e507a46b4c754d9a8aede2aac0d203bc",
"751a46fbb8e24efabfb381a85c90fbe8",
"87a808c4d4f54f719adcd29de7206e1b",
"5f0b2a0e1953406b88af2c884904e2da",
"2fa84865e9f14c1491402ef81517b4bd",
"245590db7d374515a428ff4abbd25588",
"e2973e6c02834a7c9f2f6ce5755f35f0",
"48741bbdeccb459aa4eea9c61339764b",
"1183d3f2ad3c4fb0af1d925b5f9e3efe",
"9cc51d8029eb4217bc37daa918649692",
"41f13d2f023e405180689e03bc2c32a1",
"247484c0bf5945bcb4627b48928366c8",
"14c0f20a9ab341ee966fe77815099ff0",
"a219f3b89a34443abe612846676f9356",
"152d7bf2a74f400db3d3ecaa719ef8d1",
"36676899a61f4be4b631f6271f6ecec9",
"77ecad9f150c430fa85f5833d97c42df",
"cef064f1c55f41bf957fc4623260fdb4",
"37cbe8800af04a42a0355922969b6393",
"f8dacdab001d4db0b6b3776ac7d3634a",
"5a59fb5f7acf4213847c985e66c9ee3c",
"ae6d42fb84fc4984af1d4430acdcd3c9",
"02d120e49f2c4f95a6090b1d8d521767",
"8f1e6c36b84c4115a671dcb9ade41c8b",
"81a728910a2341a785a6f252bbb371f7",
"69a8d50f11244ba688c183d14d2395ec",
"350f29f737534bfba4258bc31ec274a2",
"9beac0680e3049dfafcb6ec185fd2265",
"dbf5ed93dac646ed979fa7a8c569dfe3",
"4db5ee5b7b674abba75fbce264e6dfa3",
"0c0c96eeac664f339aa4511bf47087e2",
"18451e19df5449b1853b5e13dacd19c5",
"d864d29d02c54ecfaedd7b866a6df8c2",
"7875163297284832a35aca84cbb105ce",
"d42d8228ea1247a1a81bb99b18c4640c",
"bcda4c9a48e943a6a0ef812fcd64a6db",
"61e491b843c347b6b2a9948de7caf01d",
"dee07d33b8de4c3b847fcff670e68102",
"b07acf871a0a46f1889bfb439d13752b",
"ba94310dc12a4a258205b14901ad3f94",
"a93210a691414502ba3c2dff03ffb4ce",
"fd2fe9ef6da64f72ab29d481d1739f5e",
"dbfeea8ee2374b8c8fa70431c35f281f",
"84d27c45065e426badbfcfcdc8ff16b6",
"fa9ea0d3234e41689c827485d0360885",
"4cb119127b404f46a53012c62d004e28",
"d9020a2a2c8440db81d2cfdf0289b667",
"04d39c4dda9f4a1bb01b8d6320032372",
"4d67b10ec7794170addb4e968e20f170",
"55ac5c2a82ee48fe988e1e4f26c168b0",
"9a079a30b4ae4bbc80122faf83e0ad59",
"acda8e7582934fecbbf854e66e23f698",
"4fbc4cfe529d471ba85f3ae8e53b28d6",
"a0d0fedc5bec4f5b943fddf9a954fbdf",
"cab602573c6940919f93e59fe6f4838d",
"51b8f4ce40f94ac39cf44d98f1522ec7",
"32d6af64f2464cfb965671f2692b4e15",
"e1e77d98b01f4376a6c075975c27571e",
"6a47e60b10a6481b94aee021c8dbc7ba",
"5657a84bf4b74710b2de1a54f9236e39",
"7bd5d1beeb0e49e293d9f6b91bb6d7fb",
"60ceb890b5644493a8886d91b9dac461",
"40138ff29073407abb95f793509fc320",
"0ac4d8e674804ad6bdc5f2d62f2e0d33",
"7bfcd9acf29646db8b6123708d1ffe27",
"5e88d6515f16475fb72d7c153422b591",
"5e5b77dd649547f896ab306fccc94a4e",
"a843fa23e6c94fb486bff8764574fdc5",
"fd0ac7ed3d3146ec85913f4e05c4a2f6",
"77204d81ff8f4ee585361a503fa647dc",
"923653dfe90e475a9efa44baf98ba9a0",
"62600092f8cc43f493b86b0169f67be1",
"59e46bbe96df4b88ad31c09096ce0e0a",
"8f5c7b88a2cc4b5abb0814c814833349"
]
},
"id": "DkIvEkIIkEyB",
"outputId": "2f85e1d0-8810-4b41-b683-0c33578d991c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n",
"🦥 Unsloth Zoo will now patch everything to make training faster!\n",
"==((====))== Unsloth 2025.10.1: Fast Gpt_Oss patching. Transformers: 4.56.2.\n",
" \\\\ /| Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.\n",
"O^O/ \\_/ \\ Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0\n",
"\\ / Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]\n",
" \"-____-\" Free license: http://github.com/unslothai/unsloth\n",
"Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n",
"Unsloth: Using float16 precision for gpt_oss won't work! Using float32.\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "abe2b0a2913d4633943f44333ae799f8",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"model.safetensors.index.json: 0.00B [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "caf742160db041a1b6c2cfdf78f2dc9a",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Fetching 4 files: 0%| | 0/4 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "6ab4e5676ad84807a126fffa99f7a0d4",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"model-00001-of-00004.safetensors: 0%| | 0.00/4.00G [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3f9b801b52da4eb79f730d87bea5c338",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"model-00004-of-00004.safetensors: 0%| | 0.00/1.16G [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "219ca32ab51e4b4385b2c1026a78503a",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"model-00002-of-00004.safetensors: 0%| | 0.00/4.00G [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "8db5e86577744ff1a39c8e198eee5dd3",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"model-00003-of-00004.safetensors: 0%| | 0.00/3.37G [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "1183d3f2ad3c4fb0af1d925b5f9e3efe",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "f8dacdab001d4db0b6b3776ac7d3634a",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"generation_config.json: 0%| | 0.00/165 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unsloth: Offloading embeddings to RAM to save 1.08 GB.\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0c0c96eeac664f339aa4511bf47087e2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"tokenizer_config.json: 0.00B [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "fd2fe9ef6da64f72ab29d481d1739f5e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"tokenizer.json: 0%| | 0.00/27.9M [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "4fbc4cfe529d471ba85f3ae8e53b28d6",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"special_tokens_map.json: 0%| | 0.00/446 [00:00<?, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0ac4d8e674804ad6bdc5f2d62f2e0d33",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"chat_template.jinja: 0.00B [00:00, ?B/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from unsloth import FastLanguageModel\n",
"import torch\n",
"max_seq_length = 768 # Can increase for longer RL output\n",
"lora_rank = 4 # Larger rank = smarter, but slower\n",
"model, tokenizer = FastLanguageModel.from_pretrained(\n",
" model_name = \"unsloth/gpt-oss-20b\", # unsloth/gpt-oss-20b-BF16 for H100s\n",
" max_seq_length = max_seq_length,\n",
" load_in_4bit = True, # False for LoRA 16bit. Choose False on H100s\n",
" offload_embedding = True, # Reduces VRAM by 1GB\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TfeUs-lQJDSq"
},
"source": [
"To do efficient RL, we will use LoRA, which allows us to only add 1 to 5% of extra weights to the model for fine-tuning purposes. This allows us to save memory usage by 60% while retaining most accuracy. Read Unsloth's [gpt-oss RL Guide](https://docs.unsloth.ai/new/gpt-oss-reinforcement-learning) for more details."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "8rGa-o3HJCo1",
"outputId": "6dc27dbf-0c60-4996-8e97-932aab7c14fb"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unsloth: Making `model.base_model.model.model` require gradients\n"
]
}
],
"source": [
"model = FastLanguageModel.get_peft_model(\n",
" model,\n",
" r = lora_rank, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n",
" target_modules = [\n",
" \"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n",
" \"gate_proj\", \"up_proj\", \"down_proj\",\n",
" ],\n",
" lora_alpha = lora_rank*2, # *2 speeds up training\n",
" use_gradient_checkpointing = \"unsloth\", # Reduces memory usage\n",
" random_state = 3407,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "N0QnO9_YJBOI"
},
"source": [
"# 2048 game\n",
"\n",
"We used GPT-5 to create a variant of the 2048 game. It should output the current game board state, and allow us to advance the game board state with 1 action (up, down, left, right)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "D9CI4jtgL5mw"
},
"outputs": [],
"source": [
"#@title (Collapsible) 2048 Game Implementation\n",
"from dataclasses import dataclass, field\n",
"from typing import List, Tuple, Optional\n",
"import random\n",
"import copy\n",
"\n",
"def _compress_and_merge_row_left(row: List[int]) -> Tuple[List[int], int, bool]:\n",
" n = len(row)\n",
" tiles = [x for x in row if x != 0]\n",
" gained = 0\n",
" i = 0\n",
" merged = []\n",
" while i < len(tiles):\n",
" if i + 1 < len(tiles) and tiles[i] == tiles[i + 1]:\n",
" v = tiles[i] * 2\n",
" gained += v\n",
" merged.append(v)\n",
" i += 2\n",
" else:\n",
" merged.append(tiles[i])\n",
" i += 1\n",
" merged += [0] * (n - len(merged))\n",
" changed = merged != row\n",
" return merged, gained, changed\n",
"\n",
"def _move_left(board: List[List[int]]) -> Tuple[List[List[int]], int, bool]:\n",
" changed_any = False\n",
" total_gain = 0\n",
" new_board = []\n",
" for row in board:\n",
" new_row, gained, changed = _compress_and_merge_row_left(row)\n",
" new_board.append(new_row)\n",
" total_gain += gained\n",
" changed_any = changed_any or changed\n",
" return new_board, total_gain, changed_any\n",
"\n",
"def _move_right(board: List[List[int]]) -> Tuple[List[List[int]], int, bool]:\n",
" changed_any = False\n",
" total_gain = 0\n",
" new_board = []\n",
" for row in board:\n",
" rev = list(reversed(row))\n",
" new_rev, gained, changed = _compress_and_merge_row_left(rev)\n",
" new_row = list(reversed(new_rev))\n",
" new_board.append(new_row)\n",
" total_gain += gained\n",
" changed_any = changed_any or changed\n",
" return new_board, total_gain, changed_any\n",
"\n",
"def _transpose(board: List[List[int]]) -> List[List[int]]:\n",
" return [list(row) for row in zip(*board)]\n",
"\n",
"def _move_up(board: List[List[int]]) -> Tuple[List[List[int]], int, bool]:\n",
" t = _transpose(board)\n",
" moved, gain, changed = _move_left(t)\n",
" return _transpose(moved), gain, changed\n",
"\n",
"def _move_down(board: List[List[int]]) -> Tuple[List[List[int]], int, bool]:\n",
" t = _transpose(board)\n",
" moved, gain, changed = _move_right(t)\n",
" return _transpose(moved), gain, changed\n",
"\n",
"def _empty_cells(board: List[List[int]]) -> List[Tuple[int, int]]:\n",
" size = len(board)\n",
" return [(r, c) for r in range(size) for c in range(size) if board[r][c] == 0]\n",
"\n",
"def _can_move(board: List[List[int]]) -> bool:\n",
" if _empty_cells(board):\n",
" return True\n",
" size = len(board)\n",
" for r in range(size):\n",
" for c in range(size - 1):\n",
" if board[r][c] == board[r][c + 1]:\n",
" return True\n",
" for r in range(size - 1):\n",
" for c in range(size):\n",
" if board[r][c] == board[r + 1][c]:\n",
" return True\n",
" return False\n",
"\n",
"@dataclass\n",
"class GameBoard:\n",
" size: int\n",
" seed: Optional[int] = None\n",
" target: int = 2048\n",
" probability_fours: float = 0.10 # originally spawns (4) 10% of the time!\n",
" _rng: random.Random = field(init=False, repr=False)\n",
" _board: List[List[int]] = field(init=False, repr=False)\n",
" _score: int = field(default=0, init=False, repr=False)\n",
" _state: str = field(default=\"ongoing\", init=False, repr=False)\n",
"\n",
" def __post_init__(self):\n",
" if self.size < 2:\n",
" raise ValueError(\"Board size must be at least 2.\")\n",
" self._rng = random.Random(self.seed)\n",
" self._board = [[0 for _ in range(self.size)] for _ in range(self.size)]\n",
" self._add_random_tile()\n",
" self._add_random_tile()\n",
" self._update_state_after_change()\n",
"\n",
" class _BoardView:\n",
" def __init__(self, game: \"GameBoard\"):\n",
" self._game = game\n",
" def __iter__(self):\n",
" return iter(self._game._board)\n",
" def __len__(self):\n",
" return len(self._game._board)\n",
" def __getitem__(self, idx):\n",
" return self._game._board[idx]\n",
" def __repr__(self) -> str:\n",
" return repr(self._game._board)\n",
" __str__ = __repr__\n",
" def do_action(self, key: str) -> None:\n",
" self._game.do_action(key)\n",
" def state(self) -> str:\n",
" return self._game.state()\n",
" def pretty(self, colors: bool = True, border: bool = True, dot_for_zero: bool = True) -> str:\n",
" return self._game._render_pretty(colors=colors, border=border, dot_for_zero=dot_for_zero)\n",
"\n",
" def board(self) -> \"_BoardView\":\n",
" return GameBoard._BoardView(self)\n",
" def state(self) -> str:\n",
" return self._state\n",
" def score(self) -> int:\n",
" return self._score\n",
" def do_action(self, key: str) -> None:\n",
" if self._state != \"ongoing\":\n",
" return\n",
" if not isinstance(key, str) or len(key) == 0:\n",
" self._state = \"failed\"\n",
" return\n",
" k = key.strip().lower()\n",
" if k == \"q\":\n",
" self._state = \"failed\"\n",
" return\n",
" move_map = {\"a\": _move_left, \"d\": _move_right, \"w\": _move_up, \"s\": _move_down}\n",
" if k not in move_map:\n",
" self._state = \"failed\"\n",
" return\n",
" mover = move_map[k]\n",
" new_board, gain, changed = mover(self._board)\n",
" if changed:\n",
" self._board = new_board\n",
" self._score += gain\n",
" self._add_random_tile()\n",
" self._update_state_after_change()\n",
" def _add_random_tile(self) -> bool:\n",
" empties = _empty_cells(self._board)\n",
" if not empties:\n",
" return False\n",
" r, c = self._rng.choice(empties)\n",
" self._board[r][c] = 4 if self._rng.random() < self.probability_fours else 2\n",
" return True\n",
" def _update_state_after_change(self) -> None:\n",
" if any(self.target in row for row in self._board):\n",
" self._state = \"success\"\n",
" return\n",
" if not _can_move(self._board):\n",
" self._state = \"failed\"\n",
" return\n",
" self._state = \"ongoing\"\n",
" def _render_pretty(self, colors: bool = True, border: bool = True, dot_for_zero: bool = True) -> str:\n",
" \"\"\"\n",
" Pretty-print the board with colors that scale from 0 up to self.target.\n",
" Uses ANSI 256-color codes (works in most terminals). Set colors=False to disable.\n",
" \"\"\"\n",
" import math\n",
"\n",
" b = self._board\n",
" mx = max((max(row) for row in b), default=0)\n",
" cell_w = max(3, len(str(mx)))\n",
"\n",
" RESET = \"\\x1b[0m\"\n",
"\n",
" # A smooth-ish gradient from cool → warm\n",
" # (blue/cyan/green → yellow/orange/red). Tweak or expand as you like.\n",
" GRAD = [33, 39, 45, 51, 50, 49, 48, 47, 46, 82, 118, 154, 190, 226, 220, 214, 208, 202, 196]\n",
" ZERO_FG = 239 # dim gray\n",
"\n",
" def color_code(v: int) -> str:\n",
" if not colors:\n",
" return \"\"\n",
" if v == 0:\n",
" return f\"\\x1b[38;5;{ZERO_FG}m\"\n",
" # Normalize by exponent relative to target: r in [0,1]\n",
" t = max(2, self.target) # safety; avoid log2(1)\n",
" # Guard: if v is not a power of two or is <1, handle gracefully\n",
" try:\n",
" r = max(0.0, min(1.0, math.log2(v) / math.log2(t)))\n",
" except ValueError:\n",
" r = 0.0\n",
" idx = int(round(r * (len(GRAD) - 1)))\n",
" return f\"\\x1b[38;5;{GRAD[idx]}m\"\n",
"\n",
" def fmt(v: int) -> str:\n",
" s = \".\" if (v == 0 and dot_for_zero) else str(v)\n",
" s = s.rjust(cell_w)\n",
" return color_code(v) + s + (RESET if colors else \"\")\n",
"\n",
" def hline(left: str, mid: str, right: str) -> str:\n",
" return left + mid.join(\"─\" * cell_w for _ in range(self.size)) + right\n",
"\n",
" rows = []\n",
" if border:\n",
" rows.append(hline(\"┌\", \"┬\", \"┐\"))\n",
" for r in range(self.size):\n",
" content = \"│\".join(fmt(v) for v in b[r])\n",
" rows.append((\"│\" + content + \"│\") if border else content)\n",
" if border:\n",
" rows.append(hline(\"└\" if r == self.size - 1 else \"├\",\n",
" \"┴\" if r == self.size - 1 else \"┼\",\n",
" \"┘\" if r == self.size - 1 else \"┤\"))\n",
" return \"\\n\".join(rows)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4BcaLniVKLpa"
},
"source": [
"For example let's create a board of size 5 X 5 and set the target to 8 instead of 2048.\n",
"\n",
"**[NOTE]** 2048 originally spawns a (4) 10% of the time! We can disable this for harder games. See [Wikipedia page](https://en.wikipedia.org/wiki/2048_(video_game)) for more details."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-M8kGaFRJ2ic",
"outputId": "fad6c36b-cb16-490f-ad4f-6bf998dd24ab"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;48m 2\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\n",
"└───┴───┴───┴───┴───┘ ongoing\n"
]
}
],
"source": [
"game = GameBoard(size = 5, seed = 42, target = 8, probability_fours = 0.10)\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zclUeNxosv4k",
"outputId": "ad099448-d1f2-4471-cbc1-f463293e06ba"
},
"outputs": [
{
"data": {
"text/plain": [
"GameBoard(size=5, seed=42, target=8, probability_fours=0.1)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"game"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "envzrXmjKRff"
},
"source": [
"We'll use WASD for the action space:\n",
"\n",
"```\n",
" W\n",
"A S D\n",
"```\n",
"Also `game.state()` will say `success` if we succeeded in getting the target!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "b-gSgthFI_wq",
"outputId": "68af4e66-80c8-4fa0-c7f3-e9ba22923494"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┴───┴───┘ ongoing\n"
]
}
],
"source": [
"game.do_action(\"A\")\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "lUDdHKAxvZf8",
"outputId": "38692fcc-bfa9-47b3-82f8-09bee2842d38"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;48m 2\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┴───┴───┘ ongoing\n"
]
}
],
"source": [
"game.do_action(\"W\")\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "wkTHxvvUvcmO",
"outputId": "f9447b03-b0eb-443e-e139-607f231c76fe"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┴───┴───┘ ongoing\n"
]
}
],
"source": [
"game.do_action(\"D\")\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XO8vlL-4vd-K",
"outputId": "a6f786bf-39d5-4a23-d79b-17ea9e94272c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┴───┴───┘ ongoing\n"
]
}
],
"source": [
"game.do_action(\"W\")\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "MEa2ngmrvfNm",
"outputId": "c27d9fca-55a0-42c4-dae5-bf8e402d7295"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;196m 8\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\n",
"├───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┴───┴───┘ success\n"
]
}
],
"source": [
"game.do_action(\"D\")\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gGL1X29Fy4n5"
},
"source": [
"If we do some other action that's not part of the action space, we will get an error, and the game will not accept anymore actions."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "VZeIHbqoy7yn",
"outputId": "11d15a8f-f09d-4833-8ef7-3bad0510e618"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"┌───┬───┬───┐\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;190m 4\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;48m 2\u001b[0m│\n",
"├───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┘ failed\n"
]
}
],
"source": [
"game = GameBoard(size = 3, seed = 42, target = 8, probability_fours = 0.10)\n",
"game.do_action(\"AA\") # Not in WASD\n",
"game.do_action(\"W\") # Doesn't do anything\n",
"game.do_action(\"A\") # Doesn't do anything\n",
"print(game.board().pretty(), game.state())"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VR6czU96cpxf"
},
"source": [
"# RL Environment Setup\n",
"\n",
"We'll set up a function to accept some strategy that'll emit an action within `WASD` and check the game state.\n",
"\n",
"We'll also add a timer to only execute the stratgegy for 2 seconds maximum, otherwise it might never terminate!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "tdgjnf-8z_kr"
},
"outputs": [],
"source": [
"from typing import Callable\n",
"from unsloth import execute_with_time_limit\n",
"\n",
"def _execute_strategy(strategy : Callable, game : GameBoard):\n",
" assert callable(strategy)\n",
"\n",
" steps = 0\n",
" while game.state() == \"ongoing\":\n",
" action = strategy(list(game.board()))\n",
" steps += 1\n",
" if type(action) is not str:\n",
" return steps, \"failed\"\n",
" game.do_action(action)\n",
" return steps, game.state()\n",
"\n",
"@execute_with_time_limit(2)\n",
"def execute_strategy(strategy : Callable, game : GameBoard):\n",
" return _execute_strategy(strategy, game)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ywh0HizI9ayE"
},
"source": [
"Let's make a generic strategy to just hit `W`. We should expect this generic strategy to fail:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "5bkhqoZc0IO8",
"outputId": "149e18be-dae2-4382-817a-620e7b40ebde"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Timed out with error = Timed out after 2s\n"
]
}
],
"source": [
"def always_move_left(board):\n",
" return \"W\"\n",
"\n",
"game = GameBoard(size = 8, seed = 42, target = 2048, probability_fours = 0.10)\n",
"try:\n",
" execute_strategy(always_move_left, game)\n",
"except TimeoutError as e:\n",
" print(f\"Timed out with error = {str(e)}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dkuHVdB09sgf"
},
"source": [
"To allow longer strategies for gpt-oss-20b Reinforcement Learning, we shall allow a 5 second timer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "SK-LfzsA9wbW"
},
"outputs": [],
"source": [
"@execute_with_time_limit(5)\n",
"def execute_strategy(strategy : Callable, game : GameBoard):\n",
" return _execute_strategy(strategy, game)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "tRhLV_bZMYxy"
},
"source": [
"# Code Execution\n",
"\n",
"To execute and create a new Python function, we first have to check if the function does not call other global variables or cheat. This is called `countering reward hacking` since we don't want the function to cheat.\n",
"\n",
"For example the below piece of code is fine, since it only imports Python level functions. We use `check_python_modules`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zz80kvg6M4BG",
"outputId": "f13fdc0d-ddb3-4c4a-cf65-805dfb31dddd"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Only Python imports? True\n",
"{'stdlib': ['math', 'typing'], 'non_stdlib': [], 'relative_imports': 0}\n"
]
}
],
"source": [
"from unsloth import check_python_modules\n",
"\n",
"sample = \"\"\"\n",
"def strategy(board):\n",
" import math\n",
" from typing import Callable\n",
" return \"W\"\n",
"\"\"\"\n",
"ok, info = check_python_modules(sample)\n",
"print(\"Only Python imports?\", ok)\n",
"print(info)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bZzVWgKQ-VIg"
},
"source": [
"For the below piece of code, since we import `numpy`, we should not allow the execution:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Z89Jw1KB-Ux7",
"outputId": "1a4cc701-1677-44b9-d44e-3f3f6dfed8d2"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Only Python imports? False\n",
"{'stdlib': [], 'non_stdlib': ['numpy'], 'relative_imports': 0}\n"
]
}
],
"source": [
"sample = \"\"\"\n",
"def strategy(board):\n",
" from numpy import matmul\n",
" return \"W\"\n",
"\"\"\"\n",
"ok, info = check_python_modules(sample)\n",
"print(\"Only Python imports?\", ok)\n",
"print(info)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "SDSrjOTLVyQm"
},
"source": [
"We also disallow global variable access. We'll use Unsloth's `create_locked_down_function` function\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GcmYAmohVqw2",
"outputId": "bbfcbbb5-8063-42fe-b349-964554317ab8"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"name 'np' is not defined\n"
]
}
],
"source": [
"from unsloth import create_locked_down_function\n",
"function = \"\"\"\n",
"def import_numpy():\n",
" np.matmul\n",
" print(\"Success\")\n",
"\"\"\"\n",
"f = create_locked_down_function(function)\n",
"try:\n",
" f()\n",
"except Exception as e:\n",
" print(str(e))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "5tJKwLUgZsRq",
"outputId": "13588c11-6685-4627-b2d4-445bff9799c8"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"60\n"
]
}
],
"source": [
"from unsloth import create_locked_down_function\n",
"function = \"\"\"\n",
"def add(a, b):\n",
" def adder(a):\n",
" return a + b\n",
" return adder(b) + b\n",
"\"\"\"\n",
"f = create_locked_down_function(function)\n",
"try:\n",
" print(f(10, 20))\n",
"except Exception as e:\n",
" print(str(e))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8CzwCyXIPK04"
},
"source": [
"# Data & RL task setup\n",
"\n",
"We now have to create a prompt to tell the model to create a strategy for the 2048 game. You can customize this to some other task for another RL task."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "B-2RRE4HMrQO",
"outputId": "332255d7-1e6a-4cb4-9ede-c8a2f01378fe"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Create a new short 2048 strategy using only native Python code.\n",
"You are given a list of list of numbers for the current board state.\n",
"Output one action for \"W\", \"A\", \"S\", \"D\" on what is the optimal next step.\n",
"Output your new short function in backticks using the format below:\n",
"```python\n",
"def strategy(board):\n",
" return \"W\" # Example\n",
"```\n",
"All helper functions should be inside def strategy. Only output the short function `strategy`.\n"
]
}
],
"source": [
"prompt = \"\"\"\n",
"Create a new short 2048 strategy using only native Python code.\n",
"You are given a list of list of numbers for the current board state.\n",
"Output one action for \"W\", \"A\", \"S\", \"D\" on what is the optimal next step.\n",
"Output your new short function in backticks using the format below:\n",
"```python\n",
"def strategy(board):\n",
" return \"W\" # Example\n",
"```\n",
"All helper functions should be inside def strategy. Only output the short function `strategy`.\n",
"\"\"\".strip()\n",
"print(prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MIdudFUodN4i"
},
"source": [
"First, let's prompt gpt-oss without RL and see how it goes:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "9HJxrS76h3Ds",
"outputId": "fcfe2220-fc72-4af7-f8a1-f243311b0156"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.\n",
"Knowledge cutoff: 2024-06\n",
"Current date: 2025-10-05\n",
"\n",
"Reasoning: low\n",
"\n",
"# Valid channels: analysis, commentary, final. Channel must be included for every message.\n",
"Calls to these tools must go to the commentary channel: 'functions'.<|end|><|start|>user<|message|>Create a new short 2048 strategy using only native Python code.\n",
"You are given a list of list of numbers for the current board state.\n",
"Output one action for \"W\", \"A\", \"S\", \"D\" on what is the optimal next step.\n",
"Output your new short function in backticks using the format below:\n",
"```python\n",
"def strategy(board):\n",
" return \"W\" # Example\n",
"```\n",
"All helper functions should be inside def strategy. Only output the short function `strategy`.<|end|><|start|>assistant<|channel|>analysis<|message|>We need simple heuristic: return random move or based on empty spaces. Provide function.<|end|><|start|>assistant<|channel|>final<|message|>```python\n",
"def strategy(board):\n",
" # Count empty cells and choose move that reduces board density\n",
" empty_counts = {\n",
" \"W\": sum(row.count(0) for row in board[1:]),\n",
" \"S\": sum(row.count(0) for row in board[:-1]),\n",
" \"A\": sum(col.count(0) for col in zip(*board)[1:]),\n",
" \"D\": sum(col.count(0) for col in zip(*board)[:-1]),\n",
" }\n",
" # Prefer the direction with the most empty cells\n",
" return max(empty_counts, key=empty_counts.get)\n",
"```<|return|>\n"
]
}
],
"source": [
"text = tokenizer.apply_chat_template(\n",
" [{\"role\": \"user\", \"content\": prompt}],\n",
" tokenize = False,\n",
" add_generation_prompt = True,\n",
" reasoning_effort = \"low\",\n",
")\n",
"\n",
"from transformers import TextStreamer\n",
"_ = model.generate(\n",
" **tokenizer(text, return_tensors = \"pt\").to(\"cuda\"),\n",
" temperature = 1.0,\n",
" max_new_tokens = 512,\n",
" streamer = TextStreamer(tokenizer, skip_prompt = False),\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "iknaWZNudTNq"
},
"source": [
"# Reward functions\n",
"\n",
"We now design a `extract_function` function which simply extracts the function wrapped in 3 back ticks.\n",
"\n",
"And 3 reward functions:\n",
"\n",
"1. `function_works` which rewards the model if the strategy is a valid Python function.\n",
"2. `no_cheating` which checks if the function imported other modules, and if it did, we penalize it.\n",
"3. `strategy_succeeds` which checks if the game strategy actually succeeds in attaining 2048 after running the auto-generated strategy."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "8JJGXKdJ-Zl_",
"outputId": "80fd8078-1621-4c64-a906-5204b444addd"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"def strategy(board):\n",
" return \"W\" # Example\n"
]
}
],
"source": [
"def extract_function(text):\n",
" if text.count(\"```\") >= 2:\n",
" first = text.find(\"```\") + 3\n",
" second = text.find(\"```\", first)\n",
" fx = text[first : second].strip()\n",
" fx = fx.removeprefix(\"python\\n\")\n",
" fx = fx[fx.find(\"def\"):]\n",
" if fx.startswith(\"def strategy(board):\"): return fx\n",
" return None\n",
"print(extract_function(prompt))"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KLXEcf_HSJlI"
},
"source": [
"Below is our `function_works` reward function which uses Python's `exec` but guarded by not allowing leakage of local and global variables. We can also use `check_python_modules` first to check if there are errors before even executing the function:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "h3-B0IIsS56S",
"outputId": "f3e174fa-2fbf-400b-ec7d-87590be3ef68"
},
"outputs": [
{
"data": {
"text/plain": [
"(False,\n",
" {'error': \"SyntaxError: expected '(' (<unknown>, line 1)\",\n",
" 'stdlib': [],\n",
" 'non_stdlib': [],\n",
" 'relative_imports': 0})"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ok, info = check_python_modules(\"def a\")\n",
"ok, info"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "qgFNXORy-lpO"
},
"outputs": [],
"source": [
"def function_works(completions, **kwargs):\n",
" scores = []\n",
" for completion in completions:\n",
" score = 0\n",
" response = completion[0][\"content\"]\n",
" function = extract_function(response)\n",
" if function is not None:\n",
" ok, info = check_python_modules(function)\n",
" if function is None or \"error\" in info:\n",
" score = -2.0\n",
" else:\n",
" try:\n",
" new_strategy = create_locked_down_function(function)\n",
" score = 1.0\n",
" except:\n",
" score = -0.5\n",
" scores.append(score)\n",
" return scores"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Gf69i2WT-m4K"
},
"source": [
"`no_cheating` checks if the function cheated since it might have imported Numpy or other functions:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cUfHzCVx-nGK"
},
"outputs": [],
"source": [
"def no_cheating(completions, **kwargs):\n",
" scores = []\n",
" for completion in completions:\n",
" score = 0\n",
" response = completion[0][\"content\"]\n",
" function = extract_function(response)\n",
" if function is not None:\n",
" ok, info = check_python_modules(function)\n",
" scores.append(1.0 if ok else -20.0) # Penalize heavily!\n",
" else:\n",
" scores.append(-1.0) # Failed creating function\n",
" return scores"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "slnqWG3FTror"
},
"source": [
"Next `strategy_succeeds` checks if the strategy actually allows the game to terminate. Imagine if the strategy simply returned \"W\" which would fail after a time limit of 10 seconds.\n",
"\n",
"We also add a global `PRINTER` to print out the strategy and board state."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sNi129lYTpZ2"
},
"outputs": [],
"source": [
"import numpy as np\n",
"global PRINTER\n",
"PRINTER = 0\n",
"def strategy_succeeds(completions, **kwargs):\n",
" global PRINTER\n",
" scores = []\n",
" # Generate a random game board with seed\n",
" seed = np.random.randint(10000)\n",
" for completion in completions:\n",
" printed = False\n",
" score = 0\n",
" response = completion[0][\"content\"]\n",
" function = extract_function(response)\n",
" if PRINTER % 5 == 0:\n",
" printed = True\n",
" print(function)\n",
" PRINTER += 1\n",
" if function is not None:\n",
" ok, info = check_python_modules(function)\n",
" if function is None or \"error\" in info:\n",
" scores.append(0)\n",
" continue\n",
" try:\n",
" new_strategy = create_locked_down_function(function)\n",
" except:\n",
" scores.append(0)\n",
" continue\n",
" try:\n",
" game = GameBoard(size = 6, seed = seed, target = 2048, probability_fours = 0.10)\n",
" steps, game_state = execute_strategy(new_strategy, game)\n",
" print(f\"Steps = {steps} State = {game_state}\")\n",
" if printed is False:\n",
" print(function)\n",
" print(game.board().pretty())\n",
" if game_state == \"success\":\n",
" scores.append(20.0) # Success - massively reward!\n",
" else:\n",
" scores.append(2.0) # Failed but function works!\n",
" except TimeoutError as e:\n",
" print(\"Timeout\")\n",
" scores.append(-1.0) # Failed with timeout\n",
" except Exception as e:\n",
" print(f\"Exception = {str(e)}\")\n",
" scores.append(-3.0) # Failed\n",
" return scores"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "TCpSxtvSeAG_"
},
"source": [
"We'll now create the dataset which includes a replica of our prompt. Remember to add a reasoning effort of low! You can choose high reasoning mode, but this'll only work on more memory GPUs like H100s."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Ldf6SjLHVPRv",
"outputId": "589f7523-9835-49b5-c477-4e1d8b0744ff"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"181\n"
]
},
{
"data": {
"text/plain": [
"{'prompt': [{'content': 'Create a new short 2048 strategy using only native Python code.\\nYou are given a list of list of numbers for the current board state.\\nOutput one action for \"W\", \"A\", \"S\", \"D\" on what is the optimal next step.\\nOutput your new short function in backticks using the format below:\\n```python\\ndef strategy(board):\\n return \"W\" # Example\\n```\\nAll helper functions should be inside def strategy. Only output the short function `strategy`.',\n",
" 'role': 'user'}],\n",
" 'answer': 0,\n",
" 'reasoning_effort': 'low'}"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from datasets import Dataset\n",
"dataset = Dataset.from_list([{\"prompt\" : [{\"role\": \"user\", \"content\": prompt.strip()}], \"answer\" : 0, \"reasoning_effort\": \"low\"}]*1000)\n",
"maximum_length = len(tokenizer.apply_chat_template([{\"role\": \"user\", \"content\": prompt.strip()}], add_generation_prompt = True))\n",
"print(maximum_length)\n",
"dataset[0]"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9-IOMhVg-2AM"
},
"source": [
"<a name=\"Train\"></a>\n",
"### Train the model\n",
"\n",
"Now set up GRPO Trainer and all configurations! We also support GSPO, GAPO, Dr GRPO and more! Go the Unsloth [Reinforcement Learning Docs](https://docs.unsloth.ai/get-started/reinforcement-learning-rl-guide) for more options."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ptqkXK2D4d6p",
"outputId": "2061b833-5b98-4a2b-e7f5-4bc4652d8300"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unsloth: We now expect `per_device_train_batch_size` to be a multiple of `num_generations`.\n",
"We will change the batch size of 1 to the `num_generations` of 2\n"
]
}
],
"source": [
"max_prompt_length = maximum_length + 1 # + 1 just in case!\n",
"max_completion_length = max_seq_length - max_prompt_length\n",
"\n",
"from trl import GRPOConfig, GRPOTrainer\n",
"training_args = GRPOConfig(\n",
" temperature = 1.0,\n",
" learning_rate = 5e-5,\n",
" weight_decay = 0.01,\n",
" warmup_ratio = 0.1,\n",
" lr_scheduler_type = \"linear\",\n",
" optim = \"adamw_8bit\",\n",
" logging_steps = 1,\n",
" per_device_train_batch_size = 1,\n",
" gradient_accumulation_steps = 1, # Increase to 4 for smoother training\n",
" num_generations = 2, # Decrease if out of memory\n",
" max_prompt_length = max_prompt_length,\n",
" max_completion_length = max_completion_length,\n",
" # num_train_epochs = 1, # Set to 1 for a full training run\n",
" max_steps = 1000,\n",
" save_steps = 100,\n",
" report_to = \"none\", # Can use Weights & Biases, TrackIO\n",
" output_dir = \"outputs\",\n",
"\n",
" # For optional training + evaluation\n",
" # fp16_full_eval = True,\n",
" # per_device_eval_batch_size = 4,\n",
" # eval_accumulation_steps = 1,\n",
" # eval_strategy = \"steps\",\n",
" # eval_steps = 1,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "r9Mv8UZO5hz-"
},
"source": [
"And let's run the trainer! If you scroll up, you'll see a table of rewards. The goal is to see the `reward` column increase!\n",
"\n",
"You might have to wait 150 to 200 steps for any action. You'll probably get 0 reward for the first 100 steps. Please be patient!\n",
"\n",
"| Step | Training Loss | reward | reward_std | completion_length | kl |\n",
"|------|---------------|-----------|------------|-------------------|----------|\n",
"| 1 | 0.000000 | 0.125000 | 0.000000 | 200.000000 | 0.000000 |\n",
"| 2 | 0.000000 | 0.072375 | 0.248112 | 200.000000 | 0.000000 |\n",
"| 3 | 0.000000 | -0.079000 | 0.163776 | 182.500000 | 0.000005 |\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vzOuSVCL_GA9",
"outputId": "349f907c-cc67-4890-e131-397694679634"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unsloth: Switching to float32 training since model cannot work with float16\n"
]
}
],
"source": [
"# For optional training + evaluation\n",
"# new_dataset = dataset.train_test_split(test_size = 0.01)\n",
"\n",
"trainer = GRPOTrainer(\n",
" model = model,\n",
" processing_class = tokenizer,\n",
" reward_funcs = [\n",
" function_works,\n",
" no_cheating,\n",
" strategy_succeeds,\n",
" ],\n",
" args = training_args,\n",
" train_dataset = dataset,\n",
"\n",
" # For optional training + evaluation\n",
" # train_dataset = new_dataset[\"train\"],\n",
" # eval_dataset = new_dataset[\"test\"],\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "fQhtuwP4cf34"
},
"source": [
"And let's train the model!\n",
"\n",
"**NOTE** A T4 free GPU might take 5 minutes for one generation sadly since it's an old GPU - A100 or H100 will be much faster!"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
},
"id": "VGRxPdSCcfC3",
"outputId": "f8bb720c-6d69-4f43-d9d1-a404842d2dff"
},
"outputs": [
{
"metadata": {
"tags": null
},
"name": "stderr",
"output_type": "stream",
"text": [
"The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': 199998, 'pad_token_id': 200017}.\n",
"==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 2\n",
" \\\\ /| Num examples = 1,000 | Num Epochs = 1 | Total steps = 1,000\n",
"O^O/ \\_/ \\ Batch size per device = 2 | Gradient accumulation steps = 1\n",
"\\ / Data Parallel GPUs = 1 | Total batch size (2 x 1 x 1) = 2\n",
" \"-____-\" Trainable parameters = 1,990,656 of 20,916,747,840 (0.01% trained)\n",
"`generation_config` default values have been modified to match model-specific defaults: {'max_length': 131072}. If this is not desired, please set these values explicitly.\n"
]
},
{
"metadata": {
"tags": null
},
"name": "stdout",
"output_type": "stream",
"text": [
"None\n",
"Steps = 1 State = failed\n",
"def strategy(board):\n",
" # simple heuristic: prefer right or down, then left, then up\n",
" for move in \"R D L U\".split():\n",
" pass\n",
"┌───┬───┬───┬───┬───┬───┐\n",
"│\u001b[38;5;45m 2\u001b[0m│\u001b[38;5;45m 2\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"├───┼───┼───┼───┼───┼───┤\n",
"│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\u001b[38;5;239m .\u001b[0m│\n",
"└───┴───┴───┴───┴───┴───┘\n"
]
},
{
"data": {
"text/html": [
"\n",
" <div>\n",
" \n",
" <progress value='86' max='1000' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
" [ 86/1000 8:06:01 < 88:08:29, 0.00 it/s, Epoch 0.09/1]\n",
" </div>\n",
" <table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: left;\">\n",
" <th>Step</th>\n",
" <th>Training Loss</th>\n",
" <th>reward</th>\n",
" <th>reward_std</th>\n",
" <th>completions / mean_length</th>\n",
" <th>completions / min_length</th>\n",
" <th>completions / max_length</th>\n",
" <th>completions / clipped_ratio</th>\n",
" <th>completions / mean_terminated_length</th>\n",
" <th>completions / min_terminated_length</th>\n",
" <th>completions / max_terminated_length</th>\n",
" <th>kl</th>\n",
" <th>rewards / function_works / mean</th>\n",
" <th>rewards / function_works / std</th>\n",
" <th>rewards / no_cheating / mean</th>\n",
" <th>rewards / no_cheating / std</th>\n",
" <th>rewards / strategy_succeeds / mean</th>\n",
" <th>rewards / strategy_succeeds / std</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <td>1</td>\n",
" <td>0.000000</td>\n",
" <td>0.500000</td>\n",
" <td>4.949748</td>\n",
" <td>329.000000</td>\n",
" <td>72.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>72.000000</td>\n",
" <td>72.000000</td>\n",
" <td>72.000000</td>\n",
" <td>0.002197</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>1.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>2</td>\n",
" <td>0.000000</td>\n",
" <td>0.500000</td>\n",
" <td>4.949748</td>\n",
" <td>550.500000</td>\n",
" <td>515.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>515.000000</td>\n",
" <td>515.000000</td>\n",
" <td>515.000000</td>\n",
" <td>0.000298</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>1.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>3</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" <td>538.000000</td>\n",
" <td>490.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>490.000000</td>\n",
" <td>490.000000</td>\n",
" <td>490.000000</td>\n",
" <td>0.000276</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-1.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>4</td>\n",
" <td>0.000000</td>\n",
" <td>2.500000</td>\n",
" <td>2.121320</td>\n",
" <td>325.000000</td>\n",
" <td>120.000000</td>\n",
" <td>530.000000</td>\n",
" <td>0.000000</td>\n",
" <td>325.000000</td>\n",
" <td>120.000000</td>\n",
" <td>530.000000</td>\n",
" <td>0.000568</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>5</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" <td>437.000000</td>\n",
" <td>288.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>288.000000</td>\n",
" <td>288.000000</td>\n",
" <td>288.000000</td>\n",
" <td>0.001381</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-1.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>6</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>308.500000</td>\n",
" <td>301.000000</td>\n",
" <td>316.000000</td>\n",
" <td>0.000000</td>\n",
" <td>308.500000</td>\n",
" <td>301.000000</td>\n",
" <td>316.000000</td>\n",
" <td>0.000826</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-3.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>7</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>519.000000</td>\n",
" <td>452.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>452.000000</td>\n",
" <td>452.000000</td>\n",
" <td>452.000000</td>\n",
" <td>0.000223</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>8</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>333.500000</td>\n",
" <td>81.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>81.000000</td>\n",
" <td>81.000000</td>\n",
" <td>81.000000</td>\n",
" <td>0.001181</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>9</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>568.500000</td>\n",
" <td>551.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>551.000000</td>\n",
" <td>551.000000</td>\n",
" <td>551.000000</td>\n",
" <td>0.000281</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>10</td>\n",
" <td>0.000000</td>\n",
" <td>-3.000000</td>\n",
" <td>0.000000</td>\n",
" <td>586.000000</td>\n",
" <td>586.000000</td>\n",
" <td>586.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000153</td>\n",
" <td>-2.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>11</td>\n",
" <td>0.000000</td>\n",
" <td>2.500000</td>\n",
" <td>2.121320</td>\n",
" <td>330.000000</td>\n",
" <td>264.000000</td>\n",
" <td>396.000000</td>\n",
" <td>0.000000</td>\n",
" <td>330.000000</td>\n",
" <td>264.000000</td>\n",
" <td>396.000000</td>\n",
" <td>0.004015</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>0.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>12</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>374.500000</td>\n",
" <td>360.000000</td>\n",
" <td>389.000000</td>\n",
" <td>0.000000</td>\n",
" <td>374.500000</td>\n",
" <td>360.000000</td>\n",
" <td>389.000000</td>\n",
" <td>0.000245</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>13</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>520.500000</td>\n",
" <td>455.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>455.000000</td>\n",
" <td>455.000000</td>\n",
" <td>455.000000</td>\n",
" <td>0.000915</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>14</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>406.500000</td>\n",
" <td>227.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>227.000000</td>\n",
" <td>227.000000</td>\n",
" <td>227.000000</td>\n",
" <td>0.007664</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>15</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>348.500000</td>\n",
" <td>302.000000</td>\n",
" <td>395.000000</td>\n",
" <td>0.000000</td>\n",
" <td>348.500000</td>\n",
" <td>302.000000</td>\n",
" <td>395.000000</td>\n",
" <td>0.002411</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>16</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>408.000000</td>\n",
" <td>379.000000</td>\n",
" <td>437.000000</td>\n",
" <td>0.000000</td>\n",
" <td>408.000000</td>\n",
" <td>379.000000</td>\n",
" <td>437.000000</td>\n",
" <td>0.002496</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>17</td>\n",
" <td>0.000000</td>\n",
" <td>-12.500000</td>\n",
" <td>13.435029</td>\n",
" <td>493.000000</td>\n",
" <td>400.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>400.000000</td>\n",
" <td>400.000000</td>\n",
" <td>400.000000</td>\n",
" <td>0.009901</td>\n",
" <td>-2.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-10.500000</td>\n",
" <td>13.435029</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>18</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>413.000000</td>\n",
" <td>260.000000</td>\n",
" <td>566.000000</td>\n",
" <td>0.000000</td>\n",
" <td>413.000000</td>\n",
" <td>260.000000</td>\n",
" <td>566.000000</td>\n",
" <td>0.021275</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>19</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>487.500000</td>\n",
" <td>389.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>389.000000</td>\n",
" <td>389.000000</td>\n",
" <td>389.000000</td>\n",
" <td>0.019204</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>20</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" <td>586.000000</td>\n",
" <td>586.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>586.000000</td>\n",
" <td>586.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.001022</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-1.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>21</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>397.500000</td>\n",
" <td>276.000000</td>\n",
" <td>519.000000</td>\n",
" <td>0.000000</td>\n",
" <td>397.500000</td>\n",
" <td>276.000000</td>\n",
" <td>519.000000</td>\n",
" <td>0.027686</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>22</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>499.500000</td>\n",
" <td>486.000000</td>\n",
" <td>513.000000</td>\n",
" <td>0.000000</td>\n",
" <td>499.500000</td>\n",
" <td>486.000000</td>\n",
" <td>513.000000</td>\n",
" <td>0.007218</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>23</td>\n",
" <td>0.000000</td>\n",
" <td>-1.250000</td>\n",
" <td>2.474874</td>\n",
" <td>575.500000</td>\n",
" <td>565.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>565.000000</td>\n",
" <td>565.000000</td>\n",
" <td>565.000000</td>\n",
" <td>0.005928</td>\n",
" <td>-1.250000</td>\n",
" <td>1.060660</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>0.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>24</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" <td>563.500000</td>\n",
" <td>541.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>541.000000</td>\n",
" <td>541.000000</td>\n",
" <td>541.000000</td>\n",
" <td>0.008769</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-1.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>25</td>\n",
" <td>0.000100</td>\n",
" <td>-1.000000</td>\n",
" <td>2.828427</td>\n",
" <td>444.500000</td>\n",
" <td>303.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>303.000000</td>\n",
" <td>0.084963</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-0.500000</td>\n",
" <td>0.707107</td>\n",
" </tr>\n",
" <tr>\n",
" <td>26</td>\n",
" <td>0.000100</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" <td>419.000000</td>\n",
" <td>252.000000</td>\n",
" <td>586.000000</td>\n",
" <td>0.500000</td>\n",
" <td>252.000000</td>\n",
" <td>252.000000</td>\n",
" <td>252.000000</td>\n",
" <td>0.114125</td>\n",
" <td>-0.500000</td>\n",
" <td>2.121320</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>-1.500000</td>\n",
" <td>2.121320</td>\n",
" </tr>\n",
" <tr>\n",
" <td>27</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>339.500000</td>\n",
" <td>321.000000</td>\n",
" <td>358.000000</td>\n",
" <td>0.000000</td>\n",
" <td>339.500000</td>\n",
" <td>321.000000</td>\n",
" <td>358.000000</td>\n",
" <td>0.033457</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-1.000000</td>\n",
" <td>0.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <td>28</td>\n",
" <td>0.000100</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>372.500000</td>\n",
" <td>311.000000</td>\n",
" <td>434.000000</td>\n",
" <td>0.000000</td>\n",
" <td>372.500000</td>\n",
" <td>311.000000</td>\n",
" <td>434.000000</td>\n",
" <td>0.081829</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>29</td>\n",
" <td>0.000100</td>\n",
" <td>0.000000</td>\n",
" <td>1.414214</td>\n",
" <td>387.500000</td>\n",
" <td>336.000000</td>\n",
" <td>439.000000</td>\n",
" <td>0.000000</td>\n",
" <td>387.500000</td>\n",
" <td>336.000000</td>\n",
" <td>439.000000</td>\n",
" <td>0.100017</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>1.000000</td>\n",
" <td>0.000000</td>\n",
" <td>-2.000000</td>\n",
" <td>1.414214</td>\n",
" </tr>\n",
" <tr>\n",
" <td>30</td>\n",
" <td>0.000100</
gitextract_586tz2l0/
├── .github/
│ ├── CODEOWNERS
│ ├── ISSUE_TEMPLATE/
│ │ └── config.yml
│ └── workflows/
│ └── CI.yml
├── .gitignore
├── CMakeLists.txt
├── LICENSE
├── MANIFEST.in
├── README.md
├── USAGE_POLICY
├── _build/
│ └── gpt_oss_build_backend/
│ ├── __init__.py
│ └── backend.py
├── awesome-gpt-oss.md
├── compatibility-test/
│ ├── .gitignore
│ ├── README.md
│ ├── analysis.ts
│ ├── cases.jsonl
│ ├── index.ts
│ ├── package.json
│ ├── providers.ts
│ ├── runCase.ts
│ └── tools.ts
├── examples/
│ ├── agents-sdk-js/
│ │ ├── index.ts
│ │ └── package.json
│ ├── agents-sdk-python/
│ │ ├── example.py
│ │ └── pyproject.toml
│ ├── gradio/
│ │ └── gradio_chat.py
│ ├── reinforcement-fine-tuning.ipynb
│ └── streamlit/
│ └── streamlit_chat.py
├── gpt-oss-mcp-server/
│ ├── README.md
│ ├── browser_server.py
│ ├── build-system-prompt.py
│ ├── pyproject.toml
│ ├── python_server.py
│ └── reference-system-prompt.py
├── gpt_oss/
│ ├── __init__.py
│ ├── chat.py
│ ├── evals/
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── __main__.py
│ │ ├── abcd_grader.py
│ │ ├── aime_eval.py
│ │ ├── basic_eval.py
│ │ ├── chat_completions_sampler.py
│ │ ├── gpqa_eval.py
│ │ ├── healthbench_eval.py
│ │ ├── report.py
│ │ ├── responses_sampler.py
│ │ └── types.py
│ ├── generate.py
│ ├── metal/
│ │ ├── CMakeLists.txt
│ │ ├── __init__.py
│ │ ├── benchmark/
│ │ │ ├── end-to-end-threadgroup.cc
│ │ │ ├── end-to-end.cc
│ │ │ ├── f32-bf16w-rmsnorm.cc
│ │ │ ├── f32-random.cc
│ │ │ ├── mf4-f32-convert.cc
│ │ │ └── u32-random.cc
│ │ ├── examples/
│ │ │ ├── chat.py
│ │ │ └── generate.py
│ │ ├── include/
│ │ │ ├── gpt-oss/
│ │ │ │ ├── functions.h
│ │ │ │ ├── macros.h
│ │ │ │ └── types.h
│ │ │ └── gpt-oss.h
│ │ ├── python/
│ │ │ ├── context.c
│ │ │ ├── model.c
│ │ │ ├── module.c
│ │ │ ├── module.h
│ │ │ └── tokenizer.c
│ │ ├── scripts/
│ │ │ └── create-local-model.py
│ │ ├── source/
│ │ │ ├── accumulate.metal
│ │ │ ├── context.c
│ │ │ ├── convert.metal
│ │ │ ├── embeddings.metal
│ │ │ ├── expert_routing_metadata.metal
│ │ │ ├── gather_and_accumulate.metal
│ │ │ ├── generate.c
│ │ │ ├── include/
│ │ │ │ └── internal/
│ │ │ │ ├── datatype.h
│ │ │ │ ├── datatype.hpp
│ │ │ │ ├── kernel-args.h
│ │ │ │ ├── log.h
│ │ │ │ ├── macros.h
│ │ │ │ ├── math.h
│ │ │ │ ├── metal-kernels.h
│ │ │ │ ├── metal.h
│ │ │ │ ├── metal.hpp
│ │ │ │ ├── model.h
│ │ │ │ ├── rng.h
│ │ │ │ ├── rng.hpp
│ │ │ │ ├── storage.h
│ │ │ │ └── uuid.h
│ │ │ ├── log.c
│ │ │ ├── matmul.metal
│ │ │ ├── metal-kernels.c
│ │ │ ├── metal.m
│ │ │ ├── model.c
│ │ │ ├── moematmul.metal
│ │ │ ├── random.metal
│ │ │ ├── rmsnorm.metal
│ │ │ ├── rope.metal
│ │ │ ├── sample.metal
│ │ │ ├── scatter.metal
│ │ │ ├── sdpa.metal
│ │ │ ├── tokenizer.c
│ │ │ └── topk.metal
│ │ └── test/
│ │ ├── bf16-f32-embeddings.cc
│ │ ├── embeddings-kernel-tester.hpp
│ │ ├── f32-bf16w-matmul.cc
│ │ ├── f32-bf16w-rmsnorm.cc
│ │ ├── f32-random.cc
│ │ ├── f32-rope.cc
│ │ ├── fill-random-kernel-tester.hpp
│ │ ├── matmul-kernel-tester.hpp
│ │ ├── mf4-f32-convert.cc
│ │ ├── rmsnorm-kernel-tester.hpp
│ │ ├── rope-kernel-tester.hpp
│ │ └── u32-random.cc
│ ├── responses_api/
│ │ ├── __init__.py
│ │ ├── api_server.py
│ │ ├── events.py
│ │ ├── inference/
│ │ │ ├── __init__.py
│ │ │ ├── metal.py
│ │ │ ├── ollama.py
│ │ │ ├── stub.py
│ │ │ ├── transformers.py
│ │ │ ├── triton.py
│ │ │ └── vllm.py
│ │ ├── serve.py
│ │ ├── types.py
│ │ └── utils.py
│ ├── tokenizer.py
│ ├── tools/
│ │ ├── __init__.py
│ │ ├── apply_patch.md
│ │ ├── apply_patch.py
│ │ ├── python_docker/
│ │ │ └── docker_tool.py
│ │ ├── simple_browser/
│ │ │ ├── __init__.py
│ │ │ ├── backend.py
│ │ │ ├── page_contents.py
│ │ │ └── simple_browser_tool.py
│ │ └── tool.py
│ ├── torch/
│ │ ├── __init__.py
│ │ ├── model.py
│ │ ├── utils.py
│ │ └── weights.py
│ ├── triton/
│ │ ├── __init__.py
│ │ ├── attention.py
│ │ ├── model.py
│ │ └── moe.py
│ └── vllm/
│ └── token_generator.py
├── pyproject.toml
├── tests/
│ ├── conftest.py
│ ├── gpt_oss/
│ │ └── tools/
│ │ └── simple_browser/
│ │ └── test_backend.py
│ ├── test_api_endpoints.py
│ └── test_responses_api.py
└── tests-data/
├── basic-event-stream.txt
└── web-search-event-stream.txt
SYMBOL INDEX (1070 symbols across 100 files)
FILE: _build/gpt_oss_build_backend/backend.py
function _use_metal_backend (line 39) | def _use_metal_backend() -> bool:
function _setuptools_backend (line 43) | def _setuptools_backend():
function _scikit_build_backend (line 49) | def _scikit_build_backend():
function _backend (line 53) | def _backend():
function build_wheel (line 59) | def build_wheel(
function build_sdist (line 67) | def build_sdist(
function prepare_metadata_for_build_wheel (line 73) | def prepare_metadata_for_build_wheel(
function build_editable (line 89) | def build_editable(
function get_requires_for_build_wheel (line 100) | def get_requires_for_build_wheel(
function get_requires_for_build_sdist (line 115) | def get_requires_for_build_sdist(
function get_requires_for_build_editable (line 126) | def get_requires_for_build_editable(
FILE: compatibility-test/analysis.ts
function analyze (line 1) | function analyze(caseResults: any[], tries: number) {
function printAnalysis (line 101) | function printAnalysis(
FILE: compatibility-test/index.ts
function formatTimestamp (line 10) | function formatTimestamp(d: Date): string {
function main (line 21) | async function main() {
FILE: compatibility-test/providers.ts
constant PROVIDERS (line 1) | const PROVIDERS = {
FILE: compatibility-test/runCase.ts
type Case (line 20) | type Case = {
type RunCaseSummary (line 28) | type RunCaseSummary = {
function runCase (line 39) | async function runCase(
function testToolCall (line 143) | function testToolCall(apiType, caseData, result, strict) {
function testEvents (line 193) | function testEvents(apiType, events) {
function testOutputData (line 241) | function testOutputData(apiType, rawResponses, streaming) {
function deepEqual (line 308) | function deepEqual(a: any, b: any): boolean {
FILE: compatibility-test/tools.ts
function convertToTool (line 3) | function convertToTool(toolData: any) {
constant TOOLS (line 15) | const TOOLS = [
constant TOOLS_MAP (line 153) | const TOOLS_MAP = TOOLS.reduce((acc, tool) => {
FILE: examples/agents-sdk-js/index.ts
function prompt (line 17) | async function prompt(question: string) {
FILE: examples/agents-sdk-python/example.py
function prompt_user (line 18) | async def prompt_user(question: str) -> str:
function main (line 24) | async def main():
FILE: examples/gradio/gradio_chat.py
function chat_with_model (line 18) | def chat_with_model(message, history, model_choice, instructions, effort...
function toggle_function_group (line 231) | def toggle_function_group(use_funcs):
FILE: examples/streamlit/streamlit_chat.py
function trigger_fake_tool (line 91) | def trigger_fake_tool(container):
function run (line 105) | def run(container):
FILE: gpt-oss-mcp-server/browser_server.py
class AppContext (line 12) | class AppContext:
method create_or_get_browser (line 15) | def create_or_get_browser(self, session_id: str) -> SimpleBrowserTool:
method remove_browser (line 27) | def remove_browser(self, session_id: str) -> None:
function app_lifespan (line 32) | async def app_lifespan(_server: FastMCP) -> AsyncIterator[AppContext]:
function search (line 58) | async def search(ctx: Context,
function open_link (line 84) | async def open_link(ctx: Context,
function find_pattern (line 112) | async def find_pattern(ctx: Context, pattern: str, cursor: int = -1) -> ...
FILE: gpt-oss-mcp-server/build-system-prompt.py
function list_server_and_tools (line 24) | async def list_server_and_tools(server_url: str):
function trim_schema (line 32) | def trim_schema(schema: dict) -> dict:
function post_process_tools_description (line 55) | def post_process_tools_description(
FILE: gpt-oss-mcp-server/python_server.py
function python (line 26) | async def python(code: str) -> str:
FILE: gpt_oss/chat.py
function get_user_input (line 49) | def get_user_input():
function main (line 61) | def main(args):
FILE: gpt_oss/evals/__main__.py
function main (line 17) | def main():
FILE: gpt_oss/evals/abcd_grader.py
function extract_abcd (line 81) | def extract_abcd(text: str) -> str | None:
function main (line 104) | def main():
FILE: gpt_oss/evals/aime_eval.py
function format_aime_question (line 17) | def format_aime_question(row):
function extract_boxed_text (line 20) | def extract_boxed_text(text):
function normalize_number (line 34) | def normalize_number(s):
class AIME25Eval (line 40) | class AIME25Eval(Eval):
method __init__ (line 41) | def __init__(
method __call__ (line 66) | def __call__(self, sampler: SamplerBase) -> EvalResult:
FILE: gpt_oss/evals/basic_eval.py
class BasicEval (line 8) | class BasicEval(Eval):
method __init__ (line 9) | def __init__(self,):
method __call__ (line 15) | def __call__(self, sampler: SamplerBase) -> EvalResult:
FILE: gpt_oss/evals/chat_completions_sampler.py
class ChatCompletionsSampler (line 17) | class ChatCompletionsSampler(SamplerBase):
method __init__ (line 20) | def __init__(
method _pack_message (line 39) | def _pack_message(self, role: str, content: Any) -> dict[str, Any]:
method __call__ (line 42) | def __call__(self, message_list: MessageList) -> SamplerResponse:
FILE: gpt_oss/evals/gpqa_eval.py
function format_multichoice_question (line 28) | def format_multichoice_question(row):
class GPQAEval (line 32) | class GPQAEval(Eval):
method __init__ (line 33) | def __init__(
method __call__ (line 60) | def __call__(self, sampler: SamplerBase) -> EvalResult:
FILE: gpt_oss/evals/healthbench_eval.py
function parse_json_to_dict (line 99) | def parse_json_to_dict(json_string: str) -> dict:
class RubricItem (line 110) | class RubricItem:
method __init__ (line 111) | def __init__(self, criterion: str, points: float, tags: list[str]):
method __str__ (line 116) | def __str__(self):
method to_dict (line 119) | def to_dict(self):
method from_dict (line 127) | def from_dict(cls, d: dict):
function calculate_score (line 135) | def calculate_score(
function get_usage_dict (line 156) | def get_usage_dict(response_usage) -> dict[str, int | None]:
function _compute_clipped_stats (line 194) | def _compute_clipped_stats(
function _aggregate_get_clipped_mean (line 213) | def _aggregate_get_clipped_mean(
class HealthBenchEval (line 246) | class HealthBenchEval(Eval):
method __init__ (line 247) | def __init__(
method grade_sample (line 338) | def grade_sample(
method __call__ (line 428) | def __call__(self, sampler: SamplerBase) -> EvalResult:
function main (line 500) | def main():
function physician_completions_main (line 535) | def physician_completions_main(
FILE: gpt_oss/evals/report.py
function _compute_stat (line 27) | def _compute_stat(values: list, stat: str):
function aggregate_results (line 46) | def aggregate_results(
function map_with_progress (line 82) | def map_with_progress(
function message_to_html (line 118) | def message_to_html(message: Message) -> str:
function make_report (line 199) | def make_report(eval_result: EvalResult) -> str:
FILE: gpt_oss/evals/responses_sampler.py
class ResponsesSampler (line 10) | class ResponsesSampler(SamplerBase):
method __init__ (line 15) | def __init__(
method _pack_message (line 34) | def _pack_message(self, role: str, content: Any) -> dict[str, Any]:
method __call__ (line 37) | def __call__(self, message_list: MessageList) -> SamplerResponse:
FILE: gpt_oss/evals/types.py
class SamplerResponse (line 10) | class SamplerResponse:
class SamplerBase (line 18) | class SamplerBase:
method __call__ (line 24) | def __call__(
class EvalResult (line 32) | class EvalResult:
class SingleEvalResult (line 45) | class SingleEvalResult:
class Eval (line 59) | class Eval:
method __call__ (line 64) | def __call__(self, sampler: SamplerBase) -> EvalResult:
FILE: gpt_oss/generate.py
function main (line 11) | def main(args):
FILE: gpt_oss/metal/benchmark/end-to-end-threadgroup.cc
function attn_qkv_tgsize (line 19) | static void attn_qkv_tgsize(benchmark::State& state, const char* env_var...
function AttnQKVThreadgroupSizeArguments (line 85) | static void AttnQKVThreadgroupSizeArguments(benchmark::internal::Benchma...
function attn_out_tgsize (line 102) | static void attn_out_tgsize(benchmark::State& state, const char* env_var...
function AttnOutThreadgroupSizeArguments (line 168) | static void AttnOutThreadgroupSizeArguments(benchmark::internal::Benchma...
function mlp_gate_tgsize (line 185) | static void mlp_gate_tgsize(benchmark::State& state, const char* env_var...
function MlpGateThreadgroupSizeArguments (line 251) | static void MlpGateThreadgroupSizeArguments(benchmark::internal::Benchma...
function mlp_swiglu_tgsize (line 268) | static void mlp_swiglu_tgsize(benchmark::State& state, const char* env_v...
function MlpSwigluThreadgroupSizeArguments (line 334) | static void MlpSwigluThreadgroupSizeArguments(benchmark::internal::Bench...
function mlp_out_tgsize (line 351) | static void mlp_out_tgsize(benchmark::State& state, const char* env_var_...
function MlpOutThreadgroupSizeArguments (line 417) | static void MlpOutThreadgroupSizeArguments(benchmark::internal::Benchmar...
function mlp_acc_tgsize (line 434) | static void mlp_acc_tgsize(benchmark::State& state, const char* env_var_...
function MlpAccThreadgroupSizeArguments (line 500) | static void MlpAccThreadgroupSizeArguments(benchmark::internal::Benchmar...
function unembedding_tgsize (line 512) | static void unembedding_tgsize(benchmark::State& state, const char* env_...
function UnembeddingThreadgroupSizeArguments (line 578) | static void UnembeddingThreadgroupSizeArguments(benchmark::internal::Ben...
FILE: gpt_oss/metal/benchmark/end-to-end.cc
function end2end_decode (line 18) | static void end2end_decode(benchmark::State& state, const char* env_var_...
function end2end_prefill (line 82) | static void end2end_prefill(benchmark::State& state,
FILE: gpt_oss/metal/benchmark/f32-bf16w-rmsnorm.cc
function f32_bf16w_rnsnorm (line 16) | static void f32_bf16w_rnsnorm(benchmark::State& state) {
FILE: gpt_oss/metal/benchmark/f32-random.cc
function f32_fill_random (line 10) | static void f32_fill_random(benchmark::State& state) {
FILE: gpt_oss/metal/benchmark/mf4-f32-convert.cc
function mf4_f32_convert (line 13) | static void mf4_f32_convert(benchmark::State& state) {
FILE: gpt_oss/metal/benchmark/u32-random.cc
function u32_fill_random (line 10) | static void u32_fill_random(benchmark::State& state) {
FILE: gpt_oss/metal/examples/chat.py
function main (line 38) | def main(args):
FILE: gpt_oss/metal/examples/generate.py
function main (line 16) | def main(args):
FILE: gpt_oss/metal/include/gpt-oss/functions.h
type gptoss_status (line 22) | enum gptoss_status
type gptoss_status (line 35) | enum gptoss_status
type gptoss_status (line 48) | enum gptoss_status
type gptoss_status (line 59) | enum gptoss_status
type gptoss_status (line 69) | enum gptoss_status
type gptoss_status (line 82) | enum gptoss_status
type gptoss_special_token (line 84) | enum gptoss_special_token
type gptoss_status (line 96) | enum gptoss_status
type gptoss_status (line 109) | enum gptoss_status
type gptoss_status (line 122) | enum gptoss_status
type gptoss_status (line 139) | enum gptoss_status
type gptoss_status (line 152) | enum gptoss_status
type gptoss_status (line 162) | enum gptoss_status
type gptoss_status (line 180) | enum gptoss_status
type gptoss_status (line 195) | enum gptoss_status
type gptoss_status (line 208) | enum gptoss_status
type gptoss_status (line 225) | enum gptoss_status
type gptoss_status (line 241) | enum gptoss_status
type gptoss_status (line 256) | enum gptoss_status
type gptoss_status (line 268) | enum gptoss_status
type gptoss_status (line 278) | enum gptoss_status
type gptoss_status (line 291) | enum gptoss_status
type gptoss_status (line 306) | enum gptoss_status
type gptoss_status (line 316) | enum gptoss_status
type gptoss_status (line 328) | enum gptoss_status
type gptoss_status (line 339) | enum gptoss_status
type gptoss_status (line 351) | enum gptoss_status
type gptoss_status (line 363) | enum gptoss_status
type gptoss_status (line 375) | enum gptoss_status
type gptoss_status (line 386) | enum gptoss_status
type gptoss_status (line 396) | enum gptoss_status
FILE: gpt_oss/metal/include/gpt-oss/types.h
type gptoss_status (line 6) | enum gptoss_status {
type gptoss_special_token (line 18) | enum gptoss_special_token {
type gptoss_model (line 39) | struct gptoss_model
type gptoss_tokenizer (line 41) | struct gptoss_tokenizer
type gptoss_context (line 51) | struct gptoss_context
type gptoss_sampler (line 62) | struct gptoss_sampler
FILE: gpt_oss/metal/python/context.c
function PyGPTOSSContext_init (line 8) | static int PyGPTOSSContext_init(PyGPTOSSContext* self, PyObject* args, P...
function PyGPTOSSContext_dealloc (line 47) | static void PyGPTOSSContext_dealloc(PyGPTOSSContext* self) {
function PyObject (line 53) | static PyObject* PyGPTOSSContext_copy(PyGPTOSSContext *self) {
function PyObject (line 64) | static PyObject* PyGPTOSSContext_append(PyGPTOSSContext* self, PyObject*...
function PyObject (line 116) | static PyObject* PyGPTOSSContext_process(PyGPTOSSContext* self) {
function PyObject (line 126) | static PyObject* PyGPTOSSContext_sample(PyGPTOSSContext* self, PyObject*...
function PyObject (line 177) | static PyObject* PyGPTOSSContext_reset(PyGPTOSSContext* self) {
function PyObject (line 196) | static PyObject* PyGPTOSSContext_get_num_tokens(PyGPTOSSContext* self, v...
function PyObject (line 207) | static PyObject* PyGPTOSSContext_get_max_tokens(PyGPTOSSContext* self, v...
function PyObject (line 218) | static PyObject* PyGPTOSSContext_get_tokens(PyGPTOSSContext* self, void*...
FILE: gpt_oss/metal/python/model.c
function PyGPTOSSModel_init (line 8) | static int PyGPTOSSModel_init(PyGPTOSSModel* self, PyObject* args, PyObj...
function PyGPTOSSModel_dealloc (line 23) | static void PyGPTOSSModel_dealloc(PyGPTOSSModel* self) {
function PyObject (line 29) | static PyObject* PyGPTOSSModel_copy(PyGPTOSSModel* self) {
function PyObject (line 45) | static PyObject *PyGPTOSSModel_get_max_context_length(PyGPTOSSModel* sel...
function PyObject (line 56) | static PyObject *PyGPTOSSModel_get_tokenizer(PyGPTOSSModel* self, void* ...
FILE: gpt_oss/metal/python/module.c
function PyMODINIT_FUNC (line 18) | PyMODINIT_FUNC PyInit__metal(void) {
FILE: gpt_oss/metal/python/module.h
type PyGPTOSSModel (line 5) | typedef struct {
type PyGPTOSSTokenizer (line 10) | typedef struct {
type PyGPTOSSContext (line 15) | typedef struct {
FILE: gpt_oss/metal/python/tokenizer.c
function PyObject (line 7) | static PyObject* PyGPTOSSTokenizer_new(PyTypeObject* subtype, PyObject* ...
function PyGPTOSSTokenizer_dealloc (line 30) | static void PyGPTOSSTokenizer_dealloc(PyGPTOSSTokenizer* self) {
function PyObject (line 36) | static PyObject* PyGPTOSSTokenizer_copy(PyGPTOSSTokenizer* self) {
function PyObject (line 47) | static PyObject* PyGPTOSSTokenizer_encode_special_token(PyGPTOSSTokenize...
function PyObject (line 95) | static PyObject* PyGPTOSSTokenizer_decode(PyGPTOSSTokenizer* self, PyObj...
function PyObject (line 121) | static PyObject* PyGPTOSSTokenizer_get_num_text_tokens(PyGPTOSSTokenizer...
function PyObject (line 132) | static PyObject* PyGPTOSSTokenizer_get_num_special_tokens(PyGPTOSSTokeni...
function PyObject (line 143) | static PyObject* PyGPTOSSTokenizer_get_num_tokens(PyGPTOSSTokenizer* sel...
FILE: gpt_oss/metal/scripts/create-local-model.py
function write_file_header (line 83) | def write_file_header(f):
function write_tokenizer_header (line 86) | def write_tokenizer_header(f,
function write_model_header (line 97) | def write_model_header(f,
function write_padding (line 136) | def write_padding(out_file, alignment_multiple=16384):
function write_embedding_weight (line 144) | def write_embedding_weight(out_file, weight):
function write_rmsnorm_gain (line 151) | def write_rmsnorm_gain(out_file, gain):
function write_attn_sink (line 158) | def write_attn_sink(out_file, sink):
function write_linear_weight (line 165) | def write_linear_weight(out_file, *args):
function main (line 172) | def main(args):
FILE: gpt_oss/metal/source/context.c
function gptoss_context_create (line 19) | enum gptoss_status GPTOSS_ABI gptoss_context_create(
function gptoss_context_get_num_tokens (line 163) | enum gptoss_status GPTOSS_ABI gptoss_context_get_num_tokens(
function gptoss_context_get_max_tokens (line 171) | enum gptoss_status GPTOSS_ABI gptoss_context_get_max_tokens(
function gptoss_context_get_tokens (line 179) | enum gptoss_status GPTOSS_ABI gptoss_context_get_tokens(
function process_tokens (line 199) | static enum gptoss_status process_tokens(
function gptoss_context_append_chars (line 774) | enum gptoss_status GPTOSS_ABI gptoss_context_append_chars(
function gptoss_context_append_tokens (line 834) | enum gptoss_status GPTOSS_ABI gptoss_context_append_tokens(
function gptoss_context_process (line 885) | enum gptoss_status GPTOSS_ABI gptoss_context_process(
function gptoss_context_sample (line 929) | enum gptoss_status GPTOSS_ABI gptoss_context_sample(
function gptoss_context_reset (line 1054) | enum gptoss_status GPTOSS_ABI gptoss_context_reset(
function gptoss_context_retain (line 1065) | enum gptoss_status GPTOSS_ABI gptoss_context_retain(
function gptoss_context_release (line 1072) | enum gptoss_status GPTOSS_ABI gptoss_context_release(
FILE: gpt_oss/metal/source/generate.c
type options (line 32) | struct options {
function mach_timestamp_diff_to_seconds (line 41) | static inline double mach_timestamp_diff_to_seconds(uint64_t start_times...
function mach_timestamp_diff_to_microseconds (line 50) | static inline uint64_t mach_timestamp_diff_to_microseconds(uint64_t star...
function print_usage (line 60) | static void print_usage(const char* program_name) {
function parse_options (line 64) | struct options parse_options(int argc, char** argv) {
function print_profile (line 162) | static void print_profile() {
function ctrl_c_handler (line 183) | static void ctrl_c_handler(int signum) {
function main (line 188) | int main(int argc, char *argv[]) {
FILE: gpt_oss/metal/source/include/internal/datatype.h
type gptoss_bfloat16 (line 8) | typedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {
type gptoss_float16 (line 14) | typedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {
type gptoss_float8ue8m0 (line 20) | typedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {
type gptoss_float8e5m2 (line 26) | typedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {
type gptoss_float8e4m3 (line 32) | typedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {
type gptoss_float4e2m1x2 (line 38) | typedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {
FILE: gpt_oss/metal/source/include/internal/datatype.hpp
type gptoss (line 8) | namespace gptoss {
FILE: gpt_oss/metal/source/include/internal/kernel-args.h
type gptoss_expert_prediction (line 38) | struct gptoss_expert_prediction {
type gptoss_control (line 43) | struct gptoss_control {
type gptoss_topk_args (line 47) | struct gptoss_topk_args {
type gptoss_sdpa_args (line 51) | struct gptoss_sdpa_args {
type gptoss_u32_fill_random_args (line 58) | struct gptoss_u32_fill_random_args {
type gptoss_f32_fill_random_args (line 65) | struct gptoss_f32_fill_random_args {
type gptoss_accumulate_args (line 74) | struct gptoss_accumulate_args {
type gptoss_convert_args (line 80) | struct gptoss_convert_args {
type gptoss_embeddings_args (line 85) | struct gptoss_embeddings_args {
type gptoss_rmsnorm_args (line 89) | struct gptoss_rmsnorm_args {
type gptoss_matmul_args (line 95) | struct gptoss_matmul_args {
type gptoss_dense_matmul_args (line 101) | struct gptoss_dense_matmul_args {
type gptoss_dense_matmul_qkv_args (line 108) | struct gptoss_dense_matmul_qkv_args {
type gptoss_scatter_args (line 116) | struct gptoss_scatter_args {
type gptoss_moe_dense_matmul_swiglu_args (line 122) | struct gptoss_moe_dense_matmul_swiglu_args {
type gptoss_moe_dense_matmul_args (line 131) | struct gptoss_moe_dense_matmul_args {
type gptoss_expert_routing_metadata_args (line 139) | struct gptoss_expert_routing_metadata_args {
type gptoss_gather_args (line 144) | struct gptoss_gather_args {
type gptoss_unembedding_args (line 150) | struct gptoss_unembedding_args {
type gptoss_moe_matmul_swiglu_args (line 156) | struct gptoss_moe_matmul_swiglu_args {
type gptoss_moe_matmul_args (line 166) | struct gptoss_moe_matmul_args {
type gptoss_rope_args (line 175) | struct gptoss_rope_args {
type gptoss_qkv_args (line 186) | struct gptoss_qkv_args {
type gptoss_softmax_args (line 198) | struct gptoss_softmax_args {
type gptoss_sample_args (line 205) | struct gptoss_sample_args {
FILE: gpt_oss/metal/source/include/internal/log.h
function gptoss_log (line 8) | __attribute__((__format__(__printf__, 1, 2)))
FILE: gpt_oss/metal/source/include/internal/math.h
function math_ceil_div (line 7) | inline static size_t math_ceil_div(size_t numer, size_t denom) {
function math_max (line 11) | inline static size_t math_max(size_t a, size_t b) {
function math_min (line 15) | inline static size_t math_min(size_t a, size_t b) {
function math_sub_sat (line 19) | inline static size_t math_sub_sat(size_t a, size_t b) {
function math_round_down_po2 (line 23) | static size_t math_round_down_po2(size_t number, size_t multiple) {
function math_round_up_po2 (line 30) | static size_t math_round_up_po2(size_t number, size_t multiple) {
FILE: gpt_oss/metal/source/include/internal/metal-kernels.h
type gptoss_status (line 20) | enum gptoss_status
type gptoss_metal_command_buffer (line 21) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 22) | struct gptoss_metal_function
type gptoss_metal_buffer (line 25) | struct gptoss_metal_buffer
type gptoss_status (line 31) | enum gptoss_status
type gptoss_metal_command_buffer (line 32) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 33) | struct gptoss_metal_function
type gptoss_metal_buffer (line 36) | struct gptoss_metal_buffer
type gptoss_status (line 44) | enum gptoss_status
type gptoss_metal_command_buffer (line 45) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 46) | struct gptoss_metal_function
type gptoss_metal_buffer (line 49) | struct gptoss_metal_buffer
type gptoss_status (line 57) | enum gptoss_status
type gptoss_metal_command_buffer (line 58) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 59) | struct gptoss_metal_function
type gptoss_metal_buffer (line 62) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 63) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 64) | struct gptoss_metal_buffer
type gptoss_status (line 67) | enum gptoss_status
type gptoss_metal_command_buffer (line 68) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 69) | struct gptoss_metal_function
type gptoss_metal_buffer (line 71) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 73) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 75) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 77) | struct gptoss_metal_buffer
type gptoss_status (line 82) | enum gptoss_status
type gptoss_metal_command_buffer (line 83) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 84) | struct gptoss_metal_function
type gptoss_metal_buffer (line 85) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 87) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 89) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 91) | struct gptoss_metal_buffer
type gptoss_status (line 97) | enum gptoss_status
type gptoss_metal_command_buffer (line 98) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 99) | struct gptoss_metal_function
type gptoss_metal_buffer (line 101) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 103) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 105) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 107) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 109) | struct gptoss_metal_buffer
type gptoss_status (line 115) | enum gptoss_status
type gptoss_metal_command_buffer (line 116) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 117) | struct gptoss_metal_function
type gptoss_metal_buffer (line 119) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 121) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 123) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 125) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 127) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 129) | struct gptoss_metal_buffer
type gptoss_status (line 144) | enum gptoss_status
type gptoss_metal_command_buffer (line 145) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 146) | struct gptoss_metal_function
type gptoss_metal_buffer (line 148) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 150) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 152) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 154) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 156) | struct gptoss_metal_buffer
type gptoss_status (line 162) | enum gptoss_status
type gptoss_metal_command_buffer (line 164) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 165) | struct gptoss_metal_function
type gptoss_metal_buffer (line 166) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 168) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 170) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 172) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 174) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 176) | struct gptoss_metal_buffer
type gptoss_status (line 184) | enum gptoss_status
type gptoss_metal_command_buffer (line 186) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 187) | struct gptoss_metal_function
type gptoss_metal_buffer (line 188) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 190) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 192) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 194) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 196) | struct gptoss_metal_buffer
type gptoss_status (line 202) | enum gptoss_status
type gptoss_metal_command_buffer (line 204) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 205) | struct gptoss_metal_function
type gptoss_metal_buffer (line 206) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 208) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 210) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 212) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 214) | struct gptoss_metal_buffer
type gptoss_status (line 220) | enum gptoss_status
type gptoss_metal_command_buffer (line 221) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 222) | struct gptoss_metal_function
type gptoss_metal_buffer (line 225) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 227) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 229) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 231) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 233) | struct gptoss_metal_buffer
type gptoss_status (line 239) | enum gptoss_status
type gptoss_metal_command_buffer (line 240) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 241) | struct gptoss_metal_function
type gptoss_metal_buffer (line 243) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 245) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 247) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 249) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 251) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 253) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 255) | struct gptoss_metal_buffer
type gptoss_status (line 264) | enum gptoss_status
type gptoss_metal_command_buffer (line 265) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 266) | struct gptoss_metal_function
type gptoss_metal_buffer (line 268) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 270) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 272) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 274) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 276) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 278) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 280) | struct gptoss_metal_buffer
type gptoss_status (line 288) | enum gptoss_status
type gptoss_metal_command_buffer (line 289) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 290) | struct gptoss_metal_function
type gptoss_metal_buffer (line 292) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 294) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 296) | struct gptoss_metal_buffer
type gptoss_status (line 310) | enum gptoss_status
type gptoss_metal_command_buffer (line 311) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 312) | struct gptoss_metal_function
type gptoss_metal_buffer (line 315) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 317) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 319) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 321) | struct gptoss_metal_buffer
type gptoss_status (line 327) | enum gptoss_status
type gptoss_metal_command_buffer (line 328) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 329) | struct gptoss_metal_function
type gptoss_metal_buffer (line 330) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 332) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 334) | struct gptoss_metal_buffer
type gptoss_status (line 340) | enum gptoss_status
type gptoss_metal_command_buffer (line 341) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 342) | struct gptoss_metal_function
type gptoss_metal_buffer (line 343) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 345) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 347) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 349) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 351) | struct gptoss_metal_buffer
type gptoss_status (line 357) | enum gptoss_status
type gptoss_metal_command_buffer (line 359) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 360) | struct gptoss_metal_function
type gptoss_metal_buffer (line 361) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 363) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 365) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 367) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 369) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 371) | struct gptoss_metal_buffer
type gptoss_status (line 380) | enum gptoss_status
type gptoss_metal_command_buffer (line 381) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 382) | struct gptoss_metal_function
type gptoss_metal_buffer (line 383) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 385) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 387) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 389) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 391) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 393) | struct gptoss_metal_buffer
type gptoss_status (line 401) | enum gptoss_status
type gptoss_metal_command_buffer (line 402) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 403) | struct gptoss_metal_function
type gptoss_metal_buffer (line 404) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 406) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 408) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 410) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 412) | struct gptoss_metal_buffer
type gptoss_status (line 418) | enum gptoss_status
type gptoss_metal_command_buffer (line 419) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 420) | struct gptoss_metal_function
type gptoss_metal_buffer (line 421) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 423) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 425) | struct gptoss_metal_buffer
type gptoss_status (line 431) | enum gptoss_status
type gptoss_metal_command_buffer (line 432) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 433) | struct gptoss_metal_function
type gptoss_metal_buffer (line 434) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 436) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 438) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 440) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 442) | struct gptoss_metal_buffer
type gptoss_status (line 452) | enum gptoss_status
type gptoss_metal_command_buffer (line 453) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 454) | struct gptoss_metal_function
type gptoss_metal_buffer (line 457) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 459) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 461) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 463) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 465) | struct gptoss_metal_buffer
type gptoss_status (line 473) | enum gptoss_status
type gptoss_metal_command_buffer (line 474) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 475) | struct gptoss_metal_function
type gptoss_metal_buffer (line 477) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 479) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 481) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 483) | struct gptoss_metal_buffer
FILE: gpt_oss/metal/source/include/internal/metal.h
type gptoss_metal_device (line 11) | struct gptoss_metal_device {
type gptoss_status (line 21) | enum gptoss_status
type gptoss_metal_device (line 22) | struct gptoss_metal_device
type gptoss_status (line 24) | enum gptoss_status
type gptoss_metal_device (line 25) | struct gptoss_metal_device
type gptoss_metal_library (line 28) | struct gptoss_metal_library {
type gptoss_status (line 32) | enum gptoss_status
type gptoss_metal_device (line 33) | struct gptoss_metal_device
type gptoss_metal_library (line 34) | struct gptoss_metal_library
type gptoss_status (line 36) | enum gptoss_status
type gptoss_metal_library (line 37) | struct gptoss_metal_library
type gptoss_metal_function (line 39) | struct gptoss_metal_function {
type gptoss_status (line 47) | enum gptoss_status
type gptoss_metal_library (line 48) | struct gptoss_metal_library
type gptoss_metal_function (line 50) | struct gptoss_metal_function
type gptoss_status (line 52) | enum gptoss_status
type gptoss_metal_function (line 53) | struct gptoss_metal_function
type gptoss_metal_buffer (line 55) | struct gptoss_metal_buffer {
type gptoss_status (line 61) | enum gptoss_status
type gptoss_metal_device (line 62) | struct gptoss_metal_device
type gptoss_metal_buffer (line 65) | struct gptoss_metal_buffer
type gptoss_status (line 67) | enum gptoss_status
type gptoss_metal_device (line 68) | struct gptoss_metal_device
type gptoss_metal_buffer (line 71) | struct gptoss_metal_buffer
type gptoss_status (line 73) | enum gptoss_status
type gptoss_metal_buffer (line 74) | struct gptoss_metal_buffer
type gptoss_metal_command_queue (line 76) | struct gptoss_metal_command_queue {
type gptoss_status (line 80) | enum gptoss_status
type gptoss_metal_device (line 81) | struct gptoss_metal_device
type gptoss_metal_command_queue (line 82) | struct gptoss_metal_command_queue
type gptoss_status (line 84) | enum gptoss_status
type gptoss_metal_command_queue (line 85) | struct gptoss_metal_command_queue
type gptoss_metal_command_buffer (line 87) | struct gptoss_metal_command_buffer {
type gptoss_status (line 91) | enum gptoss_status
type gptoss_metal_command_queue (line 92) | struct gptoss_metal_command_queue
type gptoss_metal_command_buffer (line 93) | struct gptoss_metal_command_buffer
type gptoss_status (line 95) | enum gptoss_status
type gptoss_metal_command_buffer (line 96) | struct gptoss_metal_command_buffer
type gptoss_metal_buffer (line 97) | struct gptoss_metal_buffer
type gptoss_status (line 102) | enum gptoss_status
type gptoss_metal_command_buffer (line 103) | struct gptoss_metal_command_buffer
type gptoss_metal_buffer (line 104) | struct gptoss_metal_buffer
type gptoss_metal_buffer (line 106) | struct gptoss_metal_buffer
type gptoss_status (line 110) | enum gptoss_status
type gptoss_metal_command_buffer (line 111) | struct gptoss_metal_command_buffer
type gptoss_metal_function (line 112) | struct gptoss_metal_function
type gptoss_metal_buffer (line 122) | struct gptoss_metal_buffer
type gptoss_status (line 126) | enum gptoss_status
type gptoss_metal_command_buffer (line 127) | struct gptoss_metal_command_buffer
type gptoss_status (line 129) | enum gptoss_status
type gptoss_metal_command_buffer (line 130) | struct gptoss_metal_command_buffer
type gptoss_status (line 133) | enum gptoss_status
type gptoss_metal_command_buffer (line 134) | struct gptoss_metal_command_buffer
FILE: gpt_oss/metal/source/include/internal/metal.hpp
type gptoss (line 14) | namespace gptoss {
function Check (line 16) | inline void Check(gptoss_status s, const char* what) {
function round_up (line 22) | inline std::size_t round_up(std::size_t p, std::size_t q) {
type metal (line 31) | namespace metal {
class Device (line 33) | class Device {
method Device (line 35) | inline Device() {
method Device (line 43) | Device(const Device&) = delete;
method Device (line 44) | Device& operator=(const Device&) = delete;
method Device (line 46) | inline Device(Device&& other) noexcept {
method Device (line 51) | inline Device& operator=(Device&& other) noexcept {
method gptoss_metal_device (line 60) | inline const gptoss_metal_device* handle() const noexcept { return...
method max_buffer_size (line 62) | inline size_t max_buffer_size() const noexcept { return device_.ma...
method max_threadgroup_memory (line 63) | inline size_t max_threadgroup_memory() const noexcept { return dev...
method max_threadgroup_threads_x (line 64) | inline size_t max_threadgroup_threads_x() const noexcept { return ...
method max_threadgroup_threads_y (line 65) | inline size_t max_threadgroup_threads_y() const noexcept { return ...
method max_threadgroup_threads_z (line 66) | inline size_t max_threadgroup_threads_z() const noexcept { return ...
class Library (line 72) | class Library {
method Library (line 74) | inline explicit Library(const Device& dev) {
method Library (line 83) | Library(const Library&) = delete;
method Library (line 84) | Library& operator=(const Library&) = delete;
method Library (line 86) | inline Library(Library&& other) noexcept {
method Library (line 91) | inline Library& operator=(Library&& other) noexcept {
method gptoss_metal_library (line 100) | inline const gptoss_metal_library* handle() const noexcept {
class Function (line 108) | class Function {
method Function (line 110) | inline Function(const Library& library, const char* name) {
method Function (line 119) | Function(const Function&) = delete;
method Function (line 120) | Function& operator=(const Function&) = delete;
method Function (line 122) | inline Function(Function&& other) noexcept {
method Function (line 127) | inline Function& operator=(Function&& other) noexcept {
method gptoss_metal_function (line 136) | inline const gptoss_metal_function* handle() const noexcept { retu...
method max_threadgroup_threads (line 138) | inline size_t max_threadgroup_threads() const noexcept { return fu...
method simdgroup_threads (line 139) | inline size_t simdgroup_threads() const noexcept { return function...
method static_threadgroup_memory (line 140) | inline size_t static_threadgroup_memory() const noexcept { return ...
class Buffer (line 146) | class Buffer {
method Buffer (line 148) | inline Buffer(const Device& dev, size_t size, const void* data = n...
method Buffer (line 156) | Buffer(const Buffer&) = delete;
method Buffer (line 157) | Buffer& operator=(const Buffer&) = delete;
method Buffer (line 159) | inline Buffer(Buffer&& other) noexcept {
method Buffer (line 164) | inline Buffer& operator=(Buffer&& other) noexcept {
method size (line 173) | inline size_t size() const noexcept { return buffer_.size; }
method gptoss_metal_buffer (line 176) | inline const gptoss_metal_buffer* handle() const noexcept { return...
class CommandQueue (line 182) | class CommandQueue {
method CommandQueue (line 184) | inline explicit CommandQueue(const Device& dev) {
method CommandQueue (line 193) | CommandQueue(const CommandQueue&) = delete;
method CommandQueue (line 194) | CommandQueue& operator=(const CommandQueue&) = delete;
method CommandQueue (line 196) | inline CommandQueue(CommandQueue&& other) noexcept {
method CommandQueue (line 201) | inline CommandQueue& operator=(CommandQueue&& other) noexcept {
method gptoss_metal_command_queue (line 210) | inline const gptoss_metal_command_queue* handle() const noexcept {
class CommandBuffer (line 218) | class CommandBuffer {
method CommandBuffer (line 220) | inline explicit CommandBuffer(const CommandQueue& command_queue) {
method CommandBuffer (line 228) | CommandBuffer(const CommandBuffer&) = delete;
method CommandBuffer (line 229) | CommandBuffer& operator=(const CommandBuffer&) = delete;
method CommandBuffer (line 231) | inline CommandBuffer(CommandBuffer&& other) noexcept {
method CommandBuffer (line 236) | inline CommandBuffer& operator=(CommandBuffer&& other) noexcept {
method encode_launch_kernel (line 245) | inline void encode_launch_kernel(const Function& function,
method encode_launch_f32_fill_random (line 267) | inline void encode_launch_f32_fill_random(const Function& f32_fill...
method encode_launch_bf16_fill_random (line 287) | inline void encode_launch_bf16_fill_random(const Function& bf16_fi...
method encode_launch_u32_fill_random (line 307) | inline void encode_launch_u32_fill_random(const Function& u32_fill...
method commit (line 325) | inline void commit() {
method wait_completion (line 329) | inline double wait_completion() {
method gptoss_metal_command_buffer (line 335) | inline const gptoss_metal_command_buffer* handle() const noexcept ...
FILE: gpt_oss/metal/source/include/internal/model.h
type gptoss_tokenizer (line 13) | struct gptoss_tokenizer {
type gptoss_model (line 32) | struct gptoss_model {
type gptoss_context (line 134) | struct gptoss_context {
FILE: gpt_oss/metal/source/include/internal/rng.h
function rng_squares32 (line 5) | inline static uint32_t rng_squares32(uint64_t offset, uint64_t seed) {
FILE: gpt_oss/metal/source/include/internal/rng.hpp
type gptoss (line 5) | namespace gptoss {
type rng (line 7) | namespace rng {
function squares32 (line 9) | inline static std::uint32_t squares32(std::uint64_t offset, std::uin...
FILE: gpt_oss/metal/source/include/internal/storage.h
type gptoss_file_header (line 6) | struct gptoss_file_header {
type gptoss_gptoss_model_header (line 11) | struct gptoss_gptoss_model_header {
type gptoss_tiktoken_tokenizer_header (line 31) | struct gptoss_tiktoken_tokenizer_header {
FILE: gpt_oss/metal/source/include/internal/uuid.h
function gptoss_uuid (line 10) | struct GPTOSS_DENSELY_PACKED_STRUCTURE gptoss_uuid {
type gptoss_uuid (line 13) | struct gptoss_uuid
function gptoss_is_gptoss_model_uuid (line 21) | static inline bool gptoss_is_gptoss_model_uuid(const struct gptoss_uuid*...
function gptoss_is_applegpu_layout_uuid (line 28) | static inline bool gptoss_is_applegpu_layout_uuid(const struct gptoss_uu...
function gptoss_is_tiktoken_tokenizer_uuid (line 35) | static inline bool gptoss_is_tiktoken_tokenizer_uuid(const struct gptoss...
function gptoss_special_token_decode_uuid (line 42) | static inline enum gptoss_special_token gptoss_special_token_decode_uuid...
FILE: gpt_oss/metal/source/log.c
function gptoss_format_log (line 12) | void gptoss_format_log(const char* format, va_list args) {
FILE: gpt_oss/metal/source/metal-kernels.c
function gptoss_metal_command_buffer_encode_launch_u32_fill_random (line 13) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_u32_fill_ra...
function gptoss_metal_command_buffer_encode_launch_f32_fill_random (line 53) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_fill_ra...
function gptoss_metal_command_buffer_encode_launch_bf16_fill_random (line 101) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_bf16_fill_r...
function gptoss_metal_command_buffer_encode_launch_mf4_f32_convert (line 149) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_mf4_f32_con...
function gptoss_metal_command_buffer_encode_launch_bf16_f32_embeddings (line 190) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_bf16_f32_em...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_rmsnorm (line 235) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_r...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_matmul (line 284) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_m...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_matmul_qkv (line 344) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_m...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_matmul_add (line 439) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_m...
function _gptoss_metal_command_buffer_encode_launch_f32_bf16w_dense_matmul_impl (line 499) | enum gptoss_status _gptoss_metal_command_buffer_encode_launch_f32_bf16w_...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_dense_matmul_qkv (line 592) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_d...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_dense_matmul_attn_output (line 685) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_d...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_dense_matmul_mlp_gate (line 709) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_d...
function gptoss_metal_command_buffer_encode_launch_f32_bf16w_unembedding (line 734) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_bf16w_u...
function gptoss_metal_command_buffer_encode_launch_f32_mf4w_moe_matmul_swiglu (line 792) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_mf4w_mo...
function gptoss_metal_command_buffer_encode_launch_f32_mf4w_moe_matmul (line 868) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_mf4w_mo...
function gptoss_metal_command_buffer_encode_launch_f32_rope (line 942) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_rope(
function gptoss_metal_command_buffer_encode_launch_expert_routing_metadata (line 1002) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_expert_rout...
function gptoss_metal_command_buffer_encode_launch_f32_scatter (line 1034) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_scatter(
function gptoss_metal_command_buffer_encode_launch_f32_gather_and_accumulate_e4 (line 1087) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_gather_...
function gptoss_metal_command_buffer_encode_launch_f32_mf4w_moe_dense_matmul_swiglu (line 1140) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_mf4w_mo...
function gptoss_metal_command_buffer_encode_launch_f32_mf4w_moe_dense_matmul (line 1236) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_mf4w_mo...
function gptoss_metal_command_buffer_encode_launch_f32_accumulate (line 1329) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_accumul...
function gptoss_metal_command_buffer_encode_launch_f32_topk (line 1381) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_topk(
function gptoss_metal_command_buffer_encode_launch_f32_sdpa (line 1419) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_sdpa(
function gptoss_metal_command_buffer_encode_launch_f32_softmax (line 1478) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_softmax(
function gptoss_metal_command_buffer_encode_launch_f32_sample (line 1528) | enum gptoss_status gptoss_metal_command_buffer_encode_launch_f32_sample(
FILE: gpt_oss/metal/source/model.c
function round_up_to_page_size (line 27) | static size_t round_up_to_page_size(size_t bytes) {
function round_down_to_page_size (line 36) | static size_t round_down_to_page_size(size_t bytes) {
function read_fd (line 41) | static enum gptoss_status read_fd(int fd, void* data, size_t size, const...
function prefetch_fd (line 61) | static void prefetch_fd(int fd, size_t offset, size_t size, const char* ...
function gptoss_model_create_from_file (line 80) | enum gptoss_status GPTOSS_ABI gptoss_model_create_from_file(
function gptoss_model_get_tokenizer (line 496) | enum gptoss_status GPTOSS_ABI gptoss_model_get_tokenizer(
function gptoss_model_get_max_context_length (line 506) | enum gptoss_status GPTOSS_ABI gptoss_model_get_max_context_length(
function gptoss_model_retain (line 514) | enum gptoss_status GPTOSS_ABI gptoss_model_retain(
function gptoss_model_release (line 521) | enum gptoss_status GPTOSS_ABI gptoss_model_release(
FILE: gpt_oss/metal/source/tokenizer.c
function gptoss_tokenizer_get_special_token_id (line 17) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_get_special_token_id(
function gptoss_tokenizer_get_num_text_tokens (line 35) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_get_num_text_tokens(
function gptoss_tokenizer_get_num_special_tokens (line 43) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_get_num_special_tokens(
function gptoss_tokenizer_get_num_tokens (line 51) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_get_num_tokens(
function gptoss_tokenizer_decode (line 59) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_decode(
function gptoss_tokenizer_retain (line 83) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_retain(
function gptoss_tokenizer_release (line 90) | enum gptoss_status GPTOSS_ABI gptoss_tokenizer_release(
FILE: gpt_oss/metal/test/bf16-f32-embeddings.cc
function TEST (line 13) | TEST(BF16_F32_EMBEDDINGS, single_token_single_tile) {
function TEST (line 20) | TEST(BF16_F32_EMBEDDINGS, single_token_multi_tile) {
function TEST (line 27) | TEST(BF16_F32_EMBEDDINGS, multiple_tokens) {
FILE: gpt_oss/metal/test/embeddings-kernel-tester.hpp
type gptoss (line 13) | namespace gptoss {
class EmbeddingsKernelTester (line 15) | class EmbeddingsKernelTester {
method EmbeddingsKernelTester (line 17) | EmbeddingsKernelTester() { }
method EmbeddingsKernelTester (line 19) | EmbeddingsKernelTester(const EmbeddingsKernelTester&) = delete;
method EmbeddingsKernelTester (line 20) | EmbeddingsKernelTester(EmbeddingsKernelTester&&) = delete;
method EmbeddingsKernelTester (line 21) | EmbeddingsKernelTester& operator=(const EmbeddingsKernelTester&) = d...
method EmbeddingsKernelTester (line 22) | EmbeddingsKernelTester& operator=(EmbeddingsKernelTester&&) = delete;
method EmbeddingsKernelTester (line 24) | [[nodiscard]]
method num_channels (line 30) | std::uint32_t num_channels() const {
method EmbeddingsKernelTester (line 34) | [[nodiscard]]
method num_tokens (line 40) | std::uint32_t num_tokens() const {
method vocabulary_size (line 44) | std::uint32_t vocabulary_size() const {
method EmbeddingsKernelTester (line 48) | [[nodiscard]]
method threadgroup_size (line 54) | std::size_t threadgroup_size() const {
method Validate (line 58) | void Validate() const {
method TestBF16_F32 (line 65) | void TestBF16_F32() const {
FILE: gpt_oss/metal/test/f32-bf16w-matmul.cc
function TEST (line 13) | TEST(F32_BF16W_MATMUL, single_simdgroup_single_iteration) {
function TEST (line 21) | TEST(F32_BF16W_MATMUL, single_simdgroup_multiple_iteration) {
function TEST (line 29) | TEST(F32_BF16W_MATMUL, single_threadgroup) {
function TEST (line 39) | TEST(F32_BF16W_MATMUL, multiple_threadgroups) {
function TEST (line 50) | TEST(F32_BF16W_MATMUL, multiple_tokens) {
function TEST (line 62) | TEST(F32_BF16W_DENSE_MATMUL_QKV, seq_len_1024) {
function TEST (line 71) | TEST(F32_BF16W_DENSE_MATMUL_ATTN_OUTPUT, seq_len_1024) {
function TEST (line 80) | TEST(F32_BF16W_DENSE_MATMUL_MLP_GATE, seq_len_1024) {
FILE: gpt_oss/metal/test/f32-bf16w-rmsnorm.cc
function TEST (line 13) | TEST(F32_BF16W_RMSNORM, single_iteration) {
function TEST (line 19) | TEST(F32_BF16W_RMSNORM, multiple_iterations) {
function TEST (line 25) | TEST(F32_BF16W_RMSNORM, partial_iteration) {
function TEST (line 31) | TEST(F32_BF16W_RMSNORM, multiple_tokens) {
FILE: gpt_oss/metal/test/f32-random.cc
function TEST (line 21) | TEST(F32_FILL_RANDOM, single_threadgroup_single_iteration) {
function TEST (line 55) | TEST(F32_FILL_RANDOM, single_threadgroup_multiple_iterations) {
function TEST (line 90) | TEST(F32_FILL_RANDOM, multiple_threadgroups_multiple_iterations) {
function TEST (line 126) | TEST(F32_FILL_RANDOM, excessive_threadgroups) {
function TEST (line 160) | TEST(F32_FILL_RANDOM, nonuniform_range) {
function TEST (line 196) | TEST(F32_FILL_RANDOM, partial_range) {
FILE: gpt_oss/metal/test/f32-rope.cc
function TEST (line 16) | TEST(F32_ROPE, single_simdgroup) {
function TEST (line 27) | TEST(F32_ROPE, single_threadgroup) {
function TEST (line 41) | TEST(F32_ROPE, multiple_threadgroups) {
function TEST (line 56) | TEST(F32_ROPE, multiple_tokens) {
FILE: gpt_oss/metal/test/fill-random-kernel-tester.hpp
type gptoss (line 14) | namespace gptoss {
class FillRandomKernelTester (line 16) | class FillRandomKernelTester {
method FillRandomKernelTester (line 18) | FillRandomKernelTester() { }
method FillRandomKernelTester (line 20) | FillRandomKernelTester(const FillRandomKernelTester&) = delete;
method FillRandomKernelTester (line 21) | FillRandomKernelTester(FillRandomKernelTester&&) = delete;
method FillRandomKernelTester (line 22) | FillRandomKernelTester& operator=(const FillRandomKernelTester&) = d...
method FillRandomKernelTester (line 23) | FillRandomKernelTester& operator=(FillRandomKernelTester&&) = delete;
method FillRandomKernelTester (line 25) | [[nodiscard]]
method num_elements (line 31) | std::uint32_t num_elements() const {
method FillRandomKernelTester (line 35) | [[nodiscard]]
method threadgroup_size (line 41) | std::size_t threadgroup_size() const {
method FillRandomKernelTester (line 45) | [[nodiscard]]
method max_threadgroups (line 51) | std::size_t max_threadgroups() const {
method Validate (line 55) | void Validate() const {
method TestU32 (line 61) | void TestU32() const {
FILE: gpt_oss/metal/test/matmul-kernel-tester.hpp
type gptoss (line 13) | namespace gptoss {
function IsNearAbsRel (line 16) | ::testing::AssertionResult
class MatMulKernelTester (line 46) | class MatMulKernelTester {
method MatMulKernelTester (line 48) | MatMulKernelTester() { }
method MatMulKernelTester (line 50) | MatMulKernelTester(const MatMulKernelTester&) = delete;
method MatMulKernelTester (line 51) | MatMulKernelTester(MatMulKernelTester&&) = delete;
method MatMulKernelTester (line 52) | MatMulKernelTester& operator=(const MatMulKernelTester&) = delete;
method MatMulKernelTester (line 53) | MatMulKernelTester& operator=(MatMulKernelTester&&) = delete;
method MatMulKernelTester (line 55) | [[nodiscard]]
method num_rows (line 61) | std::uint32_t num_rows() const {
method MatMulKernelTester (line 65) | [[nodiscard]]
method num_cols (line 71) | std::uint32_t num_cols() const {
method MatMulKernelTester (line 75) | [[nodiscard]]
method num_tokens (line 81) | std::uint32_t num_tokens() const {
method MatMulKernelTester (line 85) | [[nodiscard]]
method threadgroup_size (line 91) | std::size_t threadgroup_size() const {
method Validate (line 95) | void Validate(std::uint32_t vec_size) const {
type MatMulKernelType (line 103) | enum class MatMulKernelType {
method TestF32_BF16W (line 110) | void TestF32_BF16W(MatMulKernelType kernel_type = MatMulKernelType::...
FILE: gpt_oss/metal/test/mf4-f32-convert.cc
function TEST (line 20) | TEST(MF4_F32_CONVERT, single_threadgroup_single_iteration) {
function TEST (line 78) | TEST(MF4_F32_CONVERT, multiple_threadgroups_multiple_iterations) {
FILE: gpt_oss/metal/test/rmsnorm-kernel-tester.hpp
type gptoss (line 14) | namespace gptoss {
class RMSNormKernelTester (line 16) | class RMSNormKernelTester {
method RMSNormKernelTester (line 18) | RMSNormKernelTester() { }
method RMSNormKernelTester (line 20) | RMSNormKernelTester(const RMSNormKernelTester&) = delete;
method RMSNormKernelTester (line 21) | RMSNormKernelTester(RMSNormKernelTester&&) = delete;
method RMSNormKernelTester (line 22) | RMSNormKernelTester& operator=(const RMSNormKernelTester&) = delete;
method RMSNormKernelTester (line 23) | RMSNormKernelTester& operator=(RMSNormKernelTester&&) = delete;
method RMSNormKernelTester (line 25) | [[nodiscard]]
method num_channels (line 31) | std::uint32_t num_channels() const {
method RMSNormKernelTester (line 35) | [[nodiscard]]
method num_tokens (line 41) | std::uint32_t num_tokens() const {
method RMSNormKernelTester (line 45) | [[nodiscard]]
method epsilon (line 51) | float epsilon() const {
method Validate (line 55) | void Validate() const {
method TestF32_BF16W (line 61) | void TestF32_BF16W() const {
FILE: gpt_oss/metal/test/rope-kernel-tester.hpp
type gptoss (line 14) | namespace gptoss {
class RoPEKernelTester (line 16) | class RoPEKernelTester {
method RoPEKernelTester (line 18) | RoPEKernelTester() { }
method RoPEKernelTester (line 20) | RoPEKernelTester(const RoPEKernelTester&) = delete;
method RoPEKernelTester (line 21) | RoPEKernelTester(RoPEKernelTester&&) = delete;
method RoPEKernelTester (line 22) | RoPEKernelTester& operator=(const RoPEKernelTester&) = delete;
method RoPEKernelTester (line 23) | RoPEKernelTester& operator=(RoPEKernelTester&&) = delete;
method RoPEKernelTester (line 25) | [[nodiscard]]
method threadgroup_size (line 31) | std::size_t threadgroup_size() const {
method RoPEKernelTester (line 35) | [[nodiscard]]
method head_dim (line 41) | std::uint32_t head_dim() const {
method RoPEKernelTester (line 45) | [[nodiscard]]
method num_q_heads (line 51) | std::uint32_t num_q_heads() const {
method RoPEKernelTester (line 55) | [[nodiscard]]
method num_kv_heads (line 61) | std::uint32_t num_kv_heads() const {
method num_qk_heads (line 65) | std::uint32_t num_qk_heads() const {
method num_qkv_heads (line 69) | std::uint32_t num_qkv_heads() const {
method RoPEKernelTester (line 73) | [[nodiscard]]
method num_tokens (line 79) | std::uint32_t num_tokens() const {
method RoPEKernelTester (line 83) | [[nodiscard]]
method token_offset (line 89) | std::uint32_t token_offset() const {
method RoPEKernelTester (line 93) | [[nodiscard]]
method frequency_base (line 99) | float frequency_base() const {
method Validate (line 103) | void Validate() const {
method TestF32 (line 110) | void TestF32() const {
FILE: gpt_oss/metal/test/u32-random.cc
function TEST (line 13) | TEST(U32_FILL_RANDOM, single_threadgroup_single_iteration) {
function TEST (line 21) | TEST(U32_FILL_RANDOM, single_threadgroup_multiple_iterations) {
function TEST (line 31) | TEST(U32_FILL_RANDOM, multiple_threadgroups_multiple_iterations) {
function TEST (line 42) | TEST(U32_FILL_RANDOM, excessive_threadgroups) {
function TEST (line 50) | TEST(U32_FILL_RANDOM, nonuniform_range) {
function TEST (line 61) | TEST(U32_FILL_RANDOM, partial_range) {
FILE: gpt_oss/responses_api/api_server.py
function get_reasoning_effort (line 74) | def get_reasoning_effort(
function is_not_builtin_tool (line 88) | def is_not_builtin_tool(
function create_api_server (line 100) | def create_api_server(
FILE: gpt_oss/responses_api/events.py
class ResponseEvent (line 21) | class ResponseEvent(BaseModel):
class ResponseCreatedEvent (line 25) | class ResponseCreatedEvent(ResponseEvent):
class ResponseCompletedEvent (line 30) | class ResponseCompletedEvent(ResponseEvent):
class ResponseOutputTextDelta (line 35) | class ResponseOutputTextDelta(ResponseEvent):
class ResponseReasoningSummaryTextDelta (line 44) | class ResponseReasoningSummaryTextDelta(ResponseEvent):
class ResponseReasoningTextDelta (line 54) | class ResponseReasoningTextDelta(ResponseEvent):
class ResponseReasoningTextDone (line 62) | class ResponseReasoningTextDone(ResponseEvent):
class ResponseOutputItemAdded (line 70) | class ResponseOutputItemAdded(ResponseEvent):
class ResponseOutputItemDone (line 82) | class ResponseOutputItemDone(ResponseEvent):
class ResponseInProgressEvent (line 94) | class ResponseInProgressEvent(ResponseEvent):
class ResponseContentPartAdded (line 99) | class ResponseContentPartAdded(ResponseEvent):
class ResponseOutputTextDone (line 107) | class ResponseOutputTextDone(ResponseEvent):
class ResponseContentPartDone (line 116) | class ResponseContentPartDone(ResponseEvent):
class ResponseOutputTextAnnotationAdded (line 124) | class ResponseOutputTextAnnotationAdded(ResponseEvent):
class ResponseWebSearchCallInProgress (line 135) | class ResponseWebSearchCallInProgress(ResponseEvent):
class ResponseWebSearchCallSearching (line 143) | class ResponseWebSearchCallSearching(ResponseEvent):
class ResponseWebSearchCallCompleted (line 151) | class ResponseWebSearchCallCompleted(ResponseEvent):
class ResponseCodeInterpreterCallInProgress (line 159) | class ResponseCodeInterpreterCallInProgress(ResponseEvent):
class ResponseCodeInterpreterCallInterpreting (line 167) | class ResponseCodeInterpreterCallInterpreting(ResponseEvent):
class ResponseCodeInterpreterCallCodeDelta (line 175) | class ResponseCodeInterpreterCallCodeDelta(ResponseEvent):
class ResponseCodeInterpreterCallCodeDone (line 187) | class ResponseCodeInterpreterCallCodeDone(ResponseEvent):
class ResponseCodeInterpreterCallCompleted (line 199) | class ResponseCodeInterpreterCallCompleted(ResponseEvent):
FILE: gpt_oss/responses_api/inference/metal.py
function setup_model (line 12) | def setup_model(checkpoint: str) -> Callable[[list[int], float], int]:
FILE: gpt_oss/responses_api/inference/ollama.py
function lcp (line 33) | def lcp(cache: list[int], inp: list[int]) -> list[int]:
function _now (line 41) | def _now():
function _touch_progress (line 45) | def _touch_progress():
function _reset_stream_state (line 50) | def _reset_stream_state():
function setup_model (line 60) | def setup_model(checkpoint: str) -> Callable[[list[int], float, bool], i...
FILE: gpt_oss/responses_api/inference/stub.py
function stub_infer_next_token (line 130) | def stub_infer_next_token(
function setup_model (line 141) | def setup_model(_checkpoint: str) -> Callable[[list[int], float], int]:
FILE: gpt_oss/responses_api/inference/transformers.py
function load_model (line 17) | def load_model(checkpoint: str):
function get_infer_next_token (line 31) | def get_infer_next_token(model: PreTrainedModel):
function setup_model (line 53) | def setup_model(checkpoint: str) -> Callable[[List[int], float, bool], i...
FILE: gpt_oss/responses_api/inference/triton.py
function load_model (line 20) | def load_model(checkpoint: str):
function get_infer_next_token (line 34) | def get_infer_next_token(model, device):
function setup_model (line 99) | def setup_model(checkpoint: str) -> Callable[[list[int], float], int]:
FILE: gpt_oss/responses_api/inference/vllm.py
function load_model (line 16) | def load_model(checkpoint: str):
function get_infer_next_token (line 32) | def get_infer_next_token(llm: LLM):
function setup_model (line 81) | def setup_model(checkpoint: str) -> Callable[[List[int], float, bool], i...
FILE: gpt_oss/responses_api/types.py
class UrlCitation (line 12) | class UrlCitation(BaseModel):
class TextContentItem (line 20) | class TextContentItem(BaseModel):
class SummaryTextContentItem (line 27) | class SummaryTextContentItem(BaseModel):
class ReasoningTextContentItem (line 33) | class ReasoningTextContentItem(BaseModel):
class ReasoningItem (line 38) | class ReasoningItem(BaseModel):
class Item (line 45) | class Item(BaseModel):
class FunctionCallItem (line 53) | class FunctionCallItem(BaseModel):
class FunctionCallOutputItem (line 62) | class FunctionCallOutputItem(BaseModel):
class WebSearchActionSearch (line 68) | class WebSearchActionSearch(BaseModel):
class WebSearchActionOpenPage (line 73) | class WebSearchActionOpenPage(BaseModel):
class WebSearchActionFind (line 78) | class WebSearchActionFind(BaseModel):
class WebSearchCallItem (line 84) | class WebSearchCallItem(BaseModel):
class CodeInterpreterOutputLogs (line 91) | class CodeInterpreterOutputLogs(BaseModel):
class CodeInterpreterOutputImage (line 96) | class CodeInterpreterOutputImage(BaseModel):
class CodeInterpreterCallItem (line 101) | class CodeInterpreterCallItem(BaseModel):
class Error (line 118) | class Error(BaseModel):
class IncompleteDetails (line 123) | class IncompleteDetails(BaseModel):
class Usage (line 127) | class Usage(BaseModel):
class FunctionToolDefinition (line 133) | class FunctionToolDefinition(BaseModel):
class BrowserToolConfig (line 141) | class BrowserToolConfig(BaseModel):
class CodeInterpreterToolConfig (line 146) | class CodeInterpreterToolConfig(BaseModel):
class ReasoningConfig (line 150) | class ReasoningConfig(BaseModel):
class ResponsesRequest (line 154) | class ResponsesRequest(BaseModel):
class ResponseObject (line 187) | class ResponseObject(BaseModel):
FILE: gpt_oss/responses_api/utils.py
function stub_infer_next_token (line 129) | def stub_infer_next_token(tokens: list[int], temperature: float = 0.0) -...
FILE: gpt_oss/tokenizer.py
function get_tokenizer (line 3) | def get_tokenizer():
FILE: gpt_oss/tools/apply_patch.py
class ActionType (line 28) | class ActionType(str, Enum):
class FileChange (line 35) | class FileChange:
class Commit (line 43) | class Commit:
class DiffError (line 50) | class DiffError(ValueError):
class Chunk (line 58) | class Chunk:
class PatchAction (line 65) | class PatchAction:
class Patch (line 73) | class Patch:
class Parser (line 81) | class Parser:
method _cur_line (line 89) | def _cur_line(self) -> str:
method _norm (line 95) | def _norm(line: str) -> str:
method is_done (line 100) | def is_done(self, prefixes: Optional[Tuple[str, ...]] = None) -> bool:
method startswith (line 111) | def startswith(self, prefix: Union[str, Tuple[str, ...]]) -> bool:
method read_str (line 114) | def read_str(self, prefix: str) -> str:
method read_line (line 127) | def read_line(self) -> str:
method parse (line 134) | def parse(self) -> None:
method _parse_update_file (line 177) | def _parse_update_file(self, text: str) -> PatchAction:
method _parse_add_file (line 231) | def _parse_add_file(self) -> PatchAction:
function find_context_core (line 246) | def find_context_core(
function find_context (line 268) | def find_context(
function peek_next_section (line 280) | def peek_next_section(
function _get_updated_file (line 362) | def _get_updated_file(text: str, action: PatchAction, path: str) -> str:
function patch_to_commit (line 389) | def patch_to_commit(patch: Patch, orig: Dict[str, str]) -> Commit:
function text_to_patch (line 416) | def text_to_patch(text: str, orig: Dict[str, str]) -> Tuple[Patch, int]:
function identify_files_needed (line 430) | def identify_files_needed(text: str) -> List[str]:
function identify_files_added (line 443) | def identify_files_added(text: str) -> List[str]:
function load_files (line 455) | def load_files(paths: List[str], open_fn: Callable[[str], str]) -> Dict[...
function apply_commit (line 459) | def apply_commit(
function open_file (line 480) | def open_file(path: str) -> str:
function write_file (line 485) | def write_file(path: str, content: str) -> None:
function remove_file (line 492) | def remove_file(path: str) -> None:
function apply_patch (line 497) | def apply_patch(
function main (line 513) | def main() -> None:
FILE: gpt_oss/tools/python_docker/docker_tool.py
function call_python_script (line 41) | def call_python_script(script: str) -> str:
function call_python_script_with_uv (line 82) | def call_python_script_with_uv(script: str) -> str:
class LocalJupyterSession (line 101) | class LocalJupyterSession:
method __init__ (line 104) | def __init__(
method execute (line 144) | def execute(self, code: str, *, timeout: float | None = None) -> str:
method close (line 233) | def close(self) -> None:
method __del__ (line 241) | def __del__(self) -> None: # pragma: no cover - best-effort cleanup
class PythonTool (line 244) | class PythonTool(Tool):
method __init__ (line 245) | def __init__(
method get_tool_name (line 280) | def get_tool_name(cls) -> str:
method name (line 284) | def name(self) -> str:
method instruction (line 288) | def instruction(self) -> str:
method tool_config (line 303) | def tool_config(self) -> ToolNamespaceConfig:
method _make_response (line 308) | def _make_response(
method make_response (line 316) | def make_response(
method _process (line 337) | async def _process(self, message: Message) -> AsyncIterator[Message]:
method close (line 365) | def close(self) -> None:
method __del__ (line 369) | def __del__(self) -> None: # pragma: no cover - best-effort cleanup
FILE: gpt_oss/tools/simple_browser/backend.py
class BackendError (line 44) | class BackendError(Exception):
function with_retries (line 52) | def with_retries(
function maybe_truncate (line 74) | def maybe_truncate(text: str, num_chars: int = 1024) -> str:
class Backend (line 81) | class Backend:
method search (line 85) | async def search(
method fetch (line 94) | async def fetch(self, url: str, session: ClientSession) -> PageContents:
method _post (line 97) | async def _post(self, session: ClientSession, endpoint: str, payload: ...
method _get (line 109) | async def _get(self, session: ClientSession, endpoint: str, params: di...
class ExaBackend (line 123) | class ExaBackend(Backend):
method _get_api_key (line 134) | def _get_api_key(self) -> str:
method search (line 141) | async def search(
method fetch (line 171) | async def fetch(self, url: str, session: ClientSession) -> PageContents:
class YouComBackend (line 192) | class YouComBackend(Backend):
method _get_api_key (line 199) | def _get_api_key(self) -> str:
method search (line 206) | async def search(
method fetch (line 244) | async def fetch(self, url: str, session: ClientSession) -> PageContents:
FILE: gpt_oss/tools/simple_browser/page_contents.py
class Extract (line 33) | class Extract(pydantic.BaseModel): # A search result snippet or a quota...
class FetchResult (line 40) | class FetchResult(pydantic.BaseModel):
class PageContents (line 51) | class PageContents(pydantic.BaseModel):
class Tokens (line 61) | class Tokens:
function get_domain (line 66) | def get_domain(url: str) -> str:
function multiple_replace (line 75) | def multiple_replace(text: str, replacements: dict[str, str]) -> str:
function mark_lines (line 82) | def mark_lines(text: str) -> str:
function _tiktoken_vocabulary_lengths (line 93) | def _tiktoken_vocabulary_lengths(enc_name: str) -> list[int]:
function warmup_caches (line 99) | def warmup_caches(enc_names: list[str]) -> None:
function _replace_special_chars (line 105) | def _replace_special_chars(text: str) -> str:
function merge_whitespace (line 118) | def merge_whitespace(text: str) -> str:
function arxiv_to_ar5iv (line 125) | def arxiv_to_ar5iv(url: str) -> str:
function _clean_links (line 130) | def _clean_links(root: lxml.html.HtmlElement, cur_url: str) -> dict[str,...
function _get_text (line 167) | def _get_text(node: lxml.html.HtmlElement) -> str:
function _remove_node (line 172) | def _remove_node(node: lxml.html.HtmlElement) -> None:
function _escape_md (line 177) | def _escape_md(text: str) -> str:
function _escape_md_section (line 181) | def _escape_md_section(text: str, snob: bool = False) -> str:
function html_to_text (line 185) | def html_to_text(html: str) -> str:
function _remove_math (line 209) | def _remove_math(root: lxml.html.HtmlElement) -> None:
function remove_unicode_smp (line 215) | def remove_unicode_smp(text: str) -> str:
function replace_node_with_text (line 224) | def replace_node_with_text(node: lxml.html.HtmlElement, text: str) -> None:
function replace_images (line 236) | def replace_images(
function process_html (line 253) | def process_html(
FILE: gpt_oss/tools/simple_browser/simple_browser_tool.py
class ToolUsageError (line 56) | class ToolUsageError(Exception):
function function_the_model_can_call (line 60) | def function_the_model_can_call(
function _tiktoken_vocabulary_lengths (line 78) | def _tiktoken_vocabulary_lengths(enc_name: str) -> list[int]:
class Tokens (line 90) | class Tokens:
function max_chars_per_token (line 96) | def max_chars_per_token(enc_name: str) -> int:
function get_tokens (line 102) | def get_tokens(text: str, enc_name: str) -> Tokens:
function get_end_loc (line 113) | def get_end_loc(
function get_page_metadata (line 143) | def get_page_metadata(
function join_lines (line 154) | def join_lines(
function wrap_lines (line 163) | def wrap_lines(text: str, width: int = 80) -> list[str]:
function strip_links (line 178) | def strip_links(text: str) -> str:
function maybe_get_function_args (line 185) | def maybe_get_function_args(
function run_find_in_page (line 208) | async def run_find_in_page(
function handle_errors (line 258) | def handle_errors(
class SimpleBrowserState (line 277) | class SimpleBrowserState(pydantic.BaseModel):
method current_cursor (line 284) | def current_cursor(self) -> int:
method add_page (line 287) | def add_page(self, page: PageContents) -> None:
method get_page (line 291) | def get_page(self, cursor: int = -1) -> PageContents:
method get_page_by_url (line 309) | def get_page_by_url(self, url: str) -> PageContents | None:
method pop_page_stack (line 314) | def pop_page_stack(self) -> None:
class SimpleBrowserTool (line 319) | class SimpleBrowserTool(Tool):
method __init__ (line 320) | def __init__(
method get_tool_state (line 340) | def get_tool_state(self) -> dict[str, Any]:
method get_tool_name (line 344) | def get_tool_name(cls) -> str:
method name (line 348) | def name(self) -> str:
method tool_config (line 352) | def tool_config(self) -> ToolNamespaceConfig:
method instruction (line 364) | def instruction(self) -> str:
method _render_browsing_display (line 367) | def _render_browsing_display(
method _make_response (line 381) | def _make_response(
method show_page (line 399) | async def show_page(self, loc: int = 0, num_lines: int = -1) -> Message:
method show_page_safely (line 422) | async def show_page_safely(self, loc: int = 0, num_lines: int = -1) ->...
method _open_url (line 429) | async def _open_url(self, url: str, direct_url_open: bool) -> PageCont...
method make_error_message (line 448) | def make_error_message(self, error: Exception) -> Message:
method search (line 456) | async def search(
method open (line 481) | async def open(
method find (line 538) | async def find(self, pattern: str, cursor: int = -1) -> AsyncIterator[...
method make_response (line 552) | def make_response(
method process_arguments (line 575) | def process_arguments(self, message: Message) -> dict[str, Any]:
method _process (line 590) | async def _process(self, message: Message) -> AsyncIterator[Message]:
method normalize_citations (line 620) | def normalize_citations(self, old_content: str, hide_partial_citations...
FILE: gpt_oss/tools/tool.py
function _maybe_update_inplace_and_validate_channel (line 13) | def _maybe_update_inplace_and_validate_channel(
class Tool (line 28) | class Tool(ABC):
method name (line 39) | def name(self) -> str:
method output_channel_should_match_input_channel (line 46) | def output_channel_should_match_input_channel(self) -> bool:
method process (line 52) | async def process(self, message: Message) -> AsyncIterator[Message]:
method _process (line 70) | async def _process(self, message: Message) -> AsyncIterator[Message]:
method instruction (line 78) | def instruction(self) -> str:
method instruction_dict (line 85) | def instruction_dict(self) -> dict[str, str]:
method error_message (line 88) | def error_message(
FILE: gpt_oss/torch/model.py
class ModelConfig (line 13) | class ModelConfig:
class RMSNorm (line 32) | class RMSNorm(torch.nn.Module):
method __init__ (line 33) | def __init__(
method forward (line 43) | def forward(self, x: torch.Tensor) -> torch.Tensor:
function _apply_rotary_emb (line 50) | def _apply_rotary_emb(
class RotaryEmbedding (line 63) | class RotaryEmbedding(torch.nn.Module):
method __init__ (line 64) | def __init__(
method _compute_concentration_and_inv_freq (line 85) | def _compute_concentration_and_inv_freq(self) -> torch.Tensor:
method _compute_cos_sin (line 125) | def _compute_cos_sin(self, num_tokens: int):
method forward (line 133) | def forward(
function sdpa (line 153) | def sdpa(Q, K, V, S, sm_scale, sliding_window=0):
class AttentionBlock (line 176) | class AttentionBlock(torch.nn.Module):
method __init__ (line 177) | def __init__(
method forward (line 217) | def forward(self, x: torch.Tensor) -> torch.Tensor:
function swiglu (line 249) | def swiglu(x, alpha: float = 1.702, limit: float = 7.0):
class MLPBlock (line 259) | class MLPBlock(torch.nn.Module):
method __init__ (line 260) | def __init__(
method forward (line 312) | def forward(self, x: torch.Tensor) -> torch.Tensor:
class TransformerBlock (line 339) | class TransformerBlock(torch.nn.Module):
method __init__ (line 340) | def __init__(
method forward (line 351) | def forward(self, x: torch.Tensor) -> torch.Tensor:
class Transformer (line 357) | class Transformer(torch.nn.Module):
method __init__ (line 358) | def __init__(
method forward (line 382) | def forward(self, x: torch.Tensor) -> torch.Tensor:
method from_checkpoint (line 391) | def from_checkpoint(
class TokenGenerator (line 444) | class TokenGenerator:
method __init__ (line 446) | def __init__(self, checkpoint: str, device: torch.device):
method generate (line 451) | def generate(self,
FILE: gpt_oss/torch/utils.py
function suppress_output (line 6) | def suppress_output(rank):
function init_distributed (line 21) | def init_distributed() -> torch.device:
FILE: gpt_oss/torch/weights.py
class Checkpoint (line 28) | class Checkpoint:
method __init__ (line 29) | def __init__(self, path: str, device: torch.device):
method get (line 52) | def get(self, name: str) -> torch.Tensor:
method _get_tensor (line 61) | def _get_tensor(self, name: str) -> str:
method _get_mxfp4_tensor (line 68) | def _get_mxfp4_tensor(
method _get_mxfp4_tensor_copy (line 119) | def _get_mxfp4_tensor_copy(self, blocks_name: str, scales_name: str, d...
FILE: gpt_oss/triton/attention.py
function _attn_fwd (line 19) | def _attn_fwd(
class _attention (line 104) | class _attention(torch.autograd.Function):
method forward (line 106) | def forward(ctx, q, k, v, sinks, sm_scale, bandwidth, start_q):
function attention_ref (line 165) | def attention_ref(
function test_eq (line 215) | def test_eq(batch_size, num_queries, num_keys, num_key_value_heads, num_...
FILE: gpt_oss/triton/model.py
class RotaryEmbedding (line 14) | class RotaryEmbedding(torch.nn.Module):
method __init__ (line 15) | def __init__(
method _compute_concentration_and_inv_freq (line 39) | def _compute_concentration_and_inv_freq(self) -> torch.Tensor:
method _compute_cos_sin (line 79) | def _compute_cos_sin(self, start: int, num_tokens: int):
method _rotate (line 88) | def _rotate(
method forward (line 102) | def forward(
class Cache (line 121) | class Cache:
method __init__ (line 122) | def __init__(self, batch_size, n_ctx, n_kv_heads, d_head=64, device: t...
method reset (line 127) | def reset(self):
method repeat_interleave (line 132) | def repeat_interleave(self, n):
method truncate (line 137) | def truncate(self, n_ctx):
method extend (line 147) | def extend(self, k, v):
class AttentionBlock (line 157) | class AttentionBlock(torch.nn.Module):
method __init__ (line 158) | def __init__(
method forward (line 200) | def forward(self, x: torch.Tensor, cache: Cache | None = None) -> torc...
class MLPBlock (line 273) | class MLPBlock(torch.nn.Module):
method __init__ (line 274) | def __init__(
method forward (line 342) | def forward(self, x: torch.Tensor) -> torch.Tensor:
class TransformerBlock (line 364) | class TransformerBlock(torch.nn.Module):
method __init__ (line 365) | def __init__(
method forward (line 376) | def forward(self, x: torch.Tensor, cache: Cache | None = None) -> torc...
class Transformer (line 382) | class Transformer(torch.nn.Module):
method __init__ (line 383) | def __init__(
method forward (line 408) | def forward(self, x: torch.Tensor, caches: list[Cache] | None = None) ...
method from_checkpoint (line 422) | def from_checkpoint(
class TokenGenerator (line 470) | class TokenGenerator:
method __init__ (line 472) | def __init__(self, checkpoint: str, context: int, device: torch.device):
method generate (line 485) | def generate(self,
FILE: gpt_oss/triton/moe.py
function quantize_mx4 (line 16) | def quantize_mx4(w):
function swiglu (line 23) | def swiglu(x, alpha: float = 1.702, limit: float = 7.0, interleaved: boo...
function moe (line 34) | def moe(x, wg, w1, w1_mx, w2, w2_mx, bg, b1, b2, experts_per_token=4, nu...
FILE: gpt_oss/vllm/token_generator.py
class TokenGenerator (line 4) | class TokenGenerator:
method __init__ (line 5) | def __init__(self, model_path: str, tensor_parallel_size: int = 1):
method generate (line 13) | def generate(self,
FILE: tests/conftest.py
function harmony_encoding (line 18) | def harmony_encoding():
function mock_infer_token (line 23) | def mock_infer_token(harmony_encoding):
function api_client (line 39) | def api_client(harmony_encoding, mock_infer_token) -> Generator[TestClie...
function sample_request_data (line 49) | def sample_request_data():
function mock_browser_tool (line 61) | def mock_browser_tool():
function mock_python_tool (line 70) | def mock_python_tool():
function reset_test_environment (line 81) | def reset_test_environment():
function performance_timer (line 97) | def performance_timer():
FILE: tests/gpt_oss/tools/simple_browser/test_backend.py
class MockAiohttpResponse (line 8) | class MockAiohttpResponse:
method __init__ (line 11) | def __init__(self, json: dict, status: int):
method json (line 15) | async def json(self):
method __aexit__ (line 18) | async def __aexit__(self, exc_type, exc, tb):
method __aenter__ (line 21) | async def __aenter__(self):
function mock_os_environ_get (line 24) | def mock_os_environ_get(name: str, default: Any = "test_api_key"):
function test_youcom_backend (line 28) | def test_youcom_backend():
function test_youcom_backend_search (line 34) | async def test_youcom_backend_search(mock_session_get):
function test_youcom_backend_fetch (line 57) | async def test_youcom_backend_fetch(mock_session_get):
FILE: tests/test_api_endpoints.py
class TestResponsesEndpoint (line 8) | class TestResponsesEndpoint:
method test_basic_response_creation (line 10) | def test_basic_response_creation(self, api_client, sample_request_data):
method test_response_with_high_reasoning (line 18) | def test_response_with_high_reasoning(self, api_client, sample_request...
method test_response_with_medium_reasoning (line 26) | def test_response_with_medium_reasoning(self, api_client, sample_reque...
method test_response_with_invalid_model (line 34) | def test_response_with_invalid_model(self, api_client, sample_request_...
method test_response_with_empty_input (line 40) | def test_response_with_empty_input(self, api_client, sample_request_da...
method test_response_with_tools (line 45) | def test_response_with_tools(self, api_client, sample_request_data):
method test_response_with_custom_temperature (line 54) | def test_response_with_custom_temperature(self, api_client, sample_req...
method test_streaming_response (line 62) | def test_streaming_response(self, api_client, sample_request_data):
class TestResponsesWithSession (line 75) | class TestResponsesWithSession:
method test_response_with_session_id (line 77) | def test_response_with_session_id(self, api_client, sample_request_data):
method test_response_continuation (line 95) | def test_response_continuation(self, api_client, sample_request_data):
class TestErrorHandling (line 112) | class TestErrorHandling:
method test_missing_required_fields (line 114) | def test_missing_required_fields(self, api_client):
method test_invalid_reasoning_effort (line 119) | def test_invalid_reasoning_effort(self, api_client, sample_request_data):
method test_malformed_json (line 125) | def test_malformed_json(self, api_client):
method test_extremely_long_input (line 133) | def test_extremely_long_input(self, api_client, sample_request_data):
class TestToolIntegration (line 140) | class TestToolIntegration:
method test_browser_search_tool (line 142) | def test_browser_search_tool(self, api_client, sample_request_data):
method test_function_tool_integration (line 151) | def test_function_tool_integration(self, api_client, sample_request_da...
method test_multiple_tools (line 163) | def test_multiple_tools(self, api_client, sample_request_data):
class TestPerformance (line 179) | class TestPerformance:
method test_response_time_under_threshold (line 181) | def test_response_time_under_threshold(self, api_client, sample_reques...
method test_multiple_sequential_requests (line 190) | def test_multiple_sequential_requests(self, api_client, sample_request...
class TestUsageTracking (line 199) | class TestUsageTracking:
method test_usage_object_structure (line 201) | def test_usage_object_structure(self, api_client, sample_request_data):
method test_usage_increases_with_longer_input (line 219) | def test_usage_increases_with_longer_input(self, api_client, sample_re...
FILE: tests/test_responses_api.py
function stub_infer_next_token (line 21) | def stub_infer_next_token(
function test_client (line 33) | def test_client():
function test_health_check (line 39) | def test_health_check(test_client):
Condensed preview — 155 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (1,497K chars).
[
{
"path": ".github/CODEOWNERS",
"chars": 73,
"preview": "@openai/developer-experience\ndkundel-openai\nMaratyszcza\nscott-oai\nvolsgd\n"
},
{
"path": ".github/ISSUE_TEMPLATE/config.yml",
"chars": 363,
"preview": "blank_issues_enabled: false\ncontact_links:\n - name: 🐛 Model Issues\n url: https://huggingface.co/openai/gpt-oss-120b/"
},
{
"path": ".github/workflows/CI.yml",
"chars": 1550,
"preview": "name: CI\n\non:\n release:\n types: [published]\n push:\n tags:\n - \"v*\"\n workflow_dispatch:\n\n# Minimal repo-leve"
},
{
"path": ".gitignore",
"chars": 58,
"preview": "build\n_skbuild\ntmp*\n__pycache__\n*.egg*\nnode_modules/\n*.log"
},
{
"path": "CMakeLists.txt",
"chars": 764,
"preview": "cmake_minimum_required(VERSION 3.26)\nproject(gpt_oss LANGUAGES C CXX)\n\n# If not defined externally, auto-detect\nif(NOT D"
},
{
"path": "LICENSE",
"chars": 11358,
"preview": "\n Apache License\n Version 2.0, January 2004\n "
},
{
"path": "MANIFEST.in",
"chars": 27,
"preview": "recursive-include _build * "
},
{
"path": "README.md",
"chars": 24295,
"preview": "<img alt=\"gpt-oss-120\" src=\"./docs/gpt-oss.svg\">\n<p align=\"center\">\n <a href=\"https://gpt-oss.com\"><strong>Try gpt-oss<"
},
{
"path": "USAGE_POLICY",
"chars": 216,
"preview": "We aim for our tools to be used safely, responsibly, and democratically, while maximizing your control over how you use "
},
{
"path": "_build/gpt_oss_build_backend/__init__.py",
"chars": 51,
"preview": "\"\"\"In-tree PEP 517 backend package for gpt-oss.\"\"\" "
},
{
"path": "_build/gpt_oss_build_backend/backend.py",
"chars": 4572,
"preview": "\"\"\"\nBuild backend for gpt-oss that supports two modes:\n\n1) Default (pure wheel for PyPI)\n - Delegates to setuptools.bu"
},
{
"path": "awesome-gpt-oss.md",
"chars": 5157,
"preview": "\n\n# Awesome gpt-oss\n\nThis is a list of guides and resources to help you get started with t"
},
{
"path": "compatibility-test/.gitignore",
"chars": 2184,
"preview": "# Logs\nlogs\n*.log\nnpm-debug.log*\nyarn-debug.log*\nyarn-error.log*\nlerna-debug.log*\n\n# Diagnostic reports (https://nodejs."
},
{
"path": "compatibility-test/README.md",
"chars": 1086,
"preview": "# API Compatibility Test\n\nThis script uses the Agents SDK in TypeScript and the underlying OpenAI client to verify the s"
},
{
"path": "compatibility-test/analysis.ts",
"chars": 4912,
"preview": "export function analyze(caseResults: any[], tries: number) {\n // Group results by unique task: test_case + apiType\n ty"
},
{
"path": "compatibility-test/cases.jsonl",
"chars": 4847,
"preview": "{\"tool_name\":\"get_system_health\",\"input\":\"Hey, quick check: is everything up and running?\",\"expected_arguments\":\"{}\"}\n{\""
},
{
"path": "compatibility-test/index.ts",
"chars": 5692,
"preview": "import { parseArgs } from \"node:util\";\nimport { createWriteStream } from \"node:fs\";\nimport { readFile, writeFile } from "
},
{
"path": "compatibility-test/package.json",
"chars": 174,
"preview": "{\n \"type\": \"module\",\n \"dependencies\": {\n \"@openai/agents\": \"^0.0.15\",\n \"ajv\": \"^8.17.1\",\n \"listr2\": \"^9.0.1\"\n"
},
{
"path": "compatibility-test/providers.ts",
"chars": 480,
"preview": "export const PROVIDERS = {\n vllm: {\n apiBaseUrl: \"http://localhost:8000/v1\",\n apiKey: \"vllm\",\n apiType: [\"resp"
},
{
"path": "compatibility-test/runCase.ts",
"chars": 10069,
"preview": "import {\n Agent,\n Runner,\n OpenAIResponsesModel,\n OpenAIChatCompletionsModel,\n RunResult,\n StreamedRunResult,\n Fu"
},
{
"path": "compatibility-test/tools.ts",
"chars": 4344,
"preview": "import { Tool, tool } from \"@openai/agents\";\n\nfunction convertToTool(toolData: any) {\n return tool({\n name: toolData"
},
{
"path": "examples/agents-sdk-js/index.ts",
"chars": 2268,
"preview": "import { OpenAI } from \"openai\";\nimport {\n Agent,\n run,\n setDefaultOpenAIClient,\n setOpenAIAPI,\n setTracingDisabled"
},
{
"path": "examples/agents-sdk-js/package.json",
"chars": 403,
"preview": "{\n \"type\": \"module\",\n \"name\": \"agents-sdk\",\n \"version\": \"1.0.0\",\n \"main\": \"index.js\",\n \"scripts\": {\n \"start\": \"t"
},
{
"path": "examples/agents-sdk-python/example.py",
"chars": 2807,
"preview": "import asyncio\nfrom pathlib import Path\nimport shutil\n\nfrom openai import AsyncOpenAI\nfrom agents import (\n Agent,\n "
},
{
"path": "examples/agents-sdk-python/pyproject.toml",
"chars": 192,
"preview": "[project]\nname = \"agents-sdk-python\"\nversion = \"0.1.0\"\ndescription = \"Add your description here\"\nreadme = \"README.md\"\nre"
},
{
"path": "examples/gradio/gradio_chat.py",
"chars": 9410,
"preview": "import json\nimport requests\nimport gradio as gr\n\nDEFAULT_FUNCTION_PROPERTIES = \"\"\"\n{\n \"type\": \"object\",\n \"properti"
},
{
"path": "examples/reinforcement-fine-tuning.ipynb",
"chars": 383148,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"markdown\",\n \"metadata\": {\n \"id\": \"view-in-github\",\n \"colab_t"
},
{
"path": "examples/streamlit/streamlit_chat.py",
"chars": 14032,
"preview": "import json\n\nimport requests\nimport streamlit as st\n\nDEFAULT_FUNCTION_PROPERTIES = \"\"\"\n{\n \"type\": \"object\",\n \"prop"
},
{
"path": "gpt-oss-mcp-server/README.md",
"chars": 1277,
"preview": "# MCP Servers for gpt-oss reference tools\n\nThis directory contains MCP servers for the reference tools in the [gpt-oss]("
},
{
"path": "gpt-oss-mcp-server/browser_server.py",
"chars": 4660,
"preview": "import os\nfrom collections.abc import AsyncIterator\nfrom contextlib import asynccontextmanager\nfrom dataclasses import d"
},
{
"path": "gpt-oss-mcp-server/build-system-prompt.py",
"chars": 3887,
"preview": "import datetime\nimport asyncio\n\nfrom gpt_oss.tokenizer import get_tokenizer\n\nfrom openai_harmony import (\n Conversati"
},
{
"path": "gpt-oss-mcp-server/pyproject.toml",
"chars": 142,
"preview": "[project]\nname = \"gpt-oss-mcp-server\"\nversion = \"0.1.0\"\nrequires-python = \">=3.10\"\ndependencies = [\n \"mcp[cli]>=1.12."
},
{
"path": "gpt-oss-mcp-server/python_server.py",
"chars": 1713,
"preview": "from mcp.server.fastmcp import FastMCP\nfrom gpt_oss.tools.python_docker.docker_tool import PythonTool\nfrom openai_harmon"
},
{
"path": "gpt-oss-mcp-server/reference-system-prompt.py",
"chars": 1542,
"preview": "import datetime\n\nfrom gpt_oss.tools.simple_browser import SimpleBrowserTool\nfrom gpt_oss.tools.simple_browser.backend im"
},
{
"path": "gpt_oss/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/chat.py",
"chars": 13857,
"preview": "\"\"\"\nHarmony chat with tools\n\"\"\"\n\nimport atexit\nimport argparse\nimport asyncio\nimport datetime\nimport os\nfrom pathlib imp"
},
{
"path": "gpt_oss/evals/README.md",
"chars": 235,
"preview": "# `gpt_oss.evals`\n\nThis module is a reincarnation of [simple-evals](https://github.com/openai/simple-evals) adapted for "
},
{
"path": "gpt_oss/evals/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/evals/__main__.py",
"chars": 7553,
"preview": "import argparse\nimport json\nfrom datetime import datetime\n\nfrom . import report\nfrom .basic_eval import BasicEval\nfrom ."
},
{
"path": "gpt_oss/evals/abcd_grader.py",
"chars": 4519,
"preview": "import re\nimport sys\n\n\n_PATTERNS = [\n # 0)\"**Answer:** A\" or \"*Answers* – B\", i.e. markdown‐wrapped \"Answer(s)\" with "
},
{
"path": "gpt_oss/evals/aime_eval.py",
"chars": 3913,
"preview": "\"\"\"\nAIME 2025: https://huggingface.co/datasets/opencompass/AIME2025\n\"\"\"\nimport random\nimport re\nimport pandas\nfrom . imp"
},
{
"path": "gpt_oss/evals/basic_eval.py",
"chars": 1472,
"preview": "\"\"\"\nBasic eval\n\"\"\"\nfrom . import report\n\nfrom .types import Eval, EvalResult, SamplerBase, SingleEvalResult\n\nclass Basic"
},
{
"path": "gpt_oss/evals/chat_completions_sampler.py",
"chars": 3605,
"preview": "import time\nfrom typing import Any\n\nimport openai\nfrom openai import OpenAI\n\nfrom .types import MessageList, SamplerBase"
},
{
"path": "gpt_oss/evals/gpqa_eval.py",
"chars": 4487,
"preview": "\"\"\"\nGPQA: A Graduate-Level Google-Proof Q&A Benchmark\nDavid Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Ric"
},
{
"path": "gpt_oss/evals/healthbench_eval.py",
"chars": 24446,
"preview": "\"\"\"\nThis script evaluates the performance of a model on the HealthBench dataset.\n\nTo run HealthBench, HealthBench Consen"
},
{
"path": "gpt_oss/evals/report.py",
"chars": 5471,
"preview": "import os\nfrom collections import defaultdict\nfrom multiprocessing.pool import ThreadPool\nfrom typing import Any, Callab"
},
{
"path": "gpt_oss/evals/responses_sampler.py",
"chars": 3173,
"preview": "import time\nfrom typing import Any\n\nimport openai\nfrom openai import OpenAI\n\nfrom .types import MessageList, SamplerBase"
},
{
"path": "gpt_oss/evals/types.py",
"chars": 1583,
"preview": "from dataclasses import dataclass, field\nfrom typing import Any, Literal, overload\n\nMessage = dict[str, Any] # keys rol"
},
{
"path": "gpt_oss/generate.py",
"chars": 3081,
"preview": "# Model parallel inference\n# Note: This script is for demonstration purposes only. It is not designed for production use"
},
{
"path": "gpt_oss/metal/CMakeLists.txt",
"chars": 10976,
"preview": "cmake_minimum_required(VERSION 3.24)\nproject(GPTOSS\n VERSION 1.0\n DESCRIPTION \"Local GPT-OSS inference\"\n LANGUA"
},
{
"path": "gpt_oss/metal/__init__.py",
"chars": 228,
"preview": "from importlib import import_module as _im\n\n# Load the compiled extension (gpt_oss.metal._metal)\n_ext = _im(f\"{__name__}"
},
{
"path": "gpt_oss/metal/benchmark/end-to-end-threadgroup.cc",
"chars": 26375,
"preview": "#include <gpt-oss.h>\n#include <internal/model.h>\n\n#include <array>\n#include <cstdint>\n#include <cstddef>\n#include <forma"
},
{
"path": "gpt_oss/metal/benchmark/end-to-end.cc",
"chars": 8833,
"preview": "#include <gpt-oss.h>\n#include <internal/model.h>\n\n#include <array>\n#include <cstddef>\n#include <cstdint>\n#include <forma"
},
{
"path": "gpt_oss/metal/benchmark/f32-bf16w-rmsnorm.cc",
"chars": 3776,
"preview": "#include <gpt-oss.h>\n#include <internal/datatype.h>\n#include <internal/metal.hpp>\n#include <internal/metal-kernels.h>\n\n#"
},
{
"path": "gpt_oss/metal/benchmark/f32-random.cc",
"chars": 1951,
"preview": "#include <gpt-oss.h>\n#include <internal/metal.hpp>\n#include <internal/metal-kernels.h>\n\n#include <benchmark/benchmark.h>"
},
{
"path": "gpt_oss/metal/benchmark/mf4-f32-convert.cc",
"chars": 2346,
"preview": "#include <gpt-oss.h>\n#include <internal/datatype.h>\n#include <internal/metal.hpp>\n#include <internal/metal-kernels.h>\n\n#"
},
{
"path": "gpt_oss/metal/benchmark/u32-random.cc",
"chars": 1884,
"preview": "#include <gpt-oss.h>\n#include <internal/metal.hpp>\n#include <internal/metal-kernels.h>\n\n#include <benchmark/benchmark.h>"
},
{
"path": "gpt_oss/metal/examples/chat.py",
"chars": 3525,
"preview": "#!/usr/bin/env python\n\nimport argparse\nimport sys\n\nfrom datetime import date\nfrom gpt_oss.metal import Context, Model\n\n\n"
},
{
"path": "gpt_oss/metal/examples/generate.py",
"chars": 1128,
"preview": "#!/usr/bin/env python\n\nimport argparse\nimport sys\n\nfrom gpt_oss.metal import Context, Model\n\n\nparser = argparse.Argument"
},
{
"path": "gpt_oss/metal/include/gpt-oss/functions.h",
"chars": 16079,
"preview": "#pragma once\n\n#include <stddef.h>\n#include <stdint.h>\n\n#include <gpt-oss/macros.h>\n#include <gpt-oss/types.h>\n\n#ifdef __"
},
{
"path": "gpt_oss/metal/include/gpt-oss/macros.h",
"chars": 78,
"preview": "#pragma once\n\n#ifndef GPTOSS_ABI\n #define GPTOSS_ABI\n#endif // GPTOSS_ABI\n"
},
{
"path": "gpt_oss/metal/include/gpt-oss/types.h",
"chars": 1732,
"preview": "#pragma once\n\n/*\n * Status codes returned by GPT-OSS API functions.\n */\nenum gptoss_status {\n gptoss_status_success ="
},
{
"path": "gpt_oss/metal/include/gpt-oss.h",
"chars": 100,
"preview": "#pragma once\n\n#include <gpt-oss/macros.h>\n#include <gpt-oss/types.h>\n#include <gpt-oss/functions.h>\n"
},
{
"path": "gpt_oss/metal/python/context.c",
"chars": 9240,
"preview": "#include <Python.h>\n\n#include <gpt-oss.h>\n\n#include \"module.h\"\n\n\nstatic int PyGPTOSSContext_init(PyGPTOSSContext* self, "
},
{
"path": "gpt_oss/metal/python/model.c",
"chars": 2723,
"preview": "#include <Python.h>\n\n#include <gpt-oss.h>\n\n#include \"module.h\"\n\n\nstatic int PyGPTOSSModel_init(PyGPTOSSModel* self, PyOb"
},
{
"path": "gpt_oss/metal/python/module.c",
"chars": 1447,
"preview": "#include <Python.h>\n\n#include \"module.h\"\n\n\nstatic PyMethodDef module_methods[] = {\n {NULL, NULL, 0, NULL}\n};\n\nstatic "
},
{
"path": "gpt_oss/metal/python/module.h",
"chars": 421,
"preview": "#include <Python.h>\n\n#include <gpt-oss.h>\n\ntypedef struct {\n PyObject_HEAD\n gptoss_model_t handle;\n} PyGPTOSSModel"
},
{
"path": "gpt_oss/metal/python/tokenizer.c",
"chars": 6907,
"preview": "#include <Python.h>\n\n#include <gpt-oss.h>\n\n#include \"module.h\"\n\nstatic PyObject* PyGPTOSSTokenizer_new(PyTypeObject* sub"
},
{
"path": "gpt_oss/metal/scripts/create-local-model.py",
"chars": 14560,
"preview": "import argparse\nimport os\nimport math\nimport sys\nimport json\nimport itertools\nimport struct\nfrom uuid import UUID\n\nimpor"
},
{
"path": "gpt_oss/metal/source/accumulate.metal",
"chars": 2450,
"preview": "#include <metal_integer>\n#include <metal_math>\n\n#include <internal/kernel-args.h>\n\n#pragma METAL fp math_mode(safe)\n#pra"
},
{
"path": "gpt_oss/metal/source/context.c",
"chars": 52299,
"preview": "#include <assert.h>\n#include <float.h>\n#include <inttypes.h>\n#include <stdbool.h>\n#include <stdint.h>\n#include <stdlib.h"
},
{
"path": "gpt_oss/metal/source/convert.metal",
"chars": 3441,
"preview": "#include <metal_integer>\n\n#include <internal/kernel-args.h>\n\n#pragma METAL fp math_mode(safe)\n#pragma METAL fp contract("
},
{
"path": "gpt_oss/metal/source/embeddings.metal",
"chars": 867,
"preview": "#include <internal/kernel-args.h>\n\n#pragma METAL fp math_mode(safe)\n#pragma METAL fp contract(off)\n\n\nkernel void gptoss_"
},
{
"path": "gpt_oss/metal/source/expert_routing_metadata.metal",
"chars": 1606,
"preview": "#include <internal/kernel-args.h>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_stdlib>\n\nconstant uint "
},
{
"path": "gpt_oss/metal/source/gather_and_accumulate.metal",
"chars": 3011,
"preview": "#include <internal/kernel-args.h>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_stdlib>\n\n// TODO(ibrahi"
},
{
"path": "gpt_oss/metal/source/generate.c",
"chars": 12836,
"preview": "#include <assert.h>\n#include <inttypes.h>\n#include <math.h>\n#include <signal.h>\n#include <stdatomic.h>\n#include <stdbool"
},
{
"path": "gpt_oss/metal/source/include/internal/datatype.h",
"chars": 1203,
"preview": "#pragma once\n\n#include <stdint.h>\n\n#include <internal/macros.h>\n\n\ntypedef struct GPTOSS_DENSELY_PACKED_STRUCTURE {\n G"
},
{
"path": "gpt_oss/metal/source/include/internal/datatype.hpp",
"chars": 3671,
"preview": "#pragma once\n\n#include <bit>\n\n#include <internal/datatype.h>\n\n\nnamespace gptoss {\n\ntemplate <typename WideT, typename Na"
},
{
"path": "gpt_oss/metal/source/include/internal/kernel-args.h",
"chars": 4564,
"preview": "#pragma once\n\n#if !defined(__METAL_VERSION__)\n#include <stdint.h>\n#endif\n\n// TODO(ibahmed): specalize using metal functi"
},
{
"path": "gpt_oss/metal/source/include/internal/log.h",
"chars": 497,
"preview": "#pragma once\n\n#include <stdarg.h>\n\n\nvoid gptoss_format_log(const char* format, va_list args);\n\n__attribute__((__format__"
},
{
"path": "gpt_oss/metal/source/include/internal/macros.h",
"chars": 3321,
"preview": "#pragma once\n\n/***** Architecture detection macros *****/\n\n#ifdef GPTOSS_ARCH_X86_64\n #if GPTOSS_ARCH_X86_64 != 0 && "
},
{
"path": "gpt_oss/metal/source/include/internal/math.h",
"chars": 920,
"preview": "#pragma once\n\n#include <assert.h>\n#include <stddef.h>\n#include <stdint.h>\n\ninline static size_t math_ceil_div(size_t num"
},
{
"path": "gpt_oss/metal/source/include/internal/metal-kernels.h",
"chars": 18465,
"preview": "#pragma once\n\n#include <stddef.h>\n#include <stdint.h>\n\n#include <internal/metal.h>\n\n#ifdef __cplusplus\nextern \"C\" {\n#end"
},
{
"path": "gpt_oss/metal/source/include/internal/metal.h",
"chars": 4106,
"preview": "#pragma once\n\n#include <stddef.h>\n\n#include <gpt-oss/types.h>\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n\nstruct gptoss_met"
},
{
"path": "gpt_oss/metal/source/include/internal/metal.hpp",
"chars": 12401,
"preview": "#pragma once\n\n#include <array>\n#include <initializer_list>\n#include <cstring>\n#include <stdexcept>\n#include <vector>\n\n#i"
},
{
"path": "gpt_oss/metal/source/include/internal/model.h",
"chars": 5965,
"preview": "#pragma once\n\n#ifndef __cplusplus\n #include <stdatomic.h>\n#endif\n#include <stdbool.h>\n#include <stddef.h>\n#include <s"
},
{
"path": "gpt_oss/metal/source/include/internal/rng.h",
"chars": 466,
"preview": "#pragma once\n\n#include <stdint.h>\n\ninline static uint32_t rng_squares32(uint64_t offset, uint64_t seed) {\n const uint"
},
{
"path": "gpt_oss/metal/source/include/internal/rng.hpp",
"chars": 583,
"preview": "#pragma once\n\n#include <cstdint>\n\nnamespace gptoss {\n\nnamespace rng {\n\ninline static std::uint32_t squares32(std::uint64"
},
{
"path": "gpt_oss/metal/source/include/internal/storage.h",
"chars": 770,
"preview": "#pragma once\n\n#include <stdbool.h>\n#include <stdint.h>\n\nstruct gptoss_file_header {\n char magic[12];\n uint32_t zer"
},
{
"path": "gpt_oss/metal/source/include/internal/uuid.h",
"chars": 4661,
"preview": "#pragma once\n\n#include <stdbool.h>\n#include <stdint.h>\n#include <string.h>\n\n#include \"internal/macros.h\"\n\n\nstruct GPTOSS"
},
{
"path": "gpt_oss/metal/source/log.c",
"chars": 1639,
"preview": "#include <assert.h> // assert\n#include <stdarg.h> // va_list, va_copy, va_end\n#include <stdio.h> // vsnprintf\n#includ"
},
{
"path": "gpt_oss/metal/source/matmul.metal",
"chars": 23502,
"preview": "#include <metal_atomic>\n#include <metal_compute>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_simdgrou"
},
{
"path": "gpt_oss/metal/source/metal-kernels.c",
"chars": 67810,
"preview": "#include <inttypes.h>\n#include <stddef.h>\n#include <stdint.h>\n#include <math.h>\n\n#include <internal/kernel-args.h>\n#incl"
},
{
"path": "gpt_oss/metal/source/metal.m",
"chars": 18257,
"preview": "#import <Foundation/Foundation.h>\n#import <Metal/Metal.h>\n\n#include <dispatch/dispatch.h>\n#include <mach-o/getsect.h>\n\n#"
},
{
"path": "gpt_oss/metal/source/model.c",
"chars": 26427,
"preview": "#include <assert.h>\n#include <inttypes.h>\n#include <stdatomic.h>\n#include <stdint.h>\n#include <stdlib.h>\n#include <strin"
},
{
"path": "gpt_oss/metal/source/moematmul.metal",
"chars": 32543,
"preview": "#include <internal/kernel-args.h>\n#include <metal_common>\n#include <metal_compute>\n#include <metal_math>\n#include <metal"
},
{
"path": "gpt_oss/metal/source/random.metal",
"chars": 3691,
"preview": "#include <metal_integer>\n#include <metal_math>\n\n#include <internal/kernel-args.h>\n\n#pragma METAL fp math_mode(safe)\n#pra"
},
{
"path": "gpt_oss/metal/source/rmsnorm.metal",
"chars": 2066,
"preview": "#include <metal_compute>\n#include <metal_math>\n#include <metal_simdgroup>\n\n#include <internal/kernel-args.h>\n\n#pragma ME"
},
{
"path": "gpt_oss/metal/source/rope.metal",
"chars": 2244,
"preview": "#include <metal_common>\n#include <metal_math>\n\n#include <internal/kernel-args.h>\n\n#pragma METAL fp math_mode(safe)\n#prag"
},
{
"path": "gpt_oss/metal/source/sample.metal",
"chars": 7423,
"preview": "#include <metal_compute>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_simdgroup>\n\n#include <internal/k"
},
{
"path": "gpt_oss/metal/source/scatter.metal",
"chars": 3171,
"preview": "#include <internal/kernel-args.h>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_stdlib>\n\n// TODO(ibrahi"
},
{
"path": "gpt_oss/metal/source/sdpa.metal",
"chars": 15823,
"preview": "#include <metal_geometric>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_compute>\n#include <metal_simdg"
},
{
"path": "gpt_oss/metal/source/tokenizer.c",
"chars": 3121,
"preview": "#include <assert.h>\n#include <stdatomic.h>\n#include <stddef.h>\n#include <stdint.h>\n#include <stdlib.h>\n#include <string."
},
{
"path": "gpt_oss/metal/source/topk.metal",
"chars": 6205,
"preview": "#include <metal_compute>\n#include <metal_integer>\n#include <metal_math>\n#include <metal_simdgroup>\n\n#include <internal/k"
},
{
"path": "gpt_oss/metal/test/bf16-f32-embeddings.cc",
"chars": 789,
"preview": "#include <gtest/gtest.h>\n\n#include <cstddef>\n\n#include \"embeddings-kernel-tester.hpp\"\n\n\nusing gptoss::EmbeddingsKernelTe"
},
{
"path": "gpt_oss/metal/test/embeddings-kernel-tester.hpp",
"chars": 4128,
"preview": "#pragma once\n\n#include <gtest/gtest.h>\n\n#include <cstddef>\n#include <cstdint>\n\n#include <internal/datatype.hpp>\n#include"
},
{
"path": "gpt_oss/metal/test/f32-bf16w-matmul.cc",
"chars": 2504,
"preview": "#include <gtest/gtest.h>\n\n#include <cstddef>\n#include <cstdint>\n\n#include \"matmul-kernel-tester.hpp\"\n\n\nusing gptoss::Mat"
},
{
"path": "gpt_oss/metal/test/f32-bf16w-rmsnorm.cc",
"chars": 881,
"preview": "#include <gtest/gtest.h>\n\n#include <cstdint>\n\n#include \"rmsnorm-kernel-tester.hpp\"\n\n\nusing gptoss::RMSNormKernelTester;\n"
},
{
"path": "gpt_oss/metal/test/f32-random.cc",
"chars": 9402,
"preview": "#include <gtest/gtest.h>\n\n#include <cmath>\n\n#include <internal/metal.hpp>\n#include <internal/metal-kernels.h>\n#include <"
},
{
"path": "gpt_oss/metal/test/f32-rope.cc",
"chars": 2005,
"preview": "#include <gtest/gtest.h>\n\n#include <cstddef>\n#include <cstdint>\n\n#include \"rope-kernel-tester.hpp\"\n\n\nusing gptoss::RoPEK"
},
{
"path": "gpt_oss/metal/test/fill-random-kernel-tester.hpp",
"chars": 3029,
"preview": "#pragma once\n\n#include <gtest/gtest.h>\n\n#include <cstddef>\n#include <cstdint>\n\n#include <internal/datatype.hpp>\n#include"
},
{
"path": "gpt_oss/metal/test/matmul-kernel-tester.hpp",
"chars": 13397,
"preview": "#pragma once\n\n#include <gtest/gtest.h>\n\n#include <cmath>\n#include <cstddef>\n#include <cstdint>\n\n#include <internal/datat"
},
{
"path": "gpt_oss/metal/test/mf4-f32-convert.cc",
"chars": 5146,
"preview": "#include <gtest/gtest.h>\n\n#include <cmath>\n#include <ios>\n\n#include <internal/metal.hpp>\n#include <internal/metal-kernel"
},
{
"path": "gpt_oss/metal/test/rmsnorm-kernel-tester.hpp",
"chars": 5302,
"preview": "#pragma once\n\n#include <gtest/gtest.h>\n\n#include <cmath>\n#include <cstddef>\n#include <cstdint>\n\n#include <internal/datat"
},
{
"path": "gpt_oss/metal/test/rope-kernel-tester.hpp",
"chars": 7910,
"preview": "#pragma once\n\n#include <gtest/gtest.h>\n\n#include <cmath>\n#include <cstddef>\n#include <cstdint>\n\n#include <internal/datat"
},
{
"path": "gpt_oss/metal/test/u32-random.cc",
"chars": 2016,
"preview": "#include <gtest/gtest.h>\n\n#include <cstddef>\n#include <cstdint>\n\n#include \"fill-random-kernel-tester.hpp\"\n\n\nusing gptoss"
},
{
"path": "gpt_oss/responses_api/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/responses_api/api_server.py",
"chars": 60508,
"preview": "import os\nimport datetime\nimport uuid\nfrom typing import Callable, Literal, Optional, Union\n\nfrom fastapi import FastAPI"
},
{
"path": "gpt_oss/responses_api/events.py",
"chars": 5690,
"preview": "# torchrun --nproc-per-node=4 responses_api.py\nfrom typing import Literal, Optional, Union\n\nfrom pydantic import BaseMod"
},
{
"path": "gpt_oss/responses_api/inference/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/responses_api/inference/metal.py",
"chars": 1241,
"preview": "\"\"\"Metal backend for :mod:`gpt_oss.responses_api`.\"\"\"\n\nfrom typing import Callable\n\nfrom gpt_oss.metal import Context, M"
},
{
"path": "gpt_oss/responses_api/inference/ollama.py",
"chars": 7037,
"preview": "\"\"\"\nNOTE: this is a stitched together implementation that uses Ollama for inference. It's primarily used\nfor testing and"
},
{
"path": "gpt_oss/responses_api/inference/stub.py",
"chars": 2352,
"preview": "import time\nfrom typing import Callable\n\nfake_tokens = [\n 200005,\n 35644,\n 200008,\n 23483,\n 316,\n 1199"
},
{
"path": "gpt_oss/responses_api/inference/transformers.py",
"chars": 1709,
"preview": "\"\"\"\nNOTE: this is not the most efficient way to use transformers. It's a simple implementation that infers\none token at "
},
{
"path": "gpt_oss/responses_api/inference/triton.py",
"chars": 3037,
"preview": "import datetime\nimport os\nfrom typing import Callable\n\nos.environ[\"PYTORCH_CUDA_ALLOC_CONF\"] = \"expandable_segments:True"
},
{
"path": "gpt_oss/responses_api/inference/vllm.py",
"chars": 2899,
"preview": "\"\"\"\nNOTE: this is not the most efficient way to use vLLM. It's a simple implementation that infers \none token at a time "
},
{
"path": "gpt_oss/responses_api/serve.py",
"chars": 1850,
"preview": "# torchrun --nproc-per-node=4 serve.py\n\nimport argparse\n\nimport uvicorn\nfrom openai_harmony import (\n HarmonyEncoding"
},
{
"path": "gpt_oss/responses_api/types.py",
"chars": 5801,
"preview": "from typing import Any, Dict, Literal, Optional, Union\n\nfrom openai_harmony import ReasoningEffort\nfrom pydantic import "
},
{
"path": "gpt_oss/responses_api/utils.py",
"chars": 2184,
"preview": "import time\n\nfake_tokens = [\n 200005,\n 35644,\n 200008,\n 23483,\n 316,\n 1199,\n 1114,\n 717,\n 170"
},
{
"path": "gpt_oss/tokenizer.py",
"chars": 1002,
"preview": "import tiktoken\n\ndef get_tokenizer():\n o200k_base = tiktoken.get_encoding(\"o200k_base\")\n tokenizer = tiktoken.Enco"
},
{
"path": "gpt_oss/tools/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/tools/apply_patch.md",
"chars": 2567,
"preview": "When requested to perform coding-related tasks, you MUST adhere to the following criteria when executing the task:\n\n- Us"
},
{
"path": "gpt_oss/tools/apply_patch.py",
"chars": 17569,
"preview": "#!/usr/bin/env python3\n\n\"\"\"\nA self-contained **pure-Python 3.9+** utility for applying human-readable\n“pseudo-diff” patc"
},
{
"path": "gpt_oss/tools/python_docker/docker_tool.py",
"chars": 13394,
"preview": "# Run this before running the tool:\n# $ docker image pull python:3.11\nimport asyncio\nimport contextlib\nimport io\nimport "
},
{
"path": "gpt_oss/tools/simple_browser/__init__.py",
"chars": 177,
"preview": "from .simple_browser_tool import SimpleBrowserTool\nfrom .backend import ExaBackend, YouComBackend\n\n__all__ = [\n \"Simp"
},
{
"path": "gpt_oss/tools/simple_browser/backend.py",
"chars": 7611,
"preview": "\"\"\"\nSimple backend for the simple browser tool.\n\"\"\"\n\nimport functools\nimport asyncio\nimport logging\nimport os\nfrom abc i"
},
{
"path": "gpt_oss/tools/simple_browser/page_contents.py",
"chars": 9557,
"preview": "\"\"\"\nPage contents for the simple browser tool.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport dataclasses\nimport functo"
},
{
"path": "gpt_oss/tools/simple_browser/simple_browser_tool.py",
"chars": 22997,
"preview": "import contextvars\nimport dataclasses\nimport functools\nimport itertools\nimport json\nimport re\nimport textwrap\nfrom typin"
},
{
"path": "gpt_oss/tools/tool.py",
"chars": 3683,
"preview": "from abc import ABC, abstractmethod\nfrom uuid import UUID, uuid4\nfrom typing import AsyncIterator\n\nfrom openai_harmony i"
},
{
"path": "gpt_oss/torch/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/torch/model.py",
"chars": 16251,
"preview": "import json\nimport math\nimport os\nfrom dataclasses import dataclass\n\nimport torch\nimport torch.distributed as dist\n\nfrom"
},
{
"path": "gpt_oss/torch/utils.py",
"chars": 1179,
"preview": "import os\nimport torch\nimport torch.distributed as dist\n\n\ndef suppress_output(rank):\n \"\"\"Suppress printing on the cur"
},
{
"path": "gpt_oss/torch/weights.py",
"chars": 5144,
"preview": "import math\nimport os\n\nimport torch\nfrom safetensors import safe_open\n\n\n# Bytes per MXFP4 block: 32 FP4 numbers packed i"
},
{
"path": "gpt_oss/triton/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "gpt_oss/triton/attention.py",
"chars": 8367,
"preview": "\"\"\"FlashAttention w/support for learned sinks and banded attention.\n\nThis is an expanded version of the Flash Attention "
},
{
"path": "gpt_oss/triton/model.py",
"chars": 18470,
"preview": "import json\nimport math\nimport os\n\nimport torch\nfrom torch.profiler import record_function\n\nfrom gpt_oss.torch.model imp"
},
{
"path": "gpt_oss/triton/moe.py",
"chars": 2741,
"preview": "import torch\nfrom torch.profiler import record_function\n\nimport triton_kernels\nimport triton_kernels.swiglu\nfrom triton_"
},
{
"path": "gpt_oss/vllm/token_generator.py",
"chars": 2180,
"preview": "from vllm import LLMEngine, EngineArgs, SamplingParams, TokensPrompt\n\n\nclass TokenGenerator:\n def __init__(self, mode"
},
{
"path": "pyproject.toml",
"chars": 1267,
"preview": "[project]\nname = \"gpt-oss\"\ndescription = \"A collection of reference inference implementations for gpt-oss by OpenAI\"\n\nde"
},
{
"path": "tests/conftest.py",
"chars": 2953,
"preview": "import os\nimport sys\nimport pytest\nfrom typing import Generator, Any\nfrom unittest.mock import Mock, MagicMock\nfrom fast"
},
{
"path": "tests/gpt_oss/tools/simple_browser/test_backend.py",
"chars": 2792,
"preview": "import pytest\nfrom typing import Generator, Any\nfrom unittest import mock\nfrom aiohttp import ClientSession\n\nfrom gpt_os"
},
{
"path": "tests/test_api_endpoints.py",
"chars": 9608,
"preview": "import pytest\nimport json\nimport asyncio\nfrom fastapi import status\nfrom unittest.mock import patch, MagicMock, AsyncMoc"
},
{
"path": "tests/test_responses_api.py",
"chars": 1133,
"preview": "import time\n\nimport pytest\nfrom fastapi.testclient import TestClient\nfrom openai_harmony import (\n HarmonyEncodingNam"
},
{
"path": "tests-data/basic-event-stream.txt",
"chars": 6842,
"preview": "event: response.created\ndata: {\"type\":\"response.created\",\"sequence_number\":0,\"response\":{\"id\":\"resp_687937d6852c819199d1"
},
{
"path": "tests-data/web-search-event-stream.txt",
"chars": 52260,
"preview": "event: response.created\ndata: {\"type\":\"response.created\",\"sequence_number\":0,\"response\":{\"id\":\"resp_688867b6fb90819e9221"
}
]
About this extraction
This page contains the full source code of the openai/gpt-oss GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 155 files (1.3 MB), approximately 368.2k tokens, and a symbol index with 1070 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.