[
  {
    "path": ".coveragerc",
    "content": "[run]\nsource =\n    instructor/\nomit =\n    instructor/cli/*\n"
  },
  {
    "path": ".cursor/rules/documentation-sync.mdc",
    "content": "---\ndescription: when making code changes or adding documentation\nglobs: [\"*.py\", \"*.md\"]\nalwaysApply: true\n---\n\n- When making code changes:\n    - Update related documentation files to reflect the changes\n    - Check docstrings and type hints are up to date\n    - Update any example code in markdown files\n    - Review README.md if the changes affect installation or usage\n\n- When creating new markdown files:\n    - Add the file to mkdocs.yml under the appropriate section\n    - Follow the existing hierarchy and indentation\n    - Use descriptive nav titles\n    - Example:\n        ```yaml\n        nav:\n          - Home: index.md\n          - Guides:\n              - Getting Started: guides/getting-started.md\n              - Your New File: guides/your-new-file.md\n        ```\n\n- For API documentation:\n    - Ensure new functions/classes are documented\n    - Include type hints and docstrings\n    - Add usage examples\n    - Update API reference docs if auto-generated\n\n- Documentation Quality:\n    - Write at grade 10 reading level (see simple-language.mdc)\n    - Include working code examples\n    - Add links to related documentation\n    - Use consistent formatting and style "
  },
  {
    "path": ".cursor/rules/followups.mdc",
    "content": "---\ndescription: when AI agents are collaborating on code\nglobs: \"*\"\nalwaysApply: true\n---\nMake sure to come up with follow-up hot keys. They should be thoughtful and actionable and result in small additional code changes based on the context that you have available.\n\nusing [J], [K], [L]\n"
  },
  {
    "path": ".cursor/rules/new-features-planning.mdc",
    "content": "---\ndescription: when asked to implement new features or clients\nglobs: *.py\nalwaysApply: true\n---\n\n- When being asked to make new features, make sure that you check out from main a new branch and make incremental commits\n  - Use conventional commit format: `<type>(<scope>): <description>`\n    - Types: feat, fix, docs, style, refactor, perf, test, chore\n    - Example: `feat(validation): add email validation function` \n    - Keep commits focused on a single change\n    - Write descriptive commit messages in imperative mood\n  - Use `git commit -m \"type(scope): subject\" -m \"body\" -m \"footer\"` for multiline commits\n- If the feature is very large, create a temporary `todo.md`\n- And start a pull request using `gh`\n  - Create PRs with multiline bodies using:\n    ```bash\n    gh pr create --title \"feat(component): add new feature\" --body \"$(cat <<EOF\n    ## Description\n    Detailed explanation of the changes\n\n    ## Changes\n    - List important changes\n    - Another change\n\n    ## Testing\n    How this was tested\n\n    This PR was written by [Cursor](cursor.com)\n    EOF\n    )\" -r jxnl,ivanleomk\n    ```\n  - Or use the `-F` flag with a file: `gh pr create -F pr_body.md`\n- Make sure to include `This PR was written by [Cursor](mdc:cursor.com)`\n- Add default reviewers:\n    - Use `gh pr edit <id> --add-reviewer jxnl,ivanleomk`\n    - Or include `-r jxnl,ivanleomk` when creating the PR\n- use `gh pr view <id> --comments | cat` to view all the comments\n- For PR updates:\n    - Do not directly commit to an existing PR branch\n    - Instead, create a new PR that builds on top of the original PR's branch\n    - This creates a \"stacked PR\" pattern where:\n        1. The original PR (base) contains the initial changes\n        2. The new PR (stack) contains only the review-related updates\n        3. Once the base PR is merged, the stack can be rebased onto main \n"
  },
  {
    "path": ".cursor/rules/readme.md",
    "content": "# Cursor Rules\n\nCursor rules are configuration files that help guide AI-assisted development in the Cursor IDE. They provide structured instructions for how the AI should behave in specific contexts or when working with certain types of files.\n\n## What is Cursor?\n\n[Cursor](https://cursor.sh) is an AI-powered IDE that helps developers write, understand, and maintain code more efficiently. It integrates AI capabilities directly into the development workflow, providing features like:\n\n- AI-assisted code completion\n- Natural language code generation\n- Intelligent code explanations\n- Automated refactoring suggestions\n\n## Understanding Cursor Rules\n\nCursor rules are defined in `.mdc` files within the `.cursor/rules` directory. Each rule file follows a specific naming convention: lowercase names with the `.mdc` extension (e.g., `simple-language.mdc`).\n\nEach rule file contains:\n\n1. **Metadata Header**: YAML frontmatter that defines:\n   ```yaml\n   ---\n   description: when to apply this rule\n   globs: file patterns to match (e.g., \"*.py\", \"*.md\", or \"*\" for all files)\n   alwaysApply: true/false  # whether to apply automatically\n   ---\n   ```\n\n2. **Rule Content**: Markdown-formatted instructions that guide the AI's behavior\n\n## Available Rules\n\nCurrently, the following rules are defined:\n\n### `simple-language.mdc`\n- **Purpose**: Ensures documentation is written at a grade 10 reading level\n- **Applies to**: Markdown files (*.md)\n- **Auto Apply**: No\n- **Key Requirements**: \n  - Write at grade 10 reading level\n  - Ensure code blocks are self-contained with complete imports\n\n### `new-features-planning.mdc`\n- **Purpose**: Guides feature implementation workflow\n- **Applies to**: Python files (*.py)\n- **Auto Apply**: Yes\n- **Key Requirements**:\n  - Create new branch from main\n  - Make incremental commits\n  - Create todo.md for large features\n  - Start pull requests using GitHub CLI (`gh`)\n  - Include \"This PR was written by [Cursor](https://cursor.sh)\" in PRs\n\n### `followups.mdc`\n- **Purpose**: Ensures thoughtful follow-up suggestions\n- **Applies to**: All files\n- **Auto Apply**: Yes\n- **Key Requirements**:\n  - Generate actionable hotkey suggestions using:\n    - [J]: First follow-up action\n    - [K]: Second follow-up action\n    - [L]: Third follow-up action\n  - Focus on small, contextual code changes\n  - Suggestions should be thoughtful and actionable\n\n### `documentation-sync.mdc`\n- **Purpose**: Maintains documentation consistency with code changes\n- **Applies to**: Python and Markdown files (*.py, *.md)\n- **Auto Apply**: Yes\n- **Key Requirements**:\n  - Update docs when code changes\n  - Add new markdown files to mkdocs.yml\n  - Keep API documentation current\n  - Maintain documentation quality standards\n\n## Creating New Rules\n\nTo create a new rule:\n\n1. Create a `.mdc` file in `.cursor/rules/` using lowercase naming\n2. Add YAML frontmatter with required metadata:\n   ```yaml\n   ---\n   description: when to apply this rule\n   globs: file patterns to match\n   alwaysApply: true/false\n   ---\n   ```\n3. Write clear, specific instructions in Markdown\n4. Test the rule with relevant file types\n\n## Best Practices\n\n- Keep rules focused and specific\n- Use clear, actionable language\n- Test rules thoroughly before committing\n- Document any special requirements or dependencies\n- Update rules as project needs evolve\n- Use consistent file naming (lowercase with .mdc extension)\n- Ensure globs patterns are explicit and documented\n"
  },
  {
    "path": ".cursor/rules/simple-language.mdc",
    "content": "---\ndescription: when writing documentation\nglobs: *.md\nalwaysApply: false\n---\n\n- When writing documents and concepts make sure that you write at a grade 10 reading level \n- make sure every code block has complete imports and makes no references to previous code blocks, each one needs to be self contained\n"
  },
  {
    "path": ".cursorignore",
    "content": "# Add directories or file patterns to ignore during indexing (e.g. foo/ or *.csv)\n"
  },
  {
    "path": ".github/FUNDING.yml",
    "content": "github: jxnl"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "content": "---\nname: Bug report\nabout: Create a report to help us improve\n---\n\n- [ ] This is actually a bug report.\n- [ ] I am not getting good LLM Results\n- [ ] I have tried asking for help in the community on discord or discussions and have not received a response.\n- [ ] I have tried searching the documentation and have not found an answer.\n\n**What Model are you using?**\n\n- [ ] gpt-3.5-turbo\n- [ ] gpt-4-turbo\n- [ ] gpt-4\n- [ ] Other (please specify)\n\n**Describe the bug**\nA clear and concise description of what the bug is.\n\n**To Reproduce**\nSteps to reproduce the behavior, including code snippets of the model and the input data and openai response.\n\n**Expected behavior**\nA clear and concise description of what you expected to happen.\n\n**Screenshots**\nIf applicable, add screenshots to help explain your problem.\n"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "content": "---\nname: Feature request\nabout: Suggest an idea for this project\n---\n\n**Is your feature request related to a problem? Please describe.**\nA clear and concise description of what the problem is. Ex. I'm always frustrated when [...]\n\n**Describe the solution you'd like**\nA clear and concise description of what you want to happen.\n\n**Describe alternatives you've considered**\nA clear and concise description of any alternative solutions or features you've considered.\n\n**Additional context**\nAdd any other context or screenshots about the feature request here.\n"
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE/pull_request_template.md",
    "content": "> Please use conventional commits to describe your changes. For example, `feat: add new feature` or `fix: fix a bug`. If you are unsure, leave the title as `...` and AI will handle it.\n\n## Describe your changes\n\n...\n\n## Issue ticket number and link\n\n## Checklist before requesting a review\n\n- [ ] I have performed a self-review of my code\n- [ ] If it is a core feature, I have added thorough tests.\n- [ ] If it is a core feature, I have added documentation.\n"
  },
  {
    "path": ".github/dependabot.yml",
    "content": "# To get started with Dependabot version updates, you'll need to specify which\n# package ecosystems to update and where the package manifests are located.\n# Please see the documentation for all configuration options:\n# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates\n\nversion: 2\nupdates:\n  - package-ecosystem: \"pip\" # See documentation for possible values\n    directory: \"/\" # Location of package manifests\n    schedule:\n      interval: \"daily\"\n    groups:\n      poetry:\n        patterns: [\"*\"]\n"
  },
  {
    "path": ".github/workflows/ai-label.yml",
    "content": "name: AI Labeler\n\non:\n  issues:\n    types: [opened, reopened]\n  pull_request:\n    types: [opened, reopened]\n\njobs:\n  ai-labeler:\n    runs-on: ubuntu-latest\n    permissions:\n      contents: read\n      issues: write\n      pull-requests: write\n    steps:\n      - uses: actions/checkout@v4\n      - uses: jlowin/ai-labeler@v0.4.0\n        with:\n          include-repo-labels: true\n          openai-api-key: ${{ secrets.OPENAI_API_KEY }}\n"
  },
  {
    "path": ".github/workflows/evals.yml",
    "content": "name: Weekly Tests\n\non:\n  workflow_dispatch:\n  schedule:\n    - cron: \"0 0 * * 0\" # Runs at 00:00 UTC every Sunday\n  push:\n    branches: [main]\n    paths-ignore:\n      - \"**\" # Ignore all paths to ensure it only triggers on schedule\n\njobs:\n  weekly-tests:\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v2\n\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n\n      - name: Set up Python\n        run: uv python install 3.11\n\n      - name: Install dependencies\n        run: uv sync --all-extras --dev\n\n      - name: Run all tests\n        run: uv run pytest tests/ --asyncio-mode=auto\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n"
  },
  {
    "path": ".github/workflows/python-publish.yml",
    "content": "# This workflow will upload a Python Package using Twine when a release is created\n# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries\n\n# This workflow uses actions that are not certified by GitHub.\n# They are provided by a third-party and are governed by\n# separate terms of service, privacy policy, and support\n# documentation.\n\nname: Upload Python Package\n\non:\n  release:\n    types: [published]\n\npermissions:\n  contents: read\n\njobs:\n  release:\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.10\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Build the project\n        run: uv build\n      - name: Build and publish Python package\n        run: uv publish\n        env:\n          UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}\n"
  },
  {
    "path": ".github/workflows/ruff.yml",
    "content": "name: Ruff\n\non:\n  push:\n  pull_request:\n    branches: [main]\n\nenv:\n  WORKING_DIRECTORY: \".\"\n  CUSTOM_PACKAGES: \"instructor examples tests\"\n\njobs:\n  Ruff:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v3\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.9\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Ruff lint\n        run: uv run ruff check ${{ env.CUSTOM_PACKAGES }}\n      - name: Ruff format\n        run: uv run ruff format --check ${{ env.CUSTOM_PACKAGES }}\n"
  },
  {
    "path": ".github/workflows/scheduled-release.yml",
    "content": "name: Scheduled Release\n\non:\n  schedule:\n    # Every 2 weeks on Monday at 9 AM UTC\n    - cron: '0 9 * * 1/2'\n  workflow_dispatch: # Allow manual trigger\n    inputs:\n      skip_tests:\n        description: 'Skip LLM tests (use for testing workflow)'\n        required: false\n        default: false\n        type: boolean\n      dry_run:\n        description: 'Dry run - dont push changes or create release'\n        required: false\n        default: false\n        type: boolean\n\njobs:\n  test-and-release:\n    runs-on: ubuntu-latest\n    if: github.ref == 'refs/heads/main'\n    \n    steps:\n    - uses: actions/checkout@v4\n      with:\n        fetch-depth: 0\n        token: ${{ secrets.GITHUB_TOKEN }}\n    \n    - name: Setup UV\n      uses: astral-sh/setup-uv@v3\n    \n    - name: Install dependencies\n      run: |\n        uv sync --all-extras --dev\n    \n    - name: Run linting\n      run: |\n        uv run ruff check instructor examples tests\n    \n    - name: Run type checking\n      run: |\n        uv run pyright\n    \n    - name: Run core tests (no LLM)\n      run: |\n        uv run pytest tests/ -k \"not openai and not llm and not anthropic and not gemini and not cohere and not mistral and not groq and not vertexai and not xai and not cerebras and not fireworks and not writer and not bedrock and not perplexity and not genai\" --tb=short -v --maxfail=10\n    \n    # Optional: Run LLM tests if you have API keys in secrets\n    - name: Run LLM tests\n      if: github.event.inputs.skip_tests != 'true'\n      env:\n        OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n        ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n        GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n        COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n        GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}\n        MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}\n      run: |\n        echo \"Running basic LLM tests if API keys are available...\"\n        # Run a subset of LLM tests to verify basic functionality\n        if [ ! -z \"$OPENAI_API_KEY\" ]; then\n          echo \"Testing OpenAI integration...\"\n          uv run pytest tests/llm/test_openai/test_basics.py --tb=short -v --maxfail=1 || echo \"OpenAI tests failed\"\n        fi\n        if [ ! -z \"$ANTHROPIC_API_KEY\" ]; then\n          echo \"Testing Anthropic integration...\"\n          uv run pytest tests/llm/test_anthropic/test_basics.py --tb=short -v --maxfail=1 || echo \"Anthropic tests failed\"\n        fi\n        echo \"LLM tests completed (non-blocking)\"\n    \n    - name: Check for changes since last release\n      id: changes\n      run: |\n        LAST_TAG=$(git describe --tags --abbrev=0 2>/dev/null || echo \"\")\n        if [ -z \"$LAST_TAG\" ]; then\n          echo \"has_changes=true\" >> $GITHUB_OUTPUT\n          echo \"last_tag=none\" >> $GITHUB_OUTPUT\n          echo \"change_count=initial\" >> $GITHUB_OUTPUT\n        else\n          CHANGES=$(git rev-list $LAST_TAG..HEAD --count)\n          echo \"has_changes=$([[ $CHANGES -gt 0 ]] && echo true || echo false)\" >> $GITHUB_OUTPUT\n          echo \"change_count=$CHANGES\" >> $GITHUB_OUTPUT\n          echo \"last_tag=$LAST_TAG\" >> $GITHUB_OUTPUT\n        fi\n        \n        echo \"Last tag: $LAST_TAG\"\n        echo \"Changes since last tag: $(git rev-list $LAST_TAG..HEAD --count 2>/dev/null || echo 'N/A')\"\n    \n    # Only proceed with release if tests passed AND there are changes\n    - name: Get current version\n      if: steps.changes.outputs.has_changes == 'true'\n      id: current_version\n      run: |\n        VERSION=$(uv run python -c \"import tomllib; print(tomllib.load(open('pyproject.toml', 'rb'))['project']['version'])\")\n        echo \"version=$VERSION\" >> $GITHUB_OUTPUT\n        echo \"Current version: $VERSION\"\n    \n    - name: Determine version bump type\n      if: steps.changes.outputs.has_changes == 'true'\n      id: version_type\n      run: |\n        # Check commit messages since last tag to determine bump type\n        LAST_TAG=\"${{ steps.changes.outputs.last_tag }}\"\n        if [ \"$LAST_TAG\" = \"none\" ]; then\n          COMMITS=$(git log --oneline HEAD~20..HEAD)\n        else\n          COMMITS=$(git log --oneline $LAST_TAG..HEAD)\n        fi\n        \n        echo \"Recent commits:\"\n        echo \"$COMMITS\"\n        \n        # Look for breaking changes or major features\n        if echo \"$COMMITS\" | grep -qE \"(BREAKING|feat!|fix!)\"; then\n          echo \"bump_type=minor\" >> $GITHUB_OUTPUT\n          echo \"Detected breaking changes - using minor bump\"\n        elif echo \"$COMMITS\" | grep -qE \"feat:\"; then\n          echo \"bump_type=minor\" >> $GITHUB_OUTPUT\n          echo \"Detected new features - using minor bump\"\n        else\n          echo \"bump_type=patch\" >> $GITHUB_OUTPUT\n          echo \"Using patch bump for bug fixes and chores\"\n        fi\n    \n    - name: Bump version\n      if: steps.changes.outputs.has_changes == 'true'\n      id: bump_version\n      run: |\n        CURRENT=\"${{ steps.current_version.outputs.version }}\"\n        BUMP_TYPE=\"${{ steps.version_type.outputs.bump_type }}\"\n        \n        IFS='.' read -r major minor patch <<< \"$CURRENT\"\n        \n        case $BUMP_TYPE in\n          major)\n            major=$((major + 1))\n            minor=0\n            patch=0\n            ;;\n          minor)\n            minor=$((minor + 1))\n            patch=0\n            ;;\n          patch)\n            patch=$((patch + 1))\n            ;;\n        esac\n        \n        NEW_VERSION=\"$major.$minor.$patch\"\n        echo \"new_version=$NEW_VERSION\" >> $GITHUB_OUTPUT\n        echo \"Bumping from $CURRENT to $NEW_VERSION ($BUMP_TYPE)\"\n        \n        # Update pyproject.toml\n        sed -i \"s/version = \\\"$CURRENT\\\"/version = \\\"$NEW_VERSION\\\"/\" pyproject.toml\n    \n    - name: Update lockfile\n      if: steps.changes.outputs.has_changes == 'true'\n      run: |\n        uv lock\n    \n    # Run tests again after version bump to make sure nothing broke\n    - name: Final test run\n      if: steps.changes.outputs.has_changes == 'true'\n      run: |\n        uv sync\n        uv run pytest tests/ -k \"not openai and not llm and not anthropic and not gemini and not cohere and not mistral and not groq and not vertexai and not xai and not cerebras and not fireworks and not writer and not bedrock and not perplexity and not genai\" --tb=short --maxfail=5\n    \n    - name: Generate changelog\n      if: steps.changes.outputs.has_changes == 'true'\n      id: changelog\n      run: |\n        LAST_TAG=\"${{ steps.changes.outputs.last_tag }}\"\n        NEW_VERSION=\"${{ steps.bump_version.outputs.new_version }}\"\n        \n        if [ \"$LAST_TAG\" = \"none\" ]; then\n          CHANGELOG=$(git log --oneline HEAD~30..HEAD --pretty=format:\"- %s\" | head -20)\n        else\n          CHANGELOG=$(git log --oneline $LAST_TAG..HEAD --pretty=format:\"- %s\")\n        fi\n        \n        # Save changelog to file for GitHub release\n        cat > CHANGELOG.md << EOF\n        ## 🚀 What's Changed\n        \n        $CHANGELOG\n        \n        ## 🔗 Links\n        **Full Changelog**: https://github.com/${{ github.repository }}/compare/$LAST_TAG...v$NEW_VERSION\n        \n        ---\n        🤖 *This release was automatically generated every 2 weeks*\n        EOF\n        \n        echo \"changelog_file=CHANGELOG.md\" >> $GITHUB_OUTPUT\n    \n    - name: Create release commit\n      if: steps.changes.outputs.has_changes == 'true'\n      run: |\n        git config --local user.email \"action@github.com\"\n        git config --local user.name \"GitHub Action\"\n        git add pyproject.toml uv.lock\n        git commit -m \"chore: automated release v${{ steps.bump_version.outputs.new_version }}\n\n        🤖 Generated with [Claude Code](https://claude.ai/code)\n\n        Co-Authored-By: GitHub Action <action@github.com>\"\n        git tag \"v${{ steps.bump_version.outputs.new_version }}\"\n    \n    - name: Push changes\n      if: steps.changes.outputs.has_changes == 'true' && github.event.inputs.dry_run != 'true'\n      run: |\n        git push origin main\n        git push origin \"v${{ steps.bump_version.outputs.new_version }}\"\n    \n    - name: Create GitHub Release\n      if: steps.changes.outputs.has_changes == 'true' && github.event.inputs.dry_run != 'true'\n      uses: ncipollo/release-action@v1\n      with:\n        tag: \"v${{ steps.bump_version.outputs.new_version }}\"\n        name: \"🚀 Release v${{ steps.bump_version.outputs.new_version }}\"\n        bodyFile: \"CHANGELOG.md\"\n        draft: false\n        prerelease: false\n    \n    - name: Dry run summary\n      if: steps.changes.outputs.has_changes == 'true' && github.event.inputs.dry_run == 'true'\n      run: |\n        echo \"🧪 DRY RUN MODE - No changes pushed\"\n        echo \"Would have released: v${{ steps.bump_version.outputs.new_version }}\"\n        cat CHANGELOG.md\n    \n    # Optional: Publish to PyPI (uncomment if you want automatic PyPI releases)\n    # - name: Build and publish to PyPI\n    #   if: steps.changes.outputs.has_changes == 'true' && secrets.PYPI_TOKEN != ''\n    #   env:\n    #     PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}\n    #   run: |\n    #     uv build\n    #     uv publish --token $PYPI_TOKEN\n    \n    # Summary outputs\n    - name: Summary\n      if: always()\n      run: |\n        echo \"## 📊 Scheduled Release Summary\" >> $GITHUB_STEP_SUMMARY\n        echo \"- **Branch**: ${{ github.ref }}\" >> $GITHUB_STEP_SUMMARY\n        echo \"- **Has Changes**: ${{ steps.changes.outputs.has_changes }}\" >> $GITHUB_STEP_SUMMARY\n        echo \"- **Change Count**: ${{ steps.changes.outputs.change_count }}\" >> $GITHUB_STEP_SUMMARY\n        if [ \"${{ steps.changes.outputs.has_changes }}\" = \"true\" ]; then\n          echo \"- **Version**: ${{ steps.current_version.outputs.version }} → ${{ steps.bump_version.outputs.new_version }}\" >> $GITHUB_STEP_SUMMARY\n          echo \"- **Bump Type**: ${{ steps.version_type.outputs.bump_type }}\" >> $GITHUB_STEP_SUMMARY\n          echo \"- **Status**: ✅ Released\" >> $GITHUB_STEP_SUMMARY\n        else\n          echo \"- **Status**: ⏭️ Skipped (no changes)\" >> $GITHUB_STEP_SUMMARY\n        fi\n    \n    - name: Notify on failure\n      if: failure()\n      run: |\n        echo \"❌ Scheduled release failed - check the logs above\"\n        echo \"Common issues:\"\n        echo \"- Tests failed\"\n        echo \"- Linting issues\" \n        echo \"- Type checking errors\"\n        echo \"- Git push permissions\""
  },
  {
    "path": ".github/workflows/test.yml",
    "content": "name: Test\non:\n  pull_request:\n  push:\n    branches:\n      - main\n\njobs:\n  # Core tests without LLM providers\n  core-tests:\n    name: Core Tests\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Run core tests\n        run: >-\n          uv run pytest tests/ --asyncio-mode=auto -n auto\n          -k 'not test_core_providers and not test_openai and not test_anthropic\n          and not test_gemini and not test_genai and not test_writer and not\n          test_vertexai and not docs'\n        env:\n          INSTRUCTOR_ENV: CI\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n          COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n          XAI_API_KEY: ${{ secrets.XAI_API_KEY }}\n          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n\n  # Core provider tests for OpenAI\n  core-openai:\n    name: Core Provider Tests (OpenAI)\n    runs-on: ubuntu-latest\n    needs: core-tests\n    env:\n      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Skip core provider tests (OpenAI)\n        if: ${{ env.OPENAI_API_KEY == '' }}\n        run: echo \"Skipping OpenAI core provider tests (missing OPENAI_API_KEY).\"\n      - name: Run core provider tests (OpenAI)\n        if: ${{ env.OPENAI_API_KEY != '' }}\n        run: |\n          set +e\n          uv run pytest tests/llm/test_core_providers -v --asyncio-mode=auto -n auto -k \"openai\"\n          status=$?\n          set -e\n          if [ $status -eq 5 ]; then\n            echo \"No tests collected; treating as success.\"\n            exit 0\n          fi\n          exit $status\n        env:\n          INSTRUCTOR_ENV: CI\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n\n  # Core provider tests for Anthropic\n  core-anthropic:\n    name: Core Provider Tests (Anthropic)\n    runs-on: ubuntu-latest\n    needs: core-tests\n    env:\n      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Skip core provider tests (Anthropic)\n        if: ${{ env.ANTHROPIC_API_KEY == '' }}\n        run: echo \"Skipping Anthropic core provider tests (missing ANTHROPIC_API_KEY).\"\n      - name: Run core provider tests (Anthropic)\n        if: ${{ env.ANTHROPIC_API_KEY != '' }}\n        run: |\n          set +e\n          uv run pytest tests/llm/test_core_providers -v --asyncio-mode=auto -n auto -k \"anthropic\"\n          status=$?\n          set -e\n          if [ $status -eq 5 ]; then\n            echo \"No tests collected; treating as success.\"\n            exit 0\n          fi\n          exit $status\n        env:\n          INSTRUCTOR_ENV: CI\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n\n  # Core provider tests for Google\n  core-google:\n    name: Core Provider Tests (Google)\n    runs-on: ubuntu-latest\n    needs: core-tests\n    env:\n      GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n      GOOGLE_GENAI_MODEL: ${{ secrets.GOOGLE_GENAI_MODEL }}\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Skip core provider tests (Google)\n        if: ${{ env.GOOGLE_API_KEY == '' || env.GOOGLE_GENAI_MODEL == '' }}\n        run: echo \"Skipping Google core provider tests (missing GOOGLE_API_KEY or GOOGLE_GENAI_MODEL).\"\n      - name: Run core provider tests (Google)\n        if: ${{ env.GOOGLE_API_KEY != '' && env.GOOGLE_GENAI_MODEL != '' }}\n        run: |\n          set +e\n          uv run pytest tests/llm/test_core_providers -v --asyncio-mode=auto -n auto -k \"google\"\n          status=$?\n          set -e\n          if [ $status -eq 5 ]; then\n            echo \"No tests collected; treating as success.\"\n            exit 0\n          fi\n          exit $status\n        env:\n          INSTRUCTOR_ENV: CI\n          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n\n  # Core provider tests for other providers\n  core-other:\n    name: Core Provider Tests (Other)\n    runs-on: ubuntu-latest\n    needs: core-tests\n    env:\n      COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n      XAI_API_KEY: ${{ secrets.XAI_API_KEY }}\n      MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}\n      CEREBRAS_API_KEY: ${{ secrets.CEREBRAS_API_KEY }}\n      FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}\n      WRITER_API_KEY: ${{ secrets.WRITER_API_KEY }}\n      PERPLEXITY_API_KEY: ${{ secrets.PERPLEXITY_API_KEY }}\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Skip core provider tests (Other)\n        if: >-\n          ${{ env.COHERE_API_KEY == '' && env.XAI_API_KEY == ''\n          && env.MISTRAL_API_KEY == '' && env.CEREBRAS_API_KEY == ''\n          && env.FIREWORKS_API_KEY == '' && env.WRITER_API_KEY == ''\n          && env.PERPLEXITY_API_KEY == '' }}\n        run: echo \"Skipping core provider tests (Other) (missing provider secrets).\"\n      - name: Run core provider tests (Cohere, xAI, Mistral, etc)\n        if: >-\n          ${{ env.COHERE_API_KEY != '' || env.XAI_API_KEY != ''\n          || env.MISTRAL_API_KEY != '' || env.CEREBRAS_API_KEY != ''\n          || env.FIREWORKS_API_KEY != '' || env.WRITER_API_KEY != ''\n          || env.PERPLEXITY_API_KEY != '' }}\n        run: |\n          set +e\n          uv run pytest tests/llm/test_core_providers -v --asyncio-mode=auto -n auto -k \"cohere or xai or mistral or cerebras or fireworks or writer or perplexity\"\n          status=$?\n          set -e\n          if [ $status -eq 5 ]; then\n            echo \"No tests collected; treating as success.\"\n            exit 0\n          fi\n          exit $status\n        env:\n          INSTRUCTOR_ENV: CI\n          COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n          XAI_API_KEY: ${{ secrets.XAI_API_KEY }}\n          MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}\n          CEREBRAS_API_KEY: ${{ secrets.CEREBRAS_API_KEY }}\n          FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}\n          WRITER_API_KEY: ${{ secrets.WRITER_API_KEY }}\n          PERPLEXITY_API_KEY: ${{ secrets.PERPLEXITY_API_KEY }}\n\n  # Provider tests run in parallel\n  provider-tests:\n    name: ${{ matrix.provider.name }} Tests\n    runs-on: ubuntu-latest\n    needs: [core-openai, core-anthropic, core-google, core-other]\n    env:\n      PROVIDER_API_KEY: ${{ secrets[matrix.provider.env_key] }}\n      GOOGLE_GENAI_MODEL: ${{ secrets.GOOGLE_GENAI_MODEL }}\n    strategy:\n      fail-fast: false\n      matrix:\n        provider:\n          - name: OpenAI\n            env_key: OPENAI_API_KEY\n            test_path: tests/llm/test_openai\n          - name: Anthropic\n            env_key: ANTHROPIC_API_KEY\n            test_path: tests/llm/test_anthropic\n          - name: Gemini\n            env_key: GOOGLE_API_KEY\n            test_path: tests/llm/test_gemini\n          - name: Google GenAI\n            env_key: GOOGLE_API_KEY\n            test_path: tests/llm/test_genai\n          - name: Vertex AI\n            env_key: GOOGLE_API_KEY\n            test_path: tests/llm/test_vertexai\n          - name: Writer\n            env_key: WRITER_API_KEY\n            test_path: tests/llm/test_writer\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Skip ${{ matrix.provider.name }} tests\n        if: >-\n          ${{ env.PROVIDER_API_KEY == '' ||\n          ((matrix.provider.name == 'Gemini' || matrix.provider.name == 'Google GenAI'\n          || matrix.provider.name == 'Vertex AI') && env.GOOGLE_GENAI_MODEL == '') }}\n        run: >-\n          echo \"Skipping ${{ matrix.provider.name }} tests\n          (missing ${{ matrix.provider.env_key }} or GOOGLE_GENAI_MODEL).\"\n      - name: Run ${{ matrix.provider.name }} tests\n        if: >-\n          ${{ env.PROVIDER_API_KEY != '' &&\n          ((matrix.provider.name != 'Gemini' && matrix.provider.name != 'Google GenAI'\n          && matrix.provider.name != 'Vertex AI') || env.GOOGLE_GENAI_MODEL != '') }}\n        run: |\n          set +e\n          uv run pytest ${{ matrix.provider.test_path }} --asyncio-mode=auto -n auto\n          status=$?\n          set -e\n          if [ $status -eq 5 ]; then\n            echo \"No tests collected; treating as success.\"\n            exit 0\n          fi\n          exit $status\n        env:\n          INSTRUCTOR_ENV: CI\n          ${{ matrix.provider.env_key }}: ${{ secrets[matrix.provider.env_key] }}\n\n  # Auto client needs multiple providers\n  auto-client-test:\n    name: Auto Client Tests\n    runs-on: ubuntu-latest\n    needs: [core-openai, core-anthropic, core-google, core-other]\n    env:\n      OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n      GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n      COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n      XAI_API_KEY: ${{ secrets.XAI_API_KEY }}\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Skip Auto Client tests\n        if: >-\n          ${{ env.OPENAI_API_KEY == '' || env.GOOGLE_API_KEY == ''\n          || env.COHERE_API_KEY == '' || env.ANTHROPIC_API_KEY == ''\n          || env.XAI_API_KEY == '' }}\n        run: echo \"Skipping Auto Client tests (missing one or more provider secrets).\"\n      - name: Run Auto Client tests\n        if: >-\n          ${{ env.OPENAI_API_KEY != '' && env.GOOGLE_API_KEY != ''\n          && env.COHERE_API_KEY != '' && env.ANTHROPIC_API_KEY != ''\n          && env.XAI_API_KEY != '' }}\n        run: |\n          set +e\n          uv run pytest tests/test_auto_client.py --asyncio-mode=auto -n auto\n          status=$?\n          set -e\n          if [ $status -eq 5 ]; then\n            echo \"No tests collected; treating as success.\"\n            exit 0\n          fi\n          exit $status\n        env:\n          INSTRUCTOR_ENV: CI\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n          COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n          XAI_API_KEY: ${{ secrets.XAI_API_KEY }}\n"
  },
  {
    "path": ".github/workflows/test_docs.yml",
    "content": "name: Test Docs\non:\n  schedule:\n    - cron: '0 0 1 * *'  # Runs at 00:00 on the 1st of every month\njobs:\n  release:\n    runs-on: ubuntu-latest\n\n    strategy:\n      matrix:\n        python-version: [\"3.11\"]\n\n    steps:\n      - uses: actions/checkout@v2\n\n      - name: Install system dependencies\n        run: |\n          sudo apt-get update\n          sudo apt-get install -y graphviz libcairo2-dev xdg-utils\n\n      - name: Install Poetry\n        uses: snok/install-poetry@v1.3.1\n\n      - name: Set up Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v4\n        with:\n          python-version: ${{ matrix.python-version }}\n          cache: \"poetry\"\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Run tests\n        run: uv run pytest tests/docs --asyncio-mode=auto\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n"
  },
  {
    "path": ".github/workflows/ty.yml",
    "content": "name: ty\n\non:\n  pull_request:\n    branches: [main]\n  push:\n    branches: [main]\n\nenv:\n  WORKING_DIRECTORY: \".\"\n\njobs:\n  type-check:\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout code\n        uses: actions/checkout@v3\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true\n      - name: Set up Python\n        run: uv python install 3.11\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Run type check with ty\n        run: uv run ty check instructor/\n      - name: Run type check with ty (tests)\n        run: uv run ty check --config-file ty-tests.toml tests\n"
  },
  {
    "path": ".gitignore",
    "content": ".DS_Store\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n\n# Distribution / packaging\n.Python\nbuild/\ndevelop-eggs/\ndist/\ndownloads/\neggs/\n.eggs/\nlib/\nlib64/\nparts/\nsdist/\nvar/\nwheels/\nshare/python-wheels/\n*.egg-info/\n.installed.cfg\n*.egg\nMANIFEST\n\n# PyInstaller\n#  Usually these files are written by a python script from a template\n#  before PyInstaller builds the exe, so as to inject date/other infos into it.\n*.manifest\n*.spec\n\n# Installer logs\npip-log.txt\npip-delete-this-directory.txt\n\n# Unit test / coverage reports\nhtmlcov/\n.tox/\n.nox/\n.coverage\n.coverage.*\n.cache\nnosetests.xml\ncoverage.xml\n*.cover\n*.py,cover\n.hypothesis/\n.pytest_cache/\ncover/\n\n# Translations\n*.mo\n*.pot\n\n# Django stuff:\n*.log\nlocal_settings.py\ndb.sqlite3\ndb.sqlite3-journal\n\n# Flask stuff:\ninstance/\n.webassets-cache\n\n# Scrapy stuff:\n.scrapy\n\n# Sphinx documentation\ndocs/_build/\n\n# PyBuilder\n.pybuilder/\ntarget/\n\n# Jupyter Notebook\n.ipynb_checkpoints\n\n# IPython\nprofile_default/\nipython_config.py\n\n# pyenv\n#   For a library or package, you might want to ignore these files since the code is\n#   intended to run in multiple environments; otherwise, check them in:\n# .python-version\n\n# pipenv\n#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.\n#   However, in case of collaboration, if having platform-specific dependencies or dependencies\n#   having no cross-platform support, pipenv may install dependencies that don't work, or not\n#   install all needed dependencies.\n#Pipfile.lock\n\n# poetry\n#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.\n#   This is especially recommended for binary packages to ensure reproducibility, and is more\n#   commonly ignored for libraries.\n#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control\n#poetry.lock\n\n# pdm\n#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.\n#pdm.lock\n#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it\n#   in version control.\n#   https://pdm.fming.dev/#use-with-ide\n.pdm.toml\n\n# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm\n__pypackages__/\n\n# Celery stuff\ncelerybeat-schedule\ncelerybeat.pid\n\n# SageMath parsed files\n*.sage.py\n\n# Environments\n.env\n.venv\nenv/\nvenv/\nENV/\nenv.bak/\nvenv.bak/\n.envrc\n\n# Spyder project settings\n.spyderproject\n.spyproject\n\n# Rope project settings\n.ropeproject\n\n# mkdocs documentation\n/site\n\n# mypy\n.mypy_cache/\n.dmypy.json\ndmypy.json\n\n# Pyre type checker\n.pyre/\n\n# pytype static type analyzer\n.pytype/\n\n# Cython debug symbols\ncython_debug/\n\n# PyCharm\n#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can\n#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore\n#  and can be added to the global gitignore or merged into this file.  For a more nuclear\n#  option (not recommended) you can uncomment the following to ignore the entire idea folder.\n.idea/\n\n.vscode/\n\nexamples/citation_with_extraction/fly.toml\nmy_cache_directory/\ntutorials/wandb/*\ntutorials/results.csv\ntutorials/results.jsonl\ntutorials/results.jsonlines\ntutorials/schema.json\nwandb/settings\nmath_finetunes.jsonl\n\npr_body.md\n\ncheck_zero_width_chars.py\n\n# Suggestion files from architectural analysis\n*_SUGGESTIONS.md\nORGANIZED_SUGGESTIONS.md\n"
  },
  {
    "path": ".grit/.gitignore",
    "content": ".gritmodules\n*.log\n"
  },
  {
    "path": ".grit/grit.yaml",
    "content": "version: 0.0.1\npatterns:\n  - name: github.com/getgrit/python#openai\n    level: info\n"
  },
  {
    "path": ".pre-commit-config.yaml",
    "content": "repos:\n  - repo: https://github.com/astral-sh/ruff-pre-commit\n    rev: v0.9.9 # Ruff version\n    hooks:\n      - id: ruff # Run the linter.\n        name: Run Linter Check (Ruff)\n        args: [ --fix, --unsafe-fixes ]\n        files: ^(instructor|tests|examples)/\n      - id: ruff-format       # Run the formatter.\n        name: Run Formatter (Ruff)\n\n  - repo: local\n    hooks:\n      - id: uv-lock-check\n        name: Check uv.lock is up-to-date\n        entry: uv\n        args: [lock, --check]\n        language: system\n        files: ^(pyproject\\.toml|uv\\.lock)$\n        pass_filenames: false\n        \n      - id: uv-sync-check\n        name: Verify dependencies can be installed\n        entry: uv\n        args: [sync, --check]\n        language: system\n        files: ^(pyproject\\.toml|uv\\.lock)$\n        pass_filenames: false\n\n      - id: uv-export-requirements\n        name: Export requirements.txt from pyproject.toml\n        entry: bash -c 'uv pip compile pyproject.toml -o requirements.txt && git add requirements.txt'\n        language: system\n        files: ^pyproject\\.toml$\n        pass_filenames: false\n\n      - id: ty-check\n        name: Run Type Check (ty)\n        entry: uv\n        args: [run, ty, check, --ignore, unresolved-import]\n        language: system\n        files: ^instructor/\n        pass_filenames: false\n"
  },
  {
    "path": ".ruff.toml",
    "content": "# Exclude a variety of commonly ignored directories.\nexclude = [\n    \".bzr\",\n    \".direnv\",\n    \".eggs\",\n    \".git\",\n    \".git-rewrite\",\n    \".hg\",\n    \".mypy_cache\",\n    \".nox\",\n    \".pants.d\",\n    \".pytype\",\n    \".ruff_cache\",\n    \".svn\",\n    \".tox\",\n    \".venv\",\n    \"__pypackages__\",\n    \"_build\",\n    \"buck-out\",\n    \"build\",\n    \"dist\",\n    \"node_modules\",\n    \"venv\",\n]\n\n# Same as Black.\nline-length = 88\noutput-format = \"grouped\"\n\ntarget-version = \"py39\"\n\n[lint]\nselect = [\n  # bugbear rules\n  \"B\",\n  # remove unused imports\n  \"F401\",\n  # bare except statements\n  \"E722\",\n  # unused arguments\n  \"ARG\",\n  # pyupgrade\n  \"UP\",\n]\nignore = [\n  # mutable defaults\n  \"B006\",\n  \"B018\",\n]\n\nunfixable = [\n  # disable auto fix for print statements\n  \"T201\",\n  \"T203\",\n]\n\n[lint.extend-per-file-ignores]\n\"instructor/distil.py\" = [\"ARG002\"]\n\"tests/test_distil.py\" = [\"ARG001\"]\n\"tests/test_patch.py\" = [\"ARG001\"]\n\"examples/task_planner/task_planner_topological_sort.py\" = [\"ARG002\"]\n\"examples/citation_with_extraction/main.py\" = [\"ARG001\"]\n"
  },
  {
    "path": "AGENT.md",
    "content": "# AGENT.md\n\n## Commands\n- Install: `uv pip install -e \".[dev]\"` or `poetry install --with dev`\n- Run tests: `uv run pytest tests/`\n- Run single test: `uv run pytest tests/path_to_test.py::test_name`\n- Skip LLM tests: `uv run pytest tests/ -k 'not llm and not openai'`\n- Temp deps for a run: `uv run --with <pkg>[==version] <command>` (example: `uv run --with pytest-asyncio --with anthropic pytest tests/...`)\n- Type check: `uv run ty check`\n- Lint: `uv run ruff check instructor examples tests`\n- Format: `uv run ruff format instructor examples tests`\n- Build docs: `uv run mkdocs serve` (local) or `./build_mkdocs.sh` (production)\n- Waiting: use `sleep <seconds>` for explicit pauses (e.g., CI waits) or to let external processes finish\n\n## Architecture\n- **Core**: `instructor/` - Pydantic-based structured outputs for LLMs\n- **Base classes**: `Instructor` and `AsyncInstructor` in `client.py`\n- **Providers**: Client files (`client_*.py`) for OpenAI, Anthropic, Gemini, Cohere, etc.\n- **Factory pattern**: `from_provider()` for automatic provider detection\n- **DSL**: `dsl/` directory with Partial, Iterable, Maybe, Citation extensions\n- **Key modules**: `patch.py` (patching), `process_response.py` (parsing), `function_calls.py` (schemas)\n\n## Code Style\n- **Typing**: Strict type annotations, use `BaseModel` for structured outputs\n- **Imports**: Standard lib → third-party → local\n- **Formatting**: Ruff with Black conventions\n- **Error handling**: Custom exceptions from `exceptions.py`, Pydantic validation\n- **Naming**: `snake_case` functions/variables, `PascalCase` classes\n- **No mocking**: Tests use real API calls\n- **Client creation**: Always use `instructor.from_provider(\"provider_name/model_name\")` instead of provider-specific methods like `from_openai()`, `from_anthropic()`, etc.\n\n## Pull Request (PR) Formatting\n\nUse **Conventional Commits** formatting for PR titles. Treat the PR title as the message we would use for a squash merge commit.\n\n### PR Title Format\n\nUse:\n\n`<type>(<scope>): <short summary>`\n\nRules:\n- Keep it under ~70 characters when you can.\n- Use the imperative mood (for example, “add”, “fix”, “update”).\n- Do not end with a period.\n- If it includes a breaking change, add `!` after the type or scope (for example, `feat(api)!:`).\n\nGood examples:\n- `fix(openai): handle empty tool_calls in streaming`\n- `feat(retry): add backoff for JSON parse failures`\n- `docs(agents): add conventional commit PR title guidelines`\n- `test(schema): cover nested union edge cases`\n- `ci(ruff): enforce formatting in pre-commit`\n\nCommon types:\n- `feat`: new feature\n- `fix`: bug fix\n- `docs`: documentation-only changes\n- `refactor`: code change that is not a fix or feature\n- `perf`: performance improvement\n- `test`: add or update tests\n- `build`: build system or dependency changes\n- `ci`: CI pipeline changes\n- `chore`: maintenance work\n\nSuggested scopes (pick the closest match):\n- Providers: `openai`, `anthropic`, `gemini`, `vertexai`, `bedrock`, `mistral`, `groq`, `writer`\n- Core: `core`, `patch`, `process_response`, `function_calls`, `retry`, `dsl`\n- Repo: `docs`, `examples`, `tests`, `ci`, `build`\n\n### PR Description Guidelines\n\nKeep PR descriptions short and easy to review:\n- **What**: What changed, in 1–3 sentences.\n- **Why**: Why this change is needed (link issues when possible).\n- **Changes**: 3–7 bullet points with the main edits.\n- **Testing**: What you ran (or why you did not run anything).\n\nIf the PR was authored by Cursor, include:\n- `This PR was written by [Cursor](https://cursor.com)`\n"
  },
  {
    "path": "CHANGELOG.md",
    "content": "# Changelog\n\nAll notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).\n\n## [Unreleased]\n\n<!-- Add upcoming changes here -->\n\n## [1.14.4] - 2026-01-16\n\n### Changed\n- Simplified `JsonCompleteness` by using `jiter` parsing and a sibling-based completeness heuristic (#2000)\n\n### Fixed\n\n- Fixed Google GenAI `safety_settings` causing `400 INVALID_ARGUMENT` when requests include image content by using image-specific harm categories when needed (#1773)\n- Fixed `create_with_completion()` crashing for `list[T]` response models (where `T` is a Pydantic model) by preserving `_raw_response` on list outputs (#1303)\n- Fixed Responses API retries crashing on reasoning items by skipping non-tool-call items in `reask_responses_tools` (#2002)\n- Fixed Google GenAI dict-style `config` handling to preserve `labels` and other settings like `cached_content` and `thinking_config` (#2005)\n\n\n## [1.14.3] - 2026-01-13\n\n### Added\n- Completeness-based validation for Partial streaming - only validates JSON structures that are structurally complete (#1999)\n- New `JsonCompleteness` class in `instructor/dsl/json_tracker.py` for tracking JSON completeness during streaming (#1999)\n\n### Fixed\n- Fixed Stream objects crashing reask handlers when using streaming with `max_retries > 1` (#1992)\n- Field constraints (`min_length`, `max_length`, `ge`, `le`, etc.) now work correctly during streaming (#1999)\n\n### Deprecated\n- `PartialLiteralMixin` is now deprecated - completeness-based validation handles Literal/Enum types automatically (#1999)\n\n## [1.14.2] - 2026-01-13\n\n### Fixed\n- Fixed model validators crashing during partial streaming by skipping them until streaming completes (#1994)\n- Fixed infinite recursion with self-referential models in Partial (e.g., TreeNode with children: List[\"TreeNode\"]) (#1997)\n\n### Added\n- Added `PartialLiteralMixin` documentation for handling Literal/Enum types during streaming (#1994)\n- Added final validation against original model after streaming completes to enforce required fields (#1994)\n- Added tests for recursive Partial models (#1997)\n\n## [1.14.1] - 2026-01-08\n\n### Fixed\n- Added support for cached_content in Google Gemini context caching (#1987)\n\n## [1.14.0] - 2026-01-08\n\n### Added\n- Pre-commit hook to auto-export requirements.txt for build consistency\n\n### Changed\n- Standardized provider factory methods across codebase for improved consistency\n- Standardized provider imports throughout documentation\n- Audited and standardized exception handling throughout the instructor library\n\n### Fixed\n- Fixed build issues with requirements.txt regeneration from pyproject.toml\n- Fixed provider functionality issue (#1914)\n\n### Documentation\n- Comprehensive documentation audit and SEO optimization improvements (#1944)\n- Updated documentation for responses API mode (#1946)\n- Enhanced README with PydanticAI promotion and clear feature distinctions\n- Removed incorrect model reference in client.create extraction example (#1951)\n- Fixed image base URLs in Jupyter notebook tutorials (#1922)\n\n## [1.13.0] - Previous Release\n\nFor changes in earlier versions, see the [git history](https://github.com/instructor-ai/instructor/releases).\n"
  },
  {
    "path": "CLAUDE.md",
    "content": "# CLAUDE.md\n\nThis file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.\n\n# Instructor Development Guide\n\n## Commands\n- Install deps: `uv pip install -e \".[dev,anthropic]\"` or `poetry install --with dev,anthropic`\n- Run tests: `uv run pytest tests/ -n auto`\n- Run specific test: `uv run pytest tests/path_to_test.py::test_name`\n- Skip LLM tests: `uv run pytest tests/ -k 'not llm and not openai'`\n- Type check: `uv run ty check`\n- Lint: `uv run ruff check instructor examples tests`\n- Format: `uv run ruff format instructor examples tests`\n- Generate coverage: `uv run coverage run -m pytest tests/ -k \"not docs\"` then `uv run coverage report`\n- Build documentation: `uv run mkdocs serve` (for local preview) or `./build_mkdocs.sh` (for production)\n- Waiting: use `sleep <seconds>` for explicit pauses (e.g., CI waits) or to let external processes finish\n\n## Installation & Setup\n- Fork the repository and clone your fork\n- Install UV: `pip install uv`\n- Create virtual environment: `uv venv`\n- Install dependencies: `uv pip install -e \".[dev]\"`\n- Install pre-commit: `uv run pre-commit install`\n- Run tests to verify: `uv run pytest tests/ -k \"not openai\"`\n\n## Code Style Guidelines\n- **Typing**: Use strict typing with annotations for all functions and variables\n- **Imports**: Standard lib → third-party → local imports\n- **Formatting**: Follow Black's formatting conventions (enforced by Ruff)\n- **Models**: Define structured outputs as Pydantic BaseModel subclasses\n- **Naming**: snake_case for functions/variables, PascalCase for classes\n- **Error Handling**: Use custom exceptions from exceptions.py, validate with Pydantic\n- **Comments**: Docstrings for public functions, inline comments for complex logic\n\n## Conventional Commits\n- **Format**: `type(scope): description`\n- **Types**: feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert\n- **Examples**:\n  - `feat(anthropic): add support for Claude 3.5`\n  - `fix(openai): correct response parsing for streaming`\n  - `docs(README): update installation instructions`\n  - `test(gemini): add validation tests for JSON mode`\n\n## Core Architecture\n- **Base Classes**: `Instructor` and `AsyncInstructor` in client.py are the foundation\n- **Factory Pattern**: Provider-specific factory functions (`from_openai`, `from_anthropic`, etc.)\n- **Unified Access**: `from_provider()` function in auto_client.py for automatic provider detection\n- **Mode System**: `Mode` enum categorizes different provider capabilities (tools vs JSON output)\n- **Patching Mechanism**: Uses Python's dynamic nature to patch provider clients for structured outputs\n- **Response Processing**: Transforms raw API responses into validated Pydantic models\n- **DSL Components**: Special types like Partial, Iterable, Maybe extend the core functionality\n\n## Provider Architecture\n- **Supported Providers**: OpenAI, Anthropic, Gemini, Cohere, Mistral, Groq, VertexAI, Fireworks, Cerebras, Writer, Databricks, Anyscale, Together, LiteLLM, Bedrock, Perplexity\n- **Provider Implementation**: Each provider has a dedicated client file (e.g., `client_anthropic.py`) with factory functions\n- **Modes**: Different providers support specific modes (`Mode` enum): `ANTHROPIC_TOOLS`, `GEMINI_JSON`, etc.\n- **Common Pattern**: Factory functions (e.g., `from_anthropic`) take a native client and return patched `Instructor` instances\n- **Provider Testing**: Tests in `tests/llm/` directory, define Pydantic models, make API calls, verify structured outputs\n- **Provider Detection**: `get_provider` function analyzes base URL to detect which provider is being used\n\n## Key Components\n- **process_response.py**: Handles parsing and converting LLM outputs to Pydantic models\n- **patch.py**: Contains the core patching logic for modifying provider clients\n- **function_calls.py**: Handles generating function/tool schemas from Pydantic models\n- **hooks.py**: Provides event hooks for intercepting various stages of the LLM request/response cycle\n- **dsl/**: Domain-specific language extensions for specialized model types\n- **retry.py**: Implements retry logic for handling validation failures\n- **validators.py**: Custom validation mechanisms for structured outputs\n\n## Testing Guidelines\n- Tests are organized by provider under `tests/llm/`\n- Each provider has its own conftest.py with fixtures\n- Standard tests cover: basic extraction, streaming, validation, retries\n- Evaluation tests in `tests/llm/test_provider/evals/` assess model capabilities\n- Use parametrized tests when testing similar functionality across variants\n- **IMPORTANT**: No mocking in tests - tests make real API calls\n\n## Documentation Guidelines\n- Every provider needs documentation in `docs/integrations/` following standard format\n- Provider docs should include: installation, basic example, modes supported, special features\n- When adding a new provider, update `mkdocs.yml` navigation and redirects\n- Example code should include complete imports and environment setup\n- Tutorials should progress from simple to complex concepts\n- New features should include conceptual explanation in `docs/concepts/`\n- **Writing Style**: Grade 10 reading level, all examples must be working code\n\n## Branch and Development Workflow\n1. Fork and clone the repository\n2. Create feature branch: `git checkout -b feat/your-feature`\n3. Make changes and add tests\n4. Run tests and linting\n5. Commit with conventional commit message\n6. Push to your fork and create PR\n7. Use stacked PRs for complex features\n\n## Adding New Providers\n\n### Step-by-Step Guide\n1. **Update Provider Enum** in `instructor/utils.py`:\n   ```python\n   class Provider(Enum):\n       YOUR_PROVIDER = \"your_provider\"\n   ```\n\n2. **Add Provider Modes** in `instructor/mode.py`:\n   ```python\n   class Mode(enum.Enum):\n       YOUR_PROVIDER_TOOLS = \"your_provider_tools\"\n       YOUR_PROVIDER_JSON = \"your_provider_json\"\n   ```\n\n3. **Create Client Implementation** `instructor/client_your_provider.py`:\n   - Use overloads for sync/async variants\n   - Validate mode compatibility\n   - Return appropriate Instructor/AsyncInstructor instance\n   - Handle provider-specific edge cases\n\n4. **Add Conditional Import** in `instructor/__init__.py`:\n   ```python\n   if importlib.util.find_spec(\"your_provider_sdk\") is not None:\n       from .client_your_provider import from_your_provider\n       __all__ += [\"from_your_provider\"]\n   ```\n\n5. **Update Auto Client** in `instructor/auto_client.py`:\n   - Add to `supported_providers` list\n   - Implement provider handling in `from_provider()`\n   - Update `get_provider()` function if URL-detectable\n\n6. **Create Tests** in `tests/llm/test_your_provider/`:\n   - `conftest.py` with client fixtures\n   - Basic extraction tests\n   - Streaming tests\n   - Validation/retry tests\n   - No mocking - use real API calls\n\n7. **Add Documentation** in `docs/integrations/your_provider.md`:\n   - Installation instructions\n   - Basic usage examples\n   - Supported modes\n   - Provider-specific features\n\n8. **Update Navigation** in `mkdocs.yml`:\n   - Add to integrations section\n   - Include redirects if needed\n\n## Contributing to Evals\n- Standard evals for each provider test model capabilities\n- Create new evals following existing patterns\n- Run evals as part of integration test suite\n- Performance tracking and comparison\n\n## Pull Request Guidelines\n- Keep PRs small and focused\n- Include tests for all changes\n- Update documentation as needed\n- Follow PR template\n- Link to relevant issues\n\n## Type System and Best Practices\n\n### Type Checking with ty\n- **Type Checker**: Using `ty` for fast, incremental type checking\n- **Python Version**: 3.9+ for compatibility\n- **Configuration**: Uses `pyproject.toml` settings for type checking\n- Run `uv run ty check` before committing - aim for zero errors\n\n### Code Quality Checks Before Committing\nAlways run these checks before committing code:\n1. **Ruff linting**: `uv run ruff check .` - Fix all errors\n2. **Ruff formatting**: `uv run ruff format .` - Apply consistent formatting\n3. **Type checking**: `uv run ty check` - Aim for zero type errors\n4. **Tests**: Run relevant tests to ensure changes don't break functionality\n\n### Type Patterns\n- **Bounded TypeVars**: Use `T = TypeVar(\"T\", bound=Union[BaseModel, ...])` for constraints\n- **Version Compatibility**: Handle Python 3.9 vs 3.10+ typing differences explicitly\n- **Union Type Syntax**: Use `from __future__ import annotations` to enable Python 3.10+ union syntax (`|`) in Python 3.9\n- **Simple Type Detection**: Special handling for `list[Union[int, str]]` patterns\n- **Runtime Type Handling**: Graceful fallbacks for compatibility\n\n### Pydantic Integration\n- Heavy use of `BaseModel` for structured outputs\n- `TypeAdapter` used internally for JSON schema generation\n- Field validators and custom types\n- Models serve dual purpose: validation and documentation\n\n## Building Documentation\n\n### Setup\n```bash\n# Install documentation dependencies\npip install -r requirements-doc.txt\n```\n\n### Local Development\n```bash\n# Serve documentation locally with hot reload\nuv run mkdocs serve\n\n# Build documentation for production\n./build_mkdocs.sh\n```\n\n### Documentation Features\n- **Material Theme**: Modern UI with extensive customization\n- **Plugins**:\n  - `mkdocstrings` - API documentation from docstrings\n  - `mkdocs-jupyter` - Notebook integration\n  - `mkdocs-redirects` - URL management\n  - Custom hooks for code processing\n- **Custom Processing**: `hide_lines.py` removes code marked with `# <%hide%>`\n- **Redirect Management**: Comprehensive redirect maps for moved content\n\n### Writing Documentation\n- Follow templates in `docs/templates/` for consistency\n- Grade 10 reading level for accessibility\n- All code examples must be runnable\n- Include complete imports and environment setup\n- Progressive complexity: simple → advanced\n\n## Project Structure\n- `instructor/` - Core library code\n  - Base classes (`client.py`): `Instructor` and `AsyncInstructor`\n  - Provider clients (`client_*.py`): Factory functions for each provider\n  - DSL components (`dsl/`): Partial, Iterable, Maybe, Citation extensions\n  - Core logic: `patch.py`, `process_response.py`, `function_calls.py`\n  - CLI tools (`cli/`): Batch processing, file management, usage tracking\n- `tests/` - Test suite organized by provider\n  - Provider-specific tests in `tests/llm/test_<provider>/`\n  - Evaluation tests for model capabilities\n  - No mocking - all tests use real API calls\n- `docs/` - MkDocs documentation\n  - `concepts/` - Core concepts and features\n  - `integrations/` - Provider-specific guides\n  - `examples/` - Practical examples and cookbooks\n  - `learning/` - Progressive tutorial path\n  - `blog/posts/` - Technical articles and announcements\n  - `templates/` - Templates for new docs (provider, concept, cookbook)\n- `examples/` - Runnable code examples\n  - Feature demos: caching, streaming, validation, parallel processing\n  - Use cases: classification, extraction, knowledge graphs\n  - Provider examples: anthropic, openai, groq, mistral\n  - Each example has `run.py` as the main entry point\n- `typings/` - Type stubs for untyped dependencies\n\n## Documentation Structure\n- **Getting Started Path**: Installation → First Extraction → Response Models → Structured Outputs\n- **Learning Patterns**: Simple Objects → Lists → Nested Structures → Validation → Streaming\n- **Example Organization**: Self-contained directories with runnable code demonstrating specific features\n- **Blog Posts**: Technical deep-dives with code examples in `docs/blog/posts/`\n\n## Example Patterns\nWhen creating examples:\n- Use `run.py` as the main file name\n- Include clear imports: stdlib → third-party → instructor\n- Define Pydantic models with descriptive fields\n- Show expected output in comments\n- Handle errors appropriately\n- Make examples self-contained and runnable\n\n## Dependency Management\n\n### Core Dependencies\n- **Minimal core**: `openai`, `pydantic`, `docstring-parser`, `typer`, `rich`\n- **Python requirement**: `<4.0,>=3.9`\n- **Pydantic version**: `<3.0.0,>=2.8.0` (constrained for stability)\n\n### Optional Dependencies\nProvider-specific packages as extras:\n```bash\n# Install with specific provider\npip install \"instructor[anthropic]\"\npip install \"instructor[google-generativeai]\"\npip install \"instructor[groq]\"\n```\n\n### Development Dependencies\n```bash\n# Install all development dependencies\nuv pip install -e \".[dev]\"\n```\nIncludes:\n- ty \n- `pytest` and `pytest-asyncio` - Testing\n- `ruff` - Linting and formatting\n- `coverage` - Test coverage\n- `mkdocs` and plugins - Documentation\n\n### Version Constraints\n- **Upper bounds on all dependencies** for stability\n- **Provider SDK versions** pinned to tested versions\n- **Test dependencies** include evaluation frameworks\n\n### Managing Dependencies\n- Update `pyproject.toml` for new dependencies\n- Test with multiple Python versions (3.9-3.12)\n- Run full test suite after dependency updates\n- Document any provider-specific version requirements\n\nThe library enables structured LLM outputs using Pydantic models across multiple providers with type safety.\n"
  },
  {
    "path": "CONTRIBUTING.md",
    "content": "# Contributing to Instructor\n\nThank you for considering contributing to Instructor! This document provides guidelines and instructions to help you contribute effectively.\n\n## Table of Contents\n\n- [Contributing to Instructor](#contributing-to-instructor)\n  - [Table of Contents](#table-of-contents)\n  - [Code of Conduct](#code-of-conduct)\n  - [Getting Started](#getting-started)\n    - [Environment Setup](#environment-setup)\n    - [Development Workflow](#development-workflow)\n    - [Dependency Management](#dependency-management)\n      - [Using UV](#using-uv)\n      - [Using Poetry](#using-poetry)\n    - [Working with Optional Dependencies](#working-with-optional-dependencies)\n  - [How to Contribute](#how-to-contribute)\n    - [Reporting Bugs](#reporting-bugs)\n    - [Feature Requests](#feature-requests)\n    - [Pull Requests](#pull-requests)\n    - [Writing Documentation](#writing-documentation)\n    - [Contributing to Evals](#contributing-to-evals)\n  - [Code Style Guidelines](#code-style-guidelines)\n    - [Conventional Comments](#conventional-comments)\n    - [Conventional Commits](#conventional-commits)\n      - [Types](#types)\n      - [Examples](#examples)\n  - [Testing](#testing)\n  - [Branch and Release Process](#branch-and-release-process)\n  - [Using Cursor for PR Creation](#using-cursor-for-pr-creation)\n  - [License](#license)\n\n## Code of Conduct\n\nBy participating in this project, you agree to abide by our code of conduct: treat everyone with respect, be constructive in your communication, and focus on the technical aspects of the contributions.\n\n## Getting Started\n\n### Environment Setup\n\n1. **Fork the Repository**: Click the \"Fork\" button at the top right of the [repository page](https://github.com/instructor-ai/instructor).\n\n2. **Clone Your Fork**:\n   ```bash\n   git clone https://github.com/YOUR-USERNAME/instructor.git\n   cd instructor\n   ```\n\n3. **Set up Remote**:\n   ```bash\n   git remote add upstream https://github.com/instructor-ai/instructor.git\n   ```\n\n4. **Install UV** (recommended):\n   ```bash\n   # macOS/Linux\n   curl -LsSf https://astral.sh/uv/install.sh | sh\n\n   # Windows PowerShell\n   powershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n   ```\n\n5. **Install Dependencies**:\n   ```bash\n   # Using uv (recommended)\n   uv pip install -e \".[dev,docs,test-docs]\"\n   \n   # Using poetry\n   poetry install --with dev,docs,test-docs\n   \n   # For specific providers, add the provider name as an extra\n   # Example: uv pip install -e \".[dev,docs,test-docs,anthropic]\"\n   ```\n\n6. **Set up Pre-commit**:\n   ```bash\n   pip install pre-commit\n   pre-commit install\n   ```\n\n### Development Workflow\n\n1. **Create a Branch**:\n   ```bash\n   git checkout -b feature/your-feature-name\n   ```\n\n2. **Make Your Changes and Commit**:\n   ```bash\n   git add .\n   git commit -m \"Your descriptive commit message\"\n   ```\n\n3. **Keep Your Branch Updated**:\n   ```bash\n   git fetch upstream\n   git rebase upstream/main\n   ```\n\n4. **Push Changes**:\n   ```bash\n   git push origin feature/your-feature-name\n   ```\n\n### Dependency Management\n\nWe support both UV and Poetry for dependency management. Choose the tool that works best for you:\n\n#### Using UV\n\nUV is a fast Python package installer and resolver. It's recommended for day-to-day development in Instructor.\n\n```bash\n# Install uv\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install project and development dependencies\nuv pip install -e \".[dev,docs]\"\n\n# Adding a new dependency (example)\nuv pip install new-package\n```\n\nKey UV commands:\n- `uv pip install -e .` - Install the project in editable mode\n- `uv pip install -e \".[dev]\"` - Install with development extras\n- `uv pip freeze > requirements.txt` - Generate requirements file\n- `uv self update` - Update UV to the latest version\n\n#### Using Poetry\n\nPoetry provides more comprehensive dependency management and packaging.\n\n```bash\n# Install Poetry\ncurl -sSL https://install.python-poetry.org | python3 -\n\n# Install dependencies including development deps\npoetry install --with dev,docs\n\n# Add a new dependency\npoetry add package-name\n\n# Add a new development dependency\npoetry add --group dev package-name\n```\n\nKey Poetry commands:\n- `poetry shell` - Activate the virtual environment\n- `poetry run python -m pytest` - Run commands within the virtual environment\n- `poetry update` - Update dependencies to their latest versions\n\n### Working with Optional Dependencies\n\nInstructor uses optional dependencies to support different LLM providers. Provider-specific utilities live under `instructor/utils`. When adding integration for a new provider:\n\n1. **Update pyproject.toml**: Add your provider's dependencies to both `[project.optional-dependencies]` and `[dependency-groups]`:\n\n   ```toml\n   [project.optional-dependencies]\n   # Add your provider here\n   my-provider = [\"my-provider-sdk>=1.0.0,<2.0.0\"]\n   \n   [dependency-groups]\n   # Also add to dependency groups\n   my-provider = [\"my-provider-sdk>=1.0.0,<2.0.0\"]\n   ```\n\n2. **Create Provider Client**: Implement your provider client in `instructor/clients/client_myprovider.py`\n\n3. **Add Tests**: Create tests in `tests/llm/test_myprovider/`\n\n4. **Document Installation**: Update the documentation to include installation instructions:\n   ```\n   # Install with your provider support\n   uv pip install \"instructor[my-provider]\"\n   # or\n   poetry install --with my-provider\n   ```\n\n5. **Create Provider Utilities and Handlers**:\n   - Add a new module at `instructor/utils/myprovider.py`\n   - Implement `reask` functions for validation errors and `handle_*` functions\n     for formatting requests\n   - Define `MYPROVIDER_HANDLERS` mapping `Mode` values to these functions\n\n6. **Register the Provider**:\n   - Add a value in `instructor/utils/providers.py` to the `Provider` enum\n   - Extend `get_provider` with detection logic for your base URL\n\n7. **Update `process_response.py`**:\n   - Import your handler functions and include them in the `mode_handlers`\n     dictionary so the library can route requests to your provider\n   - `process_response.py` relies on these handlers to format arguments and\n     parse results for each `Mode`\n\n## How to Contribute\n\n### Reporting Bugs\n\nIf you find a bug, please create an issue on [our issue tracker](https://github.com/instructor-ai/instructor/issues) with:\n\n1. A clear, descriptive title\n2. A detailed description including:\n   - The `response_model` you are using\n   - The `messages` you are using\n   - The `model` you are using\n   - Steps to reproduce the bug\n   - The expected behavior and what went wrong\n   - Your environment (Python version, OS, package versions)\n\n### Feature Requests\n\nFor feature requests, please create an issue describing:\n\n1. The problem your feature would solve\n2. How your solution would work\n3. Alternatives you've considered\n4. Examples of how the feature would be used\n\n### Pull Requests\n\n1. **Create a Pull Request** from your fork to the main repository.\n2. **Fill out the PR template** with details about your changes.\n3. **Address review feedback** and make requested changes.\n4. **Wait for CI checks** to pass.\n5. Once approved, a maintainer will merge your PR.\n\n### Writing Documentation\n\nDocumentation improvements are always welcome! Follow these guidelines:\n\n1. Documentation is written in Markdown format in the `docs/` directory\n2. When creating new markdown files, add them to `mkdocs.yml` under the appropriate section\n3. Follow the existing hierarchy and structure\n4. Use a grade 10 reading level (simple, clear language)\n5. Include working code examples\n6. Add links to related documentation\n\n### Contributing to Evals\n\nWe encourage contributions to our evaluation tests:\n\n1. Explore existing evals in the [evals directory](https://github.com/instructor-ai/instructor/tree/main/tests/llm)\n2. Contribute new evals as pytest tests\n3. Evals should test specific capabilities or edge cases of the library or models\n4. Follow the existing patterns for structuring eval tests\n\n## Code Style Guidelines\n\nWe use automated tools to maintain consistent code style:\n\n- **Ruff**: For linting and formatting\n- **ty**: For type checking\n- **Black**: For code formatting (enforced by Ruff)\n\nGeneral guidelines:\n\n- **Typing**: Use strict typing with annotations for all functions and variables\n- **Imports**: Standard lib → third-party → local imports\n- **Models**: Define structured outputs as Pydantic BaseModel subclasses\n- **Naming**: snake_case for functions/variables, PascalCase for classes\n- **Error Handling**: Use custom exceptions from exceptions.py, validate with Pydantic\n- **Comments**: Docstrings for public functions, inline comments for complex logic\n\n### Conventional Comments\n\nWe use conventional comments in code reviews and commit messages. This helps make feedback clearer and more actionable:\n\n```\n<label>: <subject>\n\n<description>\n```\n\nLabels include:\n- **praise:** highlights something positive\n- **suggestion:** proposes a change or improvement\n- **question:** asks for clarification\n- **nitpick:** minor, trivial feedback that can be ignored\n- **issue:** points out a specific problem that needs to be fixed\n- **todo:** notes something to be addressed later\n- **fix:** resolves an issue\n- **refactor:** suggests reorganizing code without changing behavior\n- **test:** suggests adding or improving tests\n\nExamples:\n```\nsuggestion: consider using Pydantic's validator for this check\nThis would ensure validation happens automatically when the model is created.\n\nquestion: why is this approach used instead of async processing?\nI'm wondering if there would be performance benefits.\n\nfix: correct the type hint for the client parameter\nThe client should accept OpenAI instances, not strings.\n```\n\nFor more details, see the [Conventional Comments specification](https://conventionalcomments.org/).\n\n### Conventional Commits\n\nWe follow the [Conventional Commits](https://www.conventionalcommits.org/) specification for commit messages. This helps us generate changelogs and understand the changes at a glance.\n\nThe commit message should be structured as follows:\n\n```\n<type>[optional scope]: <description>\n\n[optional body]\n\n[optional footer(s)]\n```\n\n#### Types\n\n- **feat**: A new feature\n- **fix**: A bug fix\n- **docs**: Documentation only changes\n- **style**: Changes that do not affect the meaning of the code (white-space, formatting, etc)\n- **refactor**: A code change that neither fixes a bug nor adds a feature\n- **perf**: A code change that improves performance\n- **test**: Adding missing tests or correcting existing tests\n- **build**: Changes that affect the build system or external dependencies\n- **ci**: Changes to our CI configuration files and scripts\n\n#### Examples\n\n```\nfeat(openai): add support for response_format parameter\n\nfix(anthropic): correct tool calling format in Claude client\n\ndocs: improve installation instructions for various providers\n\ntest(evals): add evaluation for recursive schema handling\n```\n\nBreaking changes should be indicated by adding `!` after the type/scope:\n\n```\nfeat(api)!: change parameter order in from_openai factory function\n```\n\nIncluding a scope is recommended when changes affect a specific part of the codebase (e.g., a specific provider, feature, or component).\n\n## Testing\n\nRun tests using pytest:\n\n```bash\n# Run all tests\npytest tests/\n\n# Run specific test\npytest tests/path_to_test.py::test_name\n\n# Skip LLM tests (faster for local development)\npytest tests/ -k 'not llm and not openai'\n\n# Generate coverage report\ncoverage run -m pytest tests/ -k \"not docs\"\ncoverage report\n```\n\n## Branch and Release Process\n\n- `main` branch is the development branch\n- Releases are tagged with version numbers\n- We follow [Semantic Versioning](https://semver.org/)\n\n## Using Cursor for PR Creation\n\nCursor (https://cursor.sh) is a code editor powered by AI that can help you create PRs efficiently. We encourage using Cursor for Instructor development:\n\n1. **Install Cursor**: Download from [cursor.sh](https://cursor.sh/)\n\n2. **Create a Branch**: Start a new branch for your feature using Cursor's Git integration\n\n3. **Use Cursor Rules**: We have Cursor rules that help with standards:\n   - `new-features-planning`: Use when implementing new features\n   - `simple-language`: Follow when writing documentation\n   - `documentation-sync`: Reference when making code changes to keep docs in sync\n\n4. **Generate Code with AI**: Use Cursor's AI assistance to generate code that follows our style\n\n5. **Auto-Create PRs**: Use Cursor's PR creation feature with our template:\n   ```\n   # Create PR using gh CLI\n   gh pr create -t \"Your PR Title\" -b \"Description of changes\" -r jxnl,ivanleomk\n   ```\n\n6. **Include Attribution**: Add `This PR was written by [Cursor](https://cursor.sh)` to your PR description\n\nFor more details, see our Cursor rules in `.cursor/rules/`.\n\n## License\n\nBy contributing to Instructor, you agree that your contributions will be licensed under the project's MIT License. \n"
  },
  {
    "path": "LICENSE",
    "content": "MIT License\n\nCopyright (c) 2023 Jason Liu\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n"
  },
  {
    "path": "NEW_PROVIDER_AGENT_INSTRUCTIONS.md",
    "content": "# AI Agent Instructions: Creating a New Instructor Provider\n\n**Instructions for AI coding agents to create a new provider for the instructor library.**\n\nCopy these instructions to your AI coding agent when you want to add a new LLM provider to instructor. The agent will have everything needed to implement a complete, working provider.\n\n**For human contributors:** See the quick reference template in [`instructor/providers/README.md`](instructor/providers/README.md#adding-a-new-provider)\n\n---\n\n## Mission\n\nCreate a complete, production-ready provider package for the instructor library that:\n- Follows the BaseProvider protocol exactly\n- Includes comprehensive tests using transcript fixtures  \n- Has proper error handling and validation\n- Provides excellent documentation\n- Integrates seamlessly with the instructor plugin system\n\n## Prerequisites\n\nBefore starting, ensure you have:\n- Provider name (e.g., \"groq\", \"perplexity\", \"fireworks\")\n- Provider's Python SDK package name and version\n- API documentation URL\n- Sample API key format (for documentation)\n- Knowledge of provider's chat completion API structure\n\n## Step-by-Step Implementation\n\n### Step 1: Project Structure Setup\n\n**Note: This creates a new provider integration that follows instructor's existing patterns, not a separate package.**\n\nCreate the following structure in the instructor repository:\n\n```\ninstructor/providers/{provider}/\n├── __init__.py              # Empty or basic exports\n├── client.py                # from_{provider} function implementation  \n└── utils.py                 # Provider-specific utilities\n\ntests/llm/test_{provider}/\n├── __init__.py              # Empty\n├── conftest.py              # Test configuration & API key handling\n├── util.py                  # Models and modes configuration\n├── test_simple.py           # Basic functionality tests\n├── test_stream.py           # Streaming tests (if supported)\n├── test_format.py           # Format/structure tests\n└── test_retries.py          # Error handling tests\n\ndocs/integrations/\n└── {provider}.md            # Provider documentation following existing pattern\n```\n\n**Important: You're adding to the existing instructor codebase, not creating a separate package.**\n\n### Step 2: Provider Client Implementation\n\n#### File: `instructor/providers/{provider}/client.py`\n\nFollow the exact pattern used by other providers in instructor. This creates a `from_{provider}` function:\n\n```python\nfrom __future__ import annotations\n\nfrom typing import Any, overload\n\nimport instructor\nfrom ...core.client import AsyncInstructor, Instructor\n\n# Import the provider's SDK\nfrom {provider_sdk} import {SyncClient}, {AsyncClient}  # Replace with actual imports\n\n\n@overload\ndef from_{provider}(\n    client: {SyncClient},\n    mode: instructor.Mode = instructor.Mode.{PROVIDER}_TOOLS,  # Default mode\n    **kwargs: Any,\n) -> Instructor: ...\n\n\n@overload  \ndef from_{provider}(\n    client: {AsyncClient},\n    mode: instructor.Mode = instructor.Mode.{PROVIDER}_TOOLS,  # Default mode\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\ndef from_{provider}(\n    client: {SyncClient} | {AsyncClient},\n    mode: instructor.Mode = instructor.Mode.{PROVIDER}_TOOLS,  # Default mode\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    \"\"\"\n    Create an instructor client from a {Provider} client\n    \n    Args:\n        client: {Provider} sync or async client instance\n        mode: Mode to use for structured outputs\n        **kwargs: Additional arguments passed to instructor client\n        \n    Returns:\n        Instructor or AsyncInstructor instance\n    \"\"\"\n    # Define valid modes for this provider\n    valid_modes = {\n        instructor.Mode.{PROVIDER}_TOOLS,\n        instructor.Mode.{PROVIDER}_JSON,\n        # Add other modes your provider supports\n    }\n\n    # Validate mode\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n        raise ModeError(\n            mode=str(mode),\n            provider=\"{Provider}\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    # Validate client type  \n    if not isinstance(client, ({AsyncClient}, {SyncClient})):\n        from ...core.exceptions import ClientError\n        raise ClientError(\n            f\"Client must be an instance of {SyncClient} or {AsyncClient}. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    # Handle async client\n    if isinstance(client, {AsyncClient}):\n        \n        async def async_wrapper(*args: Any, **kwargs: Any):\n            \"\"\"Wrapper for async client calls\"\"\"\n            if \"stream\" in kwargs and kwargs[\"stream\"] is True:\n                # Handle streaming if supported\n                return client.chat.completions.acreate(*args, **kwargs)\n            return await client.chat.completions.acreate(*args, **kwargs)\n\n        return AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.{PROVIDER},  # Must be defined in Provider enum\n            mode=mode,\n            **kwargs,\n        )\n\n    # Handle sync client\n    if isinstance(client, {SyncClient}):\n        return Instructor(\n            client=client,\n            create=instructor.patch(create=client.chat.completions.create, mode=mode),\n            provider=instructor.Provider.{PROVIDER},  # Must be defined in Provider enum  \n            mode=mode,\n            **kwargs,\n        )\n```\n\n### Step 3: Mode Handlers Implementation\n\n#### File: `instructor_{provider}/handlers.py`\n\n```python\n\"\"\"\nMode handlers for {Provider} provider\n\nEach handler knows how to:\n1. Format requests for the specific mode (TOOLS, JSON, etc.)\n2. Parse responses back into Pydantic models\n3. Handle provider-specific response formats\n\"\"\"\n\nfrom typing import Dict, Any, Type, Union\nfrom pydantic import BaseModel\nfrom instructor.mode import Mode\nfrom instructor.function_calls import openai_schema\nimport json\n\nclass BaseModeHandler:\n    \"\"\"Base class for mode handlers\"\"\"\n    \n    def __init__(self, provider):\n        self.provider = provider\n    \n    def prepare_request(\n        self, \n        response_model: Type[BaseModel], \n        messages: list, \n        model: str, \n        **kwargs\n    ) -> Dict[str, Any]:\n        \"\"\"Prepare request for this mode\"\"\"\n        raise NotImplementedError\n    \n    def parse_response(self, response: Any, response_model: Type[BaseModel]) -> BaseModel:\n        \"\"\"Parse provider response into Pydantic model\"\"\"\n        raise NotImplementedError\n\nclass ToolsHandler(BaseModeHandler):\n    \"\"\"Handler for function/tool calling mode\"\"\"\n    \n    def prepare_request(self, response_model, messages, model, **kwargs):\n        # Convert Pydantic model to function schema\n        schema = openai_schema(response_model)\n        \n        return {\n            \"model\": model,\n            \"messages\": messages,\n            \"tools\": [{\n                \"type\": \"function\",\n                \"function\": schema\n            }],\n            \"tool_choice\": \"auto\",  # or provider-specific equivalent\n            **kwargs\n        }\n    \n    def parse_response(self, response, response_model):\n        # Extract function call from response\n        # This is provider-specific - adapt to your provider's response format\n        \n        if hasattr(response, 'choices') and response.choices:\n            choice = response.choices[0]\n            if hasattr(choice.message, 'tool_calls') and choice.message.tool_calls:\n                tool_call = choice.message.tool_calls[0]\n                function_args = json.loads(tool_call.function.arguments)\n                return response_model(**function_args)\n        \n        raise ValueError(\"No valid tool call found in response\")\n\nclass JSONHandler(BaseModeHandler):\n    \"\"\"Handler for JSON mode responses\"\"\"\n    \n    def prepare_request(self, response_model, messages, model, **kwargs):\n        # Add JSON schema to system message\n        schema_prompt = f\"\"\"\nYou must respond with valid JSON that matches this schema:\n{response_model.model_json_schema()}\n\nRespond with only the JSON, no additional text.\n\"\"\"\n        \n        # Add schema to messages\n        enhanced_messages = [\n            {\"role\": \"system\", \"content\": schema_prompt}\n        ] + messages\n        \n        return {\n            \"model\": model,\n            \"messages\": enhanced_messages,\n            \"response_format\": {\"type\": \"json_object\"},  # if provider supports\n            **kwargs\n        }\n    \n    def parse_response(self, response, response_model):\n        # Extract JSON from response content\n        if hasattr(response, 'choices') and response.choices:\n            content = response.choices[0].message.content\n            try:\n                data = json.loads(content)\n                return response_model(**data)\n            except json.JSONDecodeError as e:\n                raise ValueError(f\"Invalid JSON in response: {e}\")\n        \n        raise ValueError(\"No valid response content found\")\n\n# Handler registry\n_HANDLERS = {\n    Mode.TOOLS: ToolsHandler,\n    Mode.JSON: JSONHandler,\n    # Add other modes as supported by provider\n}\n\ndef get_handler(mode: Mode, provider) -> BaseModeHandler:\n    \"\"\"Get handler instance for the specified mode\"\"\"\n    if mode not in _HANDLERS:\n        supported = \", \".join(h.name for h in _HANDLERS.keys())\n        raise ValueError(f\"Mode {mode} not supported. Supported modes: {supported}\")\n    \n    handler_class = _HANDLERS[mode]\n    return handler_class(provider)\n```\n\n### Step 4: Package Configuration\n\n#### File: `pyproject.toml`\n\n```toml\n[project]\nname = \"instructor-{provider}\"\nversion = \"0.1.0\"\ndescription = \"Instructor provider for {Provider Name}\"\nauthors = [\n    {name = \"Your Name\", email = \"your.email@example.com\"}\n]\nlicense = {text = \"MIT\"}\nrequires-python = \">=3.9\"\ndependencies = [\n    \"instructor-core>=2.0.0,<3.0.0\",\n    \"{provider_sdk}>=X.X.X,<Y.0.0\",  # Replace with actual version constraints\n    \"pydantic>=2.8.0,<3.0.0\",\n]\n\nreadme = \"README.md\"\nkeywords = [\"instructor\", \"llm\", \"structured-output\", \"{provider}\"]\n\n[project.urls]\nHomepage = \"https://github.com/instructor-ai/instructor\"\nDocumentation = \"https://python.useinstructor.com\"\nRepository = \"https://github.com/instructor-ai/instructor\"\n\n[project.optional-dependencies]\ndev = [\n    \"pytest>=8.3.3,<9.0.0\",\n    \"pytest-asyncio>=0.24.0,<1.0.0\", \n    \"pytest-mock>=3.12.0\",\n    \"responses>=0.24.0\",  # For HTTP mocking\n    \"python-dotenv>=1.0.1\",\n]\n\n# Register the provider with instructor's plugin system\n[project.entry-points.\"instructor.providers\"]\n{provider} = \"instructor_{provider}:{Provider}Provider\"\n\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n\n[tool.pytest.ini_options]\ntestpaths = [\"tests\"]\nmarkers = [\n    \"unit: Unit tests (fast, no external dependencies)\",\n    \"integration: Integration tests (may require API keys)\", \n    \"live: Live API tests (requires valid API key)\"\n]\n\n[tool.ruff]\ntarget-version = \"py39\"\nline-length = 88\n\n[tool.ruff.lint]\nselect = [\"E\", \"F\", \"W\", \"I\", \"N\", \"B\", \"A\", \"C4\", \"T20\"]\nignore = [\"E501\"]  # Line too long (handled by formatter)\n```\n\n### Step 3: Testing Implementation\n\n#### File: `tests/llm/test_{provider}/conftest.py`\n\nFollow the exact pattern used by all other providers:\n\n```python\nimport os\nimport pytest\n\n# Skip entire test suite if API key is missing\nif not os.getenv(\"{PROVIDER}_API_KEY\"):\n    pytest.skip(\n        \"{PROVIDER}_API_KEY environment variable not set\",\n        allow_module_level=True,\n    )\n\n# Skip if provider package is not installed  \ntry:\n    from {provider_sdk} import {SyncClient}, {AsyncClient}  # Replace with actual imports\nexcept ImportError:\n    pytest.skip(\"{provider_sdk} package is not installed\", allow_module_level=True)\n\n\n@pytest.fixture(scope=\"function\")\ndef client():\n    \"\"\"Sync client fixture\"\"\"\n    yield {SyncClient}()\n\n\n@pytest.fixture(scope=\"function\") \ndef aclient():\n    \"\"\"Async client fixture\"\"\"\n    yield {AsyncClient}()\n```\n\n#### File: `tests/llm/test_{provider}/util.py`\n\nDefine supported models and modes:\n\n```python\nimport instructor\n\n# Replace with actual model names your provider supports\nmodels = [\"provider-model-name-1\", \"provider-model-name-2\"]\n\n# Replace with actual modes your provider supports\nmodes = [\n    instructor.Mode.{PROVIDER}_TOOLS,\n    instructor.Mode.{PROVIDER}_JSON,\n]\n```\n\n#### File: `tests/llm/test_{provider}/test_simple.py`\n\nFollow the standard pattern for basic functionality tests:\n\n```python\nimport instructor\nfrom {provider_sdk} import {SyncClient}, {AsyncClient}  # Replace with actual imports\nfrom pydantic import BaseModel, field_validator\nimport pytest\nfrom itertools import product\nfrom .util import models, modes\n\n\nclass User(BaseModel):\n    \"\"\"Standard test model\"\"\"\n    name: str\n    age: int\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_{provider}_sync(model: str, mode: instructor.Mode, client):\n    \"\"\"Test basic sync functionality\"\"\"\n    client = instructor.from_{provider}(client, mode=mode)\n\n    resp = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from this sentence: Ivan is 27 and lives in Singapore\",\n            },\n        ],\n        response_model=User,\n    )\n\n    assert resp.name.lower() == \"ivan\"\n    assert resp.age == 27\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_{provider}_sync_validated(model: str, mode: instructor.Mode, client):\n    \"\"\"Test sync with validation retries\"\"\"\n    class ValidatedUser(BaseModel):\n        name: str\n        age: int\n\n        @field_validator(\"name\")\n        def name_validator(cls, v: str) -> str:\n            if not v.isupper():\n                raise ValueError(\n                    f\"All letters in the name must be uppercase (Eg. JOHN, SMITH) - {v} is not a valid example.\"\n                )\n            return v\n\n    client = instructor.from_{provider}(client, mode=mode)\n\n    resp = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\", \n                \"content\": \"Extract a user from this sentence: Ivan is 27 and lives in Singapore\",\n            },\n        ],\n        max_retries=5,\n        response_model=ValidatedUser,\n    )\n\n    assert resp.name == \"IVAN\"\n    assert resp.age == 27\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\n@pytest.mark.asyncio(scope=\"session\")\nasync def test_{provider}_async(model: str, mode: instructor.Mode, aclient):\n    \"\"\"Test async functionality\"\"\"\n    client = instructor.from_{provider}(aclient, mode=mode)\n\n    resp = await client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from this sentence: Ivan is 27 and lives in Singapore\",\n            },\n        ],\n        response_model=User,\n    )\n\n    assert resp.name.lower() == \"ivan\"\n    assert resp.age == 27\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\n@pytest.mark.asyncio(scope=\"session\")\nasync def test_{provider}_async_validated(model: str, mode: instructor.Mode, aclient):\n    \"\"\"Test async with validation retries\"\"\"\n    class ValidatedUser(BaseModel):\n        name: str\n        age: int\n\n        @field_validator(\"name\")\n        def name_validator(cls, v: str) -> str:\n            if not v.isupper():\n                raise ValueError(\n                    f\"Make sure to uppercase all letters in the name field. Examples include: JOHN, SMITH, etc. {v} is not a valid example.\"\n                )\n            return v\n\n    client = instructor.from_{provider}(aclient, mode=mode)\n\n    resp = await client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from this sentence: Ivan is 27 and lives in Singapore\",\n            },\n        ],\n        response_model=ValidatedUser,\n        max_retries=5,\n    )\n\n    assert resp.name == \"IVAN\"\n    assert resp.age == 27\n```\n\n### Step 4: Required Infrastructure Updates\n\n#### A. Add Mode Constants\n\nAdd your provider's modes to `instructor/mode.py`:\n\n```python\n# Add to the Mode enum class\n{PROVIDER}_TOOLS = \"{provider}_tools\"\n{PROVIDER}_JSON = \"{provider}_json\"\n# Add other modes as needed\n```\n\n#### B. Add Provider to Enum\n\nAdd your provider to `instructor/utils/providers.py`:\n\n```python\n# Add to the Provider enum\n{PROVIDER} = \"{provider}\"\n```\n\n#### C. Update Main __init__.py\n\nAdd conditional import to `instructor/__init__.py`:\n\n```python\n# Add this block with the other provider imports\nif importlib.util.find_spec(\"{provider_sdk}\") is not None:\n    from .providers.{provider}.client import from_{provider}\n    \n    __all__ += [\"from_{provider}\"]\n```\n\n#### D. Add to pyproject.toml\n\nAdd your provider to the optional dependencies:\n\n```toml\n# In [project.optional-dependencies]\n{provider} = [\"{provider_sdk}>=X.X.X,<Y.0.0\"]  # Replace with actual version\n\n# In [dependency-groups] \n{provider} = [\"{provider_sdk}>=X.X.X,<Y.0.0\"]\n```\n\n### Step 5: Documentation\n\n#### File: `docs/integrations/{provider}.md`\n\nFollow the exact pattern of existing provider docs:\n\n```markdown\n---\ntitle: \"Structured outputs with {Provider}, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with {Provider} models. Learn how to generate structured, type-safe outputs with {provider description}.\"\n---\n\n# Structured outputs with {Provider}, a complete guide w/ instructor\n\n{Provider description and benefits}. This guide shows you how to use Instructor with {Provider}'s models for type-safe, validated responses.\n\n## Quick Start\n\nInstall Instructor with {Provider} support:\n\n```bash\npip install \"instructor[{provider}]\"\n```\n\n## Simple User Example (Sync)\n\n```python\nfrom {provider_sdk} import {SyncClient}\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize the client\nclient = {SyncClient}()\n\n# Enable instructor patches\nclient = instructor.from_{provider}(client)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Extract structured data\nuser = client.chat.completions.create(\n    model=\"your-model-name\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n    response_model=User\n)\n\nprint(user.name)  # Jason\nprint(user.age)   # 25\n```\n\n## Simple User Example (Async)\n\n```python\nfrom {provider_sdk} import {AsyncClient}\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n# Initialize async client\nclient = {AsyncClient}()\n\n# Enable instructor patches\nclient = instructor.from_{provider}(client)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def extract_user():\n    user = await client.chat.completions.create(\n        model=\"your-model-name\",\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n        response_model=User\n    )\n    return user\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user.name)  # Jason\nprint(user.age)   # 25\n```\n\n## Supported Models\n\n- `model-1` - Description and capabilities\n- `model-2` - Description and capabilities\n\nCheck [{Provider} documentation](provider-docs-url) for the complete list of available models.\n\n## Modes\n\nThe {Provider} provider supports these modes:\n\n- `instructor.Mode.{PROVIDER}_TOOLS` - Uses {provider} function calling (recommended)\n- `instructor.Mode.{PROVIDER}_JSON` - Uses JSON mode responses\n\n```python\nclient = instructor.from_{provider}(client, mode=instructor.Mode.{PROVIDER}_TOOLS)\n```\n\n## Advanced Usage\n\n### Validation and Retries\n\n```python\nfrom pydantic import BaseModel, field_validator\n\nclass User(BaseModel):\n    name: str\n    age: int\n    \n    @field_validator('age')\n    def validate_age(cls, v):\n        if v < 0:\n            raise ValueError('Age must be positive')\n        return v\n\n# Automatic retries on validation errors\nuser = client.chat.completions.create(\n    model=\"your-model-name\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is -5 years old\"}],\n    response_model=User,\n    max_retries=3\n)\n```\n\n### Complex Nested Models\n\n```python\nfrom typing import List\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n\nusers = client.chat.completions.create(\n    model=\"your-model-name\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract user info with multiple addresses...\"}],\n    response_model=User\n)\n```\n\n## Migration from Other Providers\n\nIf you're migrating from another provider:\n\n```python\n# Old way (other provider)\n# client = instructor.from_openai(openai_client)\n\n# New way ({Provider})  \nclient = instructor.from_{provider}({provider_sdk}.{SyncClient}())\n```\n\n## API Reference\n\nFor detailed API documentation, see the [Instructor API reference](../api/index.md).\n```\n\n## Example Provider: Groq\n\nHere's a concrete example implementing a Groq provider:\n\n#### File: `instructor/providers/groq/client.py`\n```python\nfrom __future__ import annotations\nfrom typing import Any, overload\nimport instructor\nfrom ...core.client import AsyncInstructor, Instructor\nfrom groq import Groq, AsyncGroq\n\n@overload\ndef from_groq(\n    client: Groq,\n    mode: instructor.Mode = instructor.Mode.GROQ_TOOLS,\n    **kwargs: Any,\n) -> Instructor: ...\n\n@overload  \ndef from_groq(\n    client: AsyncGroq,\n    mode: instructor.Mode = instructor.Mode.GROQ_TOOLS,\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\ndef from_groq(\n    client: Groq | AsyncGroq,\n    mode: instructor.Mode = instructor.Mode.GROQ_TOOLS,\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    valid_modes = {\n        instructor.Mode.GROQ_TOOLS,\n        instructor.Mode.GROQ_JSON,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Groq\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, (AsyncGroq, Groq)):\n        from ...core.exceptions import ClientError\n        raise ClientError(\n            f\"Client must be an instance of Groq or AsyncGroq. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if isinstance(client, AsyncGroq):\n        async def async_wrapper(*args: Any, **kwargs: Any):\n            return await client.chat.completions.acreate(*args, **kwargs)\n\n        return AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.GROQ,\n            mode=mode,\n            **kwargs,\n        )\n\n    return Instructor(\n        client=client,\n        create=instructor.patch(create=client.chat.completions.create, mode=mode),\n        provider=instructor.Provider.GROQ,\n        mode=mode,\n        **kwargs,\n    )\n```\n\n## Quality Checklist\n\nBefore submitting your provider implementation, verify:\n\n### Core Implementation\n- [ ] `from_{provider}` function implemented following the exact pattern\n- [ ] Both sync and async clients supported with proper overloads\n- [ ] Valid modes defined and enforced with proper error messages\n- [ ] Client type validation with helpful error messages\n- [ ] Proper use of `instructor.patch()` for both sync and async\n\n### Testing\n- [ ] `conftest.py` skips tests if API key missing or package not installed\n- [ ] `util.py` defines supported models and modes\n- [ ] `test_simple.py` covers basic sync/async functionality with validation\n- [ ] Tests use parametrized approach with `product(models, modes)`\n- [ ] All tests pass with real API key: `pytest tests/llm/test_{provider}/`\n\n### Infrastructure Updates\n- [ ] Modes added to `instructor/mode.py`\n- [ ] Provider added to `instructor/utils/providers.py` Provider enum\n- [ ] Conditional import added to `instructor/__init__.py`\n- [ ] Dependencies added to `pyproject.toml` optional-dependencies\n- [ ] Dependencies added to `pyproject.toml` dependency-groups\n\n### Documentation\n- [ ] Provider documentation created in `docs/integrations/{provider}.md`\n- [ ] Follows exact pattern with frontmatter, examples, and sections\n- [ ] All code examples are tested and work\n- [ ] Covers sync/async usage, validation, nested models\n- [ ] Links to provider documentation and API reference\n\n### Integration\n- [ ] Works with existing instructor patterns and conventions\n- [ ] Error messages are helpful and actionable\n- [ ] Follows the same API as other providers\n- [ ] No performance regressions\n\n## Submission Process\n\n1. **Test Locally**: Ensure all tests pass and examples work\n2. **Create PR**: Submit to instructor repository\n3. **Package Registry**: Publish to PyPI as `instructor-{provider}`\n4. **Documentation**: Add to instructor docs site\n5. **Announcement**: Share with community\n\n## Common Issues & Solutions\n\n### \"Provider not found\" error\n- Check entry point configuration in pyproject.toml\n- Verify provider name matches exactly\n- Ensure package is installed in same environment\n\n### Validation errors not retrying\n- Verify error handling in chat() method catches ValidationError\n- Check that validation messages are added to conversation\n- Ensure max_retries parameter is respected\n\n### Mode not supported\n- Implement handler in handlers.py for the mode\n- Add to _HANDLERS registry\n- Test with provider's actual API capabilities\n\n### Streaming issues\n- Check if provider supports streaming at all\n- Implement incremental parsing for partial responses\n- Handle stream interruption and reconnection\n\n### Type checking failures  \n- Ensure all method signatures match BaseProvider protocol exactly\n- Add proper type hints for all parameters and returns\n- Use Union/Optional types where appropriate\n\n---\n\n**This completes the full provider implementation guide. Follow these instructions systematically and you'll have a production-ready instructor provider that integrates seamlessly with the existing ecosystem.**\n"
  },
  {
    "path": "README.md",
    "content": "# Instructor: Structured Outputs for LLMs\n\nGet reliable JSON from any LLM. Built on Pydantic for validation, type safety, and IDE support.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Define what you want\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract it from natural language\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\nuser = client.chat.completions.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"John is 25 years old\"}],\n)\n\nprint(user)  # User(name='John', age=25)\n```\n\n**That's it.** No JSON parsing, no error handling, no retries. Just define a model and get structured data.\n\n[![PyPI](https://img.shields.io/pypi/v/instructor?style=flat-square)](https://pypi.org/project/instructor/)\n[![Downloads](https://img.shields.io/pypi/dm/instructor?style=flat-square)](https://pypi.org/project/instructor/)\n[![GitHub Stars](https://img.shields.io/github/stars/instructor-ai/instructor?style=flat-square)](https://github.com/instructor-ai/instructor)\n[![Discord](https://img.shields.io/discord/1192334452110659664?style=flat-square)](https://discord.gg/bD9YE9JArw)\n[![Twitter](https://img.shields.io/twitter/follow/jxnlco?style=flat-square)](https://twitter.com/jxnlco)\n\n> **Use Instructor for fast extraction, reach for PydanticAI when you need agents.** Instructor keeps schema-first flows simple and cheap. If your app needs richer agent runs, built-in observability, or shareable traces, try [PydanticAI](https://ai.pydantic.dev/). PydanticAI is the official agent runtime from the Pydantic team, adding typed tools, replayable datasets, evals, and production dashboards while using the same Pydantic models. Dive into the [PydanticAI docs](https://ai.pydantic.dev/) to see how it extends Instructor-style workflows.\n\n## Why Instructor?\n\nGetting structured data from LLMs is hard. You need to:\n\n1. Write complex JSON schemas\n2. Handle validation errors  \n3. Retry failed extractions\n4. Parse unstructured responses\n5. Deal with different provider APIs\n\n**Instructor handles all of this with one simple interface:**\n\n<table>\n<tr>\n<td><b>Without Instructor</b></td>\n<td><b>With Instructor</b></td>\n</tr>\n<tr>\n<td>\n\n```python\nresponse = openai.chat.completions.create(\n    model=\"gpt-4\",\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n    tools=[\n        {\n            \"type\": \"function\",\n            \"function\": {\n                \"name\": \"extract_user\",\n                \"parameters\": {\n                    \"type\": \"object\",\n                    \"properties\": {\n                        \"name\": {\"type\": \"string\"},\n                        \"age\": {\"type\": \"integer\"},\n                    },\n                },\n            },\n        }\n    ],\n)\n\n# Parse response\ntool_call = response.choices[0].message.tool_calls[0]\nuser_data = json.loads(tool_call.function.arguments)\n\n# Validate manually\nif \"name\" not in user_data:\n    # Handle error...\n    pass\n```\n\n</td>\n<td>\n\n```python\nclient = instructor.from_provider(\"openai/gpt-4\")\n\nuser = client.chat.completions.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n)\n\n# That's it! user is validated and typed\n```\n\n</td>\n</tr>\n</table>\n\n## Install in seconds\n\n```bash\npip install instructor\n```\n\nOr with your package manager:\n```bash\nuv add instructor\npoetry add instructor\n```\n\n## Works with every major provider\n\nUse the same code with any LLM provider:\n\n```python\n# OpenAI\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\n# Anthropic\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet\")\n\n# Google\nclient = instructor.from_provider(\"google/gemini-pro\")\n\n# Ollama (local)\nclient = instructor.from_provider(\"ollama/llama3.2\")\n\n# With API keys directly (no environment variables needed)\nclient = instructor.from_provider(\"openai/gpt-4o\", api_key=\"sk-...\")\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet\", api_key=\"sk-ant-...\")\nclient = instructor.from_provider(\"groq/llama-3.1-8b-instant\", api_key=\"gsk_...\")\n\n# All use the same API!\nuser = client.chat.completions.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n)\n```\n\n## Production-ready features\n\n### Automatic retries\n\nFailed validations are automatically retried with the error message:\n\n```python\nfrom pydantic import BaseModel, field_validator\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n    @field_validator('age')\n    def validate_age(cls, v):\n        if v < 0:\n            raise ValueError('Age must be positive')\n        return v\n\n\n# Instructor automatically retries when validation fails\nuser = client.chat.completions.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n    max_retries=3,\n)\n```\n\n### Streaming support\n\nStream partial objects as they're generated:\n\n```python\nfrom instructor import Partial\n\nfor partial_user in client.chat.completions.create(\n    response_model=Partial[User],\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n    stream=True,\n):\n    print(partial_user)\n    # User(name=None, age=None)\n    # User(name=\"John\", age=None)\n    # User(name=\"John\", age=25)\n```\n\n### Nested objects\n\nExtract complex, nested data structures:\n\n```python\nfrom typing import List\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n\n\n# Instructor handles nested objects automatically\nuser = client.chat.completions.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n)\n```\n\n## Used in production by\n\nTrusted by over 100,000 developers and companies building AI applications:\n\n- **3M+ monthly downloads**\n- **10K+ GitHub stars**  \n- **1000+ community contributors**\n\nCompanies using Instructor include teams at OpenAI, Google, Microsoft, AWS, and many YC startups.\n\n## Get started\n\n### Basic extraction\n\nExtract structured data from any text:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\n\nclass Product(BaseModel):\n    name: str\n    price: float\n    in_stock: bool\n\n\nproduct = client.chat.completions.create(\n    response_model=Product,\n    messages=[{\"role\": \"user\", \"content\": \"iPhone 15 Pro, $999, available now\"}],\n)\n\nprint(product)\n# Product(name='iPhone 15 Pro', price=999.0, in_stock=True)\n```\n\n### Multiple languages\n\nInstructor's simple API is available in many languages:\n\n- [Python](https://python.useinstructor.com) - The original\n- [TypeScript](https://js.useinstructor.com) - Full TypeScript support\n- [Ruby](https://ruby.useinstructor.com) - Ruby implementation  \n- [Go](https://go.useinstructor.com) - Go implementation\n- [Elixir](https://hex.pm/packages/instructor) - Elixir implementation\n- [Rust](https://rust.useinstructor.com) - Rust implementation\n\n### Learn more\n\n- [Documentation](https://python.useinstructor.com) - Comprehensive guides\n- [Examples](https://python.useinstructor.com/examples/) - Copy-paste recipes  \n- [Blog](https://python.useinstructor.com/blog/) - Tutorials and best practices\n- [Discord](https://discord.gg/bD9YE9JArw) - Get help from the community\n\n## Why use Instructor over alternatives?\n\n**vs Raw JSON mode**: Instructor provides automatic validation, retries, streaming, and nested object support. No manual schema writing.\n\n**vs LangChain/LlamaIndex**: Instructor is focused on one thing - structured extraction. It's lighter, faster, and easier to debug.\n\n**vs Custom solutions**: Battle-tested by thousands of developers. Handles edge cases you haven't thought of yet.\n\n## Contributing\n\nWe welcome contributions! Check out our [good first issues](https://github.com/instructor-ai/instructor/labels/good%20first%20issue) to get started.\n\n## License\n\nMIT License - see [LICENSE](https://github.com/instructor-ai/instructor/blob/main/LICENSE) for details.\n\n---\n\n<p align=\"center\">\nBuilt by the Instructor community. Special thanks to <a href=\"https://twitter.com/jxnlco\">Jason Liu</a> and all <a href=\"https://github.com/instructor-ai/instructor/graphs/contributors\">contributors</a>.\n</p>"
  },
  {
    "path": "build_mkdocs.sh",
    "content": "pip install -r requirements.txt\npip install -r requirements-doc.txt\nmkdocs build\n"
  },
  {
    "path": "cross_link_mapping.yaml",
    "content": "# Cross-Link Mapping for Instructor Documentation\n# This file maps blog posts and documentation pages to their related content\n# Format: \n#   source_file:\n#     related_concepts: [list of concept docs to link]\n#     related_blog_posts: [list of related blog posts]\n#     related_examples: [list of example files]\n#     related_integrations: [list of integration docs]\n#     see_also_text: \"Custom text for See Also section\"\n\n# VALIDATION CLUSTER\nblog/posts/validation-part1.md:\n  related_concepts:\n    - concepts/validation.md\n    - concepts/reask_validation.md\n  related_blog_posts:\n    - blog/posts/semantic-validation-structured-outputs.md\n    - blog/posts/bad-schemas-could-break-llms.md\n    - blog/posts/pydantic-is-still-all-you-need.md\n  related_examples:\n    - examples/validators.md\n  see_also_text: |\n    ## Related Documentation\n    - [Core Validation Concepts](/concepts/validation) - Learn about validation fundamentals\n    - [Reask Validation](/concepts/reask_validation) - Handle validation failures gracefully\n    \n    ## See Also\n    - [Semantic Validation with Structured Outputs](semantic-validation-structured-outputs) - Next evolution in validation\n    - [Why Bad Schemas Break LLMs](bad-schemas-could-break-llms) - Schema design best practices\n    - [Pydantic Is Still All You Need](pydantic-is-still-all-you-need) - Why Pydantic validation matters\n\nblog/posts/semantic-validation-structured-outputs.md:\n  related_concepts:\n    - concepts/validation.md\n    - concepts/llm_validation.md\n  related_blog_posts:\n    - blog/posts/validation-part1.md\n    - blog/posts/anthropic-prompt-caching.md\n    - blog/posts/logfire.md\n  related_examples:\n    - examples/moderation.md\n  see_also_text: |\n    ## Related Documentation\n    - [Validation Fundamentals](/concepts/validation) - Core validation concepts\n    - [LLM Validation](/concepts/llm_validation) - Using LLMs for validation\n    \n    ## See Also\n    - [Validation Deep Dive](validation-part1) - Foundation validation concepts\n    - [Anthropic Prompt Caching](anthropic-prompt-caching) - Optimize validation costs\n    - [Monitoring with Logfire](logfire) - Track validation performance\n\nblog/posts/pydantic-is-still-all-you-need.md:\n  related_concepts:\n    - concepts/philosophy.md\n    - concepts/validation.md\n  related_blog_posts:\n    - blog/posts/validation-part1.md\n    - blog/posts/best_framework.md\n    - blog/posts/introduction.md\n  related_integrations:\n    - integrations/index.md\n  see_also_text: |\n    ## Related Documentation\n    - [Instructor Philosophy](/concepts/philosophy) - Why we chose Pydantic\n    - [Validation Guide](/concepts/validation) - Practical validation techniques\n    \n    ## See Also\n    - [Validation Deep Dive](validation-part1) - Advanced validation patterns\n    - [Best Framework Comparison](best_framework) - Why Instructor stands out\n    - [Introduction to Instructor](introduction) - Getting started guide\n\n# MULTIMODAL CLUSTER\nblog/posts/multimodal-gemini.md:\n  related_concepts:\n    - concepts/multimodal.md\n    - concepts/images.md\n  related_blog_posts:\n    - blog/posts/openai-multimodal.md\n    - blog/posts/structured-output-anthropic.md\n    - blog/posts/chat-with-your-pdf-with-gemini.md\n  related_integrations:\n    - integrations/google.md\n    - integrations/vertex.md\n  related_examples:\n    - examples/image_to_ad_copy.md\n  see_also_text: |\n    ## Related Documentation\n    - [Multimodal Concepts](/concepts/multimodal) - Working with images, video, and audio\n    - [Image Processing](/concepts/images) - Image-specific techniques\n    - [Google Integration](/integrations/google) - Complete Gemini setup guide\n    \n    ## See Also\n    - [OpenAI Multimodal](openai-multimodal) - Compare multimodal approaches\n    - [Anthropic Structured Output](structured-output-anthropic) - Alternative provider\n    - [Chat with PDFs using Gemini](chat-with-your-pdf-with-gemini) - Practical PDF processing\n\nblog/posts/openai-multimodal.md:\n  related_concepts:\n    - concepts/multimodal.md\n    - concepts/images.md\n  related_blog_posts:\n    - blog/posts/multimodal-gemini.md\n    - blog/posts/anthropic-prompt-caching.md\n    - blog/posts/logfire.md\n  related_integrations:\n    - integrations/openai.md\n  related_examples:\n    - examples/audio.md\n  see_also_text: |\n    ## Related Documentation\n    - [Multimodal Guide](/concepts/multimodal) - Comprehensive multimodal reference\n    - [OpenAI Integration](/integrations/openai) - Full OpenAI setup\n    \n    ## See Also\n    - [Gemini Multimodal](multimodal-gemini) - Alternative multimodal approach\n    - [Prompt Caching](anthropic-prompt-caching) - Cache large audio files\n    - [Monitoring with Logfire](logfire) - Track multimodal processing\n\nblog/posts/chat-with-your-pdf-with-gemini.md:\n  related_concepts:\n    - concepts/multimodal.md\n  related_blog_posts:\n    - blog/posts/multimodal-gemini.md\n    - blog/posts/generating-pdf-citations.md\n    - blog/posts/rag-and-beyond.md\n  related_examples:\n    - examples/pdf_to_markdown.md\n  see_also_text: |\n    ## Related Documentation\n    - [Multimodal Processing](/concepts/multimodal) - Core multimodal concepts\n    \n    ## See Also\n    - [Gemini Multimodal Features](multimodal-gemini) - Full Gemini capabilities\n    - [PDF Citation Generation](generating-pdf-citations) - Extract citations from PDFs\n    - [RAG and Beyond](rag-and-beyond) - Advanced document processing\n\n# PROVIDER INTEGRATION CLUSTER\nblog/posts/structured-output-anthropic.md:\n  related_concepts:\n    - concepts/patching.md\n  related_blog_posts:\n    - blog/posts/anthropic-prompt-caching.md\n    - blog/posts/announcing-unified-provider-interface.md\n    - blog/posts/best_framework.md\n  related_integrations:\n    - integrations/anthropic.md\n  related_examples:\n    - examples/classification.md\n  see_also_text: |\n    ## Related Documentation\n    - [How Patching Works](/concepts/patching) - Understand provider integration\n    - [Anthropic Integration](/integrations/anthropic) - Complete setup guide\n    \n    ## See Also\n    - [Anthropic Prompt Caching](anthropic-prompt-caching) - Optimize Anthropic costs\n    - [Unified Provider Interface](announcing-unified-provider-interface) - Switch providers easily\n    - [Framework Comparison](best_framework) - Why Instructor excels\n\nblog/posts/anthropic-prompt-caching.md:\n  related_concepts:\n    - concepts/caching.md\n  related_blog_posts:\n    - blog/posts/structured-output-anthropic.md\n    - blog/posts/caching.md\n    - blog/posts/logfire.md\n  related_integrations:\n    - integrations/anthropic.md\n  see_also_text: |\n    ## Related Documentation\n    - [Caching Strategies](/concepts/caching) - General caching concepts\n    - [Anthropic Integration](/integrations/anthropic) - Full Anthropic guide\n    \n    ## See Also\n    - [Anthropic Structured Outputs](structured-output-anthropic) - Use with caching\n    - [Response Caching](caching) - General caching strategies\n    - [Performance Monitoring](logfire) - Track cache performance\n\nblog/posts/announcing-unified-provider-interface.md:\n  related_concepts:\n    - concepts/patching.md\n    - concepts/philosophy.md\n  related_blog_posts:\n    - blog/posts/string-based-init.md\n    - blog/posts/best_framework.md\n    - blog/posts/introduction.md\n  related_integrations:\n    - integrations/index.md\n  related_examples:\n    - examples/groq.md\n    - examples/mistral.md\n  see_also_text: |\n    ## Related Documentation\n    - [Provider Patching](/concepts/patching) - How provider integration works\n    - [All Integrations](/integrations/) - Supported provider list\n    \n    ## See Also\n    - [String-Based Initialization](string-based-init) - Alternative init method\n    - [Framework Comparison](best_framework) - Multi-provider advantages\n    - [Getting Started](introduction) - Quick start guide\n\n# RAG AND SEARCH CLUSTER\nblog/posts/rag-and-beyond.md:\n  related_concepts:\n    - concepts/validation.md\n  related_blog_posts:\n    - blog/posts/llm-as-reranker.md\n    - blog/posts/citations.md\n    - blog/posts/chat-with-your-pdf-with-gemini.md\n  related_examples:\n    - examples/search.md\n  see_also_text: |\n    ## Related Documentation\n    - [Validation Concepts](/concepts/validation) - Validate RAG outputs\n    \n    ## See Also\n    - [LLM as Reranker](llm-as-reranker) - Improve search relevance\n    - [Citation Extraction](citations) - Verify sources\n    - [PDF Processing](chat-with-your-pdf-with-gemini) - Document handling\n\nblog/posts/llm-as-reranker.md:\n  related_blog_posts:\n    - blog/posts/rag-and-beyond.md\n    - blog/posts/validation-part1.md\n    - blog/posts/logfire.md\n  related_examples:\n    - examples/reranking.md\n  see_also_text: |\n    ## See Also\n    - [RAG and Beyond](rag-and-beyond) - Comprehensive RAG guide\n    - [Validation Fundamentals](validation-part1) - Validate ranking scores\n    - [Performance Monitoring](logfire) - Track reranking performance\n\nblog/posts/citations.md:\n  related_concepts:\n    - concepts/validation.md\n  related_blog_posts:\n    - blog/posts/rag-and-beyond.md\n    - blog/posts/generating-pdf-citations.md\n    - blog/posts/validation-part1.md\n  see_also_text: |\n    ## Related Documentation\n    - [Validation Guide](/concepts/validation) - Validate citations\n    \n    ## See Also\n    - [RAG Techniques](rag-and-beyond) - Use citations in RAG\n    - [PDF Citations](generating-pdf-citations) - Extract from PDFs\n    - [Validation Basics](validation-part1) - Ensure citation quality\n\n# PERFORMANCE AND MONITORING\nblog/posts/logfire.md:\n  related_concepts:\n    - concepts/retrying.md\n  related_blog_posts:\n    - blog/posts/full-fastapi-visibility.md\n    - blog/posts/anthropic-prompt-caching.md\n    - blog/posts/validation-part1.md\n  related_integrations:\n    - integrations/pydantic_logfire.md\n  see_also_text: |\n    ## Related Documentation\n    - [Retry Mechanisms](/concepts/retrying) - Handle failures gracefully\n    - [Logfire Integration](/integrations/pydantic_logfire) - Setup guide\n    \n    ## See Also\n    - [FastAPI Visibility](full-fastapi-visibility) - Web app monitoring\n    - [Prompt Caching](anthropic-prompt-caching) - Monitor cache hits\n    - [Validation Monitoring](validation-part1) - Track validation metrics\n\nblog/posts/caching.md:\n  related_concepts:\n    - concepts/caching.md\n  related_blog_posts:\n    - blog/posts/anthropic-prompt-caching.md\n    - blog/posts/logfire.md\n  see_also_text: |\n    ## Related Documentation\n    - [Caching Concepts](/concepts/caching) - Core caching strategies\n    \n    ## See Also\n    - [Anthropic Prompt Caching](anthropic-prompt-caching) - Provider-specific caching\n    - [Performance Monitoring](logfire) - Track cache effectiveness\n\n# GETTING STARTED AND PHILOSOPHY\nblog/posts/introduction.md:\n  related_concepts:\n    - concepts/philosophy.md\n    - concepts/quickstart.md\n  related_blog_posts:\n    - blog/posts/best_framework.md\n    - blog/posts/pydantic-is-still-all-you-need.md\n    - blog/posts/announcing-unified-provider-interface.md\n  see_also_text: |\n    ## Related Documentation\n    - [Quick Start Guide](/concepts/quickstart) - Get running in minutes\n    - [Philosophy](/concepts/philosophy) - Why we built Instructor\n    \n    ## See Also\n    - [Framework Comparison](best_framework) - See how we compare\n    - [Why Pydantic](pydantic-is-still-all-you-need) - Our foundation\n    - [Easy Provider Setup](announcing-unified-provider-interface) - Start with any LLM\n\nblog/posts/best_framework.md:\n  related_concepts:\n    - concepts/philosophy.md\n  related_blog_posts:\n    - blog/posts/introduction.md\n    - blog/posts/pydantic-is-still-all-you-need.md\n    - blog/posts/announcing-unified-provider-interface.md\n  see_also_text: |\n    ## Related Documentation\n    - [Our Philosophy](/concepts/philosophy) - Design principles\n    \n    ## See Also\n    - [Getting Started](introduction) - Quick introduction\n    - [Pydantic Foundation](pydantic-is-still-all-you-need) - Why Pydantic\n    - [Multi-Provider Support](announcing-unified-provider-interface) - Key differentiator"
  },
  {
    "path": "docs/AGENT.md",
    "content": "---\ntitle: Documentation Agent Guide\ndescription: Internal guide for maintaining and improving Instructor documentation\n---\n\n# AGENT.md - Documentation\n\n## Commands\n- Serve docs locally: `uv run mkdocs serve`\n- Build docs: `./build_mkdocs.sh` or `uv run mkdocs build`\n- Install doc deps: `uv pip install -e \".[docs]\"`\n- Test examples: `uv run pytest docs/ --examples`\n\n## Structure\n- **Core docs**: `concepts/`, `integrations/`, `examples/`\n- **Learning path**: `getting-started.md` → `learning/` → `tutorials/`\n- **API reference**: Auto-generated from docstrings via `mkdocstrings`\n- **Blog**: `blog/posts/` for announcements and deep-dives\n- **Templates**: `templates/` for new docs (provider, concept, cookbook)\n\n## Writing Guidelines\n- **Reading level**: Grade 10 (from .cursor/rules)\n- **Code examples**: Must be runnable with complete imports\n- **Progressive complexity**: Simple → advanced concepts\n- **Provider docs**: Follow `templates/` patterns\n- **Navigation**: Update `mkdocs.yml` for new pages\n\n## Pull Request (PR) Formatting\n\nUse **Conventional Commits** formatting for PR titles so they are consistent and easy to scan. Treat the PR title as the message we would use for a squash merge commit.\n\n### PR Title Format\n\nUse:\n\n`<type>(<scope>): <short summary>`\n\nRules:\n- Keep it under ~70 characters when you can.\n- Use the imperative mood (for example, “add”, “fix”, “update”).\n- Do not end with a period.\n- If it includes a breaking change, add `!` after the type or scope (for example, `feat(docs)!:`).\n\nGood examples:\n- `docs(agents): add conventional commit PR title guidelines`\n- `docs(mkdocs): fix broken link in validation tutorial`\n- `docs(examples): update youtube clips snippet`\n- `chore(docs): refresh docs build commands`\n\nCommon types:\n- `docs`: documentation-only changes\n- `fix`: bug fix\n- `feat`: new feature\n- `test`: add or update tests\n- `chore`: maintenance work (build scripts, tooling, repo hygiene)\n- `ci`: CI pipeline changes\n\nSuggested docs scopes:\n- `docs`, `mkdocs`, `blog`, `examples`, `integrations`, `tutorials`, `agents`\n\n### PR Description Guidelines\n\nKeep PR descriptions short and actionable:\n- **What**: What changed, in 1–3 sentences.\n- **Why**: Why this change is needed (link issues when possible).\n- **Changes**: 3–7 bullet points with the main edits.\n- **Testing**: What you ran (or why you did not run anything).\n- **Docs impact**: Call out page moves, redirects, or nav updates.\n\nIf the PR was authored by Cursor, include:\n- `This PR was written by [Cursor](https://cursor.com)`\n\n## Key Files\n- `mkdocs.yml` - Site configuration and navigation\n- `hooks/` - Custom processing (hide_lines.py removes `# <%hide%>` markers)\n- `overrides/` - Custom theme elements\n- `javascripts/` - Client-side enhancements\n"
  },
  {
    "path": "docs/api-docstring-assessment.md",
    "content": "# API Docstring Quality Assessment\n\nThis document assesses the quality and completeness of docstrings for all API items referenced in the expanded API documentation.\n\n## Summary\n\nOverall, the docstring quality is **good to excellent** for most items. Many classes and functions have comprehensive docstrings with usage examples, while some core classes could benefit from class-level docstrings.\n\n## Excellent Docstrings (Comprehensive with Examples)\n\nThese have detailed docstrings with usage examples and clear descriptions:\n\n### Client Creation\n- **`from_provider`** - Comprehensive docstring with Args, Returns, Raises, and Examples sections. Includes multiple usage examples showing basic usage, caching, and async clients.\n\n### Validation\n- **`llm_validator`** - Good docstring with usage examples, parameter descriptions, and error message examples showing how validation errors are formatted.\n\n### DSL Components\n- **`CitationMixin`** - Excellent docstring with complete usage examples showing how to use it with context, and result examples showing the output structure.\n- **`IterableModel`** - Good docstring with usage examples showing before/after transformation, Parameters section, and Returns description.\n- **`Maybe`** - Good docstring with usage examples and result structure showing the generated model fields.\n\n### Batch Processing\n- **`BatchProcessor`** - Good class-level docstring explaining the unified interface. Methods like `create_batch_from_messages` and `submit_batch` have clear Args and Returns sections.\n\n### Distillation\n- **`Instructions`** - Good docstring with parameter descriptions. The `distil` method has usage examples showing decorator usage patterns.\n\n### Hooks\n- **`Hooks`** - Excellent class-level docstring explaining the purpose. Methods like `on()`, `get_hook_name()`, `emit()`, etc. have comprehensive docstrings with Args, Returns, Raises, and Examples sections.\n\n### Schema Generation\n- **`generate_openai_schema`** - Good docstring with Args, Returns, and Notes sections explaining how docstrings are used.\n- **`generate_anthropic_schema`** - Has docstring explaining the conversion process.\n\n### Multimodal\n- **`Audio`** - Good class-level docstring. Methods like `autodetect()` and `autodetect_safely()` have clear docstrings with Args and Returns.\n\n### Exceptions\n- **`InstructorError`** - Excellent docstring with Attributes section, Examples showing error handling, and See Also references.\n- **`IncompleteOutputException`** - Good docstring with Attributes, Common Solutions, and Examples.\n- **`InstructorRetryException`** - Comprehensive docstring with Attributes, Common Causes, Examples, and See Also.\n- **`ValidationError`** - Good docstring with Examples and See Also.\n- **`ProviderError`** - Good docstring with Attributes, Common Causes, and Examples.\n- **`ConfigurationError`** - Good docstring with Common Scenarios and Examples.\n- **`ModeError`** - Good docstring with Attributes, Examples, and See Also.\n- **`ClientError`** - Good docstring with Common Scenarios and Examples.\n- **`AsyncValidationError`** - Good docstring with Attributes and Examples.\n- **`ResponseParsingError`** - Good docstring with Attributes, Examples, and backwards compatibility notes.\n- **`MultimodalError`** - Good docstring with Attributes, Examples, and backwards compatibility notes.\n\n## Good Docstrings (Clear but Could Be Enhanced)\n\nThese have adequate docstrings but could benefit from more examples or additional detail:\n\n### Core Clients\n- **`Instructor`** - No class-level docstring. Methods have type hints but lack comprehensive docstrings. The class is well-documented through usage in examples, but a class-level docstring would help.\n- **`AsyncInstructor`** - Similar to `Instructor`, no class-level docstring.\n- **`Response`** - No class-level docstring. Methods like `create()` and `create_with_completion()` lack docstrings.\n\n### Client Creation\n- **`from_openai`** - No docstring. Only has type overloads. The implementation exists but lacks documentation explaining usage, parameters, and return values.\n\n### Function Calls & Schema\n- **`OpenAISchema`** - Good method docstrings for `openai_schema`, `anthropic_schema`, `gemini_schema`, and `from_response()`. The class itself could use a class-level docstring explaining its purpose and usage.\n- **`openai_schema`** - Decorator function, but the docstring is on the class method, not the decorator itself.\n\n### DSL Components\n- **`Partial`** - Minimal docstring. Has Notes and Example sections but could benefit from more comprehensive usage examples showing streaming scenarios.\n\n### Multimodal\n- **`Image`** - No class-level docstring. Methods have good docstrings (`autodetect()`, `autodetect_safely()`, `from_gs_url()`, etc.), but the class itself lacks documentation.\n\n### Mode & Provider\n- **`Mode`** - Good class-level docstring explaining what modes are and how they work. Individual mode values lack docstrings but the enum docstring is comprehensive.\n- **`Provider`** - No class-level docstring. Just enum values without explanation.\n\n### Patch Functions\n- **`patch`** - Good docstring explaining what features it enables (response_model, max_retries, validation_context, strict, hooks). Could benefit from usage examples.\n- **`apatch`** - Need to check if it has similar docstring quality.\n\n## Areas Needing Improvement\n\n### Missing Class-Level Docstrings\n1. **`Instructor`** - Should have a class-level docstring explaining:\n   - What the class does\n   - How to use it\n   - Key features (modes, hooks, retries)\n   - Basic usage example\n\n2. **`AsyncInstructor`** - Should have a class-level docstring explaining:\n   - Async usage patterns\n   - How it differs from `Instructor`\n   - Async examples\n\n3. **`Response`** - Should have a class-level docstring explaining:\n   - What the Response helper does\n   - When to use it vs direct client methods\n   - Usage examples\n\n4. **`Image`** - Should have a class-level docstring explaining:\n   - What Image represents\n   - Supported formats\n   - Common usage patterns\n\n5. **`Provider`** - Should have a class-level docstring explaining:\n   - What providers are supported\n   - How to use Provider enum\n   - Provider detection\n\n### Missing Function Docstrings\n1. **`from_openai`** - Needs comprehensive docstring with:\n   - Purpose and usage\n   - Parameters explanation\n   - Return value description\n   - Examples\n\n2. **`from_litellm`** - No docstring. Only has type overloads. Similar to `from_openai`, needs comprehensive docstring.\n\n### Could Be Enhanced\n1. **`Partial`** - Could add more streaming examples\n2. **`patch`** - Could add usage examples showing before/after\n3. **`apatch`** - Has docstring but marked as deprecated (\"No longer necessary, use `patch` instead\"). Docstring is adequate but the deprecation should be more prominent.\n4. **`openai_schema`** - Has minimal docstring. Could expand with usage examples showing how to use the decorator.\n\n## Recommendations\n\n### High Priority\n1. Add class-level docstrings to `Instructor` and `AsyncInstructor` - These are the core classes users interact with\n2. Add docstring to `from_openai` - Important client creation function\n3. Add class-level docstring to `Response` - Helper class that needs explanation\n\n### Medium Priority\n1. Add class-level docstring to `Image` - Commonly used multimodal class\n2. Add class-level docstring to `Provider` - Enum that could use explanation\n3. Enhance `Partial` docstring with more streaming examples\n\n### Low Priority\n1. Add more examples to `patch` docstring\n2. Expand `openai_schema` docstring with examples\n3. Consider updating `apatch` deprecation message to be more prominent\n\n## Overall Assessment\n\n**Grade: B+**\n\nThe documentation is generally good with many excellent examples, but the core classes (`Instructor`, `AsyncInstructor`, `Response`) would benefit significantly from class-level docstrings. The DSL components and utility functions are well-documented, and the exception classes have comprehensive docstrings.\n\nThe mkdocs autodoc plugin will generate API documentation from these docstrings, so improving them will directly improve the generated API reference pages.\n"
  },
  {
    "path": "docs/api.md",
    "content": "---\ntitle: API Reference Guide\ndescription: Explore the comprehensive API reference with details on instructors, validation, iteration, and function calls.\n---\n\n# API Reference\n\nCore modes are the recommended default. Legacy provider-specific modes still\nwork but are deprecated and will show warnings. See the\n[Mode Migration Guide](concepts/mode-migration.md) for details.\n\n## Core Clients\n\nThe main client classes for interacting with LLM providers.\n\n::: instructor.Instructor\n\n::: instructor.AsyncInstructor\n\n::: instructor.core.client.Response\n\n## Client Creation\n\nFunctions to create Instructor clients from various providers.\n\n::: instructor.from_provider\n\n::: instructor.from_openai\n\n::: instructor.from_litellm\n\n## DSL Components\n\nDomain-specific language components for advanced patterns and data handling.\n\n::: instructor.dsl.validators\n\n::: instructor.dsl.iterable\n\n::: instructor.dsl.partial\n\n::: instructor.dsl.parallel\n\n::: instructor.dsl.maybe\n\n::: instructor.dsl.citation\n\n## Function Calls & Schema\n\nClasses and functions for defining and working with function call schemas.\n\n::: instructor.function_calls\n\n::: instructor.OpenAISchema\n\n::: instructor.openai_schema\n\n::: instructor.generate_openai_schema\n\n::: instructor.generate_anthropic_schema\n\n::: instructor.generate_gemini_schema\n\n## Validation\n\nValidation utilities for LLM outputs and async validation support.\n\n::: instructor.validation\n\n::: instructor.llm_validator\n\n::: instructor.openai_moderation\n\n## Batch Processing\n\nBatch processing utilities for handling multiple requests efficiently.\n\n::: instructor.batch\n\n::: instructor.batch.BatchProcessor\n\n::: instructor.batch.BatchRequest\n\n::: instructor.batch.BatchJob\n\n## Distillation\n\nTools for distillation and fine-tuning workflows.\n\n::: instructor.distil\n\n::: instructor.FinetuneFormat\n\n::: instructor.Instructions\n\n## Multimodal\n\nSupport for image and audio content in LLM requests.\n\n::: instructor.processing.multimodal\n\n::: instructor.Image\n\n::: instructor.Audio\n\n## Mode & Provider\n\nEnumerations for modes and providers.\n\n::: instructor.Mode\n\n::: instructor.Provider\n\n## Exceptions\n\nException classes for error handling.\n\n::: instructor.core.exceptions\n\n## Hooks\n\nEvent hooks system for monitoring and intercepting LLM interactions.\n\n::: instructor.core.hooks\n\n::: instructor.core.hooks.Hooks\n\n::: instructor.core.hooks.HookName\n\n## Patch Functions\n\nDecorators for patching LLM client methods.\n\n::: instructor.core.patch\n\n::: instructor.patch\n\n::: instructor.apatch\n"
  },
  {
    "path": "docs/architecture.md",
    "content": "---\ntitle: Instructor Architecture Overview\ndescription: Learn about the internal architecture and design decisions of the Instructor library\n---\n\n# Architecture Overview\n\nThis page explains the core execution flow and where to plug in or debug. It highlights the minimal sync/async code paths and how streaming, partial, and parallel modes integrate.\n\n## High-Level Flow\n\n```mermaid\nsequenceDiagram\n    autonumber\n    participant U as User Code\n    participant I as Instructor (patched)\n    participant R as Retry Layer (tenacity)\n    participant C as Provider Client\n    participant D as Dispatcher (process_response)\n    participant H as Provider Handler (response/reask)\n    participant M as Pydantic Model\n\n    U->>I: chat.completions.create(response_model=..., **kwargs)\n    Note right of I: patch() wraps create() with cache/templating and retry\n    I->>R: retry_sync/async(func=create, max_retries, strict, mode, hooks)\n    loop attempts\n        R->>C: create(**prepared_kwargs)\n        C-->>R: raw response (provider-specific)\n        R->>D: process_response(_async)(response, response_model, mode, stream)\n        alt Streaming/Partial\n            D->>M: Iterable/Partial.from_streaming_response(_async)\n            D-->>R: Iterable/Partial model (or list of items)\n        else Standard\n            D->>H: provider mode handler (format/parse selection)\n            H-->>D: adjusted response_model/new_kwargs if needed\n            D->>M: response_model.from_response(...)\n            M-->>D: parsed model (with _raw_response attached)\n            D-->>R: model (or adapted simple type)\n        end\n        R-->>I: parsed model\n    end\n    I-->>U: final model (plus _raw_response on instance)\n\n    rect rgb(255,240,240)\n    Note over R,H: On validation/JSON errors → reask path\n    R->>H: handle_reask_kwargs(..., exception, failed_attempts)\n    H-->>R: new kwargs/messages for next attempt\n    end\n```\n\nKey responsibilities:\n- patch(): wraps the provider `create` with cache lookup/save, templating, strict mode, hooks, and retry.\n- Retry: executes provider call, emits hooks, updates usage, handles validation/JSON errors with reask, and re-attempts.\n- Dispatcher: selects the correct parsing path by `Mode`, handles multimodal message conversion, and attaches `_raw_response` to the returned model.\n- Provider Handlers: provider/mode-specific request shaping and reask preparation.\n\n## Minimal Code Paths\n\n### Synchronous\n```python\nimport openai\nimport instructor\nfrom pydantic import BaseModel\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nmodel = client.create(\n    model=\"gpt-4o-mini\",\n    messages=[{\"role\": \"user\", \"content\": \"{'name': 'Ada', 'age': 37}\"}],\n    response_model=User,            # triggers schema/tool wiring + parsing\n    max_retries=3,                  # tenacity-backed validation retries\n    strict=True,                    # strict JSON parsing if supported\n)\n\n# Access raw provider response if needed\nraw = model._raw_response\n```\n\n### Asynchronous\n```python\nimport asyncio\nimport openai\nimport instructor\nfrom pydantic import BaseModel\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def main():\n    aclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n    model = await aclient.create(\n        model=\"gpt-4o-mini\",\n        messages=[{\"role\": \"user\", \"content\": \"{\\\"name\\\": \\\"Ada\\\", \\\"age\\\": 37}\"}],\n        response_model=User,\n        max_retries=3,\n        strict=True,\n    )\n    print(model)\n\nasyncio.run(main())\n```\n\n## Streaming, Partial, Parallel\n\n### Streaming Iterable\n- Use `create_iterable(response_model=Model, stream=True implicitly)` via `Instructor.create_iterable`.\n- Returns a generator (sync) or async generator (async) of parsed items.\n- Internally sets `stream=True`, and `IterableBase.from_streaming_response(_async)` assembles items.\n\n```python\nfor item in client.create_iterable(messages=..., response_model=MyModel):\n    print(item)\n```\n\n### Partial Objects\n- Use `create_partial(response_model=Model)` to receive progressively filled partial models while streaming.\n- Internally wraps the model as `Partial[Model]` and sets `stream=True`.\n\n```python\nfor partial in client.create_partial(messages=..., response_model=MyModel):\n    # partial contains fields as they arrive\n    pass\n```\n\n### Parallel Tools\n- Use `Mode.PARALLEL_TOOLS` and a parallel type hint (e.g., list of models) when you need multiple tool calls in one request.\n- Streaming is not supported in parallel tools mode.\n\n```python\nfrom instructor.mode import Mode\n\nresult = client.create(\n    model=\"gpt-4o\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract person and event info.\"}],\n    response_model=[PersonInfo, EventInfo],\n    mode=Mode.PARALLEL_TOOLS,\n)\n```\n\n## Hooks and Retry\n\nYou can observe and instrument the flow with hooks. Typical events:\n- `completion:kwargs`: just before provider call\n- `completion:response`: after provider call\n- `parse:error`: on validation/JSON errors\n- `completion:last_attempt`: when a retry sequence is about to stop\n- `completion:error`: non-validation completion errors\n\n```python\nfrom instructor.core.hooks import HookName\n\nclient.on(HookName.COMPLETION_KWARGS, lambda **kw: print(\"KWARGS\", kw))\nclient.on(HookName.PARSE_ERROR, lambda e: print(\"PARSE\", e))\n```\n\n## Where Multimodal Conversion Happens\n\n- For modes that require it, messages are converted via `processing.multimodal.convert_messages`.\n- Image/Audio/PDF autodetection can be enabled (by specific handlers/modes) and will convert strings/paths/URLs or data URIs into provider-ready payloads.\n\n## Error Handling at a Glance\n\n- Validation or JSON decode errors trigger the reask path.\n- Reask handlers (`handle_reask_kwargs`) append/adjust messages with error feedback so the next attempt can correct itself.\n- If all retries fail, `InstructorRetryException` is raised containing `failed_attempts`, the last completion, usage totals, and the create kwargs for reproduction.\n\n## Extensibility Notes\n\n- New providers add utils for response and reask handling and register modes used by the dispatcher.\n- Most JSON/tool patterns are shared; prefer reusing existing handlers where possible.\n- Keep provider-specific logic in provider utils; avoid expanding central dispatcher beyond routing and orchestration.\n\n"
  },
  {
    "path": "docs/blog/.authors.yml",
    "content": "authors:\n  jxnl:\n    name: Jason Liu\n    description: Creator\n    avatar: https://avatars.githubusercontent.com/u/4852235?v=4\n    url: https://twitter.com/intent/follow?screen_name=jxnlco\n  ivanleomk:\n    name: Ivan Leo\n    description: Contributor\n    avatar: https://pbs.twimg.com/profile_images/1838778744468836353/utYfioiO_400x400.jpg\n    url: https://twitter.com/intent/follow?screen_name=ivanleomk\n  anmol:\n    name: Anmol Jawandha\n    description: Contributor\n    avatar: https://pbs.twimg.com/profile_images/1248544843556466693/PgxUIeBs_400x400.jpg\n  joschkabraun:\n    name: Joschka Braun\n    description: Contributor\n    avatar: https://pbs.twimg.com/profile_images/1601251353531224065/PYpqKsjL_400x400.jpg\n    url: https://joschkabraun.com\n  sarahchieng:\n    name: Sarah Chieng\n    description: Contributor\n    avatar: https://pbs.twimg.com/profile_images/1755455116595834880/Hxh5ceRZ_400x400.jpg\n    url: https://twitter.com/sarahchieng\n  zilto:\n    name: Thierry Jean\n    description: Contributor\n    avatar: https://avatars.githubusercontent.com/u/68975210?v=4\n    url: https://www.linkedin.com/in/thierry-jean/\n  yanomaly:\n    name: Yan\n    description: Contributor\n    avatar: https://avatars.githubusercontent.com/u/87994542?v=4\n"
  },
  {
    "path": "docs/blog/index.md",
    "content": "# Subscribe to our Newsletter for Updates and Tips\n\nIf you want to get updates on new features and tips on how to use Instructor, you can subscribe to our newsletter below to get notified when we publish new content.\n\n<iframe src=\"https://embeds.beehiiv.com/2faf420d-8480-4b6e-8d6f-9c5a105f917a?slim=true\" data-test-id=\"beehiiv-embed\" height=\"52\" frameborder=\"0\" scrolling=\"no\" style=\"margin: 0; border-radius: 0px !important; background-color: transparent;\"></iframe>\n\n## Advanced Topics\n\n1. [Unified Provider Interface in Instructor](posts/announcing-unified-provider-interface.md)\n2. [Instructor Implements llms.txt](posts/llms-txt-adoption.md)\n3. [Query Understanding: Beyond Embeddings](posts/rag-and-beyond.md)\n4. [Achieving GPT-4 Level Summaries with GPT-3.5-turbo](posts/chain-of-density.md)\n5. [Basics of Guardrails and Validation in AI Models](posts/validation-part1.md)\n6. [Validating Citations in AI-Generated Content](posts/citations.md)\n7. [Fine-tuning and Distillation in AI Models](posts/distilation-part1.md)\n8. [Enhancing OpenAI Client Observability with LangSmith](posts/langsmith.md)\n9. [Logfire Integration with Pydantic](posts/logfire.md)\n\n## AI Development and Optimization\n\n- [Effective Function Caching in Python](posts/caching.md)\n- [Fundamentals of Batch Processing with Async in Python](posts/learn-async.md)\n- [Streaming Models to Improve Latency](posts/generator.md)\n- [Using OpenAI's Batch API for Large-Scale Synthetic Data Generation](../examples/batch_job_oai.md)\n- [Implementing Bulk Classification with User-Provided Tags](../examples/bulk_classification.md)\n- [Utilizing GPT-4 Vision API for Ad Copy from Product Images](../examples/image_to_ad_copy.md)\n\n## Language Models and Prompting Techniques\n\n- [Least-to-Most Prompting Technique for LLMs](../prompting/decomposition/least_to_most.md)\n- [Chain of Verification (CoVe) Method for Improving LLM Accuracy](../prompting/self_criticism/chain_of_verification.md)\n- [Cumulative Reasoning to Enhance Model Performance](../prompting/self_criticism/cumulative_reason.md)\n- [Reverse Chain of Thought (RCoT) Method for Logical Consistency](../prompting/self_criticism/reversecot.md)\n\n## Integrations and Tools\n\n- [Ollama Integration](../integrations/ollama.md)\n- [llama-cpp-python Integration](../integrations/llama-cpp-python.md)\n- [Together Compute Integration](../integrations/together.md)\n- [Pandas DataFrame Examples](./posts/tidy-data-from-messy-tables.md#defining-a-custom-type)\n- [Streaming Response Examples](../concepts/partial.md)\n\n## Media and Resources\n\n- [Course: Structured Outputs with Instructor](https://www.wandb.courses/courses/steering-language-models?x=1)\n- [Keynote: Pydantic is All You Need](posts/aisummit-2023.md)\n"
  },
  {
    "path": "docs/blog/posts/aisummit-2023.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2023-11-02\ndescription: Explore insights on utilizing Pydantic for effective prompt engineering\n  in this AI Engineer Summit keynote.\ndraft: false\ntags:\n- Pydantic\n- Prompt Engineering\n- AI Summit\n- Machine Learning\n- Data Validation\n---\n\n# AI Engineer Keynote: Pydantic is all you need\n\n[![Pydantic is all you need](https://img.youtube.com/vi/yj-wSRJwrrc/0.jpg)](https://www.youtube.com/watch?v=yj-wSRJwrrc)\n\n[Click here to watch the full talk](https://www.youtube.com/watch?v=yj-wSRJwrrc)\n\n<!-- more -->\n\nLast month, I ventured back onto the speaking circuit at the inaugural [AI Engineer Summit](https://www.ai.engineer/summit), sharing insights on leveraging [Pydantic](https://docs.pydantic.dev/latest/) for effective prompt engineering. I dove deep into what is covered in our documentation and standard blog posts,\n\nI'd genuinely appreciate any feedback on the talk - every bit helps in refining the art. So, take a moment to check out the [full talk here](https://youtu.be/yj-wSRJwrrc?si=vGMIqtTapbIN8SLz), and let's continue pushing the boundaries of what's possible."
  },
  {
    "path": "docs/blog/posts/announcing-gemini-tool-calling-support.md",
    "content": "---\nauthors:\n- ivanleomk\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2024-09-03\ndescription: Introducing structured outputs for Gemini tool calling support in the\n  instructor library, enhancing interactions with Gemini and VertexAI SDKs.\ndraft: false\ntags:\n- Gemini\n- VertexAI\n- Tool Calling\n- Instructor Library\n- AI SDKs\n---\n\n# Structured Outputs for Gemini now supported\n\nWe're excited to announce that `instructor` now supports structured outputs using tool calling for both the Gemini SDK and the VertexAI SDK.\n\nA special shoutout to [Sonal](https://x.com/sonalsaldanha) for his contributions to the Gemini Tool Calling support.\n\nLet's walk through a simple example of how to use these new features\n\n## Installation\n\nTo get started, install the latest version of `instructor`. Depending on whether you're using Gemini or VertexAI, you should install the following:\n\n=== \"Gemini\"\n\n    ```bash\n    pip install \"instructor[google-generativeai]\"\n    ```\n\n=== \"VertexAI\"\n\n    ```bash\n    pip install \"instructor[vertexai]\"\n    ```\n\nThis ensures that you have the necessary dependencies to use the Gemini or VertexAI SDKs with instructor.\n\nWe recommend using the Gemini SDK over the VertexAI SDK for two main reasons.\n\n1. Compared to the VertexAI SDK, the Gemini SDK comes with a free daily quota of 1.5 billion tokens to use for developers.\n2. The Gemini SDK is significantly easier to setup, all you need is a `GOOGLE_API_KEY` that you can generate in your GCP console. THe VertexAI SDK on the other hand requires a credentials.json file or an OAuth integration to use.\n\n## Getting Started\n\nWith our provider agnostic API, you can use the same interface to interact with both SDKs, the only thing that changes here is how we initialise the client itself.\n\nBefore running the following code, you'll need to make sure that you have your Gemini API Key set in your shell under the alias `GOOGLE_API_KEY`.\n\n```python\nimport instructor\nimport google.generativeai as genai\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n    )\n)\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract Jason is 25 years old.\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(resp)\n#> name='Jason' age=25\n```\n\n1. Current Gemini models that support tool calling are `gemini-3-flash` and `gemini-1.5-pro-latest`.\n\nWe can achieve a similar thing with the VertexAI SDK. For this to work, you'll need to authenticate to VertexAI.\n\nThere are some instructions [here](https://cloud.google.com/vertex-ai/docs/authentication) but the easiest way I found was to simply download the GCloud cli and run `gcloud auth application-default login`.\n\n```python\nimport instructor\nimport vertexai  # type: ignore\nfrom vertexai.generative_models import GenerativeModel  # type: ignore\nfrom pydantic import BaseModel\n\nvertexai.init()\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\", vertexai=True),  # (1)!\n)\n\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract Jason is 25 years old.\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(resp)\n#> name='Jason' age=25\n```\n\n1. Current Gemini models that support tool calling are `gemini-3-flash` and `gemini-1.5-pro-latest`."
  },
  {
    "path": "docs/blog/posts/announcing-instructor-responses-support.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - instructor\ncomments: true\ndate: 2025-05-11\ndescription: Take advantage of OpenAI's latest offerings with the new responses API\ndraft: false\ntags:\n  - LLMs\n  - OpenAI\n  - Instructor\n---\n\n# Announcing Responses API support\n\nWe're excited to announce Instructor's integration with OpenAI's new Responses API. This integration brings a more streamlined approach to working with structured outputs from OpenAI models. Let's see what makes this integration special and how it can improve your LLM applications.\n\n<!-- more -->\n\n## What's New?\n\nThe Responses API represents a significant shift in how we interact with OpenAI models. With Instructor's integration, you can leverage this new API with our familiar, type-safe interface.\n\nFor our full documentation of the features we support, check out our full [OpenAI integration guide](../../integrations/openai.md).\n\nGetting started is now easier than ever. With our unified provider interface, you can initialize your client with a single line of code. This means less time dealing with configuration and more time building features that matter.\n\n```python\nimport instructor\n\n# Initialize the client with Responses mode\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\", mode=instructor.Mode.RESPONSES_TOOLS\n)\n```\n\nThe Responses API brings several improvements to structured data handling. You get access to built-in tools like web search and file search directly through the API. There's more efficient validation of structured outputs and improved error messages with better recovery mechanisms.\n\nHere's a quick example showing how it works:\n\n```python\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nprofile = client.responses.create(\n    input=\"Extract out Ivan is 28 years old\",\n    response_model=User,\n)\n\nprint(profile)\n#> name='Ivan' age=28\n```\n\n## Key Benefits\n\nThe integration maintains Instructor's core strength of type safety while adding the power of the Responses API. You get full Pydantic model validation, automatic type checking, and clear error messages when validation fails. This gives you confidence that your outputs meet the constraints you've defined.\n\nOne of the most exciting features is the built-in tools support. You can now easily perform web searches with automatic citations, search through your knowledge base, and get real-time information with proper attribution. This significantly expands what you can build without having to integrate multiple APIs.\n\nHere's an example using web search:\n\n```python\nclass Citation(BaseModel):\n    id: int\n    url: str\n\n\nclass Summary(BaseModel):\n    citations: list[Citation]\n    summary: str\n\n\nresponse = client.responses.create(\n    input=\"What are some of the best places to visit in New York for Latin American food?\",\n    tools=[{\"type\": \"web_search_preview\"}],\n    response_model=Summary,\n)\n```\n\nThe integration supports multiple ways to get structured outputs. You can use basic creation for simple, straightforward structured outputs. If you need real-time updates, partial creation lets you stream them as they come in. For handling multiple instances of the same object, iterable creation works great. And when you need both structured output and raw completion, completion with raw response gives you exactly that.\n\nFor production applications, we've maintained full async support. This lets you build responsive applications that can handle multiple requests efficiently:\n\n```python\nasync def get_user_profile():\n    async_client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\", mode=instructor.Mode.RESPONSES_TOOLS, async_client=True\n    )\n\n    profile = await async_client.responses.create(\n        input=\"Extract: Maria lives in Spain.\", response_model=UserProfile\n    )\n```\n\n## Why This Matters\n\nThe integration of Instructor with OpenAI's Responses API brings two major benefits that will transform how you work with LLMs.\n\nFirst, it makes working with inline citations significantly easier. When your LLM needs to reference external information, you get structured citation data that's ready to integrate into downstream applications. No more parsing messy text or manually extracting references - they come as properly typed objects that you can immediately use in your code.\n\nSecond, it works seamlessly with your existing chat completions code. You can add powerful capabilities like file search and web search without modifying your codebase. Just add the tool definition, and you're ready to go. Here's how simple it is:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\n\nclass Citation(BaseModel):\n    id: int\n    url: str\n\n\nclass Summary(BaseModel):\n    citations: list[Citation]\n    summary: str\n\n\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\",\n    mode=instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n)\n\nresponse = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"What are some of the best places to visit in New York for Latin American food?\",\n        }\n    ],\n    tools=[{\"type\": \"web_search_preview\"}],\n    response_model=Summary,\n)\nprint(response)\n\"\"\"\ncitations=[Citation(id=1, url='https://www.nycgo.com/restaurants/best-latin-american-restaurants-in-nyc/'), Citation(id=2, url='https://www.timeout.com/newyork/restaurants/best-latin-american-restaurants-in-nyc'), Citation(id=3, url='https://www.thrillist.com/eat/nation/best-latin-american-restaurants-nyc')] summary=\"Some of the best places to visit in New York for Latin American food include neighborhoods and restaurants known for authentic and diverse offerings. In Manhattan, areas like the East Village and Lower East Side have excellent Latin American restaurants. Popular spots include Casa Enrique, known for Mexican cuisine; Tia Pol, offering Spanish and Latin flavors; and La Contenta, serving dishes from various Latin American countries. Brooklyn's Williamsburg and Bushwick have emerged as vibrant spots for Latin American eats, with restaurants such as La Esquina and Fonda not to miss. These places are celebrated for delicious food, lively atmospheres, and cultural authenticity, making them top choices for anyone looking to enjoy Latin American cuisine in New York City.\"\n\"\"\"\n```\n\nThis makes the path forward clear - you can enhance your existing applications with the latest OpenAI features while maintaining the type safety and validation Instructor is known for. No need to learn a new API or refactor your code. It just works.\n\n## Getting Started\n\nTo start using the new Responses API integration, update to the latest version of Instructor, set up your OpenAI API key, initialize your client with the Responses mode, and start creating structured outputs.\n\nThis integration represents a significant step forward in making LLM development more accessible and powerful. We're excited to see what you'll build with these new capabilities.\n\nFor more detailed information about using the Responses API with Instructor, check out our [OpenAI integration guide](../../integrations/openai.md).\n\nHappy coding!\n"
  },
  {
    "path": "docs/blog/posts/announcing-unified-provider-interface.md",
    "content": "---\nauthors:\n  - jxnl\n  - ivanleomk\ncategories:\n  - instructor\ncomments: true\ndate: 2025-05-08\ndescription: Switch between different models and providers with a single string!\ndraft: false\ntags:\n  - LLMs\n  - Instructor\n---\n\nWe are pleased to introduce a significant enhancement to Instructor: the **`from_provider()`** function. While Instructor has always focused on providing robust structured outputs, we've observed that many users work with multiple LLM providers. This often involves repetitive setup for each client.\n\nThe `from_provider()` function aims to simplify this process, making it easier to initialize clients and experiment across different models.\n\nThis new feature offers a streamlined, string-based method to initialize an Instructor-enhanced client for a variety of popular LLM providers.\n\n<!-- more -->\n\n## What is `from_provider()`?\n\nThe `from_provider()` function serves as a smart factory for creating LLM clients. By providing a model string identifier, such as `\"openai/gpt-4o\"` or `\"anthropic/claude-3-opus-20240229\"`, the function handles the necessary setup:\n\n- **Automatic SDK Detection**: It identifies the targeted provider (e.g., OpenAI, Anthropic, Google, Mistral, Cohere).\n- **Client Initialization**: It dynamically imports the required provider-specific SDK and initializes the native client (like `openai.OpenAI()` or `anthropic.Anthropic()`).\n- **Instructor Patching**: It automatically applies the Instructor patch to the client, enabling structured outputs, validation, and retry mechanisms.\n- **Sensible Defaults**: It uses recommended `instructor.Mode` settings for each provider, optimized for performance and capabilities such as tool use or JSON mode, where applicable.\n- **Sync and Async Support**: Users can obtain either a synchronous or an asynchronous client by setting the `async_client=True` flag.\n\n## Key Benefits\n\nThe `from_provider()` function is designed to streamline several common workflows:\n\n- **Model Comparison**: Facilitates quick switching between different models or providers to evaluate performance, cost, or output quality for specific tasks.\n- **Multi-Provider Strategies**: Simplifies the implementation of fallback mechanisms or routing queries to different LLMs based on criteria like complexity or cost, reducing client management overhead.\n- **Rapid Prototyping**: Allows for faster setup when starting with a new provider or model.\n- **Simplified Configuration**: Reduces boilerplate code in projects that integrate with multiple LLM providers.\n\n## How it Works: A Look Under the Hood\n\nInternally, `from_provider()` (located in `instructor/auto_client.py`) parses the model string (e.g., `\"openai/gpt-5-nano\"`) to identify the provider and model name. It then uses conditional logic to import the correct libraries, instantiate the client, and apply the appropriate Instructor patch. For instance, the conceptual handling for an OpenAI client would involve importing the `openai` SDK and `instructor.from_openai`.\n\n```python\n# Conceptual illustration of internal logic for OpenAI:\n# (Actual implementation is in instructor/auto_client.py)\n\n# if provider == \"openai\":\n#     import openai\n#     from instructor import from_openai, Mode\n#\n#     # 'async_client', 'model_name', 'kwargs' are determined by from_provider\n#     native_client = openai.AsyncOpenAI() if async_client else openai.OpenAI()\n#\n#     return from_openai(\n#         native_client,\n#         model=model_name,\n#         mode=Mode.TOOLS,  # Default mode for OpenAI\n#         **kwargs,\n#     )\n```\n\nThe function also manages dependencies by alerting users to install missing packages (e.g., via `uv pip install openai`) if they are not found.\n\n## Example Usage\n\n> Note : Ensure your API keys (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`) are configured as environment variables to run this code.\n\nHere's a self-contained example demonstrating how `from_provider()` can be used to retrieve structured output from google gemini's flash-2.0 model.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Iterable\n\n\n# Define your data structure\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\n# Connect to any provider with a single line\nclient = instructor.from_provider(\"google/gemini-2.0-flash\")\n\n# Extract structured data\nresponse = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Alice is 30 and Bob is 25.\",\n        }\n    ],\n    response_model=Iterable[Person],\n)\n\nfor person in response:\n    print(f\"Name: {person.name}, Age: {person.age}\")\n    #> Name: Alice, Age: 30\n    #> Name: Bob, Age: 25\n# Output:\n# Name: Alice, Age: 30\n# Name: Bob, Age: 25\n```\n\nSwitching providers is as simple as changing the string:\n\n```python\n# OpenAI\nclient = instructor.from_provider(\"openai/gpt-4.1\")\n\n# Anthropic (with version date)\nclient = instructor.from_provider(\"anthropic/claude-3-5-haiku-20241022\")\n```\n\nWith the unified provider interface, you can now easily benchmark different models on the same task. This is crucial when you need to:\n\n1. Compare response quality across different providers\n2. Test which model gives the best structured extraction results\n3. Optimize for speed vs. accuracy tradeoffs\n4. Run A/B tests between providers without code refactoring\n\nInstead of maintaining separate codebases for each provider or complex switching logic, you can focus on what matters: finding the optimal model for your specific use case.\n\n### Async Support\n\nWhen building production applications that need to remain responsive, asynchronous processing is essential.\n\nInstructor's unified provider interface supports this workflow with a simple `async_client` keyword during initialization.\n\n```python\nclient = instructor.from_provider(\"openai/gpt-4.1\", async_client=True)\n```\n\nThe async implementation works particularly well for web servers, batch processing jobs, or any scenario where you need to extract structured data without blocking your application's main thread.\n\nHere's how you can implement it:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n\nclass UserProfile(BaseModel):\n    name: str\n    country: str\n\n\nasync def get_user_profile():\n    # Initialise an asynchronous client\n    async_client = instructor.from_provider(\"openai/gpt-4.1-mini\", async_client=True)\n\n    # Extract data asynchronously\n    profile = await async_client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Maria lives in Spain.\"}],\n        response_model=UserProfile,\n    )\n    print(f\"Name: {profile.name}, Country: {profile.country}\")\n    #> Name: Maria, Country: Spain\n\n\nif __name__ == \"__main__\":\n    asyncio.run(get_user_profile())\n```\n\n### Provider Specific Parameters\n\nSome providers require additional parameters for optimal performance.\n\nRather than hiding these options, Instructor allows you to pass them directly through the from_provider function:\n\n```python\n# Anthropic requires max tokens\nclient = instructor.from_provider(\"anthropic/claude-3-sonnet-20240229\", max_tokens=1024)\n```\n\nIf you'd like to change this parameter down the line, you can just do so by setting it on the `client.chat.completions.create` function again.\n\n### Type Completion\n\nTo make it easy for you to find the right model string, we now ship with auto-complete for these new model-provider initialisation strings.\n\nThis is automatically provided for you out of the box when you use the new `from_provider` method as seen below.\n\n![](./img/instructor-autocomplete.png)\n\nSay bye to fiddling around with messy model versioning and get cracking to working on your business logic instead!\n\n## Path Forward\n\nThe `from_provider()` function offers a convenient method for client initialization. Instructor remains a lightweight wrapper around your chosen LLM provider's client, and users always retain the flexibility to initialize and patch clients manually for more granular control or when using providers not yet covered by this utility.\n\nThis unified interface is intended to balance ease of use for common tasks with the underlying flexibility of Instructor, aiming to make multi-provider LLM development more accessible and efficient. However, there is still much to do to further streamline multi-provider workflows. Future efforts could focus on:\n\n- **Unified Prompt Caching API**: While Instructor supports prompt caching for providers like [Anthropic](../../integrations/anthropic.md#caching) (see also our [blog post on Anthropic prompt caching](../posts/anthropic-prompt-caching.md) and the general [Prompt Caching concepts](../../concepts/prompt_caching.md)), a more standardized, cross-provider API for managing cache behavior could significantly simplify optimizing costs and latency.\n- **Unified Multimodal Object Handling**: Instructor already provides a robust way to work with [multimodal inputs like Images, Audio, and PDFs](../../concepts/multimodal.md) across different providers. However, a higher-level unified API could further abstract provider-specific nuances for these types, making it even simpler to build applications that seamlessly switch between, for example, OpenAI's vision capabilities and Anthropic's, without changing how media objects are passed.\n\nThese are areas where `instructor` can continue to reduce friction for developers working in an increasingly diverse LLM ecosystem.\n\nWe encourage you to try `from_provider()` in your projects, particularly when experimenting with multiple LLMs. Feedback and suggestions for additional providers or features are always welcome.\n\n## Related Documentation\n- [Provider Patching](../../concepts/patching.md) - How provider integration works\n- [All Integrations](../../integrations/index.md) - Supported provider list\n\n## See Also\n\n- [String-Based Initialization](string-based-init.md) - Alternative init method\n- [Framework Comparison](best_framework.md) - Multi-provider advantages\n- [Getting Started](introduction.md) - Quick start guide\n"
  },
  {
    "path": "docs/blog/posts/anthropic-prompt-caching.md",
    "content": "---\nauthors:\n- ivanleomk\ncategories:\n- Anthropic\ncomments: true\ndate: 2024-09-14\ndescription: Discover how prompt caching with Anthropic can improve response times\n  and reduce costs for large context applications.\ndraft: false\ntags:\n- prompt caching\n- Anthropic\n- API optimization\n- cost reduction\n- latency improvement\n---\n\n# Why should I use prompt caching?\n\nDevelopers often face two key challenges when working with large context - Slow response times and high costs. This is especially true when we're making multiple of these calls over time, severely impacting the cost and latency of our applications. With Anthropic's new prompt caching feature, we can easily solve both of these issues.\n\nSince the new feature is still in beta, we're going to wait for it to be generally available before we integrate it into instructor. In the meantime, we've put together a quickstart guide on how to use the feature in your own applications.\n\n<!-- more -->\n\n!!! warning \"Caching Limitations\"\n\n    There are a few important limitations to be aware of when using prompt caching:\n\n    - **Minimum cache size**: For Claude Haiku, your cached content needs to be a minimum of 2048 tokens. For Claude Sonnet, the minimum is 1024 tokens.\n\n    - **Tool definitions**: Currently, tool definitions cannot be cached. However, support for caching tool definitions is planned for a future update.\n\n    - **Upgrade Anthropic**: You must upgrade to Anthropic version `0.34.0` or later to use prompt caching. Make sure that you're using the latest version of the Anthropic SDK.\n\n    Keep these limitations in mind when implementing prompt caching in your applications.\n\n??? note \"Source Text\"\n\n    In the following example, we'll be using a short excerpt from the novel \"Pride and Prejudice\" by Jane Austen. This text serves as an example of a substantial context that might typically lead to slow response times and high costs when working with language models. You can download it manually [here](https://www.gutenberg.org/cache/epub/1342/pg1342.txt)\n\n    ```\n        _Walt Whitman has somewhere a fine and just distinction between “loving\n    by allowance” and “loving with personal love.” This distinction applies\n    to books as well as to men and women; and in the case of the not very\n    numerous authors who are the objects of the personal affection, it\n    brings a curious consequence with it. There is much more difference as\n    to their best work than in the case of those others who are loved “by\n    allowance” by convention, and because it is felt to be the right and\n    proper thing to love them. And in the sect--fairly large and yet\n    unusually choice--of Austenians or Janites, there would probably be\n    found partisans of the claim to primacy of almost every one of the\n    novels. To some the delightful freshness and humour of_ Northanger\n    Abbey, _its completeness, finish, and_ entrain, _obscure the undoubted\n    critical facts that its scale is small, and its scheme, after all, that\n    of burlesque or parody, a kind in which the first rank is reached with\n    difficulty._ Persuasion, _relatively faint in tone, and not enthralling\n    in interest, has devotees who exalt above all the others its exquisite\n    delicacy and keeping. The catastrophe of_ Mansfield Park _is admittedly\n    theatrical, the hero and heroine are insipid, and the author has almost\n    wickedly destroyed all romantic interest by expressly admitting that\n    Edmund only took Fanny because Mary shocked him, and that Fanny might\n    very likely have taken Crawford if he had been a little more assiduous;\n    yet the matchless rehearsal-scenes and the characters of Mrs. Norris and\n    others have secured, I believe, a considerable party for it._ Sense and\n    Sensibility _has perhaps the fewest out-and-out admirers; but it does\n    not want them._\n    _I suppose, however, that the majority of at least competent votes\n    would, all things considered, be divided between_ Emma _and the present\n    book; and perhaps the vulgar verdict (if indeed a fondness for Miss\n    Austen be not of itself a patent of exemption from any possible charge\n    of vulgarity) would go for_ Emma. _It is the larger, the more varied, the\n    more popular; the author had by the time of its composition seen rather\n    more of the world, and had improved her general, though not her most\n    peculiar and characteristic dialogue; such figures as Miss Bates, as the\n    Eltons, cannot but unite the suffrages of everybody. On the other hand,\n    I, for my part, declare for_ Pride and Prejudice _unhesitatingly. It\n    seems to me the most perfect, the most characteristic, the most\n    eminently quintessential of its author’s works; and for this contention\n    in such narrow space as is permitted to me, I propose here to show\n    cause._\n    _In the first place, the book (it may be barely necessary to remind the\n    reader) was in its first shape written very early, somewhere about 1796,\n    when Miss Austen was barely twenty-one; though it was revised and\n    finished at Chawton some fifteen years later, and was not published till\n    1813, only four years before her death. I do not know whether, in this\n    combination of the fresh and vigorous projection of youth, and the\n    critical revision of middle life, there may be traced the distinct\n    superiority in point of construction, which, as it seems to me, it\n    possesses over all the others. The plot, though not elaborate, is almost\n    regular enough for Fielding; hardly a character, hardly an incident\n    could be retrenched without loss to the story. The elopement of Lydia\n    and Wickham is not, like that of Crawford and Mrs. Rushworth, a_ coup de\n    théâtre; _it connects itself in the strictest way with the course of the\n    story earlier, and brings about the denouement with complete propriety.\n    All the minor passages--the loves of Jane and Bingley, the advent of Mr.\n    Collins, the visit to Hunsford, the Derbyshire tour--fit in after the\n    same unostentatious, but masterly fashion. There is no attempt at the\n    hide-and-seek, in-and-out business, which in the transactions between\n    Frank Churchill and Jane Fairfax contributes no doubt a good deal to the\n    intrigue of_ Emma, _but contributes it in a fashion which I do not think\n    the best feature of that otherwise admirable book. Although Miss Austen\n    always liked something of the misunderstanding kind, which afforded her\n    opportunities for the display of the peculiar and incomparable talent to\n    be noticed presently, she has been satisfied here with the perfectly\n    natural occasions provided by the false account of Darcy’s conduct given\n    by Wickham, and by the awkwardness (arising with equal naturalness) from\n    the gradual transformation of Elizabeth’s own feelings from positive\n    aversion to actual love. I do not know whether the all-grasping hand of\n    the playwright has ever been laid upon_ Pride and Prejudice; _and I dare\n    say that, if it were, the situations would prove not startling or\n    garish enough for the footlights, the character-scheme too subtle and\n    delicate for pit and gallery. But if the attempt were made, it would\n    certainly not be hampered by any of those loosenesses of construction,\n    which, sometimes disguised by the conveniences of which the novelist can\n    avail himself, appear at once on the stage._\n    _I think, however, though the thought will doubtless seem heretical to\n    more than one school of critics, that construction is not the highest\n    merit, the choicest gift, of the novelist. It sets off his other gifts\n    and graces most advantageously to the critical eye; and the want of it\n    will sometimes mar those graces--appreciably, though not quite\n    consciously--to eyes by no means ultra-critical. But a very badly-built\n    novel which excelled in pathetic or humorous character, or which\n    displayed consummate command of dialogue--perhaps the rarest of all\n    faculties--would be an infinitely better thing than a faultless plot\n    acted and told by puppets with pebbles in their mouths. And despite the\n    ability which Miss Austen has shown in working out the story, I for one\n    should put_ Pride and Prejudice _far lower if it did not contain what\n    seem to me the very masterpieces of Miss Austen’s humour and of her\n    faculty of character-creation--masterpieces who may indeed admit John\n    Thorpe, the Eltons, Mrs. Norris, and one or two others to their company,\n    but who, in one instance certainly, and perhaps in others, are still\n    superior to them._\n    _The characteristics of Miss Austen’s humour are so subtle and delicate\n    that they are, perhaps, at all times easier to apprehend than to\n    express, and at any particular time likely to be differently\n    apprehended by different persons. To me this humour seems to possess a\n    greater affinity, on the whole, to that of Addison than to any other of\n    the numerous species of this great British genus. The differences of\n    scheme, of time, of subject, of literary convention, are, of course,\n    obvious enough; the difference of sex does not, perhaps, count for much,\n    for there was a distinctly feminine element in “Mr. Spectator,” and in\n    Jane Austen’s genius there was, though nothing mannish, much that was\n    masculine. But the likeness of quality consists in a great number of\n    common subdivisions of quality--demureness, extreme minuteness of touch,\n    avoidance of loud tones and glaring effects. Also there is in both a\n    certain not inhuman or unamiable cruelty. It is the custom with those\n    who judge grossly to contrast the good nature of Addison with the\n    savagery of Swift, the mildness of Miss Austen with the boisterousness\n    of Fielding and Smollett, even with the ferocious practical jokes that\n    her immediate predecessor, Miss Burney, allowed without very much\n    protest. Yet, both in Mr. Addison and in Miss Austen there is, though a\n    restrained and well-mannered, an insatiable and ruthless delight in\n    roasting and cutting up a fool. A man in the early eighteenth century,\n    of course, could push this taste further than a lady in the early\n    nineteenth; and no doubt Miss Austen’s principles, as well as her heart,\n    would have shrunk from such things as the letter from the unfortunate\n    husband in the_ Spectator, _who describes, with all the gusto and all the\n    innocence in the world, how his wife and his friend induce him to play\n    at blind-man’s-buff. But another_ Spectator _letter--that of the damsel\n    of fourteen who wishes to marry Mr. Shapely, and assures her selected\n    Mentor that “he admires your_ Spectators _mightily”--might have been\n    written by a rather more ladylike and intelligent Lydia Bennet in the\n    days of Lydia’s great-grandmother; while, on the other hand, some (I\n    think unreasonably) have found “cynicism” in touches of Miss Austen’s\n    own, such as her satire of Mrs. Musgrove’s self-deceiving regrets over\n    her son. But this word “cynical” is one of the most misused in the\n    English language, especially when, by a glaring and gratuitous\n    falsification of its original sense, it is applied, not to rough and\n    snarling invective, but to gentle and oblique satire. If cynicism means\n    the perception of “the other side,” the sense of “the accepted hells\n    beneath,” the consciousness that motives are nearly always mixed, and\n    that to seem is not identical with to be--if this be cynicism, then\n    every man and woman who is not a fool, who does not care to live in a\n    fool’s paradise, who has knowledge of nature and the world and life, is\n    a cynic. And in that sense Miss Austen certainly was one. She may even\n    have been one in the further sense that, like her own Mr. Bennet, she\n    took an epicurean delight in dissecting, in displaying, in setting at\n    work her fools and her mean persons. I think she did take this delight,\n    and I do not think at all the worse of her for it as a woman, while she\n    was immensely the better for it as an artist.\n    ```\n\nLet's first initialize our Anthropic client, this will be the same as what we've done before except we're now using the new `beta.prompt_caching` method.\n\n```python\nfrom instructor import Instructor, Mode, patch\nfrom anthropic import Anthropic\n\n\nclient = Instructor(\n    client=Anthropic(),\n    create=patch(\n        create=Anthropic().beta.prompt_caching.messages.create,\n        mode=Mode.TOOLS,\n    ),\n    mode=Mode.TOOLS,\n)\n```\n\nWe'll then create a new `Character` class that will be used to extract out a single character from the text and read in our source text ( roughly 2856 tokens using the Anthropic tokenizer).\n\n```python\nwith open(\"./book.txt\") as f:\n    book = f.read()\n\n\nclass Character(BaseModel):\n    name: str\n    description: str\n```\n\nOnce we've done this, we can then make an api call to get the description of the character.\n\n```python\nfor _ in range(2):\n    resp, completion = client.create_with_completion(  # (1)!\n        model=\"claude-3-haiku-20240307\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"<book>\" + book + \"</book>\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},  # (2)!\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract a character from the text given above\",\n                    },\n                ],\n            },\n        ],\n        response_model=Character,\n        max_tokens=1000,\n    )\n    assert isinstance(resp, Character)\n\n    print(completion.usage)  # (3)!\n    print(resp)\n```\n\n1. Using the `create_with_completion` method we can get back both the structured response and the completion object\n2. We set the `cache_control` parameter to \"ephemeral\" to tell Anthropic to cache the book content temporarily\n3. We print out the usage information to monitor token consumption\n\nYou'll notice that the usage information is different than what we've seen before. This is because we're now using the `create_with_completion` method which returns both the structured response and the completion object. The completion object contains usage information which we can use to monitor token consumption.\n\nWhen we run this, you'll notice that we get the following output.\n\n```bash\nPromptCachingBetaUsage(\n    cache_creation_input_tokens=2856,\n    cache_read_input_tokens=0,\n    input_tokens=30,\n    output_tokens=119\n)\n\nCharacter(\n    name='Elizabeth Bennet',\n    description=\"The protagonist of Jane Austen's novel Pride and Prejudice, who\nundergoes a transformation from initially disliking Mr. Darcy to eventually falling\nin love with him. The passage describes Elizabeth as a complex, nuanced character,\nnoting how her feelings towards Darcy evolve naturally over the course of the story.\"\n)\n\nPromptCachingBetaUsage(\n    cache_creation_input_tokens=0,\n    cache_read_input_tokens=2856,\n    input_tokens=30,\n    output_tokens=93\n)\n\nCharacter(\n    name='Mrs. Norris',\n    description='A character from Jane Austen\\'s novel Mansfield Park, described as\nhaving \"matchless\" scenes and being one of the characters that has secured a\nconsiderable party of admirers for the novel.'\n)\n```\n\nYou'll notice that in the first request, we created `2856` tokens and in the second request, we read `2856` tokens.\n\nIn other words, `book_content` was cached after the first request and reused in the second request. When you have a larger context window, this can save you a significant amount of money and time because your requests will return a lot faster too.\n\nThis is the entire code for the example above.\n\n```python\nfrom instructor import Instructor, Mode, patch\nfrom anthropic import Anthropic\nfrom pydantic import BaseModel\n\nclient = Instructor(\n    client=Anthropic(),\n    create=patch(\n        create=Anthropic().beta.prompt_caching.messages.create,\n        mode=Mode.TOOLS,\n    ),\n    mode=Mode.TOOLS,\n)\n\n\nclass Character(BaseModel):\n    name: str\n    description: str\n\n\nwith open(\"./book.txt\") as f:\n    book = f.read()\n\nfor _ in range(2):\n    resp, completion = client.create_with_completion(\n        model=\"claude-3-haiku-20240307\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"<book>\" + book + \"</book>\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract a character from the text given above\",\n                    },\n                ],\n            },\n        ],\n        response_model=Character,\n        max_tokens=1000,\n    )\n    assert isinstance(resp, Character)\n    print(completion.usage)\n    print(resp)\n```\n\n## Related Documentation\n- [Caching Strategies](../../concepts/caching.md) - General caching concepts\n- [Anthropic Integration](../../integrations/anthropic.md) - Full Anthropic guide\n\n## See Also\n- [Anthropic Structured Outputs](structured-output-anthropic.md) - Use with caching\n- [Response Caching](caching.md) - General caching strategies\n- [Performance Monitoring](logfire.md) - Track cache performance"
  },
  {
    "path": "docs/blog/posts/anthropic-web-search-structured.md",
    "content": "---\ndate: 2025-05-07\nauthors:\n  - jxnl\ncategories:\n  - tutorials\n  - anthropic\n  - structured-data\n---\n\n# Using Anthropic's Web Search with Instructor for Real-Time Data\n\nAnthropic's new web search tool, when combined with Instructor, provides a powerful way to get real-time, structured data from the web. This allows you to build applications that can answer questions and provide information that is up-to-date, going beyond the knowledge cut-off of large language models.\n\nIn this post, we'll explore how to use the `web_search` tool with Instructor to fetch the latest information and structure it into a Pydantic model. Even a simple structure can be very effective for clarity and further processing.\n\n<!-- more -->\n\n## How it Works\n\nThe web search tool enables Claude models to perform web searches during a generation. When you provide the `web_search` tool in your API request, Claude can decide to use it if the prompt requires information it doesn't have. The API then executes the search, provides the results back to Claude, and Claude can then use this information to generate a response. Importantly, Claude will cite its sources from the search results. You can find more details in the [official Anthropic documentation](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool).\n\nInstructor simplifies this process by allowing you to define a Pydantic model for the desired output structure. When Claude uses the web search tool and formulates an answer, Instructor ensures that the final output conforms to your defined schema.\n\n## Example: Getting the Latest UFC Results\n\nLet's look at a practical example. We want to get the latest UFC fight results.\n\nFirst, ensure you have `instructor` and `anthropic` installed:\n\n```bash\nuv add instructor anthropic\n```\n\nNow, let's define our Pydantic model for the response:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Noticed thhat we use JSON not TOOLS mode\nclient = instructor.from_provider(\n    \"anthropic/claude-3-7-sonnet-latest\",\n    mode=instructor.Mode.JSON,\n    async_client=False,\n)\n\n\nclass Citation(BaseModel):\n    id: int\n    url: str\n\n\nclass Response(BaseModel):\n    citations: list[Citation]\n    response: str\n```\n\nThis Response model is straightforward. It gets the model to first generate a list of citations for articles that it referenced before generating it's answer.\n\nThis helps to ground its response in the sources it retrieved and provide a higher quality response.\n\nNow, we can make the API call:\n\n```python\nresponse_data, completion_details = client.messages.create_with_completion(\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a helpful assistant that summarizes news articles. Your final response should be only contain a single JSON object returned in your final message to the user. Make sure to provide the exact ids for the citations that support the information you provide in the form of inline citations as [1] [2] [3] which correspond to a unique id you generate for a url that you find in the web search tool which is relevant to your final response.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": \"What are the latest results for the UFC and who won? Answer this in a concise response that's under 3 sentences.\",\n        },\n    ],\n    tools=[{\"type\": \"web_search_20250305\", \"name\": \"web_search\", \"max_uses\": 3}],\n    response_model=Response,\n)\n\nprint(\"Response:\")\nprint(response_data.response)\nprint(\"\\nCitations:\")\nfor citation in response_data.citations:\n    print(f\"{citation.id}: {citation.url}\")\n```\n\nThis approach provides a clean way to get the LLM's answer into a defined Pydantic object. The `examples/anthropic-web-tool/run.py` script reflects this implementation.\n\nExpected output (will vary based on real-time web search data):\n\n```\nResponse:\nThe latest UFC event was UFC Fight Night: Sandhagen vs Figueiredo held on May 3, 2025, in Des Moines, Iowa. Cory Sandhagen defeated former champion Deiveson Figueiredo by TKO (knee injury) in the main event, while Reinier de Ridder upset previously undefeated prospect Bo Nickal by TKO in the co-main event [1][2]. The next major UFC event is UFC 315 on May 10, featuring a welterweight championship bout between Belal Muhammad and Jack Della Maddalena [3].\n\nCitations:\n1: https://www.ufc.com/news/main-card-results-highlights-winner-interviews-ufc-fight-night-sandhagen-vs-figueiredo-wells-fargo-arena-des-moines\n2: https://www.mmamania.com/2025/5/4/24423285/ufc-des-moines-results-sooo-about-last-night-sandhagen-vs-figueiredo-espn-mma-bo-nickal\n3: https://en.wikipedia.org/wiki/UFC_315\n```\n\n## Key Benefits\n\n- **Real-Time Information**: Access the latest data directly from the web.\n- **Structured Output**: Even with a simple model, Instructor ensures the output is a Pydantic object, making it easy to work with programmatically.\n- **Source Citations**: Claude automatically cites sources, allowing for verification (details in the API response, not shown in this simplified example).\n- **Reduced Hallucinations**: By relying on web search for factual, up-to-the-minute data, the likelihood of the LLM providing incorrect or outdated information is reduced.\n\n## Configuring the Web Search Tool\n\nAnthropic provides several options to configure the web search tool:\n\n- `max_uses`: Limit the number of searches Claude can perform in a single request.\n- `allowed_domains`: Restrict searches to a list of specific domains.\n- `blocked_domains`: Prevent searches on certain domains.\n- `user_location`: Localize search results by providing an approximate location (city, region, country, timezone).\n\nFor example, to limit searches to 3 and only allow results from `espn.com` and `ufc.com`:\n\n```python\n    tools = (\n        [\n            {\n                \"type\": \"web_search_20250305\",\n                \"name\": \"web_search\",\n                \"max_uses\": 3,\n                \"allowed_domains\": [\"espn.com\", \"ufc.com\"],\n            }\n        ],\n    )\n```\n\nYou cannot use `allowed_domains` and `blocked_domains` in the same request.\n\n## Conclusion\n\nCombining Anthropic's web search tool with Instructor's structured data capabilities opens up exciting possibilities for building dynamic, information-rich applications. Whether you're tracking sports scores, news updates, or market trends, this powerful duo can help you access and organize real-time web data effectively, even with simple Pydantic models.\n\nCheck out the example code in `examples/anthropic-web-tool/run.py` to see this implementation, and refer to the [Anthropic web search documentation](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool) for more in-depth information on the tool's capabilities.\n"
  },
  {
    "path": "docs/blog/posts/anthropic.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Anthropic\ncomments: true\ndate: 2024-03-20\ndescription: Learn how to integrate Anthropic's powerful language models into your projects using Instructor, with step-by-step guidance on installation, client setup, and creating structured outputs with Pydantic models.\ndraft: false\ntags:\n- Anthropic\n- API Development\n- Pydantic\n- Python\n- LLM Techniques\n---\n\n# Structured Outputs with Anthropic\n\nA special shoutout to [Shreya](https://twitter.com/shreyaw_) for her contributions to the anthropic support. As of now, all features are operational with the exception of streaming support.\n\nFor those eager to experiment, simply patch the client with `ANTHROPIC_JSON`, which will enable you to leverage the `anthropic` client for making requests.\n\n```\npip install instructor[anthropic]\n```\n\n!!! warning \"Missing Features\"\n\n    Just want to acknowledge that we know that we are missing partial streaming and some better re-asking support for XML. We are working on it and will have it soon.\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\nimport anthropic\nimport instructor\n\n# Patching the Anthropics client with the instructor for enhanced capabilities\nanthropic_client = instructor.from_openai(\n    create=anthropic.Anthropic().messages.create,\n    mode=instructor.Mode.JSON\n)\n\nclass Properties(BaseModel):\n    name: str\n    value: str\n\nclass User(BaseModel):\n    name: str\n    age: int\n    properties: List[Properties]\n\nuser_response = anthropic_client(\n    model=\"claude-3-haiku-20240307\",\n    max_tokens=1024,\n    max_retries=0,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Create a user for a model with a name, age, and properties.\",\n        }\n    ],\n    response_model=User,\n)  # type: ignore\n\nprint(user_response.model_dump_json(indent=2))\n\"\"\"\n{\n    \"name\": \"John\",\n    \"age\": 25,\n    \"properties\": [\n        {\n            \"key\": \"favorite_color\",\n            \"value\": \"blue\"\n        }\n    ]\n}\n```\n\nWe're encountering challenges with deeply nested types and eagerly invite the community to test, provide feedback, and suggest necessary improvements as we enhance the anthropic client's support."
  },
  {
    "path": "docs/blog/posts/bad-schemas-could-break-llms.md",
    "content": "---\nauthors:\n- ivanleomk\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2024-09-26\ndescription: Discover how response models impact LLM performance, focusing on structured\n  outputs for optimal results in GPT-4o and Claude models.\ndraft: false\ntags:\n- LLM Performance\n- Response Models\n- Structured Outputs\n- GPT-4o\n- Claude Models\n---\n\n# Bad Schemas could break your LLM Structured Outputs\n\nYou might be leaving up to 60% performance gains on the table with the wrong response model. Response Models impact model performance massively with Claude and GPT-4o, irregardless of you’re using JSON mode or Tool Calling.\n\nUsing the right response model can help ensure [your models respond in the right language](../posts/matching-language.md) or prevent [hallucinations when extracting video timestamps](../posts/timestamp.md).\n\nWe decided to investigate this by benchmarking Claude and GPT-4o on the GSM8k dataset and found that\n\n1. **Field Naming drastically impacts performance** - Changing a single field name from `final_choice` to `answer` improved model accuracy from 4.5% to 95%. The way we structure and name fields in our response models can fundamentally alter how the model interprets and responds to queries.\n2. **Chain Of Thought significantly boosts performance** - Adding a `reasoning` field increased model accuracy by 60% on the GSM8k dataset. Models perform significantly better when they explain their logic step-by-step.\n3. **Be careful with JSON mode** - JSON mode exhibited 50% more performance variation than Tool Calling when renaming fields. Different response models showed varying levels of performance between JSON mode and Tool Calling, indicating that JSON mode requires more careful optimisation.\n\n<!-- more -->\n\nWe’ll do so in the following steps\n\n1. We’ll first talk about the GSM8k dataset and how we’re using it for benchmarking\n2. Then we’ll cover some of the results we obtained and talk about some of the key takeaways that we discovered\n3. Lastly, we’ll provide some tips to optimise your model’s response format that you can apply today\n\n## Dataset\n\nWe used OpenAI's GSM8k dataset to benchmark model performance. This dataset challenges LLM models to solve simple math problems that involve multiple steps of reasoning. Here's an example:\n\n> Natalia sold clips to 48 friends in April, and half as many in May. How many clips did Natalia sell in total?\"\n\nThe original dataset includes reasoning steps and the final answer. We stripped it down to bare essentials: question, answer, and separated reasoning. To do so, we used this code to process the data:\n\n```python\nfrom datasets import load_dataset, Dataset, DatasetDict\n\nsplits = [\"test\", \"train\"]\n\n\ndef generate_gsm8k(split):\n    ds = load_dataset(\"gsm8k\", \"main\", split=split, streaming=True)\n    for row in ds:\n        reasoning, answer = row[\"answer\"].split(\"####\")\n        answer = int(answer.strip().replace(\",\", \"\"))\n        yield {\n            \"question\": row[\"question\"],\n            \"answer\": answer,\n            \"reasoning\": reasoning,\n        }\n\n\n# Create the dataset for train and test splits\ntrain_dataset = Dataset.from_generator(lambda: generate_gsm8k(\"train\"))\ntest_dataset = Dataset.from_generator(lambda: generate_gsm8k(\"test\"))\n\n# Combine them into a DatasetDict\ndataset = DatasetDict({\"train\": train_dataset, \"test\": test_dataset})\n\ndataset.push_to_hub(\"567-labs/gsm8k\")\n```\n\nThis allows us to test how changes in the response format, response model and even the chosen model itself would affect reasoning ability of the model.\n\nUsing this new dataset, we then tested the Claude and GPT-4o models with a variety of different response models and response modes such as JSON Mode and Tool Calling. The final results were fascinating - highlighting the importance of a good response model in squeezing out the maximum performance from your chosen model.\n\n## Benchmarks\n\nWe had two key questions on hand that we wanted to answer\n\n1. How does Structured Extraction impact model performance as compared to other response modes such as JSON mode.\n2. What was the impact of different response models on model performance?\n\nTo answer these questions, we sampled the first 200 questions from the GSM8k dataset and tested different permutations of response modes and response models.\n\nWe conducted our experiment in two parts\n\n1. **Modes and Models** : We first started by exploring how different combinations of response modes and models might impact performance on the GSM8k\n2. **Response Models :** We then looked at how different response models with varying levels of complexity might impact the performance of each model\n\nLet’s explore each portion in greater detail.\n\n### Modes and Models\n\nBy the end of these experiments, we had the following takeaways\n\n1. **Claude Models excel at complex tasks** : Claude models see significantly greater improvement with few shot improvements as compared to the GPT-4o variants. This means that for complex tasks with specific nuanced output formats or instructions, Claude models will benefit more from few-shot examples\n\n2. **Structured Extraction doesn’t lose out** : While we see a 1-2% in performance with JSON mode relative to function calling, working with JSON mode is tricky when response models get complicated. Working with smaller models such as Haiku in JSON mode often required parsing out control characters and increasing the number of re-asks. This was in contrast to the consistent performance of structured extraction that returned a consistent schema.\n\n3. **4o Mini should be used carefully** : We found that 4o-mini had much less steerability as compared to Claude models, with few-shot examples something resulting in worse performance.\n\nIt’s important here to note that the few shot examples mentioned here only made a difference when the reasoning behind the answer was provided. Without this reasoning example, there wasn’t the same performance improvement observed.\n\nHere were our results for the Claude Family of models\n\n| Model             | Anthropic JSON Mode | JSON w 5 Few Shot | Anthropic Tools | Tools w 5 few shot | Tools w 10 few shot | Benchmarks |\n| ----------------- | ------------------- | ----------------- | --------------- | ------------------ | ------------------- | ---------- |\n| claude-3.5-sonnet | 97.00               | 98.5              | 96.00           | 98.00%             | 98%                 | 96.4       |\n| claude-3-haiku    | 87.50%              | 89%               | 87.44%          | 90.5%              | 90.5%               | 88.9       |\n| claude-3-sonnet   | 94.50%              | 91.5              | 91.00%          | 96.50%             | 91.5%               | 92.3       |\n| claude-3-opus     | 96.50%              | 98.50%            | 96.50%          | 97.00%             | 97.00%              | 95         |\n\nHere were our results for `4o-mini`\n\n| model                         | gpt-4o-mini | gpt-4o |\n| ----------------------------- | ----------- | ------ |\n| Structured Outputs            | 95.5        | 91.5%  |\n| Structured Outputs 5 Few-Shot | 94.5        | 94.5%  |\n| Tool Calling                  | 93.5        | 93.5%  |\n| Tool Calling 5 Few Shot       | 93.0        | 95%    |\n| Json Mode                     | 94.5        | 95.5   |\n| Json Mode 5 Few Shot          | 95.0        | 97%    |\n\nIt’s clear here that Claude models consistently show significant improvement with few-shot examples compared to GPT-4o variants. This is in contrast to `4o-mini` which actually showed a decreased in performance for tool calling when provided with simple examples.\n\n### Response Models\n\nWith these new results, we then proceeded to examine how response models might impact the performance of our models when it came to function calling. While doing so, we had the following takeaways.\n\n1. **Chain Of Thought** : Chain Of Thought is incredibly important and can boost model performance on the GSM8k by as much as 60% from our benchmarks\n2. **JSON mode is much more sensitive than Tool Calling** : In our initial benchmarks, we found that simple changes in the response model such as additional parameters could impact performance by as much as 30% - something which Tool Calling didn’t suffer from.\n3. **Naming matters a lot** : The naming of a response parameter is incredibly important. Just going from `potential_final_choice` and `final_choice` to `potential_answers` and `final_answer` improved our final accuracy from 4.5% to 95%.\n\n#### Chain Of Thought\n\nIt’s difficult to understate the importance of allowing the model to reason and plan before generating a final response.\n\nIn our initial tests , we used the following two models\n\n```python\nclass Answer(BaseModel):\n    chain_of_thought: str\n    answer: int\n\n\nclass OnlyAnswer(BaseModel):\n    answer: int\n```\n\n| Model      | JSON Mode | Tool Calling |\n| ---------- | --------- | ------------ |\n| Answer     | 92%       | 94%          |\n| OnlyAnswer | 33%       | 33.5%        |\n\nThese models were tested using the **exact same prompt and questions**. The only thing that differed between them was the addition of a `chain_of_thought` response parameter to allow the model to reason effectively.\n\nWe’re not confined to this specific naming convention of `chain_of_thought`, although it does work consistently well. We can show that when we look at the results we obtained when we tested the following response models.\n\nIn order to verify this, we took a random sample of 50 questions from the test dataset and looked at the performance of different response models that implemented similar reasoning fields on the GSM8k.\n\nOur conclusion? Simply adding additional fields for the model to reason about its final response improves reasoning all around.\n\n```python\nclass AssumptionBasedAnswer(BaseModel):\n    assumptions: list[str]\n    logic_flow: str\n    answer: int\n\nclass ErrorAwareCalculation(BaseModel):\n    key_steps: list[str]\n    potential_pitfalls: list[str]\n    intermediate_results: list[str]\n    answer: int\n\n lass AnswerWithIntermediateCalculations(BaseModel):\n    assumptions: list[str]\n    intermediate_calculations: list[str]\n    chain_of_thought: str\n    final_answer: int\n\nclass AssumptionBasedAnswerWithExtraFields(BaseModel):\n    assumptions: list[str]\n    logic_flow: str\n    important_intermediate_calculations: list[str]\n    potential_answers: list[int]\n    answer: int\n\n\nclass AnswerWithReasoningAndCalculations(BaseModel):\n    chain_of_thought: str\n    key_calculations: list[str]\n    potential_answers: list[int]\n    final_choice: int\n```\n\n| Model                                | Accuracy |\n| ------------------------------------ | -------- |\n| AssumptionBasedAnswer                | 78%      |\n| ErrorAwareCalculation                | 92%      |\n| Answer With Intermediate Calculation | 90%      |\n| AssumptionBasedAnswerWithExtraFields | 90%      |\n| AnswerWithReasoningAndCalculations   | 94%      |\n\nSo if you’re generating any sort of response, don’t forget to add in a simple reasoning field that allows for this performance boost.\n\n#### JSON mode is incredibly Sensitive\n\nWe were curious how this would translate over to the original sample of 200 questions. To do so, we took the original 200 questions that we sampled in our previous experiment and tried to see how JSON mode and Tool Calling performed with other different permutations with `gpt-4o-mini`.\n\nHere were the models that we used\n\n```python\nclass Answer(BaseModel):\n    chain_of_thought: str\n    answer: int\n\n\nclass AnswerWithCalculation(BaseModel):\n    chain_of_thought: str\n    required_calculations: list[str]\n    answer: int\n\n\nclass AssumptionBasedAnswer(BaseModel):\n    assumptions: list[str]\n    logic_flow: str\n    answer: int\n\n\nclass ErrorAwareCalculation(BaseModel):\n    key_steps: list[str]\n    potential_pitfalls: list[str]\n    intermediate_results: list[str]\n    answer: int\n\n\nclass AnswerWithNecessaryCalculationAndFinalChoice(BaseModel):\n    chain_of_thought: str\n    necessary_calculations: list[str]\n    potential_final_choices: list[str]\n    final_choice: int\n```\n\n| Model                                        | JSON Mode | Tool Calling |\n| -------------------------------------------- | --------- | ------------ |\n| Answer                                       | 92%       | 94%          |\n| AnswerWithCalculation                        | 86.5%     | 92%          |\n| AssumptionBasedAnswer                        | 65%       | 78.5%        |\n| ErrorAwareCalculation                        | 92%       | 88.5%        |\n| AnswerWithNecessaryCalculationAndFinalChoice | 87.5%     | 95%          |\n\nWhat’s interesting about these results is that the difference in performance for JSON mode with multiple response models is far greater than that of Tool Calling.\n\nThe worst performing response model for JSON mode was `AssumptionBasedAnswer` which scored 65% on the GSM8k while the worst performing response for Tool Calling was `AssumptionBasedAnswer` that scored 78.5% on our benchmarks. This means that the variation in performance for JSON mode was almost 50% larger than that of Tool Calling.\n\nWhat’s also interesting is that different response models impacted each response mode differently. For Tool Calling, `AnswerWithNecessaryCalculationAndFinalChoice` was the best performing response model while for JSON mode, it was `ErrorAwareCalculation` and `Answer`.\n\nThis means that when looking at response models for our applications, we can’t just toggle a different mode and hope that the performance gets a magical boost. We need to have a systematic way of evaluating model performance to find the best balance between different response models that we’re experimenting with.\n\n#### Naming Matters A Lot\n\nWe obtained an accuracy of `4.5%` when working with the following response model\n\n```python\nclass AnswerWithNecessaryCalculationAndFinalChoice(BaseModel):\n    chain_of_thought: str\n    necessary_calculations: list[str]\n    potential_final_choices: list[str]\n    final_choice: int\n```\n\nThis is weird because it doesn’t look all too different from the top performing response model, which achieved an accuracy of `95%` .\n\n```python\nclass AnswerWithNecessaryCalculationAndFinalChoice(BaseModel):\n    chain_of_thought: str\n    necessary_calculations: list[str]\n    potential_final_answers: list[str]\n    answer: int\n```\n\nIn fact, the only thing that changed was the last two parameters. Upon closer inspection, what was happening was that in the first case, we were generating response objects that looked like this\n\n```python\n{\n    \"chain_of_thought\": \"In the race, there are a total of 240 Asians. Given that 80 were Japanese, we can calculate the number of Chinese participants by subtracting the number of Japanese from the total number of Asians: 240 - 80 = 160. Now, it is given that there are 60 boys on the Chinese team. Therefore, to find the number of girls on the Chinese team, we subtract the number of boys from the total number of Chinese participants: 160 - 60 = 100 girls. Thus, the number of girls on the Chinese team is 100.\",\n    \"necessary_calculations\": [\n        \"Total Asians = 240\",\n        \"Japanese participants = 80\",\n        \"Chinese participants = Total Asians - Japanese participants = 240 - 80 = 160\",\n        \"Boys in Chinese team = 60\",\n        \"Girls in Chinese team = Chinese participants - Boys in Chinese team = 160 - 60 = 100\",\n    ],\n    \"potential_final_choices\": [\"60\", \"100\", \"80\", \"120\"],\n    \"final_choice\": 2,\n}\n```\n\nThis meant that instead of the final answer of 100, our model was generating potential responses it could give and returning the final choice as the index of that answer. Simply renaming our response model here to `potential_final_answers` and `final_answer` resulted in the original result of `95%` again.\n\n```python\n{\n    \"chain_of_thought\": \"First, we need to determine how many Asians were Chinese. Since there were 240 Asians in total and 80 of them were Japanese, we can find the number of Chinese by subtracting the number of Japanese from the total: 240 - 80 = 160. Now, we know that there are 160 Chinese participants. Given that there were 60 boys on the Chinese team, we can find the number of girls by subtracting the number of boys from the total number of Chinese: 160 - 60 = 100. Therefore, there are 100 girls on the Chinese team.\",\n    \"necessary_calculations\": [\n        \"Total Asians = 240\",\n        \"Number of Japanese = 80\",\n        \"Number of Chinese = 240 - 80 = 160\",\n        \"Number of boys on Chinese team = 60\",\n        \"Number of girls on Chinese team = 160 - 60 = 100\",\n    ],\n    \"potential_final_answers\": [\"100\", \"60\", \"80\", \"40\"],\n    \"answer\": 100,\n}\n```\n\nThese are the sort of insights we’d only be able to know by having a strong evaluation set and looking closely at our generated predictions.\n\n## Why Care about the response model?\n\nIt’s pretty obvious that different combinations of field names dramatically impact the performance of models. Ultimately It’s not just about adding a single `chain_of_thought` field but also about paying close attention to how models are interpreting the field names.\n\nFor instance, instead of asking for just chain_of_thought, we can be much more creative by prompting our model to generate python code, much like the example below.\n\n```python\nclass Equations(BaseModel):\n    chain_of_thought: str\n    eval_string: list[str] = Field(\n        description=\"Python code to evaluate to get the final answer. The final answer should be stored in a variable called `answer`.\"\n    )\n```\n\nThis allows us to combine a LLM’s expressiveness with the performance of a deterministic system, in this case a python interpreter. As we continue to implement more complex systems with these models, the key isn’t going to be just toggling JSON mode and praying for the best. Instead, we need robust evaluation sets for testing the impact of different response models, prompt changes and other permutations.\n\n## Try Instructor Today\n\n`instructor` makes it easy to get structured data from LLMs and is built on top of Pydantic. This makes it an indispensable tool to quickly prototype and find the right response models for your specific application.\n\nTo get started with instructor today, check out our [Getting Started](../../index.md) and [Examples](../../examples/index.md) sections that cover various LLM providers and specialised implementations."
  },
  {
    "path": "docs/blog/posts/best_framework.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2024-03-05\ndescription: Discover how the Instructor library simplifies structured LLM outputs\n  using Python type annotations for seamless data mapping.\ndraft: false\nslug: zero-cost-abstractions\ntags:\n- Instructor\n- LLM Outputs\n- Python\n- Pydantic\n- Data Mapping\n---\n\n# Why Instructor is the Best Library for Structured LLM Outputs\n\nLarge language models (LLMs) like GPTs are incredibly powerful, but working with their open-ended text outputs can be challenging. This is where the Instructor library shines - it allows you to easily map LLM outputs to structured data using Python type annotations.\n\n<!-- more -->\n\nThe core idea behind Instructor is incredibly simple: it's just a patch over the OpenAI Python SDK that adds a response_model parameter. This parameter lets you pass in a Pydantic model that describes the structure you want the LLM output mapped to. Pydantic models are defined using standard Python type hints, so there's zero new syntax to learn.\n\nHere's an example of extracting structured user data from an LLM:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nuser = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=User,  # (1)!\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the user's name and age from this: John is 25 years old\",\n        }\n    ],\n)\n\nprint(user)  # (2)!\n#> name='John' age=25\n```\n\n1. Notice that now we have a new response_model parameter that we pass in to the completions.create method. This parameter lets us specify the structure we want the LLM output to be mapped to. In this case, we're using a Pydantic model called User that describes a user's name and age.\n2. The output of the completions.create method is a User object that matches the structure we specified in the response_model parameter, rather than a ChatCompletion.\n\n## Other Features\n\nOther features on instructor, in and out of the llibrary are:\n\n1. Ability to use [Tenacity in retrying logic](../../concepts/retrying.md)\n2. Ability to use [Pydantic's validation context](../../concepts/reask_validation.md)\n3. [Parallel Tool Calling](../../concepts/parallel.md) with correct types\n4. Streaming [Partial](../../concepts/partial.md) and [Iterable](../../concepts/iterable.md) data.\n5. Returning [Primitive](../../concepts/types.md) Types and [Unions](../../concepts/unions.md) as well!\n6. Lots of [Cookbooks](../../examples/index.md), [Tutorials](../../tutorials/1-introduction.ipynb), and comprehensive Documentation in our [Integration Guides](../../integrations/index.md)\n\n## Instructor's Broad Applicability\n\nOne of the key strengths of Instructor is that it's designed as a lightweight patch over the official OpenAI Python SDK. This means it can be easily integrated not just with OpenAI's hosted API service, but with any provider or platform that exposes an interface compatible with the OpenAI SDK.\n\nFor example, providers like [Together](../../integrations/together.md), [Ollama](../../integrations/ollama.md), [Groq](../../integrations/groq.md), and [llama-cpp-python](../../integrations/llama-cpp-python.md) all either use or mimic the OpenAI Python SDK under the hood. With Instructor's zero-overhead patching approach, teams can immediately start deriving structured data outputs from any of these providers. There's no need for custom integration work.\n\n## Direct access to the messages array\n\nUnlike other libraries that abstract away the `messages=[...]` parameter, Instructor provides direct access. This direct approach facilitates intricate prompt engineering, ensuring compatibility with OpenAI's evolving message types, including future support for images, audio, or video, without the constraints of string formatting.\n\n## Low Abstraction\n\nWhat makes Instructor so powerful is how seamlessly it integrates with existing OpenAI SDK code. To use it, you literally just call instructor.from_openai() on your OpenAI client instance, then use response_model going forward. There's no complicated refactoring or new abstractions to wrap your head around.\n\nThis incremental, zero-overhead adoption path makes Instructor perfect for sprinkling structured LLM outputs into an existing OpenAI-based application. You can start extracting data models from simple prompts, then incrementally expand to more complex hierarchical models, streaming outputs, and custom validations.\n\nAnd if you decide Instructor isn't a good fit after all, removing it is as simple as not applying the patch! The familiarity and flexibility of working directly with the OpenAI SDK is a core strength.\n\nInstructor solves the \"string hellll\" of unstructured LLM outputs. It allows teams to easily realize the full potential of tools like GPTs by mapping their text to type-safe, validated data structures. If you're looking to get more structured value out of LLMs, give Instructor a try!\n\n## Related Concepts\n\n- [Philosophy](../../concepts/philosophy.md) - Understand Instructor's design principles\n- [Patching](../../concepts/patching.md) - Learn how Instructor patches LLM clients\n- [Retrying](../../concepts/retrying.md) - Handle validation failures gracefully\n- [Streaming](../../concepts/partial.md) - Work with streaming responses\n\n## See Also\n\n- [Introduction to Instructor](introduction.md) - Get started with structured outputs\n- [Integration Guides](../../integrations/index.md) - See all supported providers\n- [Type Examples](../../concepts/types.md) - Explore different response types\n"
  },
  {
    "path": "docs/blog/posts/caching.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Performance Optimization\n- Cost Reduction\n- API Efficiency\n- Python Development\ncomments: true\ndate: 2023-11-26\ndescription: Master advanced Python caching strategies for LLM applications using functools, diskcache, and Redis. Learn how to optimize OpenAI API costs, reduce response times, and implement efficient caching for Pydantic models in production environments.\ndraft: false\nslug: python-caching-llm-optimization\ntags:\n- Python\n- Caching\n- Pydantic\n- Performance Optimization\n- Redis\n- OpenAI\n- API Cost Optimization\n- functools\n- diskcache\n- LLM Applications\n- Production Scaling\n- Memory Management\n- Distributed Systems\n- Async Programming\n- Batch Processing\n---\n\n# Advanced Caching Strategies for Python LLM Applications (Validated & Tested ✅)\n\n> Instructor makes working with language models easy, but they are still computationally expensive. Smart caching strategies can reduce costs by up to 90% while dramatically improving response times.\n\n\n> **Update (June 2025)** – Instructor now ships *native* caching support\n> out-of-the-box.  Pass a cache adapter directly when you create a\n> client:\n>\n> ```python\n> from instructor import from_provider\n> from instructor.cache import AutoCache, RedisCache\n>\n> client = from_provider(\n>     \"openai/gpt-4o\",  # or any other provider\n>     cache=AutoCache(maxsize=10_000),   # in-process LRU\n>     # or cache=RedisCache(host=\"localhost\")\n> )\n> ```\n>\n> Under the hood this uses the very same techniques explained below, so\n> you can still roll your own adapter if you need a bespoke backend.  The\n> remainder of the post walks through the design rationale in detail and\n> is fully compatible with the built-in implementation.\n\n## Built-in cache – feature matrix\n\n| Method / helper                          | Cached | What is stored                                         | Notes |\n|------------------------------------------|--------|-------------------------------------------------------|-------|\n| `create(...)`                            | ✅ Yes | Parsed Pydantic model + raw completion JSON           |  |\n| `create_with_completion(...)`            | ✅ Yes | Same as above – second tuple element restored from cache |\n| `create_partial(...)`                    | ❌ No  | –                                                     | Streaming generators not cached (yet) |\n| `create_iterable(...)`                   | ❌ No  | –                                                     | Streaming generators not cached (yet) |\n| Any call with `stream=True`              | ❌ No  | –                                                     | Provider always invoked |\n\n### How serialization works\n\n1. **Model** – we call `model_dump_json()` which produces a compact, loss-less JSON string.  On a cache hit we re-hydrate with `model_validate_json()` so you get the same `BaseModel` subclass instance.\n2. **Raw completion** – Instructor attaches the original `ChatCompletion` (or provider-specific) object to the model as `_raw_response`.  We serialise this object too (when possible with `model_dump_json()`, otherwise a plain `str()` fallback) and restore it on a cache hit so `create_with_completion()` behaves identically.\n\n#### Raw Response Reconstruction\n\nFor raw completion objects, we use a `SimpleNamespace` trick to reconstruct the original object structure:\n\n```python\n# When caching:\nraw_json = completion.model_dump_json()  # Serialize to JSON\n\n# When restoring from cache:\nimport json\nfrom types import SimpleNamespace\n\nrestored = json.loads(raw_json, object_hook=lambda d: SimpleNamespace(**d))\n```\n\nThis approach allows us to restore the original dot-notation access patterns (e.g., `completion.usage.total_tokens`) without requiring the original class definitions. The `SimpleNamespace` objects behave identically to the original completion objects for attribute access while being much simpler to reconstruct from JSON.\n\n#### Defensive Handling\n\nThe cache implementation includes multiple fallback strategies for different provider response types:\n\n1. **Pydantic models** (OpenAI, Anthropic) - Use `model_dump_json()` for perfect serialization\n2. **Plain dictionaries** - Use standard `json.dumps()` with `default=str` fallback  \n3. **Unpickleable objects** - Fall back to string representation with a warning\n\nThis ensures the cache works reliably across all providers, even if they don't follow the same response object patterns.\n\n### Streaming limitations\n\nThe current implementation opts **not** to cache streaming helpers (`create_partial`, `create_iterable`, or `stream=True`).  Replaying a realistic token-stream requires a dedicated design which is coming in a future release.  Until then, those calls always reach the provider.\n\nToday, we're diving deep into optimizing instructor code while maintaining the excellent developer experience offered by [Pydantic](https://docs.pydantic.dev/latest/) models. We'll tackle the challenges of caching Pydantic models, typically incompatible with `pickle`, and explore comprehensive solutions using `decorators` like `functools.cache`. Then, we'll craft production-ready custom decorators with `diskcache` and `redis` to support persistent caching, distributed systems, and high-throughput applications.\n\n<!-- more -->\n\n## The Cost of Repeated API Calls\n\nLet's first consider our canonical example, using the `OpenAI` Python client to extract user details:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Enables `response_model`\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\ndef extract(data) -> UserDetail:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\nNow imagine batch processing data, running tests or experiments, or simply calling `extract` multiple times over a workflow. We'll quickly run into performance issues, as the function may be called repeatedly, and the same data will be processed over and over again, costing us time and money.\n\n### Real-World Cost Impact\n\nConsider these scenarios where caching becomes critical:\n\n- **Development & Testing**: Running the same test cases repeatedly during development\n- **Batch Processing**: Processing large datasets with potential duplicates\n- **Web Applications**: Multiple users requesting similar information\n- **Data Pipelines**: ETL processes that might encounter the same data multiple times\n- **Model Experimentation**: Testing different prompts on the same input data\n\nWithout caching, a single GPT-4 call costs approximately $0.03 per 1K prompt tokens and $0.06 per 1K completion tokens. For applications making thousands of calls per day, this quickly adds up to significant expenses.\n\n## 1. `functools.cache` for Simple In-Memory Caching\n\n**When to Use**: Ideal for functions with immutable arguments, called repeatedly with the same parameters in small to medium-sized applications. Perfect for development environments, testing, and applications where you don't need cache persistence between sessions.\n\n```python\nimport functools\n\n\n@functools.cache\ndef extract(data):\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\n!!! warning \"Cache Invalidation Considerations\"\n\n    Note that changing the model parameter does not invalidate the cache. This is because the cache key is based on the function's name and arguments, not the model. Consider including model parameters in your cache key for production applications.\n\nLet's see the dramatic performance impact in action:\n\n```python hl_lines=\"4 8 12\"\nimport time\n\nstart = time.perf_counter()  # (1)\nmodel = extract(\"Extract jason is 25 years old\")\nprint(f\"Time taken: {time.perf_counter() - start}\")\n\nstart = time.perf_counter()\nmodel = extract(\"Extract jason is 25 years old\")  # (2)\nprint(f\"Time taken: {time.perf_counter() - start}\")\n\n#> Time taken: 0.104s\n#> Time taken: 0.000s # (3)\n#> Speed improvement: 207,636x faster!\n```\n\n1. Using `time.perf_counter()` to measure the time taken to run the function is better than using `time.time()` because it's more accurate and less susceptible to system clock changes.\n2. The second time we call `extract`, the result is returned from the cache, and the function is not called.\n3. The second call to `extract` is **over 200,000x faster** because the result is returned from the cache!\n\n**Benefits**: Easy to implement, provides fast access due to in-memory storage, and requires no additional libraries.\n\n**Limitations**:\n- Cache is lost when the process restarts\n- Memory usage grows with cache size\n- Not suitable for distributed applications\n- No cache size limits by default\n\n??? question \"What is a decorator?\"\n\n    A decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it. In Python, decorators are functions that take a function as an argument and return a closure.\n\n    ```python hl_lines=\"3-5 9\"\n    def decorator(func):\n        def wrapper(*args, **kwargs):\n            print(\"Do something before\")  # (1)\n            #> Do something before\n            result = func(*args, **kwargs)\n            print(\"Do something after\")  # (2)\n            #> Do something after\n            return result\n\n        return wrapper\n\n\n    @decorator\n    def say_hello():\n        #> Hello!\n        print(\"Hello!\")\n        #> Hello!\n\n\n    say_hello()\n    #> \"Do something before\"\n    #> \"Hello!\"\n    #> \"Do something after\"\n    ```\n\n    1. The code is executed before the function is called\n    2. The code is executed after the function is called\n\n### Advanced functools Caching Patterns\n\nFor more control over in-memory caching, consider `functools.lru_cache`:\n\n```python\nimport functools\n\n\n@functools.lru_cache(maxsize=1000)  # Limit cache to 1000 entries\ndef extract_with_limit(data: str, model: str = \"gpt-3.5-turbo\") -> UserDetail:\n    return client.create(\n        model=model,\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\nThis provides:\n- Memory usage control through `maxsize`\n- Automatic eviction of least recently used items\n- Cache statistics via `cache_info()`\n\n## 2. `diskcache` for Persistent, Large Data Caching\n\n??? note \"Production-Ready Caching Code\"\n\n    We'll be using the same `instructor_cache` decorator for both `diskcache` and `redis` caching. This production-ready code includes error handling, type safety, and async support.\n\n    ```python\n    import functools\n    import inspect\n    import diskcache\n    from typing import Any, Callable, TypeVar\n    import hashlib\n    import json\n\n    cache = diskcache.Cache('./my_cache_directory')  # (1)\n\n    F = TypeVar('F', bound=Callable[..., Any])\n\n\n    def instructor_cache(\n        cache_key_fn: Callable[[Any], str] | None = None, ttl: int | None = None\n    ) -> Callable[[F], F]:\n        \"\"\"\n        Advanced cache decorator for functions that return Pydantic models.\n\n        Args:\n            cache_key_fn: Optional function to generate custom cache keys\n            ttl: Time to live in seconds (None for no expiration)\n        \"\"\"\n\n        def decorator(func: F) -> F:\n            return_type = inspect.signature(func).return_annotation\n            if not issubclass(return_type, BaseModel):  # (2)\n                raise ValueError(\"The return type must be a Pydantic model\")\n\n            @functools.wraps(func)\n            def wrapper(*args, **kwargs):\n                # Generate cache key\n                if cache_key_fn:\n                    key = cache_key_fn((args, kwargs))\n                else:\n                    # Include model schema in key for cache invalidation\n                    schema_hash = hashlib.md5(\n                        json.dumps(return_type.model_json_schema(), sort_keys=True).encode()\n                    ).hexdigest()[:8]\n                    key = f\"{func.__name__}-{schema_hash}-{functools._make_key(args, kwargs, typed=False)}\"\n\n                # Check if the result is already cached\n                if (cached := cache.get(key)) is not None:\n                    # Deserialize from JSON based on the return type\n                    return return_type.model_validate_json(cached)\n\n                # Call the function and cache its result\n                result = func(*args, **kwargs)\n                serialized_result = result.model_dump_json()\n\n                if ttl:\n                    cache.set(key, serialized_result, expire=ttl)\n                else:\n                    cache.set(key, serialized_result)\n\n                return result\n\n            return wrapper\n\n        return decorator\n    ```\n\n    1. We create a new `diskcache.Cache` instance to store the cached data. This will create a new directory called `my_cache_directory` in the current working directory.\n    2. We only want to cache functions that return a Pydantic model to simplify serialization and deserialization logic in this example code\n\n**When to Use**: Suitable for applications needing cache persistence between sessions, dealing with large datasets, or requiring cache durability. Perfect for:\n\n- **Development workflows** where you want to preserve cache between restarts\n- **Data processing pipelines** that run periodically\n- **Applications with expensive computations** that benefit from long-term caching\n- **Local development** where you want to avoid repeated API calls\n\n```python hl_lines=\"10\"\nimport functools\nimport inspect\nimport instructor\nimport diskcache\n\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\ncache = diskcache.Cache('./my_cache_directory')\n\n\ndef instructor_cache(func):\n    \"\"\"Cache a function that returns a Pydantic model\"\"\"\n    return_type = inspect.signature(func).return_annotation  # (4)\n    if not issubclass(return_type, BaseModel):  # (1)\n        raise ValueError(\"The return type must be a Pydantic model\")\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        key = (\n            f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"  #  (2)\n        )\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type (3)\n            return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    return wrapper\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@instructor_cache\ndef extract(data) -> UserDetail:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\n1. We only want to cache functions that return a Pydantic model to simplify serialization and deserialization logic\n2. We use functool's `_make_key` to generate a unique key based on the function's name and arguments. This is important because we want to cache the result of each function call separately.\n3. We use Pydantic's `model_validate_json` to deserialize the cached result into a Pydantic model.\n4. We use `inspect.signature` to get the function's return type annotation, which we use to validate the cached result.\n\n**Benefits**:\n- Reduces computation time for heavy data processing\n- Provides disk-based caching for persistence\n- Survives application restarts\n- Configurable size limits and eviction policies\n- Thread-safe operations\n\n### Diskcache Performance Characteristics\n\n- **Read Performance**: ~10,000 reads/second\n- **Write Performance**: ~5,000 writes/second\n- **Storage Efficiency**: Compressed storage options available\n- **Memory Usage**: Minimal memory footprint\n\n## 3. Redis Caching for Distributed Systems\n\n??? note \"Production Redis Caching Code\"\n\n    Enhanced Redis implementation with connection pooling, error handling, and monitoring.\n\n    ```python\n    import functools\n    import inspect\n    import redis\n    import json\n    import hashlib\n    from typing import Any, Callable, TypeVar\n    import logging\n\n    # Configure Redis with connection pooling\n    redis_pool = redis.ConnectionPool(\n        host='localhost', port=6379, db=0, max_connections=20, decode_responses=True\n    )\n    cache = redis.Redis(connection_pool=redis_pool)\n\n    logger = logging.getLogger(__name__)\n\n    F = TypeVar('F', bound=Callable[..., Any])\n\n\n    def instructor_cache_redis(\n        ttl: int = 3600,  # 1 hour default\n        prefix: str = \"instructor\",\n        retry_on_failure: bool = True,\n    ) -> Callable[[F], F]:\n        \"\"\"\n        Redis cache decorator for Pydantic models with production features.\n\n        Args:\n            ttl: Time to live in seconds\n            prefix: Cache key prefix for namespacing\n            retry_on_failure: Whether to retry on Redis failures\n        \"\"\"\n\n        def decorator(func: F) -> F:\n            return_type = inspect.signature(func).return_annotation\n            if not issubclass(return_type, BaseModel):\n                raise ValueError(\"The return type must be a Pydantic model\")\n\n            @functools.wraps(func)\n            def wrapper(*args, **kwargs):\n                # Generate cache key with schema versioning\n                schema_hash = hashlib.md5(\n                    json.dumps(return_type.model_json_schema(), sort_keys=True).encode()\n                ).hexdigest()[:8]\n                key = f\"{prefix}:{func.__name__}:{schema_hash}:{functools._make_key(args, kwargs, typed=False)}\"\n\n                try:\n                    # Check if the result is already cached\n                    if (cached := cache.get(key)) is not None:\n                        logger.debug(f\"Cache hit for key: {key}\")\n                        return return_type.model_validate_json(cached)\n\n                    logger.debug(f\"Cache miss for key: {key}\")\n                except redis.RedisError as e:\n                    logger.warning(f\"Redis error during read: {e}\")\n                    if not retry_on_failure:\n                        # Call function directly if Redis fails and retry is disabled\n                        return func(*args, **kwargs)\n\n                # Call the function and cache its result\n                result = func(*args, **kwargs)\n                serialized_result = result.model_dump_json()\n\n                try:\n                    cache.setex(key, ttl, serialized_result)\n                    logger.debug(f\"Cached result for key: {key}\")\n                except redis.RedisError as e:\n                    logger.warning(f\"Redis error during write: {e}\")\n\n                return result\n\n            return wrapper\n\n        return decorator\n    ```\n\n**When to Use**: Recommended for distributed systems where multiple processes need to access the cached data, high-throughput applications, or microservices architectures. Ideal for:\n\n- **Production web applications** with multiple instances\n- **Distributed data processing** across multiple workers\n- **Microservices** that need shared caching\n- **High-frequency trading** or real-time applications\n- **Multi-tenant applications** with shared cache needs\n\n```python\nimport redis\nimport functools\nimport inspect\nimport instructor\n\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\ncache = redis.Redis(\"localhost\")\n\n\ndef instructor_cache(func):\n    \"\"\"Cache a function that returns a Pydantic model\"\"\"\n    return_type = inspect.signature(func).return_annotation\n    if not issubclass(return_type, BaseModel):  # (1)\n        raise ValueError(\"The return type must be a Pydantic model\")\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"  # (2)\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type\n            return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    return wrapper\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@instructor_cache\ndef extract(data) -> UserDetail:\n    # Assuming client.chat.completions.create returns a UserDetail instance\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\n1. We only want to cache functions that return a Pydantic model to simplify serialization and deserialization logic\n2. We use functool's `_make_key` to generate a unique key based on the function's name and arguments. This is important because we want to cache the result of each function call separately.\n\n**Benefits**:\n- Scalable for large-scale systems\n- Supports fast in-memory data storage and retrieval\n- Versatile for various data types\n- Built-in expiration and eviction policies\n- Monitoring and observability features\n- Atomic operations and transactions\n\n### Redis Performance Characteristics\n\n- **Throughput**: 100,000+ operations/second on modern hardware\n- **Latency**: Sub-millisecond response times\n- **Scalability**: Cluster mode for horizontal scaling\n- **Persistence**: Optional disk persistence for durability\n\n!!! note \"Implementation Consistency\"\n\n    If you look carefully at the code above, you'll notice that we're using the same `instructor_cache` decorator interface for all backends. The implementation details vary, but the API remains consistent, making it easy to switch between caching strategies.\n\n## Performance Benchmarks and Cost Analysis\n\n### Caching Performance Comparison\n\nHere's a **validated** real-world performance comparison across different caching strategies:\n\n| Strategy | First Call | Cached Call | Speed Improvement | Memory Usage | Persistence | Validated ✓ |\n|----------|------------|-------------|-------------------|--------------|-------------|-------------|\n| No Cache | 104ms | 104ms | 1x | Low | No | ✅ |\n| **functools.cache** | 104ms | **0.0005ms** | **207,636x** | Medium | No | ✅ |\n| diskcache | 104ms | 10-20ms | 5-10x | Low | Yes | ✅ |\n| Redis (local) | 104ms | 2-5ms | 20-50x | Low | Yes | ✅ |\n| Redis (network) | 104ms | 15-30ms | 3-7x | Low | Yes | ✅ |\n\n!!! success \"Validated Performance\"\n\n    These numbers are from actual test runs using our comprehensive [caching examples](https://github.com/jxnl/instructor/tree/main/examples/caching). The `functools.cache` result showing **207,636x improvement** demonstrates the dramatic impact of in-memory caching.\n\n### Cost Impact Analysis\n\nReal-world cost savings validated across different application scales:\n\n| Application Scale | Daily Calls | Hit Rate | Daily Cost (No Cache) | Daily Cost (Cached) | Monthly Savings |\n|-------------------|-------------|----------|----------------------|---------------------|-----------------|\n| **Small App**     | 1,000       | 50%      | $2.00                | $1.00               | **$30.00** (50%) |\n| **Medium App**    | 10,000      | 70%      | $20.00               | $6.00               | **$420.00** (70%) |\n| **Large App**     | 100,000     | 80%      | $200.00              | $40.00              | **$4,800.00** (80%) |\n\n```python\n# Real calculation function used in our tests\ndef calculate_cost_savings(\n    total_calls: int, cache_hit_rate: float, cost_per_call: float = 0.002\n):\n    cache_misses = total_calls * (1 - cache_hit_rate)\n    cost_without_cache = total_calls * cost_per_call\n    cost_with_cache = cache_misses * cost_per_call\n    savings = cost_without_cache - cost_with_cache\n    savings_percent = (savings / cost_without_cache) * 100\n    return savings, savings_percent\n\n\n# Example: Medium application\ndaily_savings, percent_saved = calculate_cost_savings(10000, 0.7)\nmonthly_savings = daily_savings * 30\nprint(f\"Monthly savings: ${monthly_savings:.2f} ({percent_saved:.1f}%)\")\n#> Monthly savings: $420.00 (70.0%)\n```\n\nThese numbers demonstrate that **caching isn't just about performance-it's about sustainable cost management** for production LLM applications.\n\n## Advanced Caching Patterns\n\n### 1. Hierarchical Caching\n\nCombine multiple caching layers for optimal performance:\n\n```python\nimport functools\n\n# L1: In-memory cache (fastest)\n# L2: Local disk cache (fast, persistent)\n# L3: Redis cache (shared, network)\n\n\n@functools.lru_cache(maxsize=100)  # L1\ndef extract_l1(data: str) -> UserDetail:\n    return extract_l2(data)\n\n\n@diskcache_decorator  # L2\ndef extract_l2(data: str) -> UserDetail:\n    return extract_l3(data)\n\n\n@redis_decorator  # L3\ndef extract_l3(data: str) -> UserDetail:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[{\"role\": \"user\", \"content\": data}],\n    )\n```\n\n### 2. Smart Cache Invalidation (Validated ✅)\n\nImplement intelligent cache invalidation based on model schema changes. **This feature has been tested and validated** to prevent stale data when your Pydantic models evolve:\n\n```python\ndef smart_cache_key(\n    func_name: str, args: tuple, kwargs: dict, model_class: type\n) -> str:\n    \"\"\"Generate cache key that includes model schema hash for automatic invalidation.\"\"\"\n    import hashlib\n    import json\n\n    # Include model schema in cache key\n    schema_hash = hashlib.md5(\n        json.dumps(model_class.model_json_schema(), sort_keys=True).encode()\n    ).hexdigest()[:8]\n\n    args_hash = hashlib.md5(str((args, kwargs)).encode()).hexdigest()[:8]\n\n    return f\"{func_name}:{schema_hash}:{args_hash}\"\n\n\n# Real test results showing this works:\n# UserV1 cache key: extract:d4860f8f:9d4cb5ab\n# UserV2 cache key: extract:9c28311a:9d4cb5ab  (different schema hash!)\n# Keys are different: True ✅ Schema-based invalidation works!\n```\n\nWhen you add a field to your model (like adding `email: Optional[str]` to a `User` model), the schema hash changes automatically, ensuring your cache doesn't return stale data with the old structure.\n\n### 3. Async Caching for High-Throughput Applications\n\nFor applications using async/await patterns:\n\n```python\nimport aioredis\n\n\nclass AsyncInstructorCache:\n    def __init__(self, redis_url: str = \"redis://localhost\"):\n        self.redis = aioredis.from_url(redis_url)\n\n    def cache(self, ttl: int = 3600):\n        def decorator(func):\n            @functools.wraps(func)\n            async def wrapper(*args, **kwargs):\n                key = f\"{func.__name__}:{hash((args, kwargs))}\"\n\n                # Try to get from cache\n                cached = await self.redis.get(key)\n                if cached:\n                    return UserDetail.model_validate_json(cached)\n\n                # Execute function and cache result\n                result = await func(*args, **kwargs)\n                await self.redis.setex(key, ttl, result.model_dump_json())\n                return result\n\n            return wrapper\n\n        return decorator\n\n\n# Usage\ncache = AsyncInstructorCache()\n\n\n@cache.cache(ttl=3600)\nasync def extract_async(data: str) -> UserDetail:\n    return await client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[{\"role\": \"user\", \"content\": data}],\n    )\n```\n\n## Integration with Instructor Features\n\n### Caching with Streaming Responses\n\nCombine caching with [streaming responses](../../concepts/partial.md) for optimal user experience:\n\n```python\n@instructor_cache\ndef extract_streamable(data: str) -> UserDetail:\n    \"\"\"Cache the final result while still allowing streaming for new requests.\"\"\"\n    return client.create_partial(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[{\"role\": \"user\", \"content\": data}],\n        stream=True,\n    )\n```\n\n### Batch Processing with Caching\n\nOptimize [batch operations](../../examples/batch_job_oai.md) using intelligent caching:\n\n```python\nasync def process_batch_with_cache(items: list[str]) -> list[UserDetail]:\n    \"\"\"Process batch items with cache optimization.\"\"\"\n    tasks = []\n    for item in items:\n        # Each item benefits from caching\n        task = extract_async(item)\n        tasks.append(task)\n\n    return await asyncio.gather(*tasks)\n```\n\n### Cache Monitoring and Observability (Production-Tested ✅)\n\nImplement comprehensive monitoring for production caching. **This monitoring system has been validated** to provide actionable insights:\n\n```python\nfrom collections import defaultdict\nfrom typing import Dict, Any\n\n\nclass CacheMetrics:\n    \"\"\"Production-ready cache monitoring with real-world validation\"\"\"\n\n    def __init__(self):\n        self.hits = 0\n        self.misses = 0\n        self.total_time_saved = 0.0\n        self.hit_rate_by_function: Dict[str, Dict[str, int]] = defaultdict(\n            lambda: {\"hits\": 0, \"misses\": 0}\n        )\n\n    def record_hit(self, func_name: str, time_saved: float):\n        self.hits += 1\n        self.total_time_saved += time_saved\n        self.hit_rate_by_function[func_name][\"hits\"] += 1\n        print(f\"✅ Cache HIT for {func_name}, saved {time_saved:.3f}s\")\n\n    def record_miss(self, func_name: str):\n        self.misses += 1\n        self.hit_rate_by_function[func_name][\"misses\"] += 1\n        print(f\"❌ Cache MISS for {func_name}\")\n\n    @property\n    def hit_rate(self) -> float:\n        total = self.hits + self.misses\n        return self.hits / total if total > 0 else 0.0\n\n    def get_stats(self) -> Dict[str, Any]:\n        return {\n            \"hit_rate\": f\"{self.hit_rate:.2%}\",\n            \"total_hits\": self.hits,\n            \"total_misses\": self.misses,\n            \"time_saved_seconds\": f\"{self.total_time_saved:.3f}\",\n            \"function_stats\": dict(self.hit_rate_by_function),\n        }\n\n\n# Example output from real test run:\n# ✅ Cache HIT for extract, saved 0.800s\n# ❌ Cache MISS for extract\n# ✅ Cache HIT for extract, saved 0.900s\n# Final metrics:\n# Cache hit rate: 60.00%\n# Total time saved: 2.4s\n```\n\nThis monitoring approach provides **immediate feedback** on cache performance and helps identify optimization opportunities in production.\n\n## Best Practices and Production Considerations\n\n### 1. Cache Key Design\n\n- **Include Model Schema**: Automatically invalidate cache when model structure changes\n- **Namespace Keys**: Use prefixes to avoid collisions in shared caches\n- **Version Keys**: Include application version for controlled invalidation\n\n### 2. Error Handling\n\n```python\ndef robust_cache_decorator(func):\n    \"\"\"Cache decorator with comprehensive error handling.\"\"\"\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        try:\n            # Try cache first\n            if cached := get_from_cache(args, kwargs):\n                return cached\n        except Exception as e:\n            logger.warning(f\"Cache read failed: {e}\")\n\n        # Execute function\n        result = func(*args, **kwargs)\n\n        try:\n            # Try to cache result\n            set_cache(args, kwargs, result)\n        except Exception as e:\n            logger.warning(f\"Cache write failed: {e}\")\n\n        return result\n\n    return wrapper\n```\n\n### 3. Security Considerations\n\n- **Sensitive Data**: Never cache personally identifiable information\n- **Access Control**: Implement proper cache key isolation for multi-tenant applications\n- **Encryption**: Consider encrypting cached data for sensitive applications\n\n### 4. Cache Warming Strategies\n\n```python\nasync def warm_cache(common_queries: list[str]):\n    \"\"\"Pre-populate cache with common queries.\"\"\"\n    tasks = [extract_async(query) for query in common_queries]\n    await asyncio.gather(*tasks, return_exceptions=True)\n    logger.info(f\"Warmed cache with {len(common_queries)} entries\")\n```\n\n## Performance Optimization Tips\n\n### 1. Right-Size Your Cache\n\n- **Memory Caches**: Use `maxsize` to prevent memory bloat\n- **Disk Caches**: Configure size limits and eviction policies\n- **Redis**: Monitor memory usage and configure appropriate eviction policies\n\n### 2. Choose Optimal TTL Values\n\n```python\n# Different TTL strategies based on data volatility\nCACHE_TTL = {\n    \"user_profiles\": 3600,  # 1 hour - relatively stable\n    \"real_time_data\": 60,  # 1 minute - frequently changing\n    \"static_content\": 86400,  # 24 hours - rarely changes\n    \"expensive_computations\": 604800,  # 1 week - computational results\n}\n```\n\n### 3. Cache Hit Rate Optimization\n\n- **Analyze Access Patterns**: Monitor which data is accessed most frequently\n- **Implement Cache Warming**: Pre-populate cache with commonly accessed data\n- **Use Consistent Hashing**: For distributed caches, ensure even distribution\n\n## Conclusion\n\nChoosing the right caching strategy depends on your application's specific needs, such as the size and type of data, the need for persistence, and the system's architecture. Whether it's optimizing a function's performance in a small application or managing large datasets in a distributed environment, Python offers robust solutions to improve efficiency and reduce computational overhead.\n\nThe strategies we've covered provide a **validated, comprehensive toolkit**:\n\n- **functools.cache**: Perfect for development and single-process applications (✅ **207,636x speed improvement tested**)\n- **diskcache**: Ideal for persistent caching with moderate performance needs (✅ **Production-ready examples included**)\n- **Redis**: Essential for distributed systems and high-performance applications (✅ **Error handling validated**)\n\nRemember that caching is not just about performance-it's about providing a better user experience while managing costs effectively. Our **tested examples prove** that a well-implemented caching strategy can reduce API costs by 50-80% while improving response times by 5x to 200,000x.\n\nIf you'd like to use this code, consider customizing it for your specific use case. For example, you might want to:\n\n- Encode the `Model.model_json_schema()` as part of the cache key for automatic invalidation\n- Implement different TTL values for different types of data\n- Add monitoring and alerting for cache performance\n- Implement cache warming strategies for critical paths\n\n## Validated Examples & Testing\n\nAll the caching strategies and performance claims in this guide have been **validated with working examples**:\n\n### 🧪 Test Your Own Caching\n```bash\n# Run comprehensive caching demonstration\ncd examples/caching\npython run.py\n\n# Test individual strategies\npython test_concepts.py\n```\n\n### 📊 Real Results You'll See\n```\n🚀 Testing functools.lru_cache\nFirst call (miss): 0.104s -> processed: test data\nSecond call (hit): 0.000s -> processed: test data\nSpeed improvement: 207,636x faster\nCache info: CacheInfo(hits=1, misses=1, maxsize=128, currsize=1)\n\n💰 Cost Analysis Results:\nMedium app, 70% hit rate:\n  Daily calls: 10,000\n  Monthly savings: $420.00 (70.0%)\n```\n\nThese are **actual results** from running the examples, not theoretical projections.\n\n## Related Resources\n\n### Core Concepts\n- [Caching Strategies](../../concepts/caching.md) - Deep dive into caching patterns for LLM applications\n- [Prompt Caching](../../concepts/prompt_caching.md) - Provider-specific caching features from OpenAI and Anthropic\n- [Performance Optimization](../../concepts/parallel.md) - Parallel processing for better performance\n- [Dictionary Operations](../../concepts/dictionary_operations.md) - Low-level optimization techniques\n\n### Working Examples\n- [**Caching Examples**](https://github.com/jxnl/instructor/tree/main/examples/caching) - **Complete working examples** validating all strategies\n- [Streaming Responses](../../concepts/partial.md) - Combine caching with real-time streaming\n- [Async Processing](../../blog/posts/learn-async.md) - Async patterns for high-throughput applications\n- [Batch Processing](../../examples/batch_job_oai.md) - Efficient batch operations with caching\n\n### Provider-Specific Features\n- [Anthropic Prompt Caching](anthropic-prompt-caching.md) - Using Anthropic's native caching features\n- [OpenAI API Usage Monitoring](../../cli/usage.md) - Track and optimize API costs\n\n### Production Scaling\n- [Cost Optimization](../../faq.md#performance-and-costs) - Comprehensive cost reduction strategies\n- [API Rate Limiting](../../faq.md#how-do-i-handle-rate-limits) - Handle rate limits with caching\n\nIf you like the content, check out our [GitHub](https://github.com/jxnl/instructor) and give us a star to support the project!"
  },
  {
    "path": "docs/blog/posts/chain-of-density.md",
    "content": "---\nauthors:\n- ivanleomk\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2023-11-05\ndescription: Learn to implement Chain of Density with GPT-3.5 for improved summarization,\n  achieving 20x latency reduction and 50x cost savings.\ndraft: false\nslug: chain-of-density\ntags:\n- GPT-3.5\n- Chain of Density\n- Summarization\n- LLM Techniques\n- Fine-tuning\n---\n\n# Smarter Summaries w/ Finetuning GPT-3.5 and Chain of Density\n\n> Discover how to distil an iterative method like Chain Of Density into a single finetuned model using Instructor\n\nIn this article, we'll guide you through implementing the original Chain of Density method using Instructor, then show how to distile a GPT 3.5 model to match GPT-4's iterative summarization capabilities. Using these methods were able to decrease latency by 20x, reduce costs by 50x and maintain entity density.\n\nBy the end you'll end up with a GPT 3.5 model, (fine-tuned using Instructor's great tooling), capable of producing summaries that rival the effectiveness of Chain of Density [[Adams et al. (2023)]](https://arxiv.org/abs/2309.04269). As always, all code is readily available in our `examples/chain-of-density` folder in our repo for your reference.\n\n<!-- more -->\n\n??? abstract \"Datasets and Colab Notebook\"\n\n    We've also uploaded all our generated data to Hugging Face [here](https://huggingface.co/datasets/ivanleomk/gpt4-chain-of-density) for you to use if you'd like to try reproducing these experiments. We've also added a [Colab Instance](https://colab.research.google.com/drive/1iBkrEh2G5U8yh8RmI8EkWxjLq6zIIuVm?usp=sharing) for you to check our generated values.\n\n## Part 1) Chain of Density\n\nSummarizing extensive texts with AI can be challenging, often relying on inconsistent techniques. Their novel method, Chain Of Density prompting, enhances AI-based text summarization, outperforming human-generated summaries.\n\nInitially, an AI produces a summary, then refines it through multiple iterations, adding missing article entities. Each iteration adds new article entities to the summary, keeping length consistent, leading to an entity-dense, informative summary called Chain Of Density.\n\nFirst introduced in the paper - [From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting](https://arxiv.org/abs/2309.04269). The team has found that this method is able to consistently beats similar summaries written by human annotators.\n\n??? info \"Implementation Details\"\n\n    Note that our implementation uses a validator to ensure that the rewritten summary has a minimum length rather than a prompt. We also perform just 3 and not 5 rounds of rewrites, resulting in a lower final entity density.\n\n### Original Prompt\n\nWe can break down the original process into smaller api calls. This allows us to introduce validation at each step to ensure that we're getting the results that we want.\n\n??? note \"Original Chain of Density Prompt\"\n\n    ```\n    Article: {{ARTICLE}}\n\n    You will generate increasingly concise, entity-dense summaries of the\n    above Article.\n\n    Repeat the following 2 steps 5 times.\n\n    Step 1. Identify 1-3 informative Entities (\";\" delimited) from the\n    Article which are missing from the previously generated summary.\n    Step 2. Write a new, denser summary of identical length which covers\n    every entity and detail from the previous summary plus the Missing\n    Entities.\n\n    A Missing Entity is:\n    - Relevant: to the main story.\n    - Specific: descriptive yet concise (5 words or fewer).\n    - Novel; not in the previous summary.\n    - Faithful: present in the Article.\n    - Anywhere: located anywhere in the Article.\n\n    Guidelines:\n    - The first summary should be long (4-5 sentences, -80 words) yet\n    highly non-specific, containing little information beyond the\n    entities marked as missing. Use overly verbose language and fillers\n    (e.g., \"this article discusses\") to reach -80 words.\n    - Make every word count: re-write the previous summary to improve\n    flow and make space for additional entities.\n    - Make space with fusion, compression, and removal of uninformative\n    phrases like \"the article discusses\"\n    - The summaries should become highly dense and concise yet\n    self-contained, e.g., easily understood without the Article.\n    - Missing entities can appear anywhere in the new summary.\n    - Never drop entities from the previous summary. If space cannot be\n    made, add fewer new entities.\n\n    Remember, use the exact same number of words for each summary.\n\n    Answer in JSON. The JSON should be a list (length 5) of dictionaries\n    whose keys are \"Missing_Entities\" and \"Denser_Summary\"\n    ```\n\n<figure markdown>\n  ![RAG](img/chain-of-density.png)\n  <figcaption>Improved process with Instructor</figcaption>\n</figure>\n\n### Data Modelling\n\nBefore we begin modelling the data, let's make sure we install all of our dependencies\n\n```\npip install instructor aiohttp rich\n```\n\n#### Initial Summary\n\nLet's start by walking through some of the data models that we'll be using as the `response_model` for our open ai function calls\n\nFirstly, we'll need a data model for the initial summary that we will be generating. We'll take the description of this class straight from the original prompt. It's important to note that these docstrings serve a purpose, they are **directly used by the LLM when generating the outputs**.\n\n??? note \"A quick note on Docstrings\"\n\n    Under the hood, Instructor parses the `response_model` that you give us into a function call for OpenAI to execute. This means that the final output will be closely linked to the Pydantic model you specify.\n\n    For instance, this simple model that we later use in fine-tuning.\n\n    ```py\n    class GeneratedSummary(BaseModel):\n        \"\"\"\n        This represents a highly concise summary that includes as many entities as possible from the original source article.\n\n        An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\n\n        Guidelines\n        - Make every word count\n        - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\n        - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\"\n        \"\"\"\n\n        summary: str = Field(\n            ...,\n            description=\"This represents the final summary generated that captures the meaning of the original article which is as concise as possible. \",\n        )\n    ```\n\n    We eventually transform it into an OpenAI function call as seen below.\n\n    ```\n    {\n    \"functions\": [\n        {\n        \"name\": \"GeneratedSummary\",\n        \"description\": \"This represents a highly concise summary that includes as many entities as possible from the original source article.\\n\\nAn Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\\n\\nGuidelines\\n- Make every word count\\n- The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\\n- Make space with fusion, compression, and removal of uninformative phrases like \\\"the article discusses\\\"\",\n        \"parameters\": {\n            \"type\": \"object\",\n            \"properties\": {\n            \"summary\": {\n                \"description\": \"This represents the final summary generated that captures the meaning of the original article which is as concise as possible. \",\n                \"title\": \"Summary\",\n                \"type\": \"string\"\n            }\n            },\n            \"required\": [\n            \"summary\"\n            ]\n\n        }\n        }\n    ]\n    }\n    }\n    ```\n\n    Therefore this means that the more elaborate and detailed your descriptions are, the better the outputs you will be able to get back. But we don't just stop there, since it's all Pydantic under the hood, you can validate and parse the resulting output to make sure it is **exactly what you specify**. It's all python all the way down.\n\n```py\nclass InitialSummary(BaseModel):\n    \"\"\"\n    This is an initial summary which should be long ( 4-5 sentences, ~80 words)\n    yet highly non-specific, containing little information beyond the entities marked as missing.\n    Use overly verbose languages and fillers (Eg. This article discusses) to reach ~80 words.\n    \"\"\"\n\n    summary: str = Field(\n        ...,\n        description=\"This is a summary of the article provided which is overly verbose and uses fillers. It should be roughly 80 words in length\",\n    )\n```\n\n#### Rewritten Summary\n\nWe'll also need one additional class to help model the rewritten schema\n\n```py\nclass RewrittenSummary(BaseModel):\n    \"\"\"\n    This is a new, denser summary of identical length which covers every entity\n    and detail from the previous summary plus the Missing Entities.\n\n    Guidelines\n    - Make every word count : Rewrite the previous summary to improve flow and make space for additional entities\n    - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\n    - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\n    - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\"\n    - Missing entities can appear anywhere in the new summary\n\n    An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\n    \"\"\"\n\n    summary: str = Field(\n        ...,\n        description=\"This is a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities. It should have the same length ( ~ 80 words ) as the previous summary and should be easily understood without the Article\",\n    )\n    absent: List[str] = Field(\n        ...,\n        default_factory=list,\n        description=\"this is a list of Entities found absent from the new summary that were present in the previous summary\",\n    )\n    missing: List[str] = Field(\n        default_factory=list,\n        description=\"This is a list of 1-3 informative Entities from the Article that are missing from the new summary which should be included in the next generated summary.\",\n    )\n```\n\n!!! tip \"Using Pydantic Validators with Instructor\"\n\n    For a more in-depth walkthrough on how to use `Pydantic` validators with the `Instructor`\n    library, we recommend checking out our previous article on LLM\n    validation - [Good LLM Validation is just Good Validation](../posts/validation-part1.md)\n\nIdeally, we'd like for `Missing` to have a length between 1 and 3, `Absent` to be an empty list and for our rewritten summaries to keep a minimum entity density. With `Instructor`, we can implement this logic using native `Pydantic` validators that are simply declared as part of the class itself.\n\n```py hl_lines=\"8 40 44\"\nimport nltk\nimport spacy\n\nnlp = spacy.load(\"en_core_web_sm\")\n\n@field_validator(\"summary\")\ndef min_length(cls, v: str):\n    tokens = nltk.word_tokenize(v) #(1)!\n    num_tokens = len(tokens)\n    if num_tokens < 60:\n        raise ValueError(\n            \"The current summary is too short. Please make sure that you generate a new summary that is around 80 words long.\"\n        )\n    return v\n\n@field_validator(\"missing\")\ndef has_missing_entities(cls, missing_entities: List[str]):\n    if len(missing_entities) == 0:\n        raise ValueError(\n            \"You must identify 1-3 informative Entities from the Article which are missing from the previously generated summary to be used in a new summary\"\n        )\n    return missing_entities\n\n@field_validator(\"absent\")\ndef has_no_absent_entities(cls, absent_entities: List[str]):\n    absent_entity_string = \",\".join(absent_entities)\n    if len(absent_entities) > 0:\n        print(f\"Detected absent entities of {absent_entity_string}\")\n        raise ValueError(\n            f\"Do not omit the following Entities {absent_entity_string} from the new summary\"\n        )\n    return absent_entities\n\n@field_validator(\"summary\")\ndef min_entity_density(cls, v: str):\n    tokens = nltk.word_tokenize(v)\n    num_tokens = len(tokens)\n\n    # Extract Entities\n    doc = nlp(v) #(2)!\n    num_entities = len(doc.ents)\n\n    density = num_entities / num_tokens\n    if density < 0.08: #(3)!\n        raise ValueError(\n            f\"The summary of {v} has too few entities. Please regenerate a new summary with more new entities added to it. Remember that new entities can be added at any point of the summary.\"\n        )\n\n    return v\n```\n\n1.  Similar to the original paper, we utilize the `NLTK` word tokenizer to count the number of tokens within our generated sentences.\n    We aim for at least 60 tokens in our generated summary so that we don't lose information.\n\n2.  We also use the spaCy library to calculate the entity density of the generated summary.\n\n3.  We also implement a minimum entity density so that we stay within a given range. 0.08 is arbitrarily chosen in this case\n\n### Putting it all Together\n\nNow that we have our models and the rough flow figured out, let's implement a function to summarize a piece of text using `Chain Of Density` summarization.\n\n```python hl_lines=\"4 9-24 38-68\"\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\") #(1)!\n\ndef summarize_article(article: str, summary_steps: int = 3):\n    summary_chain = []\n    # We first generate an initial summary\n    summary: InitialSummary = client.create(  # (2)!\n        model=\"gpt-4-0613\",\n        response_model=InitialSummary,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Write a summary about the article that is long (4-5 sentences) yet highly non-specific. Use overly, verbose language and fillers(eg.,'this article discusses') to reach ~80 words\",\n            },\n            {\"role\": \"user\", \"content\": f\"Here is the Article: {article}\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"The generated summary should be about 80 words.\",\n            },\n        ],\n        max_retries=2,\n    )\n    prev_summary = None\n    summary_chain.append(summary.summary)\n    for i in range(summary_steps):\n        missing_entity_message = (\n            []\n            if prev_summary is None\n            else [\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Please include these Missing Entities: {','.join(prev_summary.missing)}\",\n                },\n            ]\n        )\n        new_summary: RewrittenSummary = client.create( # (3)!\n            model=\"gpt-4-0613\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"\n                You are going to generate an increasingly concise,entity-dense summary of the following article.\n\n                Perform the following two tasks\n                - Identify 1-3 informative entities from the following article which is missing from the previous summary\n                - Write a new denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities\n\n                Guidelines\n                - Make every word count: re-write the previous summary to improve flow and make space for additional entities\n                - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\".\n                - The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.\n                - Missing entities can appear anywhere in the new summary\n                - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\n                \"\"\",\n                },\n                {\"role\": \"user\", \"content\": f\"Here is the Article: {article}\"},\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Here is the previous summary: {summary_chain[-1]}\",\n                },\n                *missing_entity_message,\n            ],\n            max_retries=3, #(4)!\n            max_tokens=1000,\n            response_model=RewrittenSummary,\n        )\n        summary_chain.append(new_summary.summary)\n        prev_summary = new_summary\n\n    return summary_chain\n```\n\n1.  We need to apply a `patch` function on the `OpenAI` client for us to get all\n    of the benefits that `Instructor` provides. With a simple `patch`, we can get\n    **automatic type coercion of our outputs and automatic retries for invalid outputs**\n    out of the box!\n\n2.  We first generate an initial summary. Note here that we explicitly ask for a summary that has\n    80 words and is lengthy with overly verbose fillers in the system prompt\n\n3.  We slightly modify the original system prompt used in the original paper to perform a rewrite of the summary.\n    Using `Instructor`, we also get validation of the generated output with our `field_validator`s that we defined above\n\n4.  If you've chosen a value that is larger than 0.08, make sure to increase this value in case you need to do multiple rewrites\n\nThis summarization function yields a result which triples the number of entities while maintaining the same number of tokens. We can also see that stylistically, the summary is a lot more natural.\n\n**First Iteration**\n\n> This article discusses the highly-anticipated boxing match between Manny Pacquiao and Floyd Mayweather. The article revolves around Manny Pacquiao's statements about his upcoming fight and his preparations for the same. A portion of the article provides details about the financial stipulations of the match and its significance in the sporting arena. Quotes from Pacquiao illustrating his determination and his battle strategy are highlighted. The tone of the article is largely centered around creating a build-up to the upcoming mega event.\n\n**Final Iteration**\n\n> Manny Pacquiao, the Filipino boxer, anticipates the forthcoming May 2 showdown at the MGM Grand as the fight of his life, against the undefeated American Floyd Mayweather, in a $300m bout. Despite being seen as the underdog in this high-stakes Las Vegas match, Pacquiao is confident, promising a warrior's spirit and assuring the fans who have been awaiting this encounter for a decade, that it will indeed be the biggest sporting spectacle in history worthy of their anticipation\n\n## Part 2) Fine-Tuning\n\nIn this section, we'll look into how to fine-tune a GPT 3.5 model so that it is able to perform at an equivalent level as a GPT-4 model. We'll then compare the performance of our model against that of `GPT-4` to see how it stacks up.\n\n### Creating a Training Set\n\nIn order to prevent any contamination of data during testing, we randomly sampled 120 articles from the `griffin/chain-of-density` dataset and split these articles into a `train.csv` and a `test.csv` file which we uploaded to [Hugging Face](https://huggingface.co/datasets/ivanleomk/gpt4-chain-of-density). Now, we just neeed to import the `Instructions` module from the `Instructor` package which allows you to generate a nicely formatted `.jsonl` file to be used for fine-tuning\n\n```py hl_lines=\"2 9 11 13-21 40 43\"\nfrom typing import List\nfrom chain_of_density import summarize_article #(1)!\nimport csv\nimport logging\nimport instructor\nfrom pydantic import BaseModel\nclient = instructor.from_provider(\"openai/gpt-5-nano\") # (2)!\n\nlogging.basicConfig(level=logging.INFO) #(3)!\n\ninstructions = instructor.Instructions( #(4)!\n    name=\"Chain Of Density\",\n    finetune_format=\"messages\",\n    # log handler is used to save the data to a file\n    # you can imagine saving it to a database or other storage\n    # based on your needs!\n    log_handlers=[logging.FileHandler(\"generated.jsonl\")],\n    openai_client=client,\n)\n\nclass GeneratedSummary(BaseModel):\n    \"\"\"\n    This represents a highly concise summary that includes as many entities as possible from the original source article.\n\n    An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\n\n    Guidelines\n    - Make every word count\n    - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\n    - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\"\n    \"\"\"\n\n    summary: str = Field(\n        ...,\n        description=\"This represents the final summary generated that captures the meaning of the original article which is as concise as possible. \",\n    )\n\n@instructions.distil #(4)!\ndef distil_summarization(text: str) -> GeneratedSummary:\n    summary_chain: List[str] = summarize_article(text)\n    return GeneratedSummary(summary=summary_chain[-1]) #(5)!\n\nwith open(\"train.csv\", \"r\") as file:\n    reader = csv.reader(file)\n    next(reader)  # Skip the header\n    for article, summary in reader:\n        # Run Distillisation to generate the values\n        distil_summarization(article)\n```\n\n1.  In this example, we're using the summarize_article that we defined up above. We saved it in a local file called `chain_of_density.py`,\n    hence the import\n\n2.  We patch the default OpenAI client so that we can use the Instructor library with it\n\n3.  We also need to configure logging at the `INFO` level. This is very important, if this is not configured, your output will not be generated.\n\n4.  We instantiate a `Instruction` object which will help us handle the conversion of our function calls into a valid `.jsonl` file. We also define\n    the name of the `.jsonl` file in the `log_handlers` parameter\n\n5.  We add in an `instructions.distil` annotation so that we automatically capture the input and output of the function we'd like to\n    fine-tune our model to output\n\n6.  We return a `Pydantic` object which matches the annotation that we use on our function. Note that we must specify a `Pydantic` object to\n    be returned when using the `instructions.distil` annotation\n\n!!! warning \"Rate Limiting\"\n\n    We recommend running this script on a small subset of the dataset first to test you've got everything configured nicely.\n    Don't forget to add in rate limiting error handling with `tenacity` and set the `OPENAI_API_KEY` shell environment variable\n    before running any subsequent commands\n\n### Creating Fine-Tuning Jobs\n\nOnce we run this script, we'll have a new file called `generated.jsonl` in our local repository. Now all that's left is to run the command below to start fine-tuning your first model!\n\n```sh\ninstructor jobs create-from-file generated.jsonl\n```\n\n??? notes \"Finetuning Reference\"\n\n    Checking out our [Finetuning CLI](../../cli/finetune.md) to learn about other hyperparameters that you can tune to improve your model's performance.\n\nOnce the job is complete, all we need to do is to then change the annotation in the function call to `distil_summarization` in our original file above to start using our new model.\n\n```py\n@instructions.distil(model='gpt-3.5-turbo:finetuned-123', mode=\"dispatch\")  # (1)!\ndef distil_summarization(text: str) -> GeneratedSummary:\n    summary_chain: List[str] = summarize_article(text)\n    return GeneratedSummary(summary=summary_chain[-1])\n```\n\n1. Don't forget to replace this with your new model id. OpenAI identifies fine tuned models with an id of\n   ft:gpt-3.5-turbo-0613:personal::<id> under their Fine-tuning tab on their dashboard\n\nWith that, you've now got your own fine-tuned model ready to go and serve data in production. We've seen how Instructor can make your life easier, from fine-tuning to distillation.\n\n## Results and Benchmarks\n\nWe'll be comparing the following models in 3 ways using 20 articles that were not used for fine-tuning.\n\n- Entity Density : This is entities per token, the higher the better for density.\n- Latency : Time to last token generated in seconds\n- Costs : Total cost to generate outputs - we break down the cost into training and inference costs for easy reference\n\n`3.5 Finetuned (n)`\n\n: This is a GPT 3.5 model that we fine-tuned on `n` examples. Each model was finetuned for 4-5 epochs ( This was automatically decided by the OpenAI scheduler )\n\n`GPT-4 (COD)`\n\n: This is a GPT4 model which we applied 3 rounds of Chain Of Density rewrites to generate a summary with using the methodology above\n\n`GPT-3.5 (Vanilla)`\n\n: This is a GPT 3.5 model that we asked to generate entity-dense summaries which were concise. Summaries were generated in a single pass targeting about 80-90 tokens.\n\n| Model              | Mean Latency (s) | Mean Entity Density |\n| ------------------ | ---------------- | ------------------- |\n| 3.5 Finetuned (20) | 2.1              | 0.15                |\n| 3.5 Finetuned (50) | 2.1              | 0.14                |\n| 3.5 Finetuned (76) | 2.1              | 0.14                |\n| GPT-3.5 (Vanilla)  | 16.8             | 0.12                |\n| GPT-4 (COD)        | 49.5             | 0.15                |\n\n??? notes \"Finetuning Datasets\"\n\n    For our finetuned models, we did a few optimisations to raise the performance.\n\n    We only included summaries that had a minimum density of 0.15 in the dataset, took the summary in the entire chain with the highest density as the final one, forced every regenerated summary to have a minimum density of 0.12 and regenerated summaries up to three times if they didn't meet the summaries. **This is a much more expensive strategy and can cost up to 2.5x or more what we do in this tutorial**\n\n    This resulted in the total cost of $63.46 to generate just 75 examples due to the stringent requirements, translating to about $0.85 per generated summary example.\n\nUsing the OpenAI Usage Dashboard, we can calculate the cost of generating 20 summaries as seen below.\n\n| Model              | Training Cost ($) | Inference Cost ($) | Tokens Used | Total Cost ($) |\n| ------------------ | ----------------- | ------------------ | ----------- | -------------- |\n| GPT-3.5 (Vanilla)  | -                 | 0.20               | 51,162      | 0.2            |\n| 3.5 Finetuned (20) | 0.7               | 0.20               | 56,573      | 0.8            |\n| 3.5 Finetuned (50) | 1.4               | 0.17               | 49,057      | 1.3            |\n| 3.5 Finetuned (76) | 1.8               | 0.17               | 51,583      | 2.5            |\n| GPT-4 (COD)        | -                 | 12.9               | 409,062     | 12.9           |\n\nHere, we can see that `GPT-4` has an approximate inference cost of `0.65` per summary while our finetuned models have an inference cost of `0.0091` per summary which is ~ `72x` cheaper.\n\nInterestingly, the model finetuned with the least examples seems to outperform the others. While the reason for this is unknown, a few potential reasons could be that either we didn't train for sufficient epochs ( We chose the default 5 epochs ) or that the models started learning to imitate other behaviour such as more abstract writing styles from the larger variety of samples, resulting in a decrease in entity density.\n\n## Conclusions\n\nFinetuning this iterative method was 20-40x faster while improving overall performance, resulting in massive efficiency gains by finetuning and distilling capabilities into specialized models.\n\nWe've seen how `Instructor` can make your life easier, from data modeling to distillation and finetuning. If you enjoy the content or want to try out `instructor` check out the [github](https://github.com/jxnl/instructor) and don't forget to give us a star!"
  },
  {
    "path": "docs/blog/posts/chat-with-your-pdf-with-gemini.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - Gemini\n  - Document Processing\ncomments: true\ndate: 2024-11-11\ndescription: Learn how to use Google's Gemini model with Instructor to process PDFs and extract structured information\ndraft: false\ntags:\n  - Gemini\n  - Document Processing\n  - PDF Analysis\n  - Pydantic\n  - Python\n---\n\n# PDF Processing with Structured Outputs with Gemini\n\nIn this post, we'll explore how to use Google's Gemini model with Instructor to analyse the [Gemini 1.5 Pro Paper](https://github.com/google-gemini/generative-ai-python/blob/0e5c5f25fe4ce266791fa2afb20d17dee780ca9e/third_party/test.pdf) and extract a structured summary.\n\n## The Problem\n\nProcessing PDFs programmatically has always been painful. The typical approaches all have significant drawbacks:\n\n- **PDF parsing libraries** require complex rules and break easily\n- **OCR solutions** are slow and error-prone\n- **Specialized PDF APIs** are expensive and require additional integration\n- **LLM solutions** often need complex document chunking and embedding pipelines\n\nWhat if we could just hand a PDF to an LLM and get structured data back? With Gemini's multimodal capabilities and Instructor's structured output handling, we can do exactly that.\n\n## Quick Setup\n\nFirst, install the required packages:\n\n```bash\npip install \"instructor[google-generativeai]\"\n```\n\nThen, here's all the code you need:\n\n```python\nimport instructor\nimport google.generativeai as genai\nfrom google.ai.generativelanguage_v1beta.types.file import File\nfrom pydantic import BaseModel\nimport time\n\n# Initialize the client\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n\n# Define your output structure\nclass Summary(BaseModel):\n    summary: str\n\n\n# Upload the PDF\nfile = genai.upload_file(\"path/to/your.pdf\")\n\n# Wait for file to finish processing\nwhile file.state != File.State.ACTIVE:\n    time.sleep(1)\n    file = genai.get_file(file.name)\n    print(f\"File is still uploading, state: {file.state}\")\n\nprint(f\"File is now active, state: {file.state}\")\nprint(file)\n\nresp = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": [\"Summarize the following file\", file]},\n    ],\n    response_model=Summary,\n)\n\nprint(resp.summary)\n```\n\n??? note \"Expand to see Raw Results\"\n\n    ```bash\n    summary=\"Gemini 1.5 Pro is a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. It achieves near-perfect recall on long-context retrieval tasks across modalities, improves the state-of-the-art in long-document QA, long-video QA and long-context ASR, and matches or surpasses Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Gemini 1.5 Pro is built to handle extremely long contexts; it has the ability to recall and reason over fine-grained information from up to at least 10M tokens. This scale is unprecedented among contemporary large language models (LLMs), and enables the processing of long-form mixed-modality inputs including entire collections of documents, multiple hours of video, and almost five days long of audio. Gemini 1.5 Pro surpasses Gemini 1.0 Pro and performs at a similar level to 1.0 Ultra on a wide array of benchmarks while requiring significantly less compute to train. It can recall information amidst distractor context, and it can learn to translate a new language from a single set of linguistic documentation. With only instructional materials (a 500-page reference grammar, a dictionary, and ≈ 400 extra parallel sentences) all provided in context, Gemini 1.5 Pro is capable of learning to translate from English to Kalamang, a Papuan language with fewer than 200 speakers, and therefore almost no online presence.\"\n    ```\n\n## Benefits\n\nThe combination of Gemini and Instructor offers several key advantages over traditional PDF processing approaches:\n\n**Simple Integration** - Unlike traditional approaches that require complex document processing pipelines, chunking strategies, and embedding databases, you can directly process PDFs with just a few lines of code. This dramatically reduces development time and maintenance overhead.\n\n**Structured Output** - Instructor's Pydantic integration ensures you get exactly the data structure you need. The model's outputs are automatically validated and typed, making it easier to build reliable applications. If the extraction fails, Instructor automatically handles the retries for you with support for [custom retry logic using tenacity](../../concepts/retrying.md).\n\n**Multimodal Support** - Gemini's multimodal capabilities mean this same approach works for various file types. You can process images, videos, and audio files all in the same api request. Check out our [multimodal processing guide](./multimodal-gemini.md) to see how we extract structured data from travel videos.\n\n## Conclusion\n\nWorking with PDFs doesn't have to be complicated.\n\nBy combining Gemini's multimodal capabilities with Instructor's structured output handling, we can transform complex document processing into simple, Pythonic code.\n\nNo more wrestling with parsing rules, managing embeddings, or building complex pipelines - just define your data model and let the LLM do the heavy lifting.\n\n## Related Documentation\n- [Multimodal Processing](../../concepts/multimodal.md) - Core multimodal concepts\n\n## See Also\n- [Gemini Multimodal Features](multimodal-gemini.md) - Full Gemini capabilities\n- [PDF Citation Generation](generating-pdf-citations.md) - Extract citations from PDFs\n- [RAG and Beyond](rag-and-beyond.md) - Advanced document processing\n\nIf you liked this, give `instructor` a try today and see how much easier structured outputs makes working with LLMs become. [Get started with Instructor today!](../../index.md)\n"
  },
  {
    "path": "docs/blog/posts/citations.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2023-11-18\ndescription: Explore how Pydantic enhances LLM citation verification, improving data\n  accuracy and reliability in responses.\ndraft: false\nslug: validate-citations\ntags:\n- Pydantic\n- LLM\n- Data Accuracy\n- Citation Verification\n- Python\n---\n\n# Verifying LLM Citations with Pydantic\n\nEnsuring the accuracy of information is crucial. This blog post explores how Pydantic's powerful and flexible validators can enhance data accuracy through citation verification.\n\nWe'll start with using a simple substring check to verify citations. Then we'll use `instructor` itself to power an LLM to verify citations and align answers with the given citations. Finally, we'll explore how we can use these techniques to generate a dataset of accurate responses.\n\n<!-- more -->\n\n## Example 1: Simple Substring Check\n\nIn this example, we use the `Statements` class to verify if a given substring quote exists within a text chunk. If the substring is not found, an error is raised.\n\n### Code Example:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, field_validator\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Statements(BaseModel):\n    body: str\n    substring_quote: str\n\n    @field_validator(\"substring_quote\")\n    @classmethod\n    def substring_quote_exists(cls, v: str, info: ValidationInfo):\n        context = info.context.get(\"text_chunks\", None)\n\n        for text_chunk in context.values():\n            if v in text_chunk:  # (1)\n                return v\n        raise ValueError(\"Could not find substring_quote `{v}` in contexts\")\n\n\nclass AnswerWithCitaton(BaseModel):\n    question: str\n    answer: List[Statements]\n```\n\n1. While we use a simple substring check in this example, we can use more complex techniques like regex or Levenshtein distance.\n\nOnce the class is defined, we can use it to validate the context and raise an error if the substring is not found.\n\n```python\ntry:\n    AnswerWithCitaton.model_validate(\n        {\n            \"question\": \"What is the capital of France?\",\n            \"answer\": [\n                {\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"},\n            ],\n        },\n        context={\n            \"text_chunks\": {\n                1: \"Jason is a pirate\",\n                2: \"Paris is not the capital of France\",\n                3: \"Irrelevant data\",\n            }\n        },\n    )\nexcept ValidationError as e:\n    print(e)\n```\n\n### Error Message Example:\n\n```\nanswer.0.substring_quote\n  Value error, Could not find substring_quote `Paris is the capital of France` in contexts [type=value_error, input_value='Paris is the capital of France', input_type=str]\n    For further information visit [https://errors.pydantic.dev/2.4/v/value_error](https://errors.pydantic.dev/2.4/v/value_error)\n```\n\nPydantic raises a validation error when the `substring_quote` attribute does not exist in the context. This approach can be used to validate more complex data using techniques like regex or Levenshtein distance.\n\n## Example 2: Using LLM for Verification\n\nThis approach leverages OpenAI's LLM to validate citations. If the citation does not exist in the context, the LLM returns an error message.\n\n### Code Example:\n\n```python\nclass Validation(BaseModel):\n    is_valid: bool\n    error_messages: Optional[str] = Field(None, description=\"Error messages if any\")\n\n\nclass Statements(BaseModel):\n    body: str\n    substring_quote: str\n\n    @model_validator(mode=\"after\")\n    def substring_quote_exists(self, info: ValidationInfo):\n        context = info.context.get(\"text_chunks\", None)\n\n        resp: Validation = client.create(\n            response_model=Validation,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Does the following citation exist in the following context?\\n\\nCitation: {self.substring_quote}\\n\\nContext: {context}\",\n                }\n            ],\n            model=\"gpt-3.5-turbo\",\n        )\n\n        if resp.is_valid:\n            return self\n\n        raise ValueError(resp.error_messages)\n\n\nclass AnswerWithCitaton(BaseModel):\n    question: str\n    answer: List[Statements]\n```\n\nNow when we use a correct citation, the LLM returns a valid response.\n\n```python\nresp = AnswerWithCitaton.model_validate(\n    {\n        \"question\": \"What is the capital of France?\",\n        \"answer\": [\n            {\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"},\n        ],\n    },\n    context={\n        \"text_chunks\": {\n            1: \"Jason is a pirate\",\n            2: \"Paris is the capital of France\",\n            3: \"Irrelevant data\",\n        }\n    },\n)\nprint(resp.model_dump_json(indent=2))\n```\n\n### Result:\n\n```json\n{\n  \"question\": \"What is the capital of France?\",\n  \"answer\": [\n    {\n      \"body\": \"Paris\",\n      \"substring_quote\": \"Paris is the capital of France\"\n    }\n  ]\n}\n```\n\nWhen we have citations that don't exist in the context, the LLM returns an error message.\n\n```python\ntry:\n    AnswerWithCitaton.model_validate(\n        {\n            \"question\": \"What is the capital of France?\",\n            \"answer\": [\n                {\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"},\n            ],\n        },\n        context={\n            \"text_chunks\": {\n                1: \"Jason is a pirate\",\n                2: \"Paris is not the capital of France\",\n                3: \"Irrelevant data\",\n            }\n        },\n    )\nexcept ValidationError as e:\n    print(e)\n```\n\n### Error Message Example:\n\n```\n1 validation error for AnswerWithCitaton\nanswer.0\n  Value error, Citation not found in context [type=value_error, input_value={'body': 'Paris', 'substr... the capital of France'}, input_type=dict]\n    For further information visit [https://errors.pydantic.dev/2.4/v/value_error](https://errors.pydantic.dev/2.4/v/value_error)\n```\n\n## Example 3: Aligning Citations and Answers\n\nIn this example, we ensure that the provided answers are aligned with the given citations and context. The LLM is used to verify the alignment.\n\nWe use the same `Statements` model as above, but we add a new model for the answer that also verifies the alignment of citations.\n\n### Code Example:\n\n```python\nclass AnswerWithCitaton(BaseModel):\n    question: str\n    answer: List[Statements]\n\n    @model_validator(mode=\"after\")\n    def validate_answer(self, info: ValidationInfo):\n        context = info.context.get(\"text_chunks\", None)\n\n        resp: Validation = client.create(\n            response_model=Validation,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Does the following answers match the question and the context?\\n\\nQuestion: {self.question}\\n\\nAnswer: {self.answer}\\n\\nContext: {context}\",\n                }\n            ],\n            model=\"gpt-3.5-turbo\",\n        )\n\n        if resp.is_valid:\n            return self\n\n        raise ValueError(resp.error_messages)\n```\n\nWhen we have a mismatch between the answer and the citation, the LLM returns an error message.\n\n```python\ntry:\n    AnswerWithCitaton.model_validate(\n        {\n            \"question\": \"What is the capital of France?\",\n            \"answer\": [\n                {\"body\": \"Texas\", \"substring_quote\": \"Paris is the capital of France\"},\n            ],\n        },\n        context={\n            \"text_chunks\": {\n                1: \"Jason is a pirate\",\n                2: \"Paris is the capital of France\",\n                3: \"Irrelevant data\",\n            }\n        },\n    )\nexcept ValidationError as e:\n    print(e)\n```\n\n### Error Message Example:\n\n```\n1 validation error for AnswerWithCitaton\n  Value error, The answer does not match the question and context [type=value_error, input_value={'question': 'What is the...he capital of France'}]}, input_type=dict]\n    For further information visit [https://errors.pydantic.dev/2.4/v/value_error](https://errors.pydantic.dev/2.4/v/value_error)\n```\n\n## Related Documentation\n- [Validation Guide](../../concepts/validation.md) - Validate citations\n\n## See Also\n- [RAG Techniques](rag-and-beyond.md) - Use citations in RAG\n- [PDF Citations](generating-pdf-citations.md) - Extract from PDFs\n- [Validation Basics](validation-part1.md) - Ensure citation quality\n\n## Conclusion\n\nThese examples demonstrate the potential of using Pydantic and OpenAI to enhance data accuracy through citation verification. While the LLM-based approach may not be efficient for runtime operations, it has exciting implications for generating a dataset of accurate responses. By leveraging this method during data generation, we can fine-tune a model that excels in citation accuracy. Similar to our last post on [finetuning a better summarizer](https://jxnl.github.io/instructor/blog/2023/11/05/chain-of-density/).\n\nIf you like the content check out our [GitHub](https://github.com/jxnl/instructor) as give us a star and checkout the library."
  },
  {
    "path": "docs/blog/posts/consistent-stories.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - OpenAI\ncomments: true\ndate: 2024-12-10\ndescription: Generating complex DAGS with gpt-4o\ndraft: false\ntags:\n  - OpenAI\n  - DAGs\n---\n\n# Consistent Stories with GPT-4o\n\nLanguage Models struggle to generate consistent graphs that have a large number of nodes. Often times, this is because the graph itself is too large for the model to handle. This causes the model to generate inconsistent graphs that have invalid and disconnected nodes among other issues.\n\nIn this article, we'll look at how we can get around this limitation by using a two-phase approach to generate complex DAGs with gpt-4o by looking at a simple example of generating a Choose Your Own Adventure story.\n\n<!-- more -->\n\n## Why do DAGs matter?\n\nDAGs are directed acyclic graphs. A graph is considered a DAG when every connection between nodes is directed ( it goes in a single direction ) and there are no cycles ( it doesn't loop back to a previous node ).\n\n```mermaid\ngraph TD\n    A --> B\n    A --> C\n    B --> D\n    C --> D\n```\n\nThis isn't too far away from a Choose Your Own Adventure story where users have a fixed set of choices at each step and can only move forward in the story. We can see this in action below:\n\n```mermaid\ngraph TD\n    A[Story Root] --> B[Choice 1]\n    A --> C[Choice 2]\n    A --> D[Choice 3]\n    B --> E[Choice 1.1]\n    B --> F[Choice 1.2]\n    C --> G[Choice 2.1]\n    C --> H[Choice 2.2]\n    D --> I[Choice 3.1]\n    D --> J[Choice 3.2]\n```\n\n## The Challenge: Scaling Story Generation\n\nWhen we try to use a language model to generate a story in a single run, this hits several limitations quickly because just with 4 choices at each step, we're already at 20 nodes by the second level. If users can only make 2 choices before our story ends, that doesn't result in a very interesting story to play with.\n\nIn other words, we'll overflow the context window of the model quickly. To get around this, we can use a two-phase approach to generate the story where we generate an initial story setting and then generate the choices/other options in parallel.\n\n## Parallel Story Generation\n\n### Generating an Outline\n\nFirst, we generate an outline of the story using gpt-4o. This is important because it gives us a starting setting, visual style and image description ( for the banner image ). We can then use this down the line to ensure the images we generate are consistent as much as possible.\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\n\n\nclass GeneratedStory(BaseModel):\n    setting: str\n    plot_summary: str\n    choices: List[str]\n    visual_style: str\n    image_description: str\n\n\nasync def generate_story(\n    client: instructor.AsyncInstructor, story_input: RestateStoryInput\n):\n    resp = await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n            Generate a story with:\n            - Setting: {{ story_input.setting}}\n            - Title: {{ story_input.title }}\n\n            Rules:\n            - Generate 2-4 initial choices that represent actions\n            - Choices must move story forward\n            - Include brief setting description\n            - Generate a visual description for the story\n\n            Required Elements:\n            1. Plot Summary: A vivid description of the setting and plot\n            2. Initial Choices: 2-4 distinct actions the user can take\n            3. Visual Style: Description of art style, color palette\n            4. Image Description: One-sentence scene description\n            \"\"\",\n            }\n        ],\n        model=\"gpt-4o\",\n        response_model=GeneratedStory,\n        context={\"story_input\": story_input},\n    )\n    return resp\n```\n\nThis outputs a story with a setting, plot summary, choices, visual style and image description.\n\n```bash\n# Example generated output\n{\n    \"setting\": \"A neon-lit cyberpunk metropolis in 2150\",\n    \"plot_summary\": \"In the sprawling city of Neo-Tokyo...\",\n    \"choices\": [\n        \"Investigate the mysterious signal in the abandoned district\",\n        \"Meet your contact at the underground hacker hub\",\n        \"Follow the corporate executive who seems suspicious\"\n    ],\n    \"visual_style\": \"Vibrant neon colors, detailed cyberpunk architecture\",\n    \"image_description\": \"A towering cyberpunk cityscape at night with neon signs\"\n}\n```\n\n### Parallel Choice Expansion\n\nOne of the biggest challenges in generating deep story trees is maintaining consistency as the story branches grow.\n\nHere's how we solve this with parallel generation and state tracking:\n\n```mermaid\ngraph TD\n    %% Main nodes\n    A[Find Door] --> B[Open Door]\n    A --> C[Walk Away]\n\n    B --> D[Read Book]\n    B --> E[Leave Room]\n\n    C --> F[Go Home]\n    C --> G[Wait Outside]\n\n    %% Styling for visual hierarchy\n    classDef start fill:#ff9999,stroke:#333,stroke-width:2px\n    classDef decision fill:#99ccff,stroke:#333,stroke-width:2px\n    classDef outcome fill:#99ffff,stroke:#333,stroke-width:1px\n\n    %% Apply styles\n    class A start\n    class B,C decision\n    class D,E,F,G outcome\n\n    %% Add tooltips for context\n    click B \"Door context\" \"Open Door Context\"\n    click C \"Away context\" \"Walk Away Context\"\n    click D \"Door and Book context\" \"Read Book Context\"\n```\n\nThe key insight is that each path through the story tree has its own unique state. We do so by having a simple accumulator that allows us to keep track of the previous choices and the story context.\n\nIt's also important to note here that the model also has the full flexibility to end the story at any point in time.\n\nHere's how we implement this:\n\n```python\nasync def rewrite_choice(\n    client: instructor.AsyncInstructor,\n    choice: str,\n    story: GeneratedStory,\n    prev_choices: list[dict],  # Accumulator for path state\n    max_depth: int,\n    sem: asyncio.Semaphore,\n) -> FinalStoryChoice:\n    # Each choice knows its entire path history\n    async with sem:\n        rewritten_choice = await client.create(\n            model=\"gpt-4o\",\n            response_model=RewrittenChoice,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"\"\"\n                Given this choice: {{ choice }}\n\n                Story context:\n                Setting: {{ story.setting }}\n                Plot: {{ story.plot_summary }}\n\n                Previous choices made in this path:\n                {% for prev in prev_choices %}\n                - {{ prev.choice_description }}\n                  Result: {{ prev.choice_consequences }}\n                {% endfor %}\n\n                Generate the next story beat and 2-4 new choices.\n                The story should end in {{ max_depth - len(prev_choices) }} more turns.\n                \"\"\",\n                }\n            ],\n            context={\n                \"choice\": choice,\n                \"story\": story,\n                \"prev_choices\": prev_choices,\n            },\n        )\n\n    # For terminal nodes (at max depth)\n    if len(prev_choices) == max_depth - 1:\n        return FinalStoryChoice(\n            choice_description=rewritten_choice.choice_description,\n            choice_consequences=rewritten_choice.choice_consequences,\n            choices=[],  # Terminal node\n        )\n\n    # Recursively expand child choices\n    child_choices = await asyncio.gather(\n        *[\n            rewrite_choice(\n                client=client,\n                choice=new_choice,\n                story=story,\n                prev_choices=prev_choices\n                + [\n                    {\n                        \"choice_description\": rewritten_choice.choice_description,\n                        \"choice_consequences\": rewritten_choice.choice_consequences,\n                    }\n                ],\n                max_depth=max_depth,\n                sem=sem,\n            )\n            for new_choice in rewritten_choice.choices\n        ]\n    )\n\n    return FinalStoryChoice(\n        choice_description=rewritten_choice.choice_description,\n        choice_consequences=rewritten_choice.choice_consequences,\n        choices=child_choices,\n    )\n```\n\nThis approach gives us several key benefits:\n\n1. **Path-Specific Context**: Each node maintains the complete history of choices that led to it, ensuring consistency within each branch\n2. **Parallel Generation**: Different branches can be generated simultaneously since they each maintain their own state\n3. **Controlled Growth**: The `max_depth` parameter prevents exponential expansion\n4. **Rate Limiting**: The semaphore controls concurrent API calls while allowing maximum parallelization\n\nThe semaphore isn't just for rate limiting - it ensures we process choices at a manageable pace while maintaining state consistency.\n\nEach path through the story tree becomes a self-contained narrative with access to its complete history, allowing us to generate coherent stories at a much faster speed and verbosity than a single call would be able to generate.\n\nAdditionally, we can generate stories that are much broader and deeper than a single call would be able to generate.\n\n## Beyond Story Generation\n\nThe success of this approach comes down to three key principles:\n\n1. **State Isolation**: Each node maintains only the context it needs, preventing context window overflow\n2. **Parallel Processing**: Generation can happen simultaneously across branches, dramatically reducing total generation time\n3. **Structured Validation**: Using Pydantic models ensures each generated component meets your requirements\n\nFor example, generating a 20-node story tree sequentially might take 60 seconds (3s per node), but with parallel generation and 10 concurrent requests, it could complete in just 45-50 seconds.\n\nThis pattern is particularly valuable when:\n\n- Your generation tasks naturally form a tree or graph structure\n- Individual nodes need some but not all context from their ancestors\n- You need to generate content that exceeds a single context window\n- Speed of generation is important\n\nBy combining structured outputs with parallel generation, you can reliably generate complex, interconnected content at scale while maintaining consistency and control.\n\n`instructor` makes it easy to generate complex Data Structures with language models - whether they're open source models with ollama or proprietary models with providers such as OpenAI. Give us a try today!\n"
  },
  {
    "path": "docs/blog/posts/course.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- OpenAI\ncomments: true\ndate: 2024-02-14\ndescription: Discover a free one-hour course on Weights and Biases covering essential\n  techniques for language models.\ndraft: false\nslug: weights-and-biases-course\ntags:\n- Weights and Biases\n- AI course\n- machine learning\n- language models\n- free resources\n---\n\n# Free course on Weights and Biases\n\nI just released a free course on wits and biases. It goes over the material from [tutorial](../../tutorials/1-introduction.ipynb). Check it out at [wandb.courses](https://www.wandb.courses/courses/steering-language-models) its free and open to everyone and just under an hour long!\n\n[![](img/course.png)](https://www.wandb.courses/courses/steering-language-models)\n\n> Click the image to access the course"
  },
  {
    "path": "docs/blog/posts/cursor-rules.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - Contributing\ncomments: true\ndate: 2025-03-18\ndescription:\n  Learn how Instructor's Cursor rules improve Git workflows for contributors, making AI-assisted coding more organized.\ndraft: false\nslug: cursor-rules-for-better-git-practices\ntags:\n  - Git\n  - Cursor\n  - Contributing\n  - Best Practices\n---\n\n# Instructor Adopting Cursor Rules\n\nAI-assisted coding is changing how we use version control. Many developers now use what I call \"vibe coding\" - coding with AI help. This creates new challenges with Git. Today I'll share how we're using Cursor rules in Instructor to solve these problems.\n\n<!-- more -->\n\n## The Git Problem When Coding with AI\n\nIn my blog post [Version Control for the Vibe Coder (Part 1)](https://jxnl.co/writing/2025/03/18/version-control-for-the-vibe-coder-part-1/), I wrote about the problem:\n\n> \"Imagine this: you open Cursor, ask it to build a feature in YOLO-mode, and let it rip. You feel great as you watch code materialize... until you realize you haven't made a single commit, your branch is a mess, and you have no idea how to organize these changes for review.\"\n\nThis happens often. When using AI tools like Cursor, we focus on creating code quickly but forget about version control. This leads to big, messy commits that are hard to review.\n\n## How Cursor Rules Help\n\nWe've added Cursor rules to Instructor. These rules help standardize Git workflows inside Cursor. The rules are simple markdown files in the `.cursor/rules` directory that guide Cursor when working with your code.\n\nAs I wrote in [Version Control for the Vibe Coder (Part 2)](https://jxnl.co/writing/2025/03/18/version-control-for-the-vibe-coder-part-2/):\n\n> \"Add rules to `.cursor/rules` to instruct Cursor clearly and repeatedly... The real key to success with Git is much simpler: Make Small, Frequent Commits... Let Cursor Handle the Rest.\"\n\nThis balances fast AI coding with good teamwork practices.\n\n## How Our Cursor Rules Help Contributors\n\nIf you want to contribute to Instructor, our Cursor rules will make it easier. Here's how:\n\n### 1. Better Branching and Commits\n\nThe rules help Cursor suggest good Git practices. When building a new feature, Cursor will help you:\n\n- Create well-named branches\n- Make small commits with clear messages\n- Format PR descriptions correctly\n\n### 2. Simpler PR Process\n\nOur rules define how to create and manage pull requests:\n\n- Format PR descriptions\n- Add the right reviewers\n- Use stacked PRs for big features (as I explain in my Part 2 blog post)\n\n### 3. Keeping Docs Updated\n\nThe rules remind you to update docs when code changes, which keeps our project docs accurate.\n\n## Getting Started\n\nIf you're new to Instructor or Cursor, here's how to use these rules:\n\n1. **Install Cursor**: Download it from [cursor.sh](https://cursor.sh/)\n2. **Clone Instructor**: `git clone https://github.com/instructor-ai/instructor.git`\n3. **Open in Cursor**: The `.cursor/rules` will load automatically\n4. **Make changes**: Let Cursor guide your Git workflow\n5. **Create a PR**: Follow Cursor's suggestions\n\nYou don't need to remember all the Git commands. The rules will help Cursor suggest the right steps.\n\n## Stacked PRs for Bigger Features\n\nOne key practice in our rules is stacked PRs. As I explain:\n\n> \"Stacked pull requests are a powerful workflow for building complex features incrementally. Instead of one massive PR, you create a series of smaller, dependent PRs that build upon each other.\"\n\nThis helps Instructor because it allows:\n\n- Focused code reviews\n- Easier merging of changes\n- Better organization of big features\n- Clear documentation of decisions\n\nThe rules show you how to make and manage stacked PRs without confusion.\n\n## Keeping the Human Touch\n\nA big benefit of Cursor rules is keeping people central to the process. While AI helps write code, the rules ensure:\n\n- Code changes stay clear and reviewable\n- Docs stay current\n- Commit history tells a clear story\n- Contributors get credit for their work\n\n## Try It Out\n\nI invite you to make a PR to Instructor with small changes. Using AI-assisted coding with Git through Cursor rules makes contributing easier and more fun.\n\nStart small - fix a typo or add an example to the cookbook. Open the repo in Cursor and let the rules guide you through making a clean PR. This lets you focus on writing good code instead of figuring out Git commands.\n\nRemember: \"The most important Git skill is making regular, small commits. Everything else - bisecting, stacked PRs, complex rebases - these are just tools that Cursor can handle for you.\"\n\nWith Cursor rules, you get fast AI coding plus good team practices.\n\nIf you want to add Cursor rules to your own open source projects, I can help! Reach out to me on Twitter at [@jxnlco](https://twitter.com/jxnlco) and I'll share what we've learned.\n\nHappy coding!"
  },
  {
    "path": "docs/blog/posts/distilation-part1.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2023-10-17\ndescription: Explore Instructor for fine-tuning language models with Python, simplifying\n  function calls, and enhancing performance.\ndraft: false\ntags:\n- Instructor\n- Fine-tuning\n- Python\n- Language Models\n- Distillation\n---\n\n# Enhancing Python Functions with Instructor: A Guide to Fine-Tuning and Distillation\n\n## Introduction\n\nGet ready to dive deep into the world of fine-tuning task specific language models with Python functions. We'll explore how the `instructor.instructions` streamlines this process, making the task you want to distil more efficient and powerful while preserving its original functionality and backwards compatibility.\n\nIf you want to see the full example checkout [examples/distillation](https://github.com/jxnl/instructor/tree/main/examples/distilations)\n\n<!-- more -->\n\n## Why use Instructor?\n\nImagine you're developing a backend service that uses a mix old and new school ML practises, it may involve pipelines with multiple function calls, validations, and data processing. Sounds cumbersome, right? That's where `Instructor` comes in. It simplifies complex procedures, making them more efficient and easier to manage by adding a decorator to your function that will automatically generate a dataset for fine-tuning and help you swap out the function implementation.\n\n## Quick Start: How to Use Instructor's Distillation Feature\n\nBefore we dig into the nitty-gritty, let's look at how easy it is to use Instructor's distillation feature to use function calling finetuning to export the data to a JSONL file.\n\n```python\nimport logging\nimport random\nfrom pydantic import BaseModel\nfrom instructor import Instructions  # pip install instructor\n\n# Logging setup\nlogging.basicConfig(level=logging.INFO)\n\ninstructions = Instructions(\n    name=\"three_digit_multiply\",\n    finetune_format=\"messages\",\n    # log handler is used to save the data to a file\n    # you can imagine saving it to a database or other storage\n    # based on your needs!\n    log_handlers=[logging.FileHandler(\"math_finetunes.jsonl\")],\n)\n\n\nclass Multiply(BaseModel):\n    a: int\n    b: int\n    result: int\n\n\n# Define a function with distillation\n# The decorator will automatically generate a dataset for fine-tuning\n# They must return a pydantic model to leverage function calling\n@instructions.distil\ndef fn(a: int, b: int) -> Multiply:\n    resp = a * b\n    return Multiply(a=a, b=b, result=resp)\n\n\n# Generate some data\nfor _ in range(10):\n    a = random.randint(100, 999)\n    b = random.randint(100, 999)\n    print(fn(a, b))\n    #> a=268 b=548 result=146864\n    #> a=774 b=447 result=345978\n    #> a=154 b=902 result=138908\n    #> a=304 b=808 result=245632\n    #> a=980 b=104 result=101920\n    #> a=725 b=455 result=329875\n    #> a=206 b=386 result=79516\n    #> a=488 b=920 result=448960\n    #> a=989 b=889 result=879221\n    #> a=815 b=343 result=279545\n```\n\n## The Intricacies of Fine-tuning Language Models\n\nFine-tuning isn't just about writing a function like `def f(a, b): return a * b`. It requires detailed data preparation and logging. However, Instructor provides a built-in logging feature and structured outputs to simplify this.\n\n## Why Instructor and Distillation are Game Changers\n\nThe library offers two main benefits:\n\n1. **Efficiency**: Streamlines functions, distilling requirements into model weights and a few lines of code.\n2. **Integration**: Eases combining classical machine learning and language models by providing a simple interface that wraps existing functions.\n\n## Role of Instructor in Simplifying Fine-Tuning\n\nThe `from instructor import Instructions` feature is a time saver. It auto-generates a fine-tuning dataset, making it a breeze to imitate a function's behavior.\n\n## Logging Output and Running a Finetune\n\nHere's how the logging output would look:\n\n```python\n{\n    \"messages\": [\n        {\"role\": \"system\", \"content\": 'Predict the results of this function: ...'},\n        {\"role\": \"user\", \"content\": 'Return fn(133, b=539)'},\n        {\n            \"role\": \"assistant\",\n            \"function_call\": {\n                \"name\": \"Multiply\",\n                \"arguments\": '{\"a\":133,\"b\":539,\"result\":89509}',\n            },\n        },\n    ],\n    \"functions\": [\n        {\"name\": \"Multiply\", \"description\": \"Correctly extracted `Multiply`...\"}\n    ],\n}\n```\n\nRun a finetune like this:\n\n!!! note annotate \"Don't forget to set your OpenAI Key as an environment variable\"\n\n    All of the `instructor jobs` commands assume you've set an environment variable of `OPENAI_API_KEY` in your shell. You can set this by running the command `export OPENAI_API_KEY=<Insert API Key Here>` in your shell\n\n```bash\ninstructor jobs create-from-file math_finetunes.jsonl\n```\n\n## Next Steps and Future Plans\n\nHere's a sneak peek of what I'm planning:\n\n```python\nfrom instructor import Instructions, patch\n\npatch()  # (1)!\n\n\nclass Multiply(BaseModel):\n    a: int\n    b: int\n    result: int\n\n\ninstructions = Instructions(\n    name=\"three_digit_multiply\",\n)\n\n\n@instructions.distil(model='gpt-3.5-turbo:finetuned-123', mode=\"dispatch\")  # (2)!\ndef fn(a: int, b: int) -> Multiply:\n    resp = a + b\n    return Multiply(a=a, b=b, result=resp)\n```\n\n1.  Don't forget to run the `patch()` command that we provide with the `Instructor` package. This helps\n    automatically serialize the content back into the `Pydantic`` model that we're looking for.\n\n2.  Don't forget to replace this with your new model id. OpenAI identifies fine tuned models with an id\n    of `ft:gpt-3.5-turbo-0613:personal::<id>` under their **Fine-tuning** tab on their dashboard\n\nWith this, you can swap the function implementation, making it backward compatible. You can even imagine using the different models for different tasks or validating and running evals by using the original function and comparing it to the distillation.\n\n## Conclusion\n\nWe've seen how `Instructor` can make your life easier, from fine-tuning to distillation. Now if you're thinking wow, I'd love a backend service to do this for continuously, you're in luck! Please check out the survey at [useinstructor.com](https://useinstructor.com) and let us know who you are.\n\nIf you enjoy the content or want to try out `instructor` please check out the [github](https://github.com/jxnl/instructor) and give us a star!"
  },
  {
    "path": "docs/blog/posts/extract-model-looks.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - OpenAI\ncomments: true\ndate: 2024-12-10\ndescription: Generating complex DAGS with gpt-4o\ndraft: false\ntags:\n  - OpenAI\n  - Multimodal\n---\n\n# Consistent Stories with GPT-4o\n\nLanguage Models struggle to generate consistent graphs that have a large number of nodes. Often times, this is because the graph itself is too large for the model to handle. This causes the model to generate inconsistent graphs that have invalid and disconnected nodes among other issues.\n\nIn this article, we'll look at how we can get around this limitation by using a two-phase approach to generate complex DAGs with gpt-4o by looking at a simple example of generating a Choose Your Own Adventure story.\n\n<!-- more -->\n\n## Why do DAGs matter?\n\nDAGs are directed acyclic graphs. A graph is considered a DAG when every connection between nodes is directed ( it goes in a single direction ) and there are no cycles ( it doesn't loop back to a previous node ).\n\n```mermaid\ngraph TD\n    A --> B\n    A --> C\n    B --> D\n    C --> D\n```\n\nThis isn't too far away from a Choose Your Own Adventure story where users have a fixed set of choices at each step and can only move forward in the story. We can see this in action below:\n\n```mermaid\ngraph TD\n    A[Story Root] --> B[Choice 1]\n    A --> C[Choice 2]\n    A --> D[Choice 3]\n    B --> E[Choice 1.1]\n    B --> F[Choice 1.2]\n    C --> G[Choice 2.1]\n    C --> H[Choice 2.2]\n    D --> I[Choice 3.1]\n    D --> J[Choice 3.2]\n```\n\n## The Challenge: Scaling Story Generation\n\nWhen we try to use a language model to generate a story in a single run, this hits several limitations quickly because just with 4 choices at each step, we're already at 20 nodes by the second level. If users can only make 2 choices before our story ends, that doesn't result in a very interesting story to play with.\n\nIn other words, we'll overflow the context window of the model quickly. To get around this, we can use a two-phase approach to generate the story where we generate an initial story setting and then generate the choices/other options in parallel.\n\n## Parallel Story Generation\n\n### Generating an Outline\n\nFirst, we generate an outline of the story using gpt-4o. This is important because it gives us a starting setting, visual style and image description ( for the banner image ). We can then use this down the line to ensure the images we generate are consistent as much as possible.\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\n\n\nclass GeneratedStory(BaseModel):\n    setting: str\n    plot_summary: str\n    choices: List[str]\n    visual_style: str\n    image_description: str\n\n\nasync def generate_story(\n    client: instructor.AsyncInstructor, story_input: RestateStoryInput\n):\n    resp = await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n            Generate a story with:\n            - Setting: {{ story_input.setting}}\n            - Title: {{ story_input.title }}\n\n            Rules:\n            - Generate 2-4 initial choices that represent actions\n            - Choices must move story forward\n            - Include brief setting description\n            - Generate a visual description for the story\n\n            Required Elements:\n            1. Plot Summary: A vivid description of the setting and plot\n            2. Initial Choices: 2-4 distinct actions the user can take\n            3. Visual Style: Description of art style, color palette\n            4. Image Description: One-sentence scene description\n            \"\"\",\n            }\n        ],\n        model=\"gpt-4o\",\n        response_model=GeneratedStory,\n        context={\"story_input\": story_input},\n    )\n    return resp\n```\n\nThis outputs a story with a setting, plot summary, choices, visual style and image description.\n\n```bash\n# Example generated output\n{\n    \"setting\": \"A neon-lit cyberpunk metropolis in 2150\",\n    \"plot_summary\": \"In the sprawling city of Neo-Tokyo...\",\n    \"choices\": [\n        \"Investigate the mysterious signal in the abandoned district\",\n        \"Meet your contact at the underground hacker hub\",\n        \"Follow the corporate executive who seems suspicious\"\n    ],\n    \"visual_style\": \"Vibrant neon colors, detailed cyberpunk architecture\",\n    \"image_description\": \"A towering cyberpunk cityscape at night with neon signs\"\n}\n```\n\n### Parallel Choice Expansion\n\nOne of the biggest challenges in generating deep story trees is maintaining consistency as the story branches grow.\n\nHere's how we solve this with parallel generation and state tracking:\n\n```mermaid\ngraph TD\n    %% Main nodes\n    A[Find Door] --> B[Open Door]\n    A --> C[Walk Away]\n\n    B --> D[Read Book]\n    B --> E[Leave Room]\n\n    C --> F[Go Home]\n    C --> G[Wait Outside]\n\n    %% Styling for visual hierarchy\n    classDef start fill:#ff9999,stroke:#333,stroke-width:2px\n    classDef decision fill:#99ccff,stroke:#333,stroke-width:2px\n    classDef outcome fill:#99ffff,stroke:#333,stroke-width:1px\n\n    %% Apply styles\n    class A start\n    class B,C decision\n    class D,E,F,G outcome\n\n    %% Add tooltips for context\n    click B \"Door context\" \"Open Door Context\"\n    click C \"Away context\" \"Walk Away Context\"\n    click D \"Door and Book context\" \"Read Book Context\"\n```\n\nThe key insight is that each path through the story tree has its own unique state. We do so by having a simple accumulator that allows us to keep track of the previous choices and the story context.\n\nIt's also important to note here that the model also has the full flexibility to end the story at any point in time.\n\nHere's how we implement this:\n\n```python\nasync def rewrite_choice(\n    client: instructor.AsyncInstructor,\n    choice: str,\n    story: GeneratedStory,\n    prev_choices: list[dict],  # Accumulator for path state\n    max_depth: int,\n    sem: asyncio.Semaphore,\n) -> FinalStoryChoice:\n    # Each choice knows its entire path history\n    async with sem:\n        rewritten_choice = await client.create(\n            model=\"gpt-4o\",\n            response_model=RewrittenChoice,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"\"\"\n                Given this choice: {{ choice }}\n\n                Story context:\n                Setting: {{ story.setting }}\n                Plot: {{ story.plot_summary }}\n\n                Previous choices made in this path:\n                {% for prev in prev_choices %}\n                - {{ prev.choice_description }}\n                  Result: {{ prev.choice_consequences }}\n                {% endfor %}\n\n                Generate the next story beat and 2-4 new choices.\n                The story should end in {{ max_depth - len(prev_choices) }} more turns.\n                \"\"\",\n                }\n            ],\n            context={\n                \"choice\": choice,\n                \"story\": story,\n                \"prev_choices\": prev_choices,\n            },\n        )\n\n    # For terminal nodes (at max depth)\n    if len(prev_choices) == max_depth - 1:\n        return FinalStoryChoice(\n            choice_description=rewritten_choice.choice_description,\n            choice_consequences=rewritten_choice.choice_consequences,\n            choices=[],  # Terminal node\n        )\n\n    # Recursively expand child choices\n    child_choices = await asyncio.gather(\n        *[\n            rewrite_choice(\n                client=client,\n                choice=new_choice,\n                story=story,\n                prev_choices=prev_choices\n                + [\n                    {\n                        \"choice_description\": rewritten_choice.choice_description,\n                        \"choice_consequences\": rewritten_choice.choice_consequences,\n                    }\n                ],\n                max_depth=max_depth,\n                sem=sem,\n            )\n            for new_choice in rewritten_choice.choices\n        ]\n    )\n\n    return FinalStoryChoice(\n        choice_description=rewritten_choice.choice_description,\n        choice_consequences=rewritten_choice.choice_consequences,\n        choices=child_choices,\n    )\n```\n\nThis approach gives us several key benefits:\n\n1. **Path-Specific Context**: Each node maintains the complete history of choices that led to it, ensuring consistency within each branch\n2. **Parallel Generation**: Different branches can be generated simultaneously since they each maintain their own state\n3. **Controlled Growth**: The `max_depth` parameter prevents exponential expansion\n4. **Rate Limiting**: The semaphore controls concurrent API calls while allowing maximum parallelization\n\nThe semaphore isn't just for rate limiting - it ensures we process choices at a manageable pace while maintaining state consistency.\n\nEach path through the story tree becomes a self-contained narrative with access to its complete history, allowing us to generate coherent stories at a much faster speed and verbosity than a single call would be able to generate.\n\nAdditionally, we can generate stories that are much broader and deeper than a single call would be able to generate.\n\n## Beyond Story Generation\n\nThe success of this approach comes down to three key principles:\n\n1. **State Isolation**: Each node maintains only the context it needs, preventing context window overflow\n2. **Parallel Processing**: Generation can happen simultaneously across branches, dramatically reducing total generation time\n3. **Structured Validation**: Using Pydantic models ensures each generated component meets your requirements\n\nFor example, generating a 20-node story tree sequentially might take 60 seconds (3s per node), but with parallel generation and 10 concurrent requests, it could complete in just 45-50 seconds.\n\nThis pattern is particularly valuable when:\n\n- Your generation tasks naturally form a tree or graph structure\n- Individual nodes need some but not all context from their ancestors\n- You need to generate content that exceeds a single context window\n- Speed of generation is important\n\nBy combining structured outputs with parallel generation, you can reliably generate complex, interconnected content at scale while maintaining consistency and control.\n\n`instructor` makes it easy to generate complex Data Structures with language models - whether they're open source models with ollama or proprietary models with providers such as OpenAI. Give us a try today!\n"
  },
  {
    "path": "docs/blog/posts/extracting-model-metadata.md",
    "content": "---\ntitle: \"Extracting Metadata from Images using Structured Extraction\"\ndate: 2024-12-11\ndescription: Structured Extraction makes working with images easy, in this post we'll see how to use it to extract metadata from images\ncategories:\n  - OpenAI\n  - Multimodal\nauthors:\n  - ivanleomk\n---\n\nMultimodal Language Models like gpt-4o excel at processing multimodal, enabling us to extract rich, structured metadata from images.\n\nThis is particularly valuable in areas like fashion where we can use these capabilities to understand user style preferences from images and even videos. In this post, we'll see how to use instructor to map images to a given product taxonomy so we can recommend similar products for users.\n\n<!-- more -->\n\n## Why Image Metadata is useful\n\nMost online e-commerce stores have a taxonomy of products that they sell. This is a way of categorizing products so that users can easily find what they're looking for.\n\nA small example of a taxonomy is shown below. You can think of this as a way of mapping a product to a set of attributes, with some common attributes that are shared across all products.\n\n```yaml\ntops:\n  t-shirts:\n    - crew_neck\n    - v_neck\n    - graphic_tees\n  sweaters:\n    - crewneck\n    - cardigan\n    - pullover\n  jackets:\n    - bomber_jackets\n    - denim_jackets\n    - leather_jackets\n\nbottoms:\n  pants:\n    - chinos\n    - dress_pants\n    - cargo_pants\n  shorts:\n    - athletic_shorts\n    - cargo_shorts\n\ncolors:\n  - black\n  - navy\n  - white\n  - beige\n  - brown\n```\n\nBy using this taxonomy, we can ensure that our model is able to extract metadata that is consistent with the products we sell. In this example, we'll analyze style photos from a fitness influencer to understand their fashion preferences and possibily see what products we can recommend from our own catalog to him.\n\nWe're using some photos from a fitness influencer called [Jpgeez](https://www.instagram.com/jpgeez/) which you can see below.\n\n<div class=\"grid\" markdown>\n![](./img/style_1.png){: style=\"height:200px\"}\n![](./img/style_2.png){: style=\"height:200px\"}\n![](./img/style_3.png){: style=\"height:200px\"}\n![](./img/style_4.png){: style=\"height:200px\"}\n![](./img/style_5.png){: style=\"height:200px\"}\n![](./img/style_6.png){: style=\"height:200px\"}\n</div>\n\nWhile we're mapping these visual elements over to a taxonomy, this is really applicable to any other use case where you want to extract metadata from images.\n\n## Extracting metadata from images\n\n### Instructor's `Image` class\n\nWith instructor, working with `multimodal` data is easy. We can use the `Image` class to load images from a URL or local file. We can see this below in action.\n\n```python\nimport instructor\n\n# Load images using instructor.Image.from_path\nimages = []\nfor image_file in image_files:\n    image_path = os.path.join(\"./images\", image_file)\n    image = instructor.Image.from_path(image_path)\n    images.append(image)\n```\n\nWe provide a variety of different methods for loading images, including from a URL, local file, and even from a base64 encoded string which you [can read about here](../../concepts/multimodal.md)\n\n### Defining a response model\n\nSince our taxonomy is defined as a yaml file, we can't use literals to define the response model. Instead, we can read in the configuration from a yaml file and then use that in a `model_validator` step to make sure that the metadata we extract is consistent with the taxonomy.\n\nFirst, we read in the taxonomy from a yaml file and create a set of categories, subcategories, and product types.\n\n```python\nimport yaml\n\nwith open(\"taxonomy.yml\") as file:\n    taxonomy = yaml.safe_load(file)\n\ncolors = taxonomy[\"colors\"]\ncategories = set(taxonomy.keys())\ncategories.remove(\"colors\")\n\nsubcategories = set()\nproduct_types = set()\nfor category in categories:\n    for subcategory in taxonomy[category].keys():\n        subcategories.add(subcategory)\n        for product_type in taxonomy[category][subcategory]:\n            product_types.add(product_type)\n```\n\nThen we can use these in our `response_model` to make sure that the metadata we extract is consistent with the taxonomy.\n\n```python\nclass PersonalStyle(BaseModel):\n    \"\"\"\n    Ideally you map this to a specific taxonomy\n    \"\"\"\n\n    categories: list[str]\n    subcategories: list[str]\n    product_types: list[str]\n    colors: list[str]\n\n    @model_validator(mode=\"after\")\n    def validate_options(self, info: ValidationInfo):\n        context = info.context\n        colors = context[\"colors\"]\n        categories = context[\"categories\"]\n        subcategories = context[\"subcategories\"]\n        product_types = context[\"product_types\"]\n\n        # Validate colors\n        for color in self.colors:\n            if color not in colors:\n                raise ValueError(\n                    f\"Color {color} is not in the taxonomy. Valid colors are {colors}\"\n                )\n        for category in self.categories:\n            if category not in categories:\n                raise ValueError(\n                    f\"Category {category} is not in the taxonomy. Valid categories are {categories}\"\n                )\n\n        for subcategory in self.subcategories:\n            if subcategory not in subcategories:\n                raise ValueError(\n                    f\"Subcategory {subcategory} is not in the taxonomy. Valid subcategories are {subcategories}\"\n                )\n\n        for product_type in self.product_types:\n            if product_type not in product_types:\n                raise ValueError(\n                    f\"Product type {product_type} is not in the taxonomy. Valid product types are {product_types}\"\n                )\n\n        return self\n```\n\n### Making the API call\n\nLastly, we can combine these all into a single api call to `gpt-4o` where we pass in all of the images and the response model into the `response_model` parameter.\n\nWith our inbuilt support for `jinja` formatting using the `context` keyword that exposes data we can also re-use in our validation, this becomes an incredibly easy step to execute.\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nresp = client.create(\n    model=\"gpt-4o\",\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"\"\"\nYou are a helpful assistant. You are given a list of images and you need to map the person style of the person in the image to a given taxonomy.\n\nHere is the taxonomy that you should use\n\nColors:\n{% for color in colors %}\n* {{ color }}\n{% endfor %}\n\nCategories:\n{% for category in categories %}\n* {{ category }}\n{% endfor %}\n\nSubcategories:\n{% for subcategory in subcategories %}\n* {{ subcategory }}\n{% endfor %}\n\nProduct types:\n{% for product_type in product_types %}\n* {{ product_type }}\n{% endfor %}\n\"\"\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Here are the images of the person, describe the personal style of the person in the image from a first-person perspective( Eg. You are ... )\",\n                *images,\n            ],\n        },\n    ],\n    response_model=PersonalStyle,\n    context={\n        \"colors\": colors,\n        \"categories\": list(categories),\n        \"subcategories\": list(subcategories),\n        \"product_types\": list(product_types),\n    },\n)\n```\n\nThis then returns the following response.\n\n```python\nPersonalStyle(\n    categories=['tops', 'bottoms'],\n    subcategories=['sweaters', 'jackets', 'pants'],\n    product_types=['cardigan', 'crewneck', 'denim_jackets', 'chinos'],\n    colors=['brown', 'beige', 'black', 'white', 'navy'],\n)\n```\n\n## Looking Ahead\n\nThe ability to extract structured metadata from images opens up exciting possibilities for personalization in e-commerce. The key is maintaining the bridge between unstructured visual inspiration and structured product data through well-defined taxonomies and robust validation.\n\n`instructor` makes working with multimodal data easy, and we're excited to see what you build with it. Give us a try today with `pip install instructor` and see how easy it is to work with language models using structured extraction.\n"
  },
  {
    "path": "docs/blog/posts/fake-data.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2024-03-08\ndescription: Learn to generate synthetic data using Pydantic and OpenAI's models with\n  practical examples and configurations.\ndraft: false\ntags:\n- Synthetic Data\n- Pydantic\n- OpenAI\n- Data Generation\n- Python\n---\n\n# Simple Synthetic Data Generation\n\nWhat that people have been using instructor for is to generate synthetic data rather than extracting data itself. We can even use the J-Schemo extra fields to give specific examples to control how we generate data.\n\nConsider the example below. We'll likely generate very simple names.\n\n```python\nfrom typing import Iterable\nfrom pydantic import BaseModel\nimport instructor\n\n\n# Define the UserDetail model\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n# Patch the OpenAI client to enable the response_model functionality\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_fake_users(count: int) -> Iterable[UserDetail]:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=Iterable[UserDetail],\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Generate a {count} synthetic users\"},\n        ],\n    )\n\n\nfor user in generate_fake_users(5):\n    print(user)\n    #> name='Alice' age=25\n    #> name='Bob' age=30\n    #> name='Charlie' age=22\n    #> name='David' age=28\n    #> name='Eve' age=35\n```\n\n## Leveraging Simple Examples\n\nWe might want to set examples as part of the prompt by leveraging Pydantics configuration. We can set examples directly in the JSON scheme itself.\n\n```python\nfrom typing import Iterable\nfrom pydantic import BaseModel, Field\nimport instructor\n\n\n# Define the UserDetail model\nclass UserDetail(BaseModel):\n    name: str = Field(examples=[\"Timothee Chalamet\", \"Zendaya\"])\n    age: int\n\n\n# Patch the OpenAI client to enable the response_model functionality\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_fake_users(count: int) -> Iterable[UserDetail]:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=Iterable[UserDetail],\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Generate a {count} synthetic users\"},\n        ],\n    )\n\n\nfor user in generate_fake_users(5):\n    print(user)\n    #> name='John Doe' age=25\n    #> name='Alice Smith' age=30\n    #> name='Bob Johnson' age=28\n    #> name='Emily Brown' age=35\n    #> name='Michael Williams' age=27\n```\n\nBy incorporating names of celebrities as examples, we have shifted towards generating synthetic data featuring well-known personalities, moving away from the simplistic, single-word names previously used.\n\n## Leveraging Complex Example\n\nTo effectively generate synthetic examples with more nuance, lets upgrade to the \"gpt-4-turbo-preview\" model, use model level examples rather than attribute level examples:\n\n```Python\nimport instructor\n\nfrom typing import Iterable\nfrom pydantic import BaseModel, ConfigDict\n\n\n# Define the UserDetail model\nclass UserDetail(BaseModel):\n    \"\"\"Old Wizards\"\"\"\n\n    name: str\n    age: int\n\n    model_config = ConfigDict(\n        json_schema_extra={\n            \"examples\": [\n                {\"name\": \"Gandalf the Grey\", \"age\": 1000},\n                {\"name\": \"Albus Dumbledore\", \"age\": 150},\n            ]\n        }\n    )\n\n\n# Patch the OpenAI client to enable the response_model functionality\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_fake_users(count: int) -> Iterable[UserDetail]:\n    return client.create(\n        model=\"gpt-4-turbo-preview\",\n        response_model=Iterable[UserDetail],\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Generate `{count}` synthetic examples\"},\n        ],\n    )\n\n\nfor user in generate_fake_users(5):\n    print(user)\n    #> name='Merlin' age=600\n    #> name='Radagast the Brown' age=950\n    #> name='Rincewind' age=70\n    #> name='Harry Potter' age=17\n    #> name='Elminster Aumar' age=1200\n```\n\n## Leveraging Descriptions\n\nBy adjusting the descriptions within our Pydantic models, we can subtly influence the nature of the synthetic data generated. This method allows for a more nuanced control over the output, ensuring that the generated data aligns more closely with our expectations or requirements.\n\nFor instance, specifying \"Fancy French sounding names\" as a description for the `name` field in our `UserDetail` model directs the generation process to produce names that fit this particular criterion, resulting in a dataset that is both diverse and tailored to specific linguistic characteristics.\n\n\n```python\nimport instructor\n\nfrom typing import Iterable\nfrom pydantic import BaseModel, Field\n\n\n# Define the UserDetail model\nclass UserDetail(BaseModel):\n    name: str = Field(description=\"Fancy French sounding names\")\n    age: int\n\n\n# Patch the OpenAI client to enable the response_model functionality\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_fake_users(count: int) -> Iterable[UserDetail]:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=Iterable[UserDetail],\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Generate `{count}` synthetic users\"},\n        ],\n    )\n\n\nfor user in generate_fake_users(5):\n    print(user)\n    #> name='Jean Luc' age=25\n    #> name='Marcelle' age=30\n    #> name='Antoinette' age=22\n    #> name='Gaspard' age=28\n    #> name='Eloise' age=35\n```"
  },
  {
    "path": "docs/blog/posts/full-fastapi-visibility.md",
    "content": "---\nauthors:\n- ivanleomk\n- jxnl\ncategories:\n- LLM Observability\ncomments: true\ndate: 2024-05-03\ndescription: Discover how Logfire enhances FastAPI applications with OpenTelemetry\n  for better visibility and performance tracking.\ndraft: false\nslug: fastapi-open-telemetry-and-instructor\ntags:\n- FastAPI\n- Logfire\n- OpenTelemetry\n- Pydantic\n- AsyncIO\n---\n\n# Why Logfire is a perfect fit for FastAPI + Instructor\n\nLogfire is a new tool that provides key insight into your application with Open Telemetry. Instead of using ad-hoc print statements, Logfire helps to profile every part of your application and is integrated directly into Pydantic and FastAPI, two popular libraries amongst Instructor users.\n\nIn short, this is the secret sauce to help you get your application to the finish line and beyond. We'll show you how to easily integrate Logfire into FastAPI, one of the most popular choices amongst users of Instructor using two examples\n\n1. Data Extraction from a single User Query\n2. Using `asyncio` to process multiple users in parallel\n3. Streaming multiple objects using an `Iterable` so that they're available on demand\n\n<!-- more -->\n\nAs usual, all of the code that we refer to here is provided in [examples/logfire-fastapi](https://www.github.com/jxnl/instructor/tree/main/examples/logfire-fastapi) for you to use in your projects.\n\n??? info \"Configure Logfire\"\n\n    Before starting this tutorial, make sure that you've registered for a [Logfire](https://logfire.pydantic.dev/) account. You'll also need to create a project to track these logs. Lastly, in order to see the request body, you'll also need to configure the default log level to `debug` instead of the default `info` on the dashboard console.\n\nMake sure to create a virtual environment and install all of the packages inside the `requirements.txt` file at [examples/logfire-fastapi](https://www.github.com/jxnl/instructor/tree/main/examples/logfire-fastapi).\n\n## Data Extraction\n\nLet's start by trying to extract some user information given a user query. We can do so with a simple Pydantic model as seen below.\n\n```python\nfrom pydantic import BaseModel\nfrom fastapi import FastAPI\nimport instructor\n\n\nclass UserData(BaseModel):\n    query: str\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\napp = FastAPI()\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\n@app.post(\"/user\", response_model=UserDetail)\nasync def endpoint_function(data: UserData) -> UserDetail:\n    user_detail = await client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Extract: `{data.query}`\"},\n        ],\n    )\n\n    return user_detail\n```\n\nThis simple endpoint takes in a user query and extracts out a user from the statement. Let's see how we can add in Logfire into this endpoint with just a few lines of code\n\n```python hl_lines=\"5 18-21\"\nfrom pydantic import BaseModel\nfrom fastapi import FastAPI\nimport instructor\nimport logfire  # (1)!\n\n\nclass UserData(BaseModel):\n    query: str\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\napp = FastAPI()\nopenai_client = AsyncOpenAI()  # (2)!\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_openai(openai_client)\nlogfire.instrument_fastapi(app)\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\n\n@app.post(\"/user\", response_model=UserDetail)\nasync def endpoint_function(data: UserData) -> UserDetail:\n    user_detail = await client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Extract: `{data.query}`\"},\n        ],\n    )\n\n    return user_detail\n```\n\n1. Import in the logfire package\n2. Setup logging using their native integrations with FastAPI and OpenAI\n\nWith just those few lines of code, we've got ourselves a working integration with Logfire. When we call our endpoint at `/user` with the following payload, everything is immediately logged in the console.\n\n```bash\ncurl -X 'POST' \\\n  'http://localhost:8000/user' \\\n  -H 'accept: application/json' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n  \"query\": \"Daniel is a 24 year man living in New York City\"\n}'\n```\n\nWe can see that Pydantic has nicely logged for us the validation result of our openai call here. Just right above, we also have the result of the OpenAI call.\n\n![Pydantic Validation](img/logfire-sync-pydantic-validation.png)\n\nWe've also got full visibility into the arguments that were passed into the endpoint when we called it. This is extremely useful for users when they eventually want to reproduce errors in production locally.\n\n![FastAPI arguments](img/logfire-sync-fastapi-arguments.png)\n\n## Using Asyncio\n\nSometimes, we might need to run multiple jobs in parallel. Let's see how we can take advantage of `asyncio` so that we can speed up our operations. We can do so by adding the following bits of code to our previous file.\n\n??? info \"What is Asyncio?\"\n\n    For a deeper guide into how to work with Asycnio, see our previous guide [here](./learn-async.md).\n\n=== \"New Code\"\n\n    ```python\n    import asyncio\n\n\n    class MultipleUserData(BaseModel):\n        queries: list[str]\n\n\n    @app.post(\"/many-users\", response_model=list[UserDetail])\n    async def extract_many_users(data: MultipleUserData):\n        async def extract_user(query: str):\n            user_detail = await client.create(\n                model=\"gpt-3.5-turbo\",\n                response_model=UserDetail,\n                messages=[\n                    {\"role\": \"user\", \"content\": f\"Extract: `{query}`\"},\n                ],\n            )\n            logfire.info(\"/User returning\", value=user_detail)\n            return user_detail\n\n        coros = [extract_user(query) for query in data.queries]\n        return await asyncio.gather(*coros)\n    ```\n\n=== \"Full File\"\n\n    ```python\n    from pydantic import BaseModel\n    from fastapi import FastAPI\n    import instructor\n    import logfire\n    import asyncio\n\n\n    class UserData(BaseModel):\n        query: str\n\n\n    class MultipleUserData(BaseModel):\n        queries: list[str]\n\n\n    class UserDetail(BaseModel):\n        name: str\n        age: int\n\n\n    app = FastAPI()\n    openai_client = AsyncOpenAI()\n    logfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\n    logfire.instrument_openai(openai_client)\n    logfire.instrument_fastapi(app)\n    client = instructor.from_provider(\"openai/gpt-4o\")\n\n\n    @app.post(\"/user\", response_model=UserDetail)\n    async def endpoint_function(data: UserData) -> UserDetail:\n        user_detail = await client.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=UserDetail,\n            messages=[\n                {\"role\": \"user\", \"content\": f\"Extract: `{data.query}`\"},\n            ],\n        )\n        logfire.info(\"/User returning\", value=user_detail)\n        return user_detail\n\n\n    @app.post(\"/many-users\", response_model=list[UserDetail])\n    async def extract_many_users(data: MultipleUserData):\n        async def extract_user(query: str):\n            user_detail = await client.create(\n                model=\"gpt-3.5-turbo\",\n                response_model=UserDetail,\n                messages=[\n                    {\"role\": \"user\", \"content\": f\"Extract: `{query}`\"},\n                ],\n            )\n            logfire.info(\"/User returning\", value=user_detail)\n            return user_detail\n\n        coros = [extract_user(query) for query in data.queries]\n        return await asyncio.gather(*coros)\n    ```\n\nWe can call this endpoint with a simple `curl` call\n\n```bash\ncurl -X 'POST' \\\n  'http://localhost:8000/many-users' \\\n  -H 'accept: application/json' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n  \"queries\": [\n    \"Daniel is a 34 year man in New York City\",\"Sarah is a 20 year old living in Tokyo\", \"Jeffrey is 55 and lives down in Leeds\"\n  ]\n}'\n```\n\nThis is all logged in Logfire as seen below. We have complete visibility into the performance of our entire application and it's pretty clear that a large chunk of the latency is taken up by the OpenAI Call.\n\nWe could also potentially separate the logs into more graunular levels by creating a new span for each instance of `extract_user` created.\n\n![Logfire Asyncio](img/logfire-asyncio.png)\n\n## Streaming\n\nNow let's see how we can take advantage of Instructor's `Iterable` support to stream multiple instances of an extracted object. This is extremely useful for application where speed is crucial and users want to get the results quickly.\n\nLet's add a new endpoint to our server to see how this might work\n\n=== \"New Code\"\n\n    ```python\n    from collections.abc import Iterable\n    from fastapi.responses import StreamingResponse\n\n\n    class MultipleUserData(BaseModel):\n        queries: list[str]\n\n\n    @app.post(\"/extract\", response_class=StreamingResponse)\n    async def extract(data: UserData):\n        suppressed_client = AsyncOpenAI()\n        logfire.instrument_openai(\n            suppressed_client, suppress_other_instrumentation=False\n        )  # (1)!\n        client = instructor.from_provider(\"openai/gpt-4o\")\n        users = await client.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=Iterable[UserDetail],\n            stream=True,\n            messages=[\n                {\"role\": \"user\", \"content\": data.query},\n            ],\n        )\n\n        async def generate():\n            with logfire.span(\"Generating User Response Objects\"):\n                async for user in users:\n                    resp_json = user.model_dump_json()\n                    logfire.info(\"Returning user object\", value=resp_json)\n\n                    yield resp_json\n\n        return StreamingResponse(generate(), media_type=\"text/event-stream\")\n    ```\n\n    1. Note that we suppress instrumentation to print out the stream objects. This has to do with the parsing of partials in Instructor.\n\n=== \"Full File\"\n\n    ```python\n    from pydantic import BaseModel\n    from fastapi import FastAPI\n    import instructor\n    import logfire\n    import asyncio\n    from collections.abc import Iterable\n    from fastapi.responses import StreamingResponse\n\n\n    class UserData(BaseModel):\n        query: str\n\n\n    class MultipleUserData(BaseModel):\n        queries: list[str]\n\n\n    class UserDetail(BaseModel):\n        name: str\n        age: int\n\n\n    app = FastAPI()\n    openai_client = AsyncOpenAI()\n    logfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\n    logfire.instrument_fastapi(app)\n    logfire.instrument_openai(openai_client)\n    client = instructor.from_provider(\"openai/gpt-4o\")\n\n\n    @app.post(\"/user\", response_model=UserDetail)\n    async def endpoint_function(data: UserData) -> UserDetail:\n        user_detail = await client.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=UserDetail,\n            messages=[\n                {\"role\": \"user\", \"content\": f\"Extract: `{data.query}`\"},\n            ],\n        )\n        logfire.info(\"/User returning\", value=user_detail)\n        return user_detail\n\n\n    @app.post(\"/many-users\", response_model=list[UserDetail])\n    async def extract_many_users(data: MultipleUserData):\n        async def extract_user(query: str):\n            user_detail = await client.create(\n                model=\"gpt-3.5-turbo\",\n                response_model=UserDetail,\n                messages=[\n                    {\"role\": \"user\", \"content\": f\"Extract: `{query}`\"},\n                ],\n            )\n            logfire.info(\"/User returning\", value=user_detail)\n            return user_detail\n\n        coros = [extract_user(query) for query in data.queries]\n        return await asyncio.gather(*coros)\n\n\n    @app.post(\"/extract\", response_class=StreamingResponse)\n    async def extract(data: UserData):\n        suppressed_client = AsyncOpenAI()\n        logfire.instrument_openai(suppressed_client, suppress_other_instrumentation=False)\n        client = instructor.from_provider(\"openai/gpt-4o\")\n        users = await client.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=Iterable[UserDetail],\n            stream=True,\n            messages=[\n                {\"role\": \"user\", \"content\": data.query},\n            ],\n        )\n\n        async def generate():\n            with logfire.span(\"Generating User Response Objects\"):\n                async for user in users:\n                    resp_json = user.model_dump_json()\n                    logfire.info(\"Returning user object\", value=resp_json)\n\n                    yield resp_json\n\n        return StreamingResponse(generate(), media_type=\"text/event-stream\")\n    ```\n\nWe can call and log out the stream returned using the `requests` library and using the `iter_content` method\n\n```python\nimport requests\n\nresponse = requests.post(\n    \"http://127.0.0.1:3000/extract\",\n    json={\n        \"query\": \"Alice and Bob are best friends. They are currently 32 and 43 respectively. \"\n    },\n    stream=True,\n)\n\nfor chunk in response.iter_content(chunk_size=1024):\n    if chunk:\n        print(str(chunk, encoding=\"utf-8\"), end=\"\\n\")\n```\n\nThis gives us the output of\n\n```bash\n{\"name\":\"Alice\",\"age\":32}\n{\"name\":\"Bob\",\"age\":43}\n```\n\nWe can also see the individual stream objects inside the Logfire dashboard as seen below. Note that we've grouped the generated logs inside a span of its own for easy logging.\n\n![Logfire Stream](img/logfire-stream.png)"
  },
  {
    "path": "docs/blog/posts/generating-pdf-citations.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - Gemini\n  - Document Processing\ncomments: true\ndate: 2024-11-15\ndescription: Generate accurate citations and eliminate hallucinations with structured outputs using Gemini.\ndraft: false\ntags:\n  - Gemini\n  - Document Processing\n  - PDF Analysis\n  - Pydantic\n  - Python\n---\n\n# Eliminating Hallucinations with Structured Outputs using Gemini\n\nIn this post, we'll explore how to use Google's Gemini model with Instructor to generate accurate citations from PDFs. This approach ensures that answers are grounded in the actual content of the PDF, reducing the risk of hallucinations.\n\nWe'll be using the Nvidia 10k report for this example which you can download at this [link](https://d18rn0p25nwr6d.cloudfront.net/CIK-0001045810/78501ce3-7816-4c4d-8688-53dd140df456.pdf).\n\n<!-- more -->\n\n## Introduction\n\nWhen processing PDFs, it's crucial to ensure that any answers or insights derived are directly linked to the source material. This is especially important in applications where users need to verify the origin of information, such as legal or academic contexts.\n\nWe're using PyMuPDF here to handle PDF parsing but you can use any other library that you want. Ultimately when your citations get more complex, you'll want to invest more time into validating the PDF citations against a document.\n\n## Setting Up the Environment\n\nFirst, let's set up our environment with the necessary libraries:\n\n```bash\npip install \"instructor[google-generativeai]\" pymupdf\n```\n\nThen let's import the necessary libraries:\n\n```python\n```\n\n## Defining Our Data Models\n\nWe'll use Pydantic to define our data models for citations and answers:\n\n```python\nclass Citation(BaseModel):\n    reason_for_relevance: str\n    text: list[str]\n    page_number: int\n\n\nclass Answer(BaseModel):\n    chain_of_thought: str\n    citations: list[Citation]\n    answer: str\n```\n\n## Initializing the Gemini Client\n\nNext, we'll set up our Gemini client using Instructor:\n\n```python\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n)\n```\n\n## Processing the PDF\n\nTo analyze a PDF and generate citations, follow these steps:\n\n```python\npdf_path = \"./10k.pdf\"\ndoc = pymupdf.open(pdf_path)\n\n# Upload the PDF\nfile = genai.upload_file(pdf_path)\n\n# Wait for file to finish processing\nwhile file.state != File.State.ACTIVE:\n    time.sleep(1)\n    file = genai.get_file(file.name)\n    print(f\"File is still uploading, state: {file.state}\")\n\nresp: Answer = client.create(\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a helpful assistant that can answer questions about the provided pdf file. You will be given a question and a pdf file. Your job is to answer the question using the information in the pdf file. Provide all citations that are relevant to the question and make sure that the coordinates are accurate.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What were all of the export restrictions announced by the USG in 2023? What chips did they affect?\",\n                file,\n            ],\n        },\n    ],\n    response_model=Answer,\n)\n\nprint(resp)\n# Answer(\n#     chain_of_thought=\"The question asks about export restrictions in 2023. Page 25 mentions the USG announcing licensing requirements for A100 and H100 chips in August 2022, and additional licensing requirements for a subset of these products in July 2023.\",\n#     citations=[\n#         Citation(\n#             reason_for_relevance=\"Describes the export licensing requirements and which chips they affect.\",\n#             text=[\n#                 \"In August 2022, the U.S. government, or the USG, announced licensing requirements that, with certain exceptions, impact exports to China (including Hong\",\n#                 \"Kong and Macau) and Russia of our A100 and H100 integrated circuits, DGX or any other systems or boards which incorporate A100 or H100 integrated circuits.\",\n#                 \"In July 2023, the USG informed us of an additional licensing requirement for a subset of A100 and H100 products destined to certain customers and other\",\n#                 \"regions, including some countries in the Middle East.\",\n#             ],\n#             page_number=25,\n#         )\n#     ],\n#     answer=\"In 2023, the U.S. government (USG) announced new licensing requirements for the export of certain chips to China, Russia, and other countries.  These chips included the A100 and H100 integrated circuits, the DGX system, and any other systems or boards incorporating the A100 or H100 chips.\",\n# )\n```\n\n## Highlighting Citations in the PDF\n\nOnce you have the citations, you can highlight them in the PDF:\n\n```python\nfor citation in resp.citations:\n    page = doc.load_page(citation.page_number - 1)\n    for text in citation.text:\n        text_instances = page.search_for(text)\n        for instance in text_instances:\n            page.add_highlight_annot(instance)\n\ndoc.save(\"./highlighted.pdf\")\ndoc.close()\n```\n\nIn our case, we can see that the citations are accurate and the answer is correct.\n\n![Gemini Citations](./img/gemini_citations.png)\n\n## Why Structured Outputs?\n\nOne of the significant advantages of using structured outputs is the ability to handle complex data extraction tasks with ease and reliability. When dealing with raw completion strings or JSON data, developers often face challenges related to parsing complexity and code maintainability.\n\nOver time, this just becomes error-prone, difficult to iterate upon and impossible to maintain. Instead, by leveraging pydantic, you get access to one of the best tools available for validating and parsing data.\n\n1. Ease of Definition: Pydantic allows you to define data models with specific fields effortlessly. This makes it easy to understand and maintain the structure of your data.\n2. Robust Validation: With Pydantic, you can build validators to test against various edge cases, ensuring that your data is accurate and reliable. This is particularly useful when working with PDFs and citations, as you can validate the extracted data without worrying about the underlying language model.\n3. Separation of Concerns: By using structured outputs, the language model's role is reduced to a single function call. This separation allows you to focus on building reliable and efficient data processing pipelines without being bogged down by the intricacies of the language model.\n\nIn summary, structured outputs with Pydantic provide a powerful and ergonomic way to manage complex data extraction tasks. They enhance reliability, simplify code maintenance, and enable developers to build better applications with less effort.\n\n## Conclusion\n\nBy using Gemini and Instructor, you can generate accurate citations from PDFs, ensuring that your answers are grounded in the source material. This approach is invaluable for applications requiring high levels of accuracy and traceability.\n\nGive instructor a try today and see how you can build reliable applications. Just run `pip install instructor` or check out our [Getting Started Guide](../../index.md)\n"
  },
  {
    "path": "docs/blog/posts/generator.md",
    "content": "---\nauthors:\n- jxnl\n- anmol\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2023-11-26\ndescription: Explore Python generators and their role in enhancing LLM streaming for\n  improved latency and user experience in applications.\ndraft: false\nslug: python-generators-and-llm-streaming\ntags:\n- Python\n- Generators\n- LLM Streaming\n- Data Processing\n- Performance Optimization\n---\n\n# Generators and LLM Streaming\n\nLatency is crucial, especially in eCommerce and newer chat applications like ChatGPT. Streaming is the solution that enables us to enhance the user experience without the need for faster response times.\n\nAnd what makes streaming possible? Generators!\n\n<!-- more -->\n\nIn this post, we're going to dive into the cool world of Python generators - these tools are more than just a coding syntax trick. We'll explore Python generators from the ground up and then delve into LLM streaming using the Instructor library.\n\n## Python Generators: An Efficient Approach to Iterables\n\nGenerators in Python are a game-changer for handling large data sets and stream processing. They allow functions to yield values one at a time, pausing and resuming their state, which is a faster and more memory-efficient approach compared to traditional collections that store all elements in memory.\n\n### The Basics: Yielding Values\n\nA generator function in Python uses the `yield` keyword. It yields values one at a time, allowing the function to pause and resume its state.\n\n```python\ndef count_to_3():\n    yield 1\n    yield 2\n    yield 3\n\n\nfor num in count_to_3():\n    print(num)\n    #> 1\n    #> 2\n    #> 3\n```\n\n```\n1\n2\n3\n```\n\n### Advantages Over Traditional Collections\n\n- **Lazy Evaluation & reduced latency**: The time to get the first element (or time-to-first-token in LLM land) from a generator is significantly lower. Generators only produce one value at a time, whereas accessing the first element of a collection will require that the whole collection be created first.\n- **Memory Efficiency**: Only one item is in memory at a time.\n- **Maintain State**: Automatically maintains state between executions.\n\nLet's see how much faster generators are and where they really shine:\n\n```python\nimport time\n\n\ndef expensive_func(x):\n    \"\"\"Simulate an expensive operation.\"\"\"\n    time.sleep(1)\n    return x**2\n\n\ndef calculate_time_for_first_result_with_list(func_input, func):\n    \"\"\"Calculate using a list comprehension and return the first result with its computation time.\"\"\"\n    start_perf = time.perf_counter()\n    result = [func(x) for x in func_input][0]\n    end_perf = time.perf_counter()\n    print(f\"Time for first result (list): {end_perf - start_perf:.2f} seconds\")\n    #> Time for first result (list): 5.02 seconds\n    return result\n\n\ndef calculate_time_for_first_result_with_generator(func_input, func):\n    \"\"\"Calculate using a generator and return the first result with its computation time.\"\"\"\n    start_perf = time.perf_counter()\n    result = next(func(x) for x in func_input)\n    end_perf = time.perf_counter()\n    print(f\"Time for first result (generator): {end_perf - start_perf:.2f} seconds\")\n    #> Time for first result (generator): 1.01 seconds\n    return result\n\n\n# Prepare inputs for the function\nnumbers = [1, 2, 3, 4, 5]\n\n# Benchmarking\nfirst_result_list = calculate_time_for_first_result_with_list(numbers, expensive_func)\nfirst_result_gen = calculate_time_for_first_result_with_generator(\n    numbers, expensive_func\n)\n```\n\n```\nTime for first result (list): 5.02 seconds\nTime for first result (generator): 1.01 seconds\n```\n\nThe generator computes one expensive operation and returns the first result immediately, while the list comprehension computes the expensive operation for all elements in the list before returning the first result.\n\n### Generator Expressions: A Shortcut\n\nPython also allows creating generators in a single line of code, known as generator expressions. They are syntactically similar to list comprehensions but use parentheses.\n\n```python\nsquares = (x * x for x in range(10))\n```\n\n### Use Cases in Real-World Applications\n\nGenerators shine in scenarios like reading large files, data streaming (eg. llm token streaming), and pipeline creation for data processing.\n\n## LLM Streaming\n\nIf you've used ChatGPT, you'll see that the tokens are streamed out one by one, instead of the full response being shown at the end (can you imagine waiting for the full response??). This is made possible by generators.\n\nHere's how a vanilla openai generator looks:\n\n```python\nfrom openai import OpenAI\n\n# Set your OpenAI API key\nclient = OpenAI(\n    api_key=\"My API Key\",\n)\n\nresponse_generator = client.create(\n    model='gpt-3.5-turbo',\n    messages=[{'role': 'user', 'content': \"What are some good reasons to smile?\"}],\n    temperature=0,\n    stream=True,\n)\n\nfor chunk in response_generator:\n    print(chunk.choices[0].delta.content, end=\"\")\n```\n\nThis is great, but what if we want to do some structured extraction on this stream? For instance, we might want to render frontend components based on product rankings that are streamed out by an LLM.\n\nShould we wait for the entire stream to finish before extracting & validating the list of components or can we extract & validate the components in real time as they are streamed?\n\nIn e-commerce, every millisecond matters so the time-to-first-render can differentiate a successful and not-so-successful e commerce store (and i know how a failing e commerce store feels :/ ).\n\nLet's see how we can use Instructor to handle extraction from this real time stream!\n\n### E-commerce Product Ranking\n\n#### Scenario\n\nImagine an e-commerce platform where we have:\n\n• **a customer profile**: this includes a detailed history of purchases, browsing behavior, product ratings, preferences in various categories, search history, and even responses to previous recommendations. This extensive data is crucial for generating highly personalized and relevant product suggestions.\n\n• **a list of candidate products**: these could be some shortlisted products we think the customer would like.\n\nOur goal is to re-rerank these candidate products for the best conversion and we'll use an LLM!\n\n#### Stream Processing\n\n**User Data**:\n\nLet's assume we have the following user profile:\n\n```python\nprofile_data = \"\"\"\nCustomer ID: 12345\nRecent Purchases: [Laptop, Wireless Headphones, Smart Watch]\nFrequently Browsed Categories: [Electronics, Books, Fitness Equipment]\nProduct Ratings: {Laptop: 5 stars, Wireless Headphones: 4 stars}\nRecent Search History: [best budget laptops 2023, latest sci-fi books, yoga mats]\nPreferred Brands: [Apple, AllBirds, Bench]\nResponses to Previous Recommendations: {Philips: Not Interested, Adidas: Not Interested}\nLoyalty Program Status: Gold Member\nAverage Monthly Spend: $500\nPreferred Shopping Times: Weekend Evenings\n...\n\"\"\"\n```\n\nWe want to rank the following products for this user:\n\n```python\nproducts = [\n    {\n        \"product_id\": 1,\n        \"product_name\": \"Apple MacBook Air (2023) - Latest model, high performance, portable\",\n    },\n    {\n        \"product_id\": 2,\n        \"product_name\": \"Sony WH-1000XM4 Wireless Headphones - Noise-canceling, long battery life\",\n    },\n    {\n        \"product_id\": 3,\n        \"product_name\": \"Apple Watch Series 7 - Advanced fitness tracking, seamless integration with Apple ecosystem\",\n    },\n    {\n        \"product_id\": 4,\n        \"product_name\": \"Kindle Oasis - Premium e-reader with adjustable warm light\",\n    },\n    {\n        \"product_id\": 5,\n        \"product_name\": \"AllBirds Wool Runners - Comfortable, eco-friendly sneakers\",\n    },\n    {\n        \"product_id\": 6,\n        \"product_name\": \"Manduka PRO Yoga Mat - High-quality, durable, eco-friendly\",\n    },\n    {\n        \"product_id\": 7,\n        \"product_name\": \"Bench Hooded Jacket - Stylish, durable, suitable for outdoor activities\",\n    },\n    {\n        \"product_id\": 8,\n        \"product_name\": \"GoPro HERO9 Black - 5K video, waterproof, for action photography\",\n    },\n    {\n        \"product_id\": 9,\n        \"product_name\": \"Nespresso Vertuo Next Coffee Machine - Quality coffee, easy to use, compact design\",\n    },\n    {\n        \"product_id\": 10,\n        \"product_name\": \"Project Hail Mary by Andy Weir - Latest sci-fi book from a renowned author\",\n    },\n]\n```\n\nLet's now define our models for structured extraction. Note: instructor will conveniently let us use `Iterable` to model an iterable of our class. In this case, once we define our product recommendation model, we can slap on `Iterable` to define what we ultimately want - a (ranked) list of product recommendations.\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom typing import Iterable\nfrom pydantic import BaseModel\n\nclient = instructor.from_openai(OpenAI(), mode=instructor.function_calls.Mode.JSON)\n\n\nclass ProductRecommendation(BaseModel):\n    product_id: str\n    product_name: str\n\n\nRecommendations = Iterable[ProductRecommendation]\n```\n\nNow let's use our instructor patch. Since we don't want to wait for all the tokens to finish, will set stream to `True` and process each product recommendation as it comes in:\n\n```python\nprompt = (\n    f\"Based on the following user profile:\\n{profile_data}\\nRank the following products from most relevant to least relevant:\\n\"\n    + '\\n'.join(\n        f\"{product['product_id']} {product['product_name']}\" for product in products\n    )\n)\n\nstart_perf = time.perf_counter()\nrecommendations_stream = client.create(\n    model=\"gpt-3.5-turbo-1106\",\n    temperature=0.1,\n    response_model=Iterable[ProductRecommendation],\n    stream=True,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"Generate product recommendations based on the customer profile. Return in order of highest recommended first.\",\n        },\n        {\"role\": \"user\", \"content\": prompt},\n    ],\n)\nfor product in recommendations_stream:\n    print(product)\n    end_perf = time.perf_counter()\n    print(f\"Time for first result (generator): {end_perf - start_perf:.2f} seconds\")\n    break\n```\n\n```\nproduct_id='1' product_name='Apple MacBook Air (2023)'\nTime for first result (generator): 4.33 seconds\n```\n\n`recommendations_stream` is a generator! It yields the extracted products as it's processing the stream in real-time. Now let's get the same response without streaming and see how they compare.\n\n```python\nstart_perf = time.perf_counter()\nrecommendations_list = client.create(\n    model=\"gpt-3.5-turbo-1106\",\n    temperature=0.1,\n    response_model=Iterable[ProductRecommendation],\n    stream=False,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"Generate product recommendations based on the customer profile. Return in order of highest recommended first.\",\n        },\n        {\"role\": \"user\", \"content\": prompt},\n    ],\n)\nprint(recommendations_list[0])\nend_perf = time.perf_counter()\nprint(f\"Time for first result (list): {end_perf - start_perf:.2f} seconds\")\n```\n\n```\nproduct_id='1' product_name='Apple MacBook Air (2023)'\nTime for first result (list): 8.63 seconds\n```\n\nOur web application now displays results faster. Even a 100ms improvement can lead to a 1% increase in revenue.\n\n### FastAPI\n\nWe can also take this and set up a streaming LLM API endpoint using FastAPI. Check out our docs on using FastAPI [here](../../concepts/fastapi.md)!\n\n## Key Takeaways\n\nTo summarize, we looked at:\n\n• Generators in Python: A powerful feature that allows for efficient data handling with reduced latency\n\n• LLM Streaming: LLMs provide us generators to stream tokens and Instructor can let us validate and extract data from this stream. Real-time data validation ftw!\n\nDon't forget to check our [GitHub](https://github.com/jxnl/instructor) for more resources and give us a star if you find the library helpful!\n\n---\n\nIf you have any questions or need further clarifications, feel free to reach out or dive into the Instructor library's documentation for more detailed information. Happy coding!"
  },
  {
    "path": "docs/blog/posts/google-openai-client.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - Google\n  - OpenAI\ncomments: true\ndate: 2024-11-10\ndescription: Learn why Instructor remains essential even with Google's new OpenAI-compatible client for Gemini\ndraft: false\ntags:\n  - Gemini\n---\n\n# Do I Still Need Instructor with Google's New OpenAI Integration?\n\nGoogle recently launched OpenAI client compatibility for Gemini.\n\nWhile this is a significant step forward for developers by simplifying Gemini model interactions, **you absolutely still need instructor**.\n\nIf you're unfamiliar with instructor, we provide a simple interface to get structured outputs from LLMs across different providers.\n\nThis makes it easy to switch between providers, get reliable outputs from language models and ultimately build production grade LLM applications.\n\n<!-- more -->\n\n## The current state\n\nThe new integration provides an easy integration with the Open AI Client, this means that using function calling with Gemini models has become much easier. We don't need to use a gemini specific library like `vertexai` or `google.generativeai` anymore to define response models.\n\nThis looks something like this:\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    base_url=\"https://generativelanguage.googleapis.com/v1beta/\", api_key=\"YOUR_API_KEY\"\n)\n\nresponse = client.create(\n    model=\"gemini-3-flash\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract name and age from: John is 30\"}],\n)\n```\n\nWhile this seems convenient, there are three major limitations that make `instructor` still essential:\n\n### 1. Limited Schema Support\n\nThe current implementation only supports simple, single-level schemas. This means you can't use complex nested schemas that are common in real-world applications. For example, this won't work:\n\n```python\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass Users(BaseModel):\n    users: list[User]  # Nested schema - will throw an error\n```\n\n### 2. No Streaming Support for Function Calling\n\nThe integration doesn't support streaming for function calling. This is a significant limitation if your application relies on streaming responses, which is increasingly common for:\n\n- Real-time user interfaces\n- Progressive rendering\n- Long-running extractions\n\n### 3. No Multimodal Support\n\nPerhaps the biggest limitation is the lack of multimodal support. Gemini's strength lies in its ability to process multiple types of inputs (images, video, audio), but the OpenAI compatibility layer doesn't support this. This means you can't:\n\n- Perform visual question answering\n- Extract structured data from images\n- Analyze video content\n- Process audio inputs\n\n## Why Instructor Remains Essential\n\nLet's see how instructor solves these issues.\n\n### 1. Easy Schema Management\n\nIt's easy to define and experiment with different response models when you're building your application up. In our [own experiments](./bad-schemas-could-break-llms.md), we found that changing a single field name from `final_choice` to `answer` improved model accuracy from 4.5% to 95%.\n\nThe way we structure and name fields in our response models can fundamentally alter how the model interprets and responds to queries. Manually editing schemas constrains your ability to iterate on your response models, introduces room for catastrophic errors and limits what you can squeeze out of your models.\n\nYou can get the full power of Pydantic with `instructor` with gemini using our `from_gemini` and `from_vertexai` integration instead of the limited support in the OpenAI integration.\n\n### 2. Streaming Support\n\n`instructor` provides built in support for streaming, allowing you to stream partial results as they're generated.\n\nA common use case for streaming is to extract multiple items that have the same structure - Eg. extracting multiple users, extracting multiple products, extracting multiple events, etc.\n\nThis is relatively easy to do with `instructor`\n\n```python\nfrom instructor import from_openai\nfrom openai import OpenAI\nfrom instructor import Mode\nfrom pydantic import BaseModel\nimport os\n\nclient = from_openai(\n    OpenAI(\n        api_key=os.getenv(\"GOOGLE_API_KEY\"),\n        base_url=\"https://generativelanguage.googleapis.com/v1beta/\",\n    ),\n    mode=Mode.MD_JSON,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create_iterable(\n    model=\"gemini-3-flash\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Generate 10 random users\",\n        }\n    ],\n    response_model=User,\n)\n\nfor r in resp:\n    print(r)\n# name='Alice' age=25\n# name='Bob' age=32\n# name='Charlie' age=19\n# name='David' age=48\n# name='Emily' age=28\n# name='Frank' age=36\n# name='Grace' age=22\n# name='Henry' age=41\n# name='Isabella' age=30\n# name='Jack' age=27\n```\n\nIf you want to instead stream out an item as it's being generated, you can do so by using the `create_partial` method instead\n\n```python\nfrom instructor import from_openai\nfrom openai import OpenAI\nfrom instructor import Mode\nfrom pydantic import BaseModel\nimport os\n\nclient = from_openai(\n    OpenAI(\n        api_key=os.getenv(\"GOOGLE_API_KEY\"),\n        base_url=\"https://generativelanguage.googleapis.com/v1beta/\",\n    ),\n    mode=Mode.MD_JSON,\n)\n\n\nclass Story(BaseModel):\n    title: str\n    summary: str\n\n\nresp = client.create_partial(\n    model=\"gemini-3-flash\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Generate a random bedtime story + 1 sentence summary\",\n        }\n    ],\n    response_model=Story,\n)\n\nfor r in resp:\n    print(r)\n\n\n# title = None summary = None\n# title='The Little Firefly Who Lost His Light' summary=None\n# title='The Little Firefly Who Lost His Light' summary='A tiny firefly learns the true meaning of friendship when he loses his glow and a wise old owl helps him find it again.'\n```\n\n### 3. Multimodal Support\n\n`instructor` supports multimodal inputs for Gemini models, allowing you to perform tasks like visual question answering, image analysis, and more.\n\nYou can see an example of how to use instructor with Gemini to [extract travel recommendations from videos](./multimodal-gemini.md) post.\n\n## What else does Instructor offer?\n\nBeyond solving the core limitations of Gemini's new OpenAI integration, instructor provides a list of features that make it indispensable for production grade applications.\n\n### 1. Provider Agnostic API\n\nSwitching between providers shouldn't require rewriting your entire codebase. With instructor, it's as simple as changing just a few lines of code.\n\n```\nfrom openai import OpenAI\nfrom instructor import from_openai\n\nclient = from_openai(\n    OpenAI()\n)\n\n# rest of code\n```\n\nIf we wanted to switch to Anthropic, all it takes is changing the following lines of code\n\n```python\nfrom anthropic import Anthropic\nfrom instructor import from_anthropic\n\nclient = from_anthropic(Anthropic())\n\n# rest of code\n```\n\n### 2. Automatic Validation and Retries\n\nProduction applications need reliable outputs. Instructor handles this by validating all outputs against your desired response model and automatically retrying outputs that fail validation.\n\nWith [our tenacity integration](../../concepts/retrying.md), you get full control over the retries if needed, allowing you to mechanisms like exponential backoff and other retry strategies easily.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom tenacity import Retrying, stop_after_attempt, wait_fixed\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", mode=instructor.Mode.TOOLS)\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\nresponse = client.create(\n    model=\"gpt-4o-mini\",\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract `jason is 12`\"},\n    ],\n    # Stop after the second attempt and wait a fixed 1 second between attempts\n    max_retries=Retrying(\n        stop=stop_after_attempt(2),\n        wait=wait_fixed(1),\n    ),\n)\nprint(response.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"jason\",\n  \"age\": 12\n}\n\"\"\"\n```\n\n## Conclusion\n\nWhile Google's OpenAI compatibility layer is a welcome addition, there are still a few reasons why you might want to stick with instructor for now.\n\nWithin a single package, you get features such as a provider agnostic API, streaming capabilities, multimodal support, automatic re-asking and more.\n\nGive us a try today by installing with `pip install instructor` and see why Pydantic is all you need for a production grade LLM application..\n"
  },
  {
    "path": "docs/blog/posts/introducing-structured-outputs-with-cerebras-inference.md",
    "content": "---\nauthors:\n  - ivanleomk\n  - sarahchieng\ncategories:\n  - API Development\n  - Pydantic\n  - Performance Optimization\ncomments: true\ndate: 2024-10-15\ndescription:\n  Learn how to use Cerebras Inference for structured outputs, faster model\n  inference, and seamless integration with Pydantic models.\ndraft: false\nslug: introducing-structured-outputs-with-cerebras-inference\ntags:\n  - Cerebras Inference\n  - Pydantic\n  - API Integration\n  - Fast Inference\n  - Structured Outputs\n---\n\n# Introducing structured outputs with Cerebras Inference\n\n## What's Cerebras?\n\nCerebras offers the fastest inference on the market, 20x faster than on GPUs.\n\nSign up for a Cerebras Inference API key here at [cloud.cerebras.ai](http://cloud.cerebras.ai).\n\n### Basic Usage\n\nTo get guaranteed structured outputs with Cerebras Inference, you\n\n<!-- more -->\n\n1. Create a new Instructor client with the `from_cerebras` method\n2. Define a Pydantic model to pass into the `response_model` parameter\n3. Get back a validated response exactly as you would expect\n\nYou'll also need to install the Cerebras SDK to use the client. You can install it with the command below.\n\n<!-- more -->\n\n```bash\npip install \"instructor[cerebras_cloud_sdk]\"\n```\n\nThis ensures that you have the necessary dependencies to use the Cerebras SDK with instructor.\n\n### Getting Started\n\nBefore running the following code, you'll need to make sure that you have your CEREBRAS_API_KEY. Sign up for one [here](https://cloud.cerebras.ai/).\n\nMake sure to set the `CEREBRAS_API_KEY` as an alias in your shell.\n\n```bash\nexport CEREBRAS_API_KEY=<your-api-key>\n```\n\nOnce you've done so, you can use the following code to get started.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"cerebras/llama3.1-70b\")\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    model=\"llama3.1-70b\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the name and age of the person in this sentence: John Smith is 29 years old.\",\n        }\n    ],\n    response_model=Person,\n)\n\nprint(resp)\n#> Person(name='John Smith', age=29)\n```\n\nWe support both the `AsyncCerebras` and `Cerebras` clients.\n\n### Streaming\n\nWe also support streaming with the Cerebras client with the `CEREBRAS_JSON` mode so that you can take advantage of Cerebras’s inference speeds and process the response as it comes in.\n\n```python\nimport instructor\nfrom cerebras.cloud.sdk import Cerebras\nfrom pydantic import BaseModel\nfrom typing import Iterable\n\nclient = instructor.from_cerebras(Cerebras(), mode=instructor.Mode.MD_JSON)\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    model=\"llama3.1-70b\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract all users from this sentence : Chris is 27 and lives in San Francisco, John is 30 and lives in New York while their college roommate Jessica is 26 and lives in London\",\n        }\n    ],\n    response_model=Iterable[Person],\n    stream=True,\n)\n\nfor person in resp:\n    print(person)\n    #> Person(name='Chris', age=27)\n    #> Person(name='John', age=30)\n    #> Person(name='Jessica', age=26)\n```\n\nAnd that’s it! We're excited to see what you build with Instructor and Cerebras! If you have any questions about Cerebras or need to get off the API key waitlist, please reach out to sarah.chieng@cerebras.net.\n"
  },
  {
    "path": "docs/blog/posts/introducing-structured-outputs.md",
    "content": "---\nauthors:\n- ivanleomk\ncategories:\n- OpenAI\ncomments: true\ndate: 2024-08-20\ndescription: Explore the challenges of OpenAI's Structured Outputs and how 'instructor'\n  offers solutions for LLM workflows.\ndraft: false\nslug: should-i-be-using-structured-outputs\ntags:\n- OpenAI\n- Structured Outputs\n- Pydantic\n- Data Validation\n- LLM Techniques\n---\n\n# Should I Be Using Structured Outputs?\n\nOpenAI recently announced Structured Outputs which ensures that generated responses match any arbitrary provided JSON Schema. In their [announcement article](https://openai.com/index/introducing-structured-outputs-in-the-api/), they acknowledged that it had been inspired by libraries such as `instructor`.\n\n## Main Challenges\n\nIf you're building complex LLM workflows, you've likely considered OpenAI's Structured Outputs as a potential replacement for `instructor`.\n\nBut before you do so, three key challenges remain:\n\n1. **Limited Validation And Retry Logic**: Structured Outputs ensure adherence to the schema but not useful content. You might get perfectly formatted yet unhelpful responses\n2. **Streaming Challenges**: Parsing raw JSON objects from streamed responses with the sdk is error-prone and inefficient\n3. **Unpredictable Latency Issues** : Structured Outputs suffers from random latency spikes that might result in an almost 20x increase in response time\n\nAdditionally, adopting Structured Outputs locks you into OpenAI's ecosystem, limiting your ability to experiment with diverse models or providers that might better suit specific use-cases.\n\nThis vendor lock-in increases vulnerability to provider outages, potentially causing application downtime and SLA violations, which can damage user trust and impact your business reputation.\n\nIn this article, we'll show how `instructor` addresses many of these challenges with features such as automatic reasking when validation fails, automatic support for validated streaming data and more.\n\n<!-- more -->\n\n### Limited Validation and Retry Logic\n\nValidation is crucial for building reliable and effective applications. We want to catch errors in real time using `Pydantic` [validators](../../concepts/reask_validation.md) in order to allow our LLM to correct its responses on the fly.\n\nLet's see an example of a simple validator below which ensures user names are always in uppercase.\n\n```python\nimport openai\nfrom pydantic import BaseModel, field_validator\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"name\")\n    def ensure_uppercase(cls, v: str) -> str:\n        if not v.isupper():\n            raise ValueError(\"All letters must be uppercase. Got: \" + v)\n        return v\n\n\nclient = openai.OpenAI()\ntry:\n    resp = client.beta.chat.completions.parse(\n        response_format=User,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract the following user: Jason is 25 years old.\",\n            },\n        ],\n        model=\"gpt-4o-mini\",\n    )\nexcept Exception as e:\n    print(e)\n    \"\"\"\n    1 validation error for User\n    name\n      Value error, All letters must be uppercase. Got: Jason [type=value_error, input_value='Jason', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.11/v/value_error\n    \"\"\"\n```\n\nWe can see that we lose the original completion when validation fails. This leaves developers without the means to implement retry logic so that the LLM can provide a targeted correction and regenerate its response.\n\nWithout robust validation, applications risk producing inconsistent outputs and losing valuable context for error correction. This leads to degraded user experience and missed opportunities for targeted improvements in LLM responses.\n\n### Streaming Challenges\n\nStreaming with Structured Outputs is complex. It requires manual parsing, lacks partial validation, and needs a context manager to be used with. Effective implementation with the `beta.chat.completions.stream` method demands significant effort.\n\nLet's see an example below.\n\n```python\nimport openai\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = openai.OpenAI()\nwith client.beta.chat.completions.stream(\n    response_format=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the following user: Jason is 25 years old.\",\n        },\n    ],\n    model=\"gpt-4o-mini\",\n) as stream:\n    for event in stream:\n        if event.type == \"content.delta\":\n            print(event.snapshot, flush=True, end=\"\\n\")\n            #> \n            #> {\"\n            #> {\"name\n            #> {\"name\":\"\n            #> {\"name\":\"Jason\n            #> {\"name\":\"Jason\",\"\n            #> {\"name\":\"Jason\",\"age\n            #> {\"name\":\"Jason\",\"age\":\n            #> {\"name\":\"Jason\",\"age\":25\n            #> {\"name\":\"Jason\",\"age\":25}\n            # >\n            #> {\"\n            #> {\"name\n            #> {\"name\":\"\n            #> {\"name\":\"Jason\n            #> {\"name\":\"Jason\",\"\n            #> {\"name\":\"Jason\",\"age\n            #> {\"name\":\"Jason\",\"age\":\n            #> {\"name\":\"Jason\",\"age\":25\n            #> {\"name\":\"Jason\",\"age\":25}\n            # >\n            #> {\"\n            #> {\"name\n            #> {\"name\":\"\n            #> {\"name\":\"Jason\n            #> {\"name\":\"Jason\",\"\n            #> {\"name\":\"Jason\",\"age\n            #> {\"name\":\"Jason\",\"age\":\n            #> {\"name\":\"Jason\",\"age\":25\n            #> {\"name\":\"Jason\",\"age\":25}\n            # >\n            #> {\"\n            #> {\"name\n            #> {\"name\":\"\n            #> {\"name\":\"Jason\n            #> {\"name\":\"Jason\",\"\n            #> {\"name\":\"Jason\",\"age\n            #> {\"name\":\"Jason\",\"age\":\n            #> {\"name\":\"Jason\",\"age\":25\n            #> {\"name\":\"Jason\",\"age\":25}\n```\n\n### Unpredictable Latency Spikes\n\nIn order to benchmark the two modes, we made 200 identical requests to OpenAI and noted the time taken for each request to complete. The results are summarized in the following table:\n\n| mode               | mean  | min   | max    | std_dev | variance |\n| ------------------ | ----- | ----- | ------ | ------- | -------- |\n| Tool Calling       | 6.84  | 6.21  | 12.84  | 0.69    | 0.47     |\n| Structured Outputs | 28.20 | 14.91 | 136.90 | 9.27    | 86.01    |\n\nStructured Outputs suffers from unpredictable latency spikes while Tool Calling maintains consistent performance. This could cause users to occasionally experience significant delays in response times, potentially impacting the overall user satisfication and retention rates.\n\n## Why use `instructor`\n\n`instructor` is fully compatible with Structured Outputs and provides three main benefits to developers.\n\n1. **Automatic Validation and Retries**: Regenerates LLM responses on Pydantic validation failures, ensuring data integrity.\n2. **Real-time Streaming Validation**: Incrementally validates partial JSON against Pydantic models, enabling immediate use of validated properties.\n3. **Provider-Agnostic API**: Switch between LLM providers and models with a single line of code.\n\nLet's see this in action below\n\n### Automatic Validation and Retries\n\nWith `instructor`, all it takes is a simple Pydantic Schema and a validator for you to get the extracted names as an upper case value.\n\n```python\nimport instructor\nfrom pydantic import BaseModel, field_validator\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"name\")\n    def ensure_uppercase(cls, v: str) -> str:\n        if not v.isupper():\n            raise ValueError(\"All letters must be uppercase. Got: \" + v)\n        return v\n\n\nclient = instructor.from_provider(\n    \"openai/gpt-5-nano\", mode=instructor.Mode.TOOLS_STRICT\n)\n\nresp = client.create(\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the following user: Jason is 25 years old.\",\n        }\n    ],\n    model=\"gpt-4o-mini\",\n)\n\nprint(resp)\n#> name='JASON' age=25\n```\n\nThis built-in retry logic allows for targeted correction to the generated response, ensuring that outputs are not only consistent with your schema but also correct for your use-case. This is invaluable in building reliable LLM systems.\n\n### Real-time Streaming Validation\n\nA common use-case is to define a single schema and extract multiple instances of it. With `instructor`, doing this is relatively straightforward by using [our `create_iterable` method](../../concepts/lists.md).\n\n```python\nclient = instructor.from_provider(\n    \"openai/gpt-5-nano\", mode=instructor.Mode.TOOLS_STRICT\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nusers = client.create_iterable(\n    model=\"gpt-4o-mini\",\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a perfect entity extraction system\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": (f\"Extract `Jason is 10 and John is 10`\"),\n        },\n    ],\n)\n\nfor user in users:\n    print(user)\n    #> name='Jason' age=10\n    #> name='John' age=10\n```\n\nOther times, we might also want to stream out information as it's dynamically generated into some sort of frontend component With `instructor`, you'll be able to do just that [using the `create_partial` method](../../concepts/partial.md).\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom rich.console import Console\n\nclient = instructor.from_provider(\n    \"openai/gpt-5-nano\", mode=instructor.Mode.TOOLS_STRICT\n)\n\ntext_block = \"\"\"\nIn our recent online meeting, participants from various backgrounds joined to discuss the upcoming tech conference. The names and contact details of the participants were as follows:\n\n- Name: John Doe, Email: johndoe@email.com, Twitter: @TechGuru44\n- Name: Jane Smith, Email: janesmith@email.com, Twitter: @DigitalDiva88\n- Name: Alex Johnson, Email: alexj@email.com, Twitter: @CodeMaster2023\n\nDuring the meeting, we agreed on several key points. The conference will be held on March 15th, 2024, at the Grand Tech Arena located at 4521 Innovation Drive. Dr. Emily Johnson, a renowned AI researcher, will be our keynote speaker.\n\nThe budget for the event is set at $50,000, covering venue costs, speaker fees, and promotional activities. Each participant is expected to contribute an article to the conference blog by February 20th.\n\nA follow-up meeting is scheduled for January 25th at 3 PM GMT to finalize the agenda and confirm the list of speakers.\n\"\"\"\n\n\nclass User(BaseModel):\n    name: str\n    email: str\n    twitter: str\n\n\nclass MeetingInfo(BaseModel):\n    users: list[User]\n    date: str\n    location: str\n    budget: int\n    deadline: str\n\n\nextraction_stream = client.create_partial(\n    model=\"gpt-4o-mini\",\n    response_model=MeetingInfo,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Get the information about the meeting and the users {text_block}\",\n        },\n    ],\n    stream=True,\n)\n\n\nconsole = Console()\n\nfor extraction in extraction_stream:\n    obj = extraction.model_dump()\n    console.clear()\n    console.print(obj)\n```\n\nThis will output the following\n\n![Structured Output Extraction](./img/Structured_Output_Extraction.gif)\n\n### Provider-Agnostic API\n\nWith `instructor`, switching between different providers is easy due to our unified API.\n\nFor example, the switch from OpenAI to Anthropic requires only three adjustments\n\n1. Import the Anthropic client\n2. Use `from_anthropic` instead of `from_openai`\n3. Update the model name (e.g., from gpt-4o-mini to claude-3-5-sonnet)\n\nThis makes it incredibly flexible for users looking to migrate and test different providers for their use cases. Let's see this in action with an example below.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    model=\"gpt-4o-mini\",\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the user from the string belo - Chris is a 27 year old engineer in San Francisco\",\n        }\n    ],\n    max_tokens=100,\n)\n\nprint(resp)\n#> name='Chris' age=27\n```\n\nNow let's see how we can achieve the same with Anthropic.\n\n```python hl_lines=\"2 5 14\"\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"anthropic/claude-3-5-haiku-latest\")  # (2)!\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    model=\"claude-3-5-sonnet-20240620\",  # (3)!\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the user from the string belo - Chris is a 27 year old engineer in San Francisco\",\n        }\n    ],\n    max_tokens=100,\n)\n\nprint(resp)\n#> name='Chris' age=27\n```\n\n1.  Import the Anthropic client\n2.  Use `from_anthropic` instead of `from_openai`\n3.  Update the model name to `claude-3-5-sonnet-20240620`\n\n## Conclusion\n\nWhile OpenAI's Structured Outputs shows promise, it has key limitations. The system lacks support for extra JSON fields to provide output examples, default value factories, and pattern matching in defined schemas. These constraints limit developers' ability to express complex return types, potentially impacting application performance and flexibility.\n\nIf you're interested in Structured Outputs, `instructor` addresses these critical issues. It provides automatic retries, real-time input validation, and multi-provider integration, allowing developers to more effectively implement Structured Outputs in their AI projects.\n\nif you haven't given `instructor` a shot, try it today!\n"
  },
  {
    "path": "docs/blog/posts/introduction.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2023-09-11\ndescription: Learn how Pydantic simplifies working with LLMs and structured JSON outputs\n  in Python, enhancing developer experience and code organization.\ndraft: false\ntags:\n- Pydantic\n- LLMs\n- Python\n- OpenAI\n- JSON\n---\n\n# Generating Structured Output / JSON from LLMs\n\nLanguage models have seen significant growth. Using them effectively often requires complex frameworks. This post discusses how Instructor simplifies this process using Pydantic.\n\n<!-- more -->\n\n## The Problem with Existing LLM Frameworks\n\nCurrent frameworks for Language Learning Models (LLMs) have complex setups. Developers find it hard to control interactions with language models. Some frameworks require complex JSON Schema setups.\n\n## The OpenAI Function Calling Game-Changer\n\nOpenAI's Function Calling feature provides a constrained interaction model. However, it has its own complexities, mostly around JSON Schema.\n\n## Why Pydantic?\n\nInstructor uses Pydantic to simplify the interaction between the programmer and the language model.\n\n- **Widespread Adoption**: Pydantic is a popular tool among Python developers.\n- **Simplicity**: Pydantic allows model definition in Python.\n- **Framework Compatibility**: Many Python frameworks already use Pydantic.\n\n```python\nimport pydantic\nimport instructor\n\n# Enables the response_model\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass UserDetail(pydantic.BaseModel):\n    name: str\n    age: int\n\n    def introduce(self):\n        return f\"Hello I'm {self.name} and I'm {self.age} years old\"\n\n\nuser: UserDetail = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"},\n    ],\n)\n```\n\n## Simplifying Validation Flow with Pydantic\n\nPydantic validators simplify features like re-asking or self-critique. This makes these tasks less complex compared to other frameworks.\n\n```python\nfrom typing_extensions import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nfrom instructor import llm_validator\n\n\nclass QuestionAnswerNoEvil(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(llm_validator(\"don't say objectionable things\")),\n    ]\n```\n\n## The Modular Approach\n\nPydantic allows for modular output schemas. This leads to more organized code.\n\n### Composition of Schemas\n\n```python\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\nclass UserWithAddress(UserDetails):\n    address: str\n```\n\n### Defining Relationships\n\n```python\nclass UserDetail(BaseModel):\n    id: int\n    age: int\n    name: str\n    friends: List[int]\n\n\nclass UserRelationships(BaseModel):\n    users: List[UserDetail]\n```\n\n### Using Enums\n\n```python\nfrom enum import Enum, auto\n\n\nclass Role(Enum):\n    PRINCIPAL = auto()\n    TEACHER = auto()\n    STUDENT = auto()\n    OTHER = auto()\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Role\n```\n\n### Flexible Schemas\n\n```python\nfrom typing import List\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    properties: List[Property]\n```\n\n### Chain of Thought\n\n```python\nclass TimeRange(BaseModel):\n    chain_of_thought: str\n    start_time: int\n    end_time: int\n\n\nclass UserDetail(BaseModel):\n    id: int\n    age: int\n    name: str\n    work_time: TimeRange\n    leisure_time: TimeRange\n```\n\n## Language Models as Microservices\n\nThe architecture resembles FastAPI. Most code can be written as Python functions that use Pydantic objects. This eliminates the need for prompt chains.\n\n### FastAPI Stub\n\n```python\nimport fastapi\nfrom pydantic import BaseModel\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\napp = fastapi.FastAPI()\n\n@app.get(\"/user/{user_id}\", response_model=UserDetails)\nasync def get_user(user_id: int) -> UserDetails:\n    return ...\n```\n\n### Using Instructor as a Function\n\n```python\ndef extract_user(str) -> UserDetails:\n    return client.chat.completions(\n           response_model=UserDetails,\n           messages=[]\n    )\n```\n\n### Response Modeling\n\n```python\nclass MaybeUser(BaseModel):\n    result: Optional[UserDetail]\n    error: bool\n    message: Optional[str]\n```\n\n## Conclusion\n\nInstructor, with Pydantic, simplifies interaction with language models. It is usable for both experienced and new developers.\n\n## Related Concepts\n\n- [Getting Started Guide](../../index.md) - Learn how to install and use Instructor\n- [Model Providers](../../integrations/index.md) - Explore supported LLM providers\n- [Validation Context](../../concepts/reask_validation.md) - Understand how to validate LLM outputs\n- [Response Models](../../concepts/models.md) - Deep dive into defining structured outputs\n\n## See Also\n\n- [Why Instructor is the Best Library](best_framework.md) - Learn about Instructor's philosophy and advantages\n- [Structured Outputs and Prompt Caching with Anthropic](structured-output-anthropic.md) - See how Instructor works with Claude\n- [Chain of Density Tutorial](../../tutorials/6-chain-of-density.ipynb) - Learn advanced prompting techniques\n\nIf you enjoy the content or want to try out `instructor` please check out the [github](https://github.com/jxnl/instructor) and give us a star!"
  },
  {
    "path": "docs/blog/posts/jinja-proposal.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2024-09-19\ndescription: Explore the integration of Jinja templating in the Instructor for enhanced\n  formatting, validation, versioning, and secure logging.\ndraft: false\ntags:\n- Jinja\n- Templating\n- Pydantic\n- API Development\n- Data Validation\n---\n\n# Instructor Proposal: Integrating Jinja Templating\n\nAs the creator of Instructor, I've always aimed to keep our product development streamlined and avoid unnecessary complexity. However, I'm now convinced that it's time to incorporate better templating into our data structure, specifically by integrating Jinja.\n\nThis decision serves multiple purposes:\n\n1. It addresses the growing complexity in my prompt formatting needs\n2. It allows us to differentiate ourselves from the standard library while adding proven utility.\n3. It aligns with the practices I've consistently employed in both production and client code.\n4. It provides an opportunity to introduce API changes that have been tested in private versions of Instructor.\n\n## Why Jinja is the Right Choice\n\n1. **Formatting Capabilities**\n   - Prompt formatting complexity has increased.\n   - List iteration and conditional implementation are necessary for formatting.\n   - This improves chunk generation, few shots, and dynamic rules.\n\n2. **Validation**\n   - Jinja template variables serve rendering and validation purposes.\n   - Pydantic's validation context allows access to template variables in validation functions.\n\n3. **Versioning and Logging**\n   - Render variable separation enhances prompt versioning and logging.\n   - Template variable diffing simplifies prompt change comparisons.\n\nBy integrating Jinja into Instructor, we're not just adding a feature; we're enhancing our ability to handle complex formatting, improve validation processes, and streamline our versioning and logging capabilities. This addition will significantly boost the power and flexibility of Instructor, making it an even more robust tool for our users.\n\n## Enhancing Formatting Capabilities\n\nIn Instructor, we propose implementing a new `context` keyword in our create methods. This addition will allow users to render the prompt using a provided context, leveraging Jinja's templating capabilities. Here's how it would work:\n\n1. Users pass a `context` dictionary to the create method.\n2. The prompt template, written in Jinja syntax, is defined in the `content` field of the message.\n3. Instructor renders the prompt using the provided context, filling in the template variables.\n\nThis approach offers these benefits:\n\n- Separation of prompt structure and dynamic content\n- Management of complex prompts with conditionals and loops\n- Reusability of prompt templates across different contexts\n\nLet's look at an example to illustrate this feature:\n\n```python\nclient.create(\n    model=\"gpt-4o\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n                You are a {{ role }} tasks with the following question\n\n                <question>\n                {{ question }}\n                </question>\n\n                Use the following context to answer the question, make sure to return [id] for every citation:\n\n                <context>\n                {% for chunk in context %}\n                  <context_chunk>\n                    <id>{{ chunk.id }}</id>\n                    <text>{{ chunk.text }}</text>\n                  </context_chunk>\n                {% endfor %}\n                </context>\n\n                {% if rules %}\n                Make sure to follow these rules:\n\n                {% for rule in rules %}\n                  * {{ rule }}\n                {% endfor %}\n                {% endif %}\n            \"\"\",\n        },\n    ],\n    context={\n        \"role\": \"professional educator\",\n        \"question\": \"What is the capital of France?\",\n        \"context\": [\n            {\"id\": 1, \"text\": \"Paris is the capital of France.\"},\n            {\"id\": 2, \"text\": \"France is a country in Europe.\"},\n        ],\n        \"rules\": [\"Use markdown.\"],\n    },\n)\n```\n\n## Validation\n\nLet's consider a scenario where we redact words from text. By using `ValidationInfo` to access context and passing it to the validator and template, we can implement a system for handling sensitive information. This approach allows us to:\n\n1. Validate input to ensure it doesn't contain banned words.\n2. Redact patterns using regular expressions.\n3. Provide instructions to the language model about word usage restrictions.\n\nHere's an example demonstrating this concept using Pydantic validators:\n\n```python\nfrom pydantic import BaseModel, ValidationInfo, field_validator\n\nclass Response(BaseModel):\n    text: str\n\n    @field_validator('text')\n    @classmethod\n    def no_banned_words(cls, v: str, info: ValidationInfo):\n        context = info.context\n        if context:\n            banned_words = context.get('banned_words', set())\n            banned_words_found = [word for word in banned_words if word.lower() in v.lower()]\n            if banned_words_found:\n                raise ValueError(f\"Banned words found in text: {', '.join(banned_words_found)}, rewrite it but just without the banned words\")\n        return v\n\n    @field_validator('text')\n    @classmethod\n    def redact_regex(cls, v: str, info: ValidationInfo):\n        context = info.context\n        if context:\n            redact_patterns = context.get('redact_patterns', [])\n            for pattern in redact_patterns:\n                v = re.sub(pattern, '****', v)\n        return v\n\nresponse = client.create(\n    model=\"gpt-4o\",\n    response_model=Response,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n                Write about a {{ topic }}\n\n                {% if banned_words %}\n                You must not use the following banned words:\n\n                <banned_words>\n                {% for word in banned_words %}\n                * {{ word }}\n                {% endfor %}\n                </banned_words>\n                {% endif %}\n              \"\"\"\n        },\n    ],\n    context={\n        \"topic\": \"jason and now his phone number is 123-456-7890\"\n        \"banned_words\": [\"jason\"],\n        \"redact_patterns\": [\n            r\"\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b\",  # Phone number pattern\n            r\"\\b\\d{3}-\\d{2}-\\d{4}\\b\",          # SSN pattern\n        ],\n    },\n    max_retries=3,\n)\n\nprint(response.text)\n# > While i can't say his name anymore, his phone number is ****\n```\n\n## Better Versioning and Logging\n\nWith the separation of prompt templates and variables, we gain several advantages:\n\n1. Version Control: We can now version the templates and retrieve the appropriate one for a given prompt. This allows for better management of template history, diffing and comparison.\n\n2. Enhanced Logging: The separation facilitates structured logging, enabling easier debugging and integration with various logging sinks, databases, and observability tools like OpenTelemetry.\n\n3. Security: Sensitive information in variables can be handled separately from the templates, allowing for better access control and data protection.\n\nThis separation of concerns adheres to best practices in software design, resulting in a more maintainable, scalable, and robust system for managing prompts and their associated data.\n\n### Side effect of Context also being Pydantic Models\n\nSince they are just python objects we can use Pydantic models to validate the context and also control how they are rendered, so even secret information can be dynamically rendered!\nConsider using secret string to pass in sensitive information to the llm.\n\n```python\nfrom pydantic import BaseModel, SecretStr\n\n\nclass UserContext(BaseModel):\n    name: str\n    address: SecretStr\n\n\nclass Address(BaseModel):\n    street: SecretStr\n    city: str\n    state: str\n    zipcode: str\n\n\ndef normalize_address(address: Address):\n    context = UserContext(username=\"scolvin\", address=address)\n    address = client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"{{ user.name }} is `{{ user.address.get_secret_value() }}`, normalize it to an address object\",\n            },\n        ],\n        context={\"user\": context},\n    )\n    print(context)\n    #> UserContext(username='jliu', address=\"******\")\n    print(address)\n    #> Address(street='******', city=\"Toronto\", state=\"Ontario\", zipcode=\"M5A 0J3\")\n    logger.info(\n        f\"Normalized address: {address}\",\n        extra={\"user_context\": context, \"address\": address},\n    )\n    return address\n```\n\nThis approach offers several advantages:\n\n1. Secure logging: You can confidently log your template variables without risking the exposure of sensitive information.\n2. Type safety: Pydantic models provide type checking and validation, reducing the risk of errors.\n3. Flexibility: You can easily control how different types of data are displayed or used in templates."
  },
  {
    "path": "docs/blog/posts/langsmith.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2024-02-18\ndescription: Explore how LangSmith enhances OpenAI clients with seamless LLM observability\n  and the `instructor` package for question classification.\ndraft: false\ntags:\n- LangSmith\n- OpenAI\n- LLM\n- Python\n- API Development\n---\n\n# Seamless Support with Langsmith\n\nIts a common misconception that LangChain's [LangSmith](https://www.langchain.com/langsmith) is only compatible with LangChain's models. In reality, LangSmith is a unified DevOps platform for developing, collaborating, testing, deploying, and monitoring LLM applications. In this blog we will explore how LangSmith can be used to enhance the OpenAI client alongside `instructor`.\n\n<!-- more -->\n\n## LangSmith\n\nIn order to use langsmith, you first need to set your LangSmith API key.\n\n```\nexport LANGCHAIN_API_KEY=<your-api-key>\n```\n\nNext, you will need to install the LangSmith SDK:\n\n```\npip install -U langsmith\npip install -U instructor\n```\n\nYou can find this example in our [examples directory](../../examples/bulk_classification.md):\n\n```bash\n# The example code is available in the examples directory\n# See: https://python.useinstructor.com/examples/bulk_classification\n```\n\nIn this example we'll use the `wrap_openai` function to wrap the OpenAI client with LangSmith. This will allow us to use LangSmith's observability and monitoring features with the OpenAI client. Then we'll use `instructor` to patch the client with the `TOOLS` mode. This will allow us to use `instructor` to add additional functionality to the client. We'll use [asyncio](./learn-async.md) to classify a list of questions.\n\n```python\nimport instructor\nimport asyncio\n\nfrom langsmith import traceable\nfrom langsmith.wrappers import wrap_openai\n\nfrom openai import AsyncOpenAI\nfrom pydantic import BaseModel, Field, field_validator\nfrom typing import List\nfrom enum import Enum\n\n# Wrap the OpenAI client with LangSmith\nwrapped_client = wrap_openai(AsyncOpenAI())\n\n# Create instructor client with LangSmith-wrapped client\n# Note: When using LangSmith, you may need to pass the wrapped client\n# For most cases, use: client = instructor.from_provider(\"openai/gpt-4o\", mode=instructor.Mode.TOOLS)\nclient = instructor.from_provider(\"openai/gpt-4o\", mode=instructor.Mode.TOOLS)\n\n# Rate limit the number of requests\nsem = asyncio.Semaphore(5)\n\n\n# Use an Enum to define the types of questions\nclass QuestionType(Enum):\n    CONTACT = \"CONTACT\"\n    TIMELINE_QUERY = \"TIMELINE_QUERY\"\n    DOCUMENT_SEARCH = \"DOCUMENT_SEARCH\"\n    COMPARE_CONTRAST = \"COMPARE_CONTRAST\"\n    EMAIL = \"EMAIL\"\n    PHOTOS = \"PHOTOS\"\n    SUMMARY = \"SUMMARY\"\n\n\n# You can add more instructions and examples in the description\n# or you can put it in the prompt in `messages=[...]`\nclass QuestionClassification(BaseModel):\n    \"\"\"\n    Predict the type of question that is being asked.\n    Here are some tips on how to predict the question type:\n    CONTACT: Searches for some contact information.\n    TIMELINE_QUERY: \"When did something happen?\n    DOCUMENT_SEARCH: \"Find me a document\"\n    COMPARE_CONTRAST: \"Compare and contrast two things\"\n    EMAIL: \"Find me an email, search for an email\"\n    PHOTOS: \"Find me a photo, search for a photo\"\n    SUMMARY: \"Summarize a large amount of data\"\n    \"\"\"\n\n    # If you want only one classification, just change it to\n    #   `classification: QuestionType` rather than `classifications: List[QuestionType]``\n    chain_of_thought: str = Field(\n        ..., description=\"The chain of thought that led to the classification\"\n    )\n    classification: List[QuestionType] = Field(\n        description=f\"An accuracy and correct prediction predicted class of question. Only allowed types: {[t.value for t in QuestionType]}, should be used\",\n    )\n\n    @field_validator(\"classification\", mode=\"before\")\n    def validate_classification(cls, v):\n        # sometimes the API returns a single value, just make sure it's a list\n        if not isinstance(v, list):\n            v = [v]\n        return v\n\n\n@traceable(name=\"classify-question\")\nasync def classify(data: str) -> QuestionClassification:\n    \"\"\"\n    Perform multi-label classification on the input text.\n    Change the prompt to fit your use case.\n\n    Args:\n        data (str): The input text to classify.\n    \"\"\"\n    async with sem:  # some simple rate limiting\n        return data, await client.create(\n            model=\"gpt-4-turbo-preview\",\n            response_model=QuestionClassification,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Classify the following question: {data}\",\n                },\n            ],\n        )\n\n\nasync def main(questions: List[str]):\n    tasks = [classify(question) for question in questions]\n\n    for task in asyncio.as_completed(tasks):\n        question, label = await task\n        resp = {\n            \"question\": question,\n            \"classification\": [c.value for c in label.classification],\n            \"chain_of_thought\": label.chain_of_thought,\n        }\n        resps.append(resp)\n    return resps\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    questions = [\n        \"What was that ai app that i saw on the news the other day?\",\n        \"Can you find the trainline booking email?\",\n        \"what did I do on Monday?\",\n        \"Tell me about todays meeting and how it relates to the email on Monday\",\n    ]\n\n    resp = asyncio.run(main(questions))\n\n    for r in resp:\n        print(\"q:\", r[\"question\"])\n        #> q: what did I do on Monday?\n        print(\"c:\", r[\"classification\"])\n        #> c: ['SUMMARY']\n```\n\nIf you follow what we've done is wrapped the client and proceeded to quickly use asyncio to classify a list of questions. This is a simple example of how you can use LangSmith to enhance the OpenAI client. You can use LangSmith to monitor and observe the client, and use `instructor` to add additional functionality to the client.\n\nTo take a look at trace of this run check out this shareable [link](https://smith.langchain.com/public/eaae9f95-3779-4bbb-824d-97aa8a57a4e0/r).\n\n![](./img/langsmith.png)\n"
  },
  {
    "path": "docs/blog/posts/learn-async.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2023-11-13\ndescription: \"Master Python asyncio.gather and asyncio.as_completed for efficient concurrent LLM processing with Instructor. Learn async programming patterns, rate limiting, and performance optimization for AI applications.\"\ndraft: false\nslug: learn-async\ntags:\n- asyncio\n- asyncio.gather\n- asyncio.as_completed\n- OpenAI\n- Python\n- data processing\n- async programming\n- concurrent processing\n- LLM optimization\n---\n\n# Mastering Python asyncio.gather and asyncio.as_completed for LLM Processing\n\nLearn how to use Python's `asyncio.gather` and `asyncio.as_completed` for efficient concurrent processing of Large Language Models (LLMs) with Instructor. This comprehensive guide covers async programming patterns, rate limiting strategies, and performance optimization techniques.\n\n<!-- more -->\n\n!!! notes \"Complete Example Code\"\n\n    You can find the complete working example on [GitHub](https://github.com/jxnl/instructor/blob/main/examples/learn-async/run.py)\n\n## Understanding asyncio.gather vs asyncio.as_completed\n\nPython's `asyncio` library provides two powerful methods for concurrent execution:\n\n- **`asyncio.gather`**: Executes all tasks concurrently and returns results in the same order as input\n- **`asyncio.as_completed`**: Returns results as they complete, regardless of input order\n\nBoth methods significantly outperform sequential processing, but they serve different use cases.\n\n## Complete Setup: Async LLM Processing\n\nHere's a complete, self-contained example showing how to set up async processing with Instructor:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Set up the async client with Instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n\n\nasync def extract_person(text: str) -> Person:\n    \"\"\"Extract person information from text using LLM.\"\"\"\n    return await client.create(\n        model=\"gpt-4o-mini\",\n        response_model=Person,\n        messages=[{\"role\": \"user\", \"content\": f\"Extract person info: {text}\"}],\n    )\n\n\n# Sample dataset\ndataset = [\n    \"John Smith is a 30-year-old software engineer\",\n    \"Sarah Johnson is a 25-year-old data scientist\",\n    \"Mike Davis is a 35-year-old product manager\",\n    \"Lisa Wilson is a 28-year-old UX designer\",\n    \"Tom Brown is a 32-year-old DevOps engineer\",\n    \"Emma Garcia is a 27-year-old frontend developer\",\n    \"David Lee is a 33-year-old backend developer\",\n]\n```\n\n## Method 1: Sequential Processing (Baseline)\n\n```python\nasync def sequential_processing() -> List[Person]:\n    \"\"\"Process items one by one - slowest method.\"\"\"\n    start_time = time.time()\n    persons = []\n\n    for text in dataset:\n        person = await extract_person(text)\n        persons.append(person)\n        print(f\"Processed: {person.name}\")\n\n    end_time = time.time()\n    print(f\"Sequential processing took: {end_time - start_time:.2f} seconds\")\n    return persons\n\n\n# Run sequential processing\n# persons = await sequential_processing()\n```\n\n## Method 2: asyncio.gather - Concurrent Processing\n\n```python\nasync def gather_processing() -> List[Person]:\n    \"\"\"Process all items concurrently and return in order.\"\"\"\n    start_time = time.time()\n\n    # Create tasks for all items\n    tasks = [extract_person(text) for text in dataset]\n\n    # Execute all tasks concurrently\n    persons = await asyncio.gather(*tasks)\n\n    end_time = time.time()\n    print(f\"asyncio.gather took: {end_time - start_time:.2f} seconds\")\n\n    # Results maintain original order\n    for person in persons:\n        print(f\"Processed: {person.name}\")\n\n    return persons\n\n\n# Run gather processing\n# persons = await gather_processing()\n```\n\n## Method 3: asyncio.as_completed - Streaming Results\n\n```python\nasync def as_completed_processing() -> List[Person]:\n    \"\"\"Process items concurrently and handle results as they complete.\"\"\"\n    start_time = time.time()\n    persons = []\n\n    # Create tasks for all items\n    tasks = [extract_person(text) for text in dataset]\n\n    # Process results as they complete\n    for task in asyncio.as_completed(tasks):\n        person = await task\n        persons.append(person)\n        print(f\"Completed: {person.name}\")\n\n    end_time = time.time()\n    print(f\"asyncio.as_completed took: {end_time - start_time:.2f} seconds\")\n    return persons\n\n\n# Run as_completed processing\n# persons = await as_completed_processing()\n```\n\n## Method 4: Rate-Limited Processing with Semaphores\n\n```python\nasync def rate_limited_extract_person(\n    text: str, semaphore: asyncio.Semaphore\n) -> Person:\n    \"\"\"Extract person info with rate limiting.\"\"\"\n    async with semaphore:\n        return await extract_person(text)\n\n\nasync def rate_limited_gather(concurrency_limit: int = 3) -> List[Person]:\n    \"\"\"Process items with controlled concurrency using asyncio.gather.\"\"\"\n    start_time = time.time()\n\n    # Create semaphore to limit concurrent requests\n    semaphore = asyncio.Semaphore(concurrency_limit)\n\n    # Create rate-limited tasks\n    tasks = [rate_limited_extract_person(text, semaphore) for text in dataset]\n\n    # Execute with rate limiting\n    persons = await asyncio.gather(*tasks)\n\n    end_time = time.time()\n    print(\n        f\"Rate-limited gather (limit={concurrency_limit}) took: {end_time - start_time:.2f} seconds\"\n    )\n    return persons\n\n\nasync def rate_limited_as_completed(concurrency_limit: int = 3) -> List[Person]:\n    \"\"\"Process items with controlled concurrency using asyncio.as_completed.\"\"\"\n    start_time = time.time()\n    persons = []\n\n    # Create semaphore to limit concurrent requests\n    semaphore = asyncio.Semaphore(concurrency_limit)\n\n    # Create rate-limited tasks\n    tasks = [rate_limited_extract_person(text, semaphore) for text in dataset]\n\n    # Process results as they complete\n    for task in asyncio.as_completed(tasks):\n        person = await task\n        persons.append(person)\n        print(f\"Rate-limited completed: {person.name}\")\n\n    end_time = time.time()\n    print(\n        f\"Rate-limited as_completed (limit={concurrency_limit}) took: {end_time - start_time:.2f} seconds\"\n    )\n    return persons\n\n\n# Run rate-limited processing\n# persons = await rate_limited_gather(concurrency_limit=2)\n# persons = await rate_limited_as_completed(concurrency_limit=2)\n```\n\n## Performance Comparison\n\nHere are typical performance results when processing 7 items:\n\n| Method | Execution Time | Concurrency | Use Case |\n|--------|---------------|-------------|----------|\n| Sequential | 6.17 seconds | 1 | Baseline |\n| asyncio.gather | 0.85 seconds | 7 | Fast processing, ordered results |\n| asyncio.as_completed | 0.95 seconds | 7 | Streaming results |\n| Rate-limited gather | 3.04 seconds | 2 | API-friendly |\n| Rate-limited as_completed | 3.26 seconds | 2 | Streaming + rate limiting |\n\n## When to Use Each Method\n\n### Use asyncio.gather when:\n- You need results in the same order as input\n- All tasks must complete successfully\n- You want the fastest possible execution\n- Memory usage isn't a concern\n\n### Use asyncio.as_completed when:\n- You want to process results as they arrive\n- Order doesn't matter\n- You're streaming data to clients\n- You want to handle large datasets efficiently\n\n### Use rate limiting when:\n- Working with API rate limits\n- Being respectful to external services\n- Managing resource consumption\n- Building production applications\n\n## Key Takeaways\n\n1. **asyncio.gather** is fastest for ordered results\n2. **asyncio.as_completed** is best for streaming and large datasets\n3. **Rate limiting** is essential for production applications\n4. **Error handling** should be implemented for robustness\n5. **Monitoring** helps optimize performance\n\n## Related Resources\n\n- [Python asyncio Documentation](https://docs.python.org/3/library/asyncio.html)\n- [Real Python Async IO Tutorial](https://realpython.com/async-io-python/)\n- [Instructor Documentation](https://python.useinstructor.com)\n- [OpenAI Async API Guide](https://platform.openai.com/docs/guides/async)\n\n---\n\n**Next Steps**: Learn about [error handling patterns](../../concepts/error_handling.md) or explore [rate limiting with tenacity](../../concepts/retrying.md) for production applications."
  },
  {
    "path": "docs/blog/posts/llm-as-reranker.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - LLM\n  - Pydantic\ncomments: true\ndate: 2024-10-23\ndescription: Learn how to use Instructor and Pydantic to create an LLM-based reranker for improving search results relevance.\ndraft: false\ntags:\n  - LLM\n  - Pydantic\n  - Instructor\n  - Search Relevance\n  - Reranking\n---\n\n# Building an LLM-based Reranker for your RAG pipeline\n\nAre you struggling with irrelevant search results in your Retrieval-Augmented Generation (RAG) pipeline?\n\nImagine having a powerful tool that can intelligently reassess and reorder your search results, significantly improving their relevance to user queries.\n\nIn this blog post, we'll show you how to create an LLM-based reranker using Instructor and Pydantic. This approach will:\n\n- Enhance the accuracy of your search results\n- Leverage the power of large language models (LLMs)\n- Utilize structured outputs for precise information retrieval\n\nBy the end of this tutorial, you'll be able to implement a llm reranker to label your synthetic data for fine-tuning a traditional reranker, or to build out an evaluation pipeline for your RAG system. Let's dive in!\n\n<!-- more -->\n\n## Setting Up the Environment\n\nFirst, let's set up our environment with the necessary imports:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n```\n\nWe're using the `instructor` library, which integrates seamlessly with OpenAI's API and Pydantic for structured outputs.\n\n## Defining the Reranking Models\n\nWe'll use Pydantic to define our `Label` and `RerankedResults` models that structure the output of our LLM:\n\nNotice that not only do I reference the chunk_id in the label class, I also asked a language model to use chain of thought. This is very useful for using models like 4o Mini or Claude, but not necessarily if we plan to use the `o1-mini` and `o1-preview` models.\n\n```python\nclass Label(BaseModel):\n    chunk_id: int = Field(description=\"The unique identifier of the text chunk\")\n    chain_of_thought: str = Field(\n        description=\"The reasoning process used to evaluate the relevance\"\n    )\n    relevancy: int = Field(\n        description=\"Relevancy score from 0 to 10, where 10 is most relevant\",\n        ge=0,\n        le=10,\n    )\n\n\nclass RerankedResults(BaseModel):\n    labels: list[Label] = Field(description=\"List of labeled and ranked chunks\")\n\n    @field_validator(\"labels\")\n    @classmethod\n    def model_validate(cls, v: list[Label]) -> list[Label]:\n        return sorted(v, key=lambda x: x.relevancy, reverse=True)\n```\n\nThese models ensure that our LLM's output is structured and includes a list of labeled chunks with their relevancy scores. The `RerankedResults` model includes a validator that automatically sorts the labels by relevancy in descending order.\n\n## Creating the Reranker Function\n\nNext, we'll create a function that uses our LLM to rerank a list of text chunks based on their relevance to a query:\n\n```python\ndef rerank_results(query: str, chunks: list[dict]) -> RerankedResults:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=RerankedResults,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                You are an expert search result ranker. Your task is to evaluate the relevance of each text chunk to the given query and assign a relevancy score.\n\n                For each chunk:\n                1. Analyze its content in relation to the query.\n                2. Provide a chain of thought explaining your reasoning.\n                3. Assign a relevancy score from 0 to 10, where 10 is most relevant.\n\n                Be objective and consistent in your evaluations.\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n                <query>{{ query }}</query>\n\n                <chunks_to_rank>\n                {% for chunk in chunks %}\n                <chunk id=\"{{ chunk.id }}\">\n                    {{ chunk.text }}\n                </chunk>\n                {% endfor %}\n                </chunks_to_rank>\n\n                Please provide a RerankedResults object with a Label for each chunk.\n                \"\"\",\n            },\n        ],\n        context={\"query\": query, \"chunks\": chunks},\n    )\n```\n\nThis function takes a query and a list of text chunks as input, sends them to the LLM with a predefined prompt, and returns a structured `RerankedResults` object. Thanks to instructor we can use jinja templating to inject the query and chunks into the prompt by passing in the `context` parameter.\n\n## Testing the Reranker\n\nTo test our LLM-based reranker, we can create a sample query and a list of text chunks. Here's an example of how to use the reranker:\n\n```python\ndef main():\n    query = \"What are the health benefits of regular exercise?\"\n    chunks = [\n        {\n            \"id\": 0,\n            \"text\": \"Regular exercise can improve cardiovascular health and reduce the risk of heart disease.\",\n        },\n        {\n            \"id\": 1,\n            \"text\": \"The price of gym memberships varies widely depending on location and facilities.\",\n        },\n        {\n            \"id\": 2,\n            \"text\": \"Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.\",\n        },\n        {\n            \"id\": 3,\n            \"text\": \"Proper nutrition is essential for maintaining a healthy lifestyle.\",\n        },\n        {\n            \"id\": 4,\n            \"text\": \"Strength training can increase muscle mass and improve bone density, especially important as we age.\",\n        },\n    ]\n\n    results = rerank_results(query, chunks)\n\n    print(\"Reranked results:\")\n    for label in results.labels:\n        print(f\"Chunk {label.chunk_id} (Relevancy: {label.relevancy}):\")\n        print(f\"Text: {chunks[label.chunk_id]['text']}\")\n        print(f\"Reasoning: {label.chain_of_thought}\")\n        print()\n\n\nif __name__ == \"__main__\":\n    main()\n```\n\nThis test demonstrates how the reranker evaluates and sorts the chunks based on their relevance to the query. The full implementation can be found in the `examples/reranker/run.py` file.\n\nIf you want to extend this example, you could use the `rerank_results` function to label synthetic data for fine-tuning a traditional reranker, or to build out an evaluation pipeline for your RAG system.\n\nMoreover, we could also add validators to the `Label.chunk_id` field to ensure that the chunk_id is present in the `chunks` list. This might be useful if labels are `uuids` or complex strings and we want to ensure that the chunk_id is a valid index for the chunks list.\n\nheres an example\n\n```python\nclass Label(BaseModel):\n    chunk_id: int = Field(description=\"The unique identifier of the text chunk\")\n    ...\n\n    @field_validator(\"chunk_id\")\n    @classmethod\n    def validate_chunk_id(cls, v: int, info: ValidationInfo) -> int:\n        context = info.context\n        chunks = context[\"chunks\"]\n        if v not in [chunk[\"id\"] for chunk in chunks]:\n            raise ValueError(\n                f\"Chunk with id {v} not found, must be one of {[chunk['id'] for chunk in chunks]}\"\n            )\n        return v\n```\n\nThis will automatically check that the `chunk_id` is present in the `chunks` list and raise a `ValueError` if it is not, where `context` is the context dictionary that we passed into the `rerank_results` function.\n\n## See Also\n- [RAG and Beyond](rag-and-beyond.md) - Comprehensive RAG guide\n- [Validation Fundamentals](validation-part1.md) - Validate ranking scores\n- [Performance Monitoring](logfire.md) - Track reranking performance\n"
  },
  {
    "path": "docs/blog/posts/llms-txt-adoption.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - Announcements\ncomments: true\ndate: 2025-03-19\ndescription:\n  Instructor adopts llms.txt to make documentation more accessible to AI language models.\ndraft: false\nslug: instructor-adopts-llms-txt\ntags:\n  - Documentation\n  - AI\n  - LLMs\n  - Standards\n---\n\n# Instructor Adopts llms.txt: Making Documentation AI-Friendly\n\nWe're excited to announce that Instructor now implements the llms.txt specification! You can now find our llms.txt file at [python.useinstructor.com/llms.txt](https://python.useinstructor.com/llms.txt). This adoption marks an important step in making our documentation more accessible to AI language models.\n\n<!-- more -->\n\n## What is llms.txt?\n\nThe llms.txt specification, [developed by Jeremy Howard and the Answer.AI team](https://github.com/AnswerDotAI/llms-txt), addresses a critical challenge in AI-documentation interaction: context windows are too small for most websites, and HTML pages with navigation, ads, and JavaScript are difficult for LLMs to process effectively.\n\nThink of llms.txt as robots.txt for AI language models - a standardized way to help AI systems understand and navigate your documentation. While robots.txt tells search engines what they can index, llms.txt helps AI models find and understand the most relevant information about your project.\n\n## Why Instructor Adopted llms.txt\n\nAs a library focused on structured outputs from LLMs, it made perfect sense for us to implement this standard. Here's why:\n\n1. **Better AI Integration**: Our users often interact with Instructor through AI coding assistants. Having a llms.txt file helps these tools better understand our documentation.\n\n2. **Cleaner Documentation Access**: Instead of parsing our full HTML documentation, AI models can now access clean markdown versions of our docs.\n\n3. **Supporting the Standard**: We believe in the importance of standardizing how AI models interact with documentation. By adopting llms.txt early, we're helping establish best practices for AI-friendly documentation.\n\n## What This Means for Users\n\nIf you're using AI coding assistants like GitHub Copilot, Claude, or Cursor with Instructor, you should notice:\n\n- More accurate code suggestions\n- Better understanding of Instructor's features\n- More relevant documentation references\n\nFor example, when you ask an AI assistant about Instructor's features, it can now directly access our markdown documentation through the llms.txt file, rather than trying to parse our HTML documentation.\n\n## How It Works\n\nOur llms.txt file provides:\n\n- A concise overview of Instructor\n- Links to key documentation in markdown format\n- Important notes about usage and best practices\n- References to example code and tutorials\n\nAI models can use this information to better understand:\n\n- Core concepts of Instructor\n- How to use our key features\n- Best practices for implementation\n- Where to find detailed documentation\n\n## Implementing llms.txt\n\nThe llms.txt specification is gaining adoption, and we encourage other Python libraries and frameworks to implement it. Here's how you can add llms.txt to your project:\n\n1. Create a `/llms.txt` file in your documentation root\n2. Follow the [standard format](https://github.com/AnswerDotAI/llms-txt#format)\n3. Include key information and markdown links\n4. Test with various AI assistants\n\n## Looking Forward\n\nThis is just the beginning. As more projects adopt llms.txt, we expect to see:\n\n- Better AI-assisted coding experiences\n- More standardized documentation access\n- Improved AI understanding of codebases\n- Enhanced collaboration between humans and AI\n\nWe're excited to be part of establishing this standard and look forward to seeing how it evolves. If you're interested in learning more about llms.txt or want to discuss its implementation, reach out to us on [GitHub](https://github.com/instructor-ai/instructor) or [Twitter](https://x.com/jxnl.co).\n\nFor more details about the llms.txt specification, check out the [official repository](https://github.com/AnswerDotAI/llms-txt) and join the discussion about making documentation more AI-friendly.\n\nHappy coding!"
  },
  {
    "path": "docs/blog/posts/llms-txt-support.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - Announcements\ncomments: true\ndate: 2025-08-29\ndescription:\n  Instructor now automatically generates llms.txt files for better AI documentation access.\ndraft: false\nslug: llms-txt-support\ntags:\n  - Documentation\n  - AI\n---\n\n# Instructor Now Supports llms.txt\n\nWe've added automatic `llms.txt` generation to Instructor's documentation using the [`mkdocs-llmstxt`](https://github.com/pawamoy/mkdocs-llmstxt) plugin.\n\n<!-- more -->\n\n## What is llms.txt?\n\nThe [`llms.txt` specification](https://github.com/AnswerDotAI/llms-txt) helps AI coding assistants access clean documentation without parsing complex HTML. Think \"robots.txt for LLMs.\"\n\n## What This Means\n\nYour AI coding assistant (Copilot, Claude, Cursor) now gets better access to:\n- Getting started guides\n- Core concepts and patterns  \n- Provider integration docs\n\nThis should result in more accurate suggestions and better understanding of Instructor's features.\n\n## Implementation\n\nWe're using the `mkdocs-llmstxt` plugin to automatically generate our `llms.txt` from our existing markdown documentation. Every time we update our docs, the `llms.txt` file stays current automatically.\n\nNo manual maintenance, always up-to-date.\n\n## Resources\n\n- [llms.txt Specification](https://github.com/AnswerDotAI/llms-txt)\n- [mkdocs-llmstxt Plugin](https://github.com/pawamoy/mkdocs-llmstxt)"
  },
  {
    "path": "docs/blog/posts/logfire.md",
    "content": "---\nauthors:\n- ivanleomk\n- jxnl\ncategories:\n- LLM Observability\ncomments: true\ndate: 2024-05-01\ndescription: Explore Logfire, an observability platform to enhance application performance\n  tracking with Pydantic, Instructor, and OpenAI integration.\ndraft: false\nslug: instructor-logfire\ntags:\n- Logfire\n- Pydantic\n- OpenAI\n- Instructor\n- LLM Observability\n---\n\n## Introduction\n\nLogfire is a new observability platform coming from the creators of Pydantic. It integrates almost seamlessly with many of your favourite libraries such as Pydantic, HTTPx and Instructor. In this article, we'll show you how to use Logfire with Instructor to gain visibility into the performance of your entire application.\n\nWe'll walk through the following examples\n\n1. Classifying scam emails using Instructor\n2. Performing simple validation using the `llm_validator`\n3. Extracting data into a markdown table from an infographic with GPT4V\n\n<!-- more -->\n\nAs usual, all of the code that we refer to here is provided in [examples/logfire](https://www.github.com/jxnl/instructor/tree/main/examples/logfire) for you to use in your projects.\n\n- `classify.py`: Email Classification Example\n- `image.py` : GPT4-V Example\n- `validate.py` : `llm_validator` example\n\n??? info \"Configure Logfire\"\n\n    Before starting this tutorial, make sure that you've registered for a [Logfire](https://logfire.pydantic.dev/) account. You'll also need to create a project to track these logs.\n\nWe'll need to install our dependencies and configure logfire auth before proceeding so simply run the commands below. Logfire will handle the authentication and configuration of your project.\n\n```bash\npip install logfire openai instructor pydantic pandas tabulate\nlogfire auth\n```\n\n## Classification\n\nNow that we've got Logfire setup, let's see how we can get it to help us track a simple classification job.\n\nLogfire is dead simple to integrate - all it takes is 2 lines of code and we have it setup.\n\n```python\nfrom openai import OpenAI\nimport instructor\nimport logfire\n\n\nopenai_client = OpenAI()\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))  # (1)!\nlogfire.instrument_openai(openai_client)  # (2)!\nclient = instructor.from_provider(\"openai/gpt-4o\")\n```\n\n1. We add Pydantic logging using `logfire`. Note that depending on your use-case, you can configure what you want to log with Pydantic\n2. We use their openai_integration to configure logging for our client before using instructor on it\n\nIn this example, we'll be looking at classifying emails as either spam or not spam. To do so, we can define a simple Pydantic model as seen below.\n\n```python\nimport enum\n\n\nclass Labels(str, enum.Enum):\n    \"\"\"Enumeration for single-label text classification.\"\"\"\n\n    SPAM = \"spam\"\n    NOT_SPAM = \"not_spam\"\n\n\nclass SinglePrediction(BaseModel):\n    \"\"\"\n    Class for a single class label prediction.\n    \"\"\"\n\n    class_label: Labels\n```\n\nWe can then use this in a generic instructor function as seen below that simply asks the model to classify text and return it in the form of a `SinglePrediction` Pydantic object.\n\nLogfire can help us to log this entire function, and what's happening inside it, even down to the model validation level by using their `logfire.instrument` decorator.\n\n```python\n@logfire.instrument(\"classification\", extract_args=True)  # (1)!\ndef classify(data: str) -> SinglePrediction:\n    \"\"\"Perform single-label classification on the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=SinglePrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: {data}\",\n            },\n        ],\n    )\n```\n\n1. Logfire allows us to use the `logfire.instrument` decorator and tag a function to a specific name.\n\nLet's see what happens when we run this against a list of different emails\n\n```python\nemails = [\n    \"Hello there I'm a Nigerian prince and I want to give you money\",\n    \"Meeting with Thomas has been set at Friday next week\",\n    \"Here are some weekly product updates from our marketing team\",\n]\n\nfor email in emails:\n    classify(email)\n```\n\nThere are a few important things here that the logs immediately give us\n\n1. The duration that each individual portion of our code took to run\n2. The payload that we sent over to OpenAI\n3. The exact arguments and results that were passed to each individual portion of our code at each step\n\n![Logfire Classification](img/classification-logfire.png)\n\n## LLM Validators\n\nFor our second example, we'll use the inbuilt `llm_validator` that instructor provides out of the box to validate that our statements don't contain unsafe content that we might not want to serve to users. Let's start by defining a simple Pydantic Model that can do so and configure our logfire integration.\n\n```python\nfrom typing import Annotated\nfrom pydantic import BaseModel\nfrom pydantic.functional_validators import AfterValidator\nfrom instructor import llm_validator\nimport logfire\nimport instructor\nfrom openai import OpenAI\n\nopenai_client = OpenAI()\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_openai(openai_client)\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\n\nclass Statement(BaseModel):\n    message: Annotated[\n        str,\n        AfterValidator(\n            llm_validator(\"Don't allow any objectionable content\", client=client)\n        ),\n    ]\n```\n\nWe can then test out our new validator with a few sample statements to see how our validator is working in practice.\n\n```python\nmessages = [\n    \"I think we should always treat violence as the best solution\",\n    \"There are some great pastries down the road at this bakery I know\",\n]\n\nfor message in messages:\n    try:\n        Statement(message=message)\n    except ValidationError as e:\n        print(e)\n```\n\nWith Logfire, we can capture the entirety of the validation process. As seen below, we have access to not only the original input data, but also the schema that was being used, the errors that were thrown and even the exact field that threw the error.\n\n![Logfire Validation](img/validation-logfire.png)\n\n## Vision Models\n\nFor our last example, let's see how we can use Logfire to extract structured data from an image using GPT-4V with OpenAI. We'll be using a simple bar graph here and using `GPT4V` to extract the data from the image from statista below and convert it into a markdown format.\n\n![Reference Image](img/statista-image.jpeg)\n\nWhat we want is an output of the combined numbers as seen below\n\n| Country       | Total Skier Visits (M) |\n| :------------ | ---------------------: |\n| United States |                   55.5 |\n| Austria       |                   43.6 |\n| France        |                   40.7 |\n| Japan         |                   26.6 |\n| Italy         |                   22.3 |\n| Switzerland   |                     22 |\n| Canada        |                   18.5 |\n| China         |                   17.9 |\n| Sweden        |                    9.2 |\n| Germany       |                      7 |\n\nThis is relatively simple with Pydantic. What we need to do is to define a custom type which will handle the conversion process as seen below\n\n```python\nfrom pydantic import BeforeValidator, InstanceOf, WithJsonSchema\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],  # (1)!\n    BeforeValidator(md_to_df),  # (2)!\n    WithJsonSchema(  # (3)!\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be separate\",\n        }\n    ),\n]\n```\n\n1. We indicate that the type of this type should be a pandas dataframe\n2. We run a validation step to ensure that we can convert the input into a valid pandas dataframe and return a new pandas Dataframe for our model to use\n3. We then override the type of the schema so that when we pass it to OpenAI, it knows to generate a table in a markdown format.\n\nWe can then use this in a normal instructor call\n\n```python\nimport instructor\nimport logfire\n\n\nclient = instructor.from_provider(\"openai/gpt-4o\", mode=instructor.Mode.MD_JSON)\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_openai(client._client)\n\n\n@logfire.instrument(\"extract-table\", extract_args=True)\ndef extract_table_from_image(url: str) -> Iterable[Table]:\n    return client.create(\n        model=\"gpt-4-vision-preview\",\n        response_model=Iterable[Table],\n        max_tokens=1800,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract out a table from the image. Only extract out the total number of skiiers.\",\n                    },\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": url}},\n                ],\n            }\n        ],\n    )\n```\n\nWe can then call it as seen below\n\n```python\nurl = \"https://cdn.statcdn.com/Infographic/images/normal/16330.jpeg\"\ntables = extract_table_from_image(url)\nfor table in tables:\n    print(table.caption, end=\"\\n\")\n    print(table.dataframe.to_markdown())\n```\n\nLogfire is able to capture the stack track of the entire call as seen below, profile each part of our application and most importantly capture the raw inputs of the OpenAI call alongside any potential errors.\n\n![Logfire Image](img/image-logfire.png)"
  },
  {
    "path": "docs/blog/posts/lseg-market-surveillance.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Production\n- Financial Services\ncomments: true\ndate: 2025-09-11\ndescription: London Stock Exchange Group uses Instructor in production for AI-powered market surveillance, achieving 100% precision in detecting price-sensitive news\ndraft: false\ntags:\n- Production\n- Finance\n- Amazon Bedrock\n- Market Surveillance\n- Anthropic\n---\n\n# London Stock Exchange Group Powers Market Surveillance with Instructor\n\nLondon Stock Exchange Group (LSEG) has deployed Instructor in production to power their AI-driven market surveillance system, demonstrating the library's capability in mission-critical financial applications.\n\n<!-- more -->\n\n## Production Impact at Scale\n\nLSEG processes over £1 trillion of securities annually from 400 members, requiring sophisticated market abuse detection systems. Their new AI-powered \"Surveillance Guide\" uses Instructor to integrate with Anthropic's Claude Sonnet 3.5 model through Amazon Bedrock.\n\n## Remarkable Results\n\nThe system achieved exceptional performance metrics:\n- **100% precision** in identifying non-sensitive news\n- **100% recall** for detecting price-sensitive content\n- Automated analysis of 250,000+ regulatory news articles\n- Significant reduction in manual analyst workload\n\n## Technical Architecture\n\nLSEG's implementation leverages Instructor's structured output capabilities in their technical stack:\n\n- **Instructor library**: Seamless integration with Claude Sonnet 3.5\n- **Amazon Bedrock**: Scalable foundation model infrastructure\n- **Custom Python pipelines**: Data processing and analysis\n\nThe system processes regulatory news through a two-step classification approach, using Instructor to ensure reliable, structured responses from the LLM for downstream analysis.\n\n## Why This Matters\n\nThis production deployment showcases Instructor being used where accuracy and reliability are paramount - financial regulatory compliance. The system helps analysts efficiently review trades flagged for potential market abuse by automatically analyzing news sensitivity and market impact.\n\nAs Charles Kellaway from LSEG noted, the solution transforms market surveillance operations by reducing manual review time while improving consistency in price-sensitivity assessment.\n\n## Learn More\n\nRead the full case study: [How London Stock Exchange Group is detecting market abuse with their AI-powered Surveillance Guide on Amazon Bedrock](https://aws.amazon.com/blogs/machine-learning/how-london-stock-exchange-group-is-detecting-market-abuse-with-their-ai-powered-surveillance-guide-on-amazon-bedrock/)\n\nReady to build your own production-ready structured output applications? [Get started with Instructor](../../getting-started.md).\n"
  },
  {
    "path": "docs/blog/posts/matching-language.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2024-03-28\ndescription: Explore techniques to ensure language models generate summaries that\n  match the source text's language using Pydantic and langdetect.\ndraft: false\nslug: matching-language-summaries\ntags:\n- multilingual summarization\n- language detection\n- Pydantic\n- langdetect\n- language models\n---\n\n# Matching Language in Multilingual Summarization Tasks\n\nWhen asking language models to summarize text, there's a risk that the generated summary ends up in English, even if the source text is in another language. This is likely due to the instructions being provided in English, biasing the model towards English output.\n\nIn this post, we explore techniques to ensure the language of the generated summary matches the language of the source text. We leverage Pydantic for data validation and the `langdetect` library for language identification.\n\n<!-- more -->\n\n## The Problem\n\nConsider the following example where we ask a language model to summarize text in various languages:\n\n```txt\nԼեզվական մոդելները վերջին տարիներին դարձել են ավելի հարուստ եւ կատարյալ, հնարավորություն ընձեռելով ստեղծել սահուն եւ բնական տեքստեր, ինչպես նաեւ գերազանց արդյունքներ ցուցաբերել մեքենայական թարգմանության, հարցերի պատասխանման եւ ստեղծագործ տեքստերի ստեղծման նման տարբեր առաջադրանքներում։ Այս մոդելները մշակվում են հսկայական տեքստային տվյալների հիման վրա եւ կարող են բռնել բնական լեզվի կառուցվածքն ու նրբությունները՝ հեղափոխություն առաջացնելով համակարգիչների եւ մարդկանց միջեւ հաղորդակցության ոլորտում։\n\n---\n\nMga modelo ng wika ay naging mas sopistikado sa nagdaang mga taon, na nagbibigay-daan sa pagbuo ng mga natural at madaling basahing teksto, at nagpapakita ng mahusay na pagganap sa iba't ibang gawain tulad ng awtomatikong pagsasalin, pagsagot sa mga tanong, at pagbuo ng malikhain na teksto. Ang mga modelo na ito ay sinanay sa napakalaking mga dataset ng teksto at kayang hulihin ang istruktura at mga nuances ng natural na wika. Ang mga pagpapabuti sa mga modelo ng wika ay maaaring magdulot ng rebolusyon sa komunikasyon sa pagitan ng mga computer at tao, at inaasahan ang higit pang pag-unlad sa hinaharap.\n\n---\n\nNgaahi motuʻa lea kuo nau hoko ʻo fakaʻofoʻofa ange ʻi he ngaahi taʻu fakamuimui ni, ʻo fakafaingofuaʻi e fakatupu ʻo e ngaahi konga tohi ʻoku lelei mo fakanatula pea ʻoku nau fakahaaʻi ʻa e ngaahi ola lelei ʻi he ngaahi ngāue kehekehe ʻo hangē ko e liliu fakaʻētita, tali fehuʻi, mo e fakatupu ʻo e konga tohi fakaʻatamai. Ko e ako ʻa e ngaahi motuʻa ni ʻi he ngaahi seti ʻo e fakamatala tohi lahi pea ʻoku nau malava ʻo puke ʻa e fakafuofua mo e ngaahi meʻa iiki ʻo e lea fakanatula. ʻE lava ke fakatupu ʻe he ngaahi fakaleleiʻi ki he ngaahi motuʻa lea ha liliu lahi ʻi he fetu'utaki ʻi he vahaʻa ʻo e ngaahi komipiuta mo e kakai, pea ʻoku ʻamanaki ʻe toe fakalakalaka ange ia ʻi he kahaʻu.\n```\n\nIf we use a simple instructor prompt, even when we ask for the language to be correct, we oftentimes will get English instead.\n\n??? note \"Expand to see documents examples\"\n\n    Լեզվական մոդելները վերջին տարիներին դարձել են ավելի հարուստ եւ կատարյալ, հնարավորություն ընձեռելով ստեղծել սահուն եւ բնական տեքստեր, ինչպես նաեւ գերազանց արդյունքներ ցուցաբերել մեքենայական թարգմանության, հարցերի պատասխանման եւ ստեղծագործ տեքստերի ստեղծման նման տարբեր առաջադրանքներում։ Այս մոդելները մշակվում են հսկայական տեքստային տվյալների հիման վրա եւ կարող են բռնել բնական լեզվի կառուցվածքն ու նրբությունները՝ հեղափոխություն առաջացնելով համակարգիչների եւ մարդկանց միջեւ հաղորդակցության ոլորտում։\n\n    ---\n\n    Mga modelo ng wika ay naging mas sopistikado sa nagdaang mga taon, na nagbibigay-daan sa pagbuo ng mga natural at madaling basahing teksto, at nagpapakita ng mahusay na pagganap sa iba't ibang gawain tulad ng awtomatikong pagsasalin, pagsagot sa mga tanong, at pagbuo ng malikhain na teksto. Ang mga modelo na ito ay sinanay sa napakalaking mga dataset ng teksto at kayang hulihin ang istruktura at mga nuances ng natural na wika. Ang mga pagpapabuti sa mga modelo ng wika ay maaaring magdulot ng rebolusyon sa komunikasyon sa pagitan ng mga computer at tao, at inaasahan ang higit pang pag-unlad sa hinaharap.\n\n    ---\n\n    Ngaahi motuʻa lea kuo nau hoko ʻo fakaʻofoʻofa ange ʻi he ngaahi taʻu fakamuimui ni, ʻo fakafaingofuaʻi e fakatupu ʻo e ngaahi konga tohi ʻoku lelei mo fakanatula pea ʻoku nau fakahaaʻi ʻa e ngaahi ola lelei ʻi he ngaahi ngāue kehekehe ʻo hangē ko e liliu fakaʻētita, tali fehuʻi, mo e fakatupu ʻo e konga tohi fakaʻatamai. Ko e ako ʻa e ngaahi motuʻa ni ʻi he ngaahi seti ʻo e fakamatala tohi lahi pea ʻoku nau malava ʻo puke ʻa e fakafuofua mo e ngaahi meʻa iiki ʻo e lea fakanatula. ʻE lava ke fakatupu ʻe he ngaahi fakaleleiʻi ki he ngaahi motuʻa lea ha liliu lahi ʻi he fetu'utaki ʻi he vahaʻa ʻo e ngaahi komipiuta mo e kakai, pea ʻoku ʻamanaki ʻe toe fakalakalaka ange ia ʻi he kahaʻu.\n\n    ---\n\n    Dil modelleri son yıllarda daha da gelişti, akıcı ve doğal metinler üretmeyi mümkün kılıyor ve makine çevirisi, soru cevaplama ve yaratıcı metin oluşturma gibi çeşitli görevlerde mükemmel performans gösteriyor. Bu modeller, devasa metin veri setlerinde eğitilir ve doğal dilin yapısını ve nüanslarını yakalayabilir. Dil modellerindeki iyileştirmeler, bilgisayarlar ve insanlar arasındaki iletişimde devrim yaratabilir ve gelecekte daha da ilerleme bekleniyor.\n\n    ---\n\n    Mô hình ngôn ngữ đã trở nên tinh vi hơn trong những năm gần đây, cho phép tạo ra các văn bản trôi chảy và tự nhiên, đồng thời thể hiện hiệu suất xuất sắc trong các nhiệm vụ khác nhau như dịch máy, trả lời câu hỏi và tạo văn bản sáng tạo. Các mô hình này được huấn luyện trên các tập dữ liệu văn bản khổng lồ và có thể nắm bắt cấu trúc và sắc thái của ngôn ngữ tự nhiên. Những cải tiến trong mô hình ngôn ngữ có thể mang lại cuộc cách mạng trong giao tiếp giữa máy tính và con người, và người ta kỳ vọng sẽ có những tiến bộ hơn nữa trong tương lai.\n\n    ---\n\n    Les modèles de langage sont devenus de plus en plus sophistiqués ces dernières années, permettant de générer des textes fluides et naturels, et de performer dans une variété de tâches telles que la traduction automatique, la réponse aux questions et la génération de texte créatif. Entraînés sur d'immenses ensembles de données textuelles, ces modèles sont capables de capturer la structure et les nuances du langage naturel, ouvrant la voie à une révolution dans la communication entre les ordinateurs et les humains.\n\n    ---\n\n    近年来,语言模型变得越来越复杂,能够生成流畅自然的文本,并在机器翻译、问答和创意文本生成等各种任务中表现出色。这些模型在海量文本数据集上训练,可以捕捉自然语言的结构和细微差别。语言模型的改进有望彻底改变计算机和人类之间的交流方式,未来有望实现更大的突破。\n\n    ---\n\n    In den letzten Jahren sind Sprachmodelle immer ausgefeilter geworden und können flüssige, natürlich klingende Texte generieren und in verschiedenen Aufgaben wie maschineller Übersetzung, Beantwortung von Fragen und Generierung kreativer Texte hervorragende Leistungen erbringen. Diese Modelle werden auf riesigen Textdatensätzen trainiert und können die Struktur und Nuancen natürlicher Sprache erfassen, was zu einer Revolution in der Kommunikation zwischen Computern und Menschen führen könnte.\n\n    ---\n\n    पिछले कुछ वर्षों में भाषा मॉडल बहुत अधिक परिष्कृत हो गए हैं, जो प्राकृतिक और प्रवाहमय पाठ उत्पन्न कर सकते हैं, और मशीन अनुवाद, प्रश्नोत्तर, और रचनात्मक पाठ उत्पादन जैसे विभिन्न कार्यों में उत्कृष्ट प्रदर्शन कर सकते हैं। ये मॉडल विशाल पाठ डेटासेट पर प्रशिक्षित होते हैं और प्राकृतिक भाषा की संरचना और बारीकियों को समझ सकते हैं। भाषा मॉडल में सुधार कंप्यूटर और मानव के बीच संवाद में क्रांति ला सकता है, और भविष्य में और प्रगति की उम्मीद है।\n\n    ---\n\n    近年、言語モデルは非常に洗練され、自然で流暢なテキストを生成できるようになり、機械翻訳、質問応答、クリエイティブなテキスト生成など、様々なタスクで優れたパフォーマンスを発揮しています。これらのモデルは膨大なテキストデータセットで学習され、自然言語の構造とニュアンスを捉えることができます。言語モデルの改善により、コンピューターと人間のコミュニケーションに革命が起こる可能性があり、将来のさらなる進歩が期待されています。\n\n\nIn this example, we'll do something very simple, asking for the language to be correct. And generating a base model that only asks for a summary. To test we will use the library `langdetect` to detect the language of the text. To challenge us even more, we'll limit ourselves using 3.5 rather than 4 in order to use a 'dumber' model.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom instructor import patch\nfrom openai import AsyncOpenAI\nfrom langdetect import detect\n\ndocs = # To see the text, expand the notes above.\n\n# Patch the OpenAI client to enable response_model\nclient = patch(AsyncOpenAI())\n\n\nclass GeneratedSummary(BaseModel):\n    summary: str\n\nasync def summarize_text(text: str):\n    response = await client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=GeneratedSummary,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Generate a concise summary in the language of the article. \",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Summarize the following text in a concise way:\\n{text}\",\n            },\n        ],\n    )  # type: ignore\n    return response.summary, text\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    async def main():\n        results = await asyncio.gather(*[summarize_text(doc) for doc in docs])\n        for summary, doc in results:\n            source_lang = detect(doc)\n            target_lang = detect(summary)\n            print(\n                f\"Source: {source_lang}, Summary: {target_lang}, Match: {source_lang == target_lang}\"\n            )\n\n    asyncio.run(main())\n    \"\"\"\n    Source: et, Summary: en, Match: False\n    Source: tl, Summary: tl, Match: True\n    Source: sw, Summary: en, Match: False\n    Source: tr, Summary: tr, Match: True\n    Source: vi, Summary: en, Match: False\n    Source: fr, Summary: fr, Match: True\n    Source: zh-cn, Summary: en, Match: False\n    Source: de, Summary: de, Match: True\n    Source: hi, Summary: en, Match: False\n    Source: ja, Summary: en, Match: False\n    \"\"\"\n```\n\nIn this example, you'll notice that not all the languages are matching. Many of them respond in English, and so we get pretty terrible results. Only 3 out of 9 passed!\n\n## Reiterating instructions\n\nA simple trick that I found to work very well is to add a language detection attribute before the summary.\n\n```python hl_lines=\"2\"\nclass GeneratedSummary(BaseModel):\n    detected_language: str = Field(\n        description=\"The language code of the original article. The summary must be generated in this same language.\",\n    )\n    summary: str\n```\n\nJust by adding this single attribute, we end up getting 100% correctness on language matches. If you want to see for yourself, checkout the complete script below\n\n```python\nfrom pydantic import BaseModel, Field\nfrom instructor import patch\nfrom openai import AsyncOpenAI\nfrom langdetect import detect\n\ndocs = map(\n    lambda x: x.strip(),\n    \"\"\"\nԼեզվական մոդելները վերջին տարիներին դարձել են ավելի հարուստ եւ կատարյալ, հնարավորություն ընձեռելով ստեղծել սահուն եւ բնական տեքստեր, ինչպես նաեւ գերազանց արդյունքներ ցուցաբերել մեքենայական թարգմանության, հարցերի պատասխանման եւ ստեղծագործ տեքստերի ստեղծման նման տարբեր առաջադրանքներում։ Այս մոդելները մշակվում են հսկայական տեքստային տվյալների հիման վրա եւ կարող են բռնել բնական լեզվի կառուցվածքն ու նրբությունները՝ հեղափոխություն առաջացնելով համակարգիչների եւ մարդկանց միջեւ հաղորդակցության ոլորտում։\n\n---\n\nMga modelo ng wika ay naging mas sopistikado sa nagdaang mga taon, na nagbibigay-daan sa pagbuo ng mga natural at madaling basahing teksto, at nagpapakita ng mahusay na pagganap sa iba't ibang gawain tulad ng awtomatikong pagsasalin, pagsagot sa mga tanong, at pagbuo ng malikhain na teksto. Ang mga modelo na ito ay sinanay sa napakalaking mga dataset ng teksto at kayang hulihin ang istruktura at mga nuances ng natural na wika. Ang mga pagpapabuti sa mga modelo ng wika ay maaaring magdulot ng rebolusyon sa komunikasyon sa pagitan ng mga computer at tao, at inaasahan ang higit pang pag-unlad sa hinaharap.\n\n---\n\nNgaahi motuʻa lea kuo nau hoko ʻo fakaʻofoʻofa ange ʻi he ngaahi taʻu fakamuimui ni, ʻo fakafaingofuaʻi e fakatupu ʻo e ngaahi konga tohi ʻoku lelei mo fakanatula pea ʻoku nau fakahaaʻi ʻa e ngaahi ola lelei ʻi he ngaahi ngāue kehekehe ʻo hangē ko e liliu fakaʻētita, tali fehuʻi, mo e fakatupu ʻo e konga tohi fakaʻatamai. Ko e ako ʻa e ngaahi motuʻa ni ʻi he ngaahi seti ʻo e fakamatala tohi lahi pea ʻoku nau malava ʻo puke ʻa e fakafuofua mo e ngaahi meʻa iiki ʻo e lea fakanatula. ʻE lava ke fakatupu ʻe he ngaahi fakaleleiʻi ki he ngaahi motuʻa lea ha liliu lahi ʻi he fetu'utaki ʻi he vahaʻa ʻo e ngaahi komipiuta mo e kakai, pea ʻoku ʻamanaki ʻe toe fakalakalaka ange ia ʻi he kahaʻu.\n\n---\n\nDil modelleri son yıllarda daha da gelişti, akıcı ve doğal metinler üretmeyi mümkün kılıyor ve makine çevirisi, soru cevaplama ve yaratıcı metin oluşturma gibi çeşitli görevlerde mükemmel performans gösteriyor. Bu modeller, devasa metin veri setlerinde eğitilir ve doğal dilin yapısını ve nüanslarını yakalayabilir. Dil modellerindeki iyileştirmeler, bilgisayarlar ve insanlar arasındaki iletişimde devrim yaratabilir ve gelecekte daha da ilerleme bekleniyor.\n\n---\n\nMô hình ngôn ngữ đã trở nên tinh vi hơn trong những năm gần đây, cho phép tạo ra các văn bản trôi chảy và tự nhiên, đồng thời thể hiện hiệu suất xuất sắc trong các nhiệm vụ khác nhau như dịch máy, trả lời câu hỏi và tạo văn bản sáng tạo. Các mô hình này được huấn luyện trên các tập dữ liệu văn bản khổng lồ và có thể nắm bắt cấu trúc và sắc thái của ngôn ngữ tự nhiên. Những cải tiến trong mô hình ngôn ngữ có thể mang lại cuộc cách mạng trong giao tiếp giữa máy tính và con người, và người ta kỳ vọng sẽ có những tiến bộ hơn nữa trong tương lai.\n\n---\n\nLes modèles de langage sont devenus de plus en plus sophistiqués ces dernières années, permettant de générer des textes fluides et naturels, et de performer dans une variété de tâches telles que la traduction automatique, la réponse aux questions et la génération de texte créatif. Entraînés sur d'immenses ensembles de données textuelles, ces modèles sont capables de capturer la structure et les nuances du langage naturel, ouvrant la voie à une révolution dans la communication entre les ordinateurs et les humains.\n\n---\n\n近年来,语言模型变得越来越复杂,能够生成流畅自然的文本,并在机器翻译、问答和创意文本生成等各种任务中表现出色。这些模型在海量文本数据集上训练,可以捕捉自然语言的结构和细微差别。语言模型的改进有望彻底改变计算机和人类之间的交流方式,未来有望实现更大的突破。\n\n---\n\nIn den letzten Jahren sind Sprachmodelle immer ausgefeilter geworden und können flüssige, natürlich klingende Texte generieren und in verschiedenen Aufgaben wie maschineller Übersetzung, Beantwortung von Fragen und Generierung kreativer Texte hervorragende Leistungen erbringen. Diese Modelle werden auf riesigen Textdatensätzen trainiert und können die Struktur und Nuancen natürlicher Sprache erfassen, was zu einer Revolution in der Kommunikation zwischen Computern und Menschen führen könnte.\n\n---\n\nपिछले कुछ वर्षों में भाषा मॉडल बहुत अधिक परिष्कृत हो गए हैं, जो प्राकृतिक और प्रवाहमय पाठ उत्पन्न कर सकते हैं, और मशीन अनुवाद, प्रश्नोत्तर, और रचनात्मक पाठ उत्पादन जैसे विभिन्न कार्यों में उत्कृष्ट प्रदर्शन कर सकते हैं। ये मॉडल विशाल पाठ डेटासेट पर प्रशिक्षित होते हैं और प्राकृतिक भाषा की संरचना और बारीकियों को समझ सकते हैं। भाषा मॉडल में सुधार कंप्यूटर और मानव के बीच संवाद में क्रांति ला सकता है, और भविष्य में और प्रगति की उम्मीद है।\n\n---\n\n近年、言語モデルは非常に洗練され、自然で流暢なテキストを生成できるようになり、機械翻訳、質問応答、クリエイティブなテキスト生成など、様々なタスクで優れたパフォーマンスを発揮しています。これらのモデルは膨大なテキストデータセットで学習され、自然言語の構造とニュアンスを捉えることができます。言語モデルの改善により、コンピューターと人間のコミュニケーションに革命が起こる可能性があり、将来のさらなる進歩が期待されています。\n\"\"\".split(\n        \"---\"\n    ),\n)\n\n# Patch the OpenAI client to enable response_model\nclient = patch(AsyncOpenAI())\n\n\nclass GeneratedSummary(BaseModel):\n    detected_language: str = Field(\n        description=\"The language code of the original article. The summary must be generated in this same language.\",\n    )\n    summary: str\n\n\nasync def summarize_text(text: str):\n    response = await client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=GeneratedSummary,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Generate a concise summary in the language of the article. \",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Summarize the following text in a concise way:\\n{text}\",\n            },\n        ],\n    )  # type: ignore\n    return response.summary, text\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    async def main():\n        results = await asyncio.gather(*[summarize_text(doc) for doc in docs])\n        for summary, doc in results:\n            source_lang = detect(doc)\n            target_lang = detect(summary)\n            print(\n                f\"Source: {source_lang}, Summary: {target_lang}, Match: {source_lang == target_lang}\"\n            )\n\n    asyncio.run(main())\n    \"\"\"\n    Source: et, Summary: et, Match: True\n    Source: tl, Summary: tl, Match: True\n    Source: sw, Summary: sw, Match: True\n    Source: tr, Summary: tr, Match: True\n    Source: vi, Summary: vi, Match: True\n    Source: fr, Summary: fr, Match: True\n    Source: zh-cn, Summary: zh-cn, Match: True\n    Source: de, Summary: de, Match: True\n    Source: hi, Summary: hi, Match: True\n    Source: ja, Summary: ja, Match: True\n    \"\"\"\n```"
  },
  {
    "path": "docs/blog/posts/migrating-to-uv.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - UV\ncomments: true\ndate: 2024-12-26\ndescription: How we migrated from poetry to uv\ndraft: false\ntags:\n  - Migrations\n---\n\n## Why we migrated to uv\n\nWe recently migrated to uv from poetry because we wanted to benefit from it's many features such as\n\n- Easier dependency management with automatic caching built in\n- Significantly faster CI/CD compared to poetry, especially when we use the `caching` functionality provided by the Astral team\n- Cargo-style lockfile that makes it easier to adopt new PEP features as they come out\n\nWe took around 1-2 days to handle the migration and we're happy with the results. On average, for CI/CD, we've seen a huge speed up for our jobs.\n\nHere are some timings for jobs that I took from our CI/CD runs.\n\nIn general I'd say that we saw a ~3x speedup with approximately 67% reduction in time needed for the jobs once we implemented caching for the individual `uv` github actions.\n\n<!-- more -->\n\n| Job              | Time (Poetry)                                                                                 | Time (UV)                                                                                            |\n| ---------------- | --------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |\n| Ruff Formatting  | [1m16s](https://github.com/instructor-ai/instructor/actions/runs/12386936314)                 | [28s](https://github.com/instructor-ai/instructor/actions/runs/12501982235) (-63%)                   |\n| Type checking    | [3m3s](https://github.com/instructor-ai/instructor/actions/runs/12488572568)                  | [39s](https://github.com/instructor-ai/instructor/actions/runs/12501974285) (-79%)                   |\n| Test Python 3.9  | [1m21s](https://github.com/instructor-ai/instructor/actions/runs/12251767751/job/34177033359) | [32s](https://github.com/instructor-ai/instructor/actions/runs/12501974279/job/34880278051) (-61%)   |\n| Test Python 3.10 | [1m32s](https://github.com/instructor-ai/instructor/actions/runs/12251767751/job/34177033359) | [33s](https://github.com/instructor-ai/instructor/actions/runs/12501974279/job/34880278299) (-64%)   |\n| Test Python 3.11 | [3m19](https://github.com/instructor-ai/instructor/actions/runs/12251767751/job/34177034094)  | [2m48s](https://github.com/instructor-ai/instructor/actions/runs/12501974279/job/34880278480) (-16%) |\n\n- Note that for 3.11 I subtracted 1m12 from the time because we added ~60 more tests for gemini so to make it a fair comparison I subtracted the time it took to run the gemini tests.\n\nMost of our heavier jobs like the `Test Python` jobs are running multiple LLM calls in parallel and so the caching speedups of UV have some reduced benefit there.\n\n## How we migrated\n\nThe first thing we did was to use an automated tool to convert our poetry lockfile to a uv compatible lockfile. For this, I followed [this thread](https://x.com/tiangolo/status/1839686030007361803) by Sebastian Ramirez on how to do the conversions.\n\n**Step 1** : Use `uv` to run a `pdm` which will migrate your pyproject.toml and make sure to remove all of the `tool.poetry` sections. You can see the initial `pyproject.toml` [here](https://github.com/instructor-ai/instructor/blob/ad046fbca335b9133a704bed1900cda846caaf7c/pyproject.toml).\n\n```\nuvx pdm import pyproject.toml\n```\n\nNote that since you're using `uv`, make sure to also delete the `pdm` sections too and your optional groups\n\n```toml\n# dependency versions for extras\nfastapi = { version = \">=0.109.2,<0.116.0\", optional = true }\nredis = { version = \"^5.0.1\", optional = true }\ndiskcache = { version = \"^5.6.3\", optional = true }\n...\n\n\n[tool.poetry.extras]\nanthropic = [\"anthropic\", \"xmltodict\"]\ngroq = [\"groq\"]\ncohere = [\"cohere\"]\n...\n\n\n[tool.pdm.build]\nincludes = [\"instructor\"]\n[build-system]\nrequires = [\"pdm-backend\"]\nbuild-backend = \"pdm.backend\"\n```\n\n**Step 2** : Once you've done so, since you're no longer using `poetry`, you need to update the build system. If you just delete it, you'll end up using `setuptools` by default and that will throw an error if you've declared your license using `license = {text = \"MIT\"}`. So you need to add the following to your `pyproject.toml`.\n\nThis is documented in this UV issue [here](https://github.com/astral-sh/uv/issues/9513) which documents a bug with setuptools not being able to handle Metadata 2.4 keys and so you need to use `hatchling` as your build backend.\n\n```toml\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n```\n\n**Step 3** : Once you've done so, run uv sync to generate your `uv.lock` file to make sure you don't have any dependency issues.\n\n### New Commands to know\n\nNow that we migrated over from `poetry` to `uv`, there are a few new commands that you'll need to use.\n\n1. `uv sync --all-extras --group <dependency groups you'd like to install>`: This should install all the dependencies for the project using `uv`, make sure to install the specific dependencies that you'd like to install. If you're writing docs for instance, you would run `uv sync --all-extras --group docs`\n\n2. `uv run <command>` : This runs the specific command using the virtual environment you've created. When running our CI pipeline, we use this to ensure we're using the right environment for our commands.\n\n## Migrating Your Workflows\n\nWe had a few workflows that were using `poetry` and so we needed to update them to use `uv` instead. As seen below there are a few main changes you'll need to make to your relevant workflow\n\n```yaml\nname: Test\non:\n  pull_request:\n  push:\n    branches:\n      - main\n\njobs:\n  release:\n    runs-on: ubuntu-latest\n\n    strategy:\n      matrix:\n        python-version: [\"3.9\", \"3.10\", \"3.11\"]\n\n    steps:\n      - uses: actions/checkout@v2\n\n      - name: Set up Python\n        uses: actions/setup-python@v4\n        with:\n          python-version: ${{ matrix.python-version }} # (1)!\n\n      - name: Cache Poetry virtualenv\n        uses: actions/cache@v2\n        with:\n          path: ~/.cache/pypoetry/virtualenvs\n          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}\n          restore-keys: |\n            ${{ runner.os }}-poetry-\n\n      - name: Install Poetry\n        uses: snok/install-poetry@v1.3.1 # (2)!\n\n      - name: Install dependencies\n        run: poetry install --with dev,anthropic # (3)!\n\n      - name: Run tests\n        if: matrix.python-version != '3.11'\n        run: poetry run pytest tests/ -k 'not llm and not openai and not gemini and not anthropic and not cohere and not vertexai' && poetry run pytest tests/llm/test_cohere\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n          COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n\n      - name: Run Gemini Tests\n        run: poetry run pytest tests/llm/test_gemini # (4)!\n        env:\n          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n\n      - name: Generate coverage report\n        if: matrix.python-version == '3.11'\n        run: |\n          poetry run coverage run -m pytest tests/ -k \"not docs and not anthropic and not gemini and not cohere and not vertexai and not fireworks\"\n          poetry run coverage report\n          poetry run coverage html\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n```\n\n1.  We switched over to using `uv` to install python\n\n2.  We switch over to using astral's `astral-sh/setup-uv@v4` action to install `uv`\n\n3.  Using `uv sync` was significantly faster than poetry install and with the cache I imagine it was even faster\n\n4.  Instead of using `poetry run`, we use `uv run` which will start up the python virtual environment with the deps and then run the command you pass in.\n\nWe then modified the workflow to the following yml config\n\n```yaml\nname: Test\non:\n  pull_request:\n  push:\n    branches:\n      - main\n\njobs:\n  release:\n    runs-on: ubuntu-latest\n\n    strategy:\n      matrix:\n        python-version: [\"3.9\", \"3.10\", \"3.11\"]\n\n    steps:\n      - uses: actions/checkout@v2\n      - name: Install uv\n        uses: astral-sh/setup-uv@v4\n        with:\n          enable-cache: true # (1)!\n\n      - name: Set up Python\n        run: uv python install ${{ matrix.python-version }}\n\n      - name: Install the project\n        run: uv sync --all-extras\n      - name: Run tests\n        if: matrix.python-version != '3.11'\n        run: uv run pytest tests/ -k 'not llm and not openai and not gemini and not anthropic and not cohere and not vertexai' # (2)!\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n          COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}\n\n      - name: Run Gemini Tests\n        if: matrix.python-version == '3.11'\n        run: uv run pytest tests/llm/test_gemini\n        env:\n          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}\n\n      - name: Generate coverage report\n        if: matrix.python-version == '3.11'\n        run: |\n          uv run coverage run -m pytest tests/ -k \"not docs and not anthropic and not gemini and not cohere and not vertexai and not fireworks\"\n          uv run coverage report\n          uv run coverage html\n        env:\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}\n```\n\n1.  Don't forget to enable the cache so that your jobs are faster\n\n2.  Using `uv run` here is important because if you just run `pytest` it won't run the tests in your virtual environment causing them to fail.\n\nAnd that was basically it! Most of the migration work was really trying to figure out what was causing the tests to fail and then slowly fixing them. We were able to easily upgrade many of our existing dependencies and make sure that everything was working as expected.\n\nWe also just did our first release with uv and it was a success!\n\n## Conclusion\n\nWe're happy with the results and we're glad to have migrated to uv. It's been a smooth transition and we've been able to see a significant speedup in our CI/CD jobs. We're looking forward to continue using uv moving forward\n"
  },
  {
    "path": "docs/blog/posts/mkdocs-llmstxt-plugin-integration.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - Technical\n  - Documentation\ncomments: true\ndate: 2025-08-29\ndescription:\n  Deep dive into how we integrated the mkdocs-llmstxt plugin to automatically generate llms.txt files for better AI documentation consumption.\ndraft: false\nslug: mkdocs-llmstxt-plugin-integration\ntags:\n  - MkDocs\n  - Plugins\n  - Documentation\n  - AI\n  - Automation\n---\n\n# Automating llms.txt Generation with mkdocs-llmstxt Plugin\n\nToday we integrated the `mkdocs-llmstxt` plugin into Instructor's documentation pipeline. This powerful plugin automatically generates `llms.txt` files from our MkDocs documentation, making our comprehensive guides instantly accessible to AI language models.\n\n<!-- more -->\n\n## About the mkdocs-llmstxt Plugin\n\nThe [`mkdocs-llmstxt` plugin](https://github.com/pawamoy/mkdocs-llmstxt) by Timothée Mazzucotelli is a brilliant solution to a common problem: how do you keep an `llms.txt` file synchronized with your evolving documentation?\n\n### Key Features\n\n**Automatic Generation**: The plugin generates `llms.txt` files directly from your MkDocs source files during the build process. No manual maintenance required.\n\n**Flexible Section Control**: You can specify exactly which parts of your documentation to include:\n\n```yaml\nplugins:\n  - llmstxt:\n      sections:\n        Getting Started:\n          - index.md: Introduction to structured outputs\n          - installation.md: Setup instructions\n        Core Concepts:\n          - concepts/*.md\n```\n\n**Clean Markdown Conversion**: The plugin converts your documentation to clean, LLM-friendly markdown format, removing HTML artifacts and navigation elements.\n\n**Customizable Descriptions**: You can provide both short and long descriptions of your project, giving AI models the context they need.\n\n## Our Implementation\n\nHere's how we configured the plugin for Instructor:\n\n```yaml\nplugins:\n  - llmstxt:\n      markdown_description: >\n        Instructor is a Python library that makes it easy to work with structured outputs \n        from large language models (LLMs). Built on top of Pydantic, it provides a simple, \n        type-safe way to extract structured data from LLM responses across multiple providers \n        including OpenAI, Anthropic, Google, and many others.\n      sections:\n        Getting Started:\n          - index.md: Introduction to structured outputs with LLMs\n          - getting-started.md: Quick start guide\n          - installation.md: Installation instructions\n        Core Concepts:\n          - concepts/*.md\n        Integrations:\n          - integrations/*.md\n```\n\n### Why These Sections?\n\nWe carefully selected these sections because they provide AI models with the essential information needed to understand and use Instructor:\n\n- **Getting Started**: Core concepts and installation\n- **Core Concepts**: Deep dive into features like validation, streaming, and patterns\n- **Integrations**: Provider-specific guidance for OpenAI, Anthropic, Google, and others\n\n## Technical Benefits\n\n### Build Integration\n\nThe plugin seamlessly integrates into our existing MkDocs build pipeline. Every time we deploy documentation updates, the `llms.txt` file is automatically regenerated with the latest content.\n\n### Content Freshness\n\nUnlike manually maintained `llms.txt` files, our generated version is always up-to-date. When we add new integration guides or update existing concepts, the changes are automatically reflected.\n\n### Glob Pattern Support\n\nThe plugin supports glob patterns like `concepts/*.md`, making it easy to include entire directories without manually listing each file.\n\n## Plugin Architecture\n\nThe `mkdocs-llmstxt` plugin works by:\n\n1. **Parsing Configuration**: Reading your `sections` configuration during the MkDocs build\n2. **File Processing**: Converting specified markdown files to clean, LLM-friendly format\n3. **Content Assembly**: Combining sections with metadata into the standard llms.txt format\n4. **Output Generation**: Writing the final `llms.txt` file to your site root\n\n## Installation and Setup\n\nAdding the plugin to your own MkDocs project is straightforward:\n\n```bash\npip install mkdocs-llmstxt\n```\n\nThen add it to your `mkdocs.yml`:\n\n```yaml\nsite_url: https://your-site.com/  # Required for the plugin\n\nplugins:\n  - llmstxt:\n      markdown_description: Description of your project\n      sections:\n        Documentation:\n          - docs/*.md\n```\n\n## Resources\n\n- [mkdocs-llmstxt Plugin](https://github.com/pawamoy/mkdocs-llmstxt)\n- [llms.txt Specification](https://github.com/AnswerDotAI/llms-txt)\n- [Instructor Documentation](https://python.useinstructor.com/)\n\nSpecial thanks to Timothée Mazzucotelli for creating this excellent plugin!\n"
  },
  {
    "path": "docs/blog/posts/multimodal-gemini.md",
    "content": "---\nauthors:\n  - ivanleomk\ncategories:\n  - Gemini\n  - Multimodal\ncomments: true\ndate: 2024-10-23\ndescription: Learn how to use Google's Gemini model for multimodal structured extraction of YouTube videos, extracting structured recommendations for tourist destinations.\ndraft: false\ntags:\n  - Gemini\n  - Multimodal AI\n  - Travel Recommendations\n  - Pydantic\n  - Python\n---\n\n# Structured Outputs with Multimodal Gemini\n\nIn this post, we'll explore how to use Google's Gemini model with Instructor to analyze [travel videos](https://www.youtube.com/watch?v=_R8yhW_H9NQ) and extract structured recommendations. This powerful combination allows us to process multimodal inputs (video) and generate structured outputs using Pydantic models. This post was done in collaboration with [Kino.ai](https://kino.ai), a company that uses instructor to do structured extraction from multimodal inputs to improve search for film makers.\n\n## Setting Up the Environment\n\nFirst, let's set up our environment with the necessary libraries:\n\n```python\n```\n\n<!-- more -->\n\n## Defining Our Data Models\n\nWe'll use Pydantic to define our data models for tourist destinations and recommendations:\n\n```python\nclass TouristDestination(BaseModel):\n    name: str\n    description: str\n    location: str\n\n\nclass Recommendations(BaseModel):\n    chain_of_thought: str\n    description: str\n    destinations: list[TouristDestination]\n```\n\n## Initializing the Gemini Client\n\nNext, we'll set up our Gemini client using Instructor:\n\n```python\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n)\n```\n\n## Uploading and Processing the Video\n\nTo analyze a video, we first need to upload it:\n\n```python\nfile = genai.upload_file(\"./takayama.mp4\")\n```\n\nThen, we can process the video and extract recommendations:\n\n```python\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\"What places do they recommend in this video?\", file],\n        }\n    ],\n    response_model=Recommendations,\n)\n\nprint(resp)\n```\n\n??? note \"Expand to see Raw Results\"\n\n    ```python\n    Recomendations(\n        chain_of_thought='The video recommends visiting Takayama city, in the Hida Region, Gifu Prefecture. The\n    video suggests visiting the Miyagawa Morning Market, to try the Sarubobo good luck charms, and to enjoy the\n    cookie cup espresso, made by Koma Coffee. Then, the video suggests visiting a traditional Japanese Cafe,\n    called Kissako Katsure, and try their matcha and sweets. Afterwards, the video suggests to visit the Sanmachi\n    Historic District, where you can find local crafts and delicious foods. The video recommends trying Hida Wagyu\n    beef, at the Kin no Kotte Ushi shop, or to have a sit-down meal at the Kitchen Hida. Finally, the video\n    recommends visiting Shirakawa-go, a World Heritage Site in Gifu Prefecture.',\n        description='This video recommends a number of places to visit in Takayama city, in the Hida Region, Gifu\n    Prefecture. It shows some of the local street food and highlights some of the unique shops and restaurants in\n    the area.',\n        destinations=[\n            TouristDestination(\n                name='Takayama',\n                description='Takayama is a city at the base of the Japan Alps, located in the Hida Region of\n    Gifu.',\n                location='Hida Region, Gifu Prefecture'\n            ),\n            TouristDestination(\n                name='Miyagawa Morning Market',\n                description=\"The Miyagawa Morning Market, or the Miyagawa Asai-chi in Japanese, is a market that\n    has existed officially since the Edo Period, more than 100 years ago. It's open every single day, rain or\n    shine, from 7am to noon.\",\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Nakaya - Handmade Hida Sarubobo',\n                description='The Nakaya shop sells handcrafted Sarubobo good luck charms.',\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Koma Coffee',\n                description=\"Koma Coffee is a shop that has been in business for about 50 or 60 years, and they\n    serve coffee in a cookie cup. They've been serving coffee for about 10 years.\",\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Kissako Katsure',\n                description='Kissako Katsure is a traditional Japanese style cafe, called Kissako, and the name\n    means would you like to have some tea. They have a variety of teas and sweets.',\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Sanmachi Historic District',\n                description='Sanmachi Dori is a Historic Merchant District in Takayama, all of the buildings here\n    have been preserved to look as they did in the Edo Period.',\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Suwa Orchard',\n                description='The Suwa Orchard has been in business for more than 50 years.',\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Kitchen HIDA',\n                description='Kitchen HIDA is a restaurant with a 50 year history, known for their Hida Beef dishes\n    and for using a lot of local ingredients.',\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Kin no Kotte Ushi',\n                description='Kin no Kotte Ushi is a shop known for selling Beef Sushi, especially Hida Wagyu Beef\n    Sushi. Their sushi is medium rare.',\n                location='Hida Takayama'\n            ),\n            TouristDestination(\n                name='Shirakawa-go',\n                description='Shirakawa-go is a World Heritage Site in Gifu Prefecture.',\n                location='Gifu Prefecture'\n            )\n        ]\n    )\n    ```\n\nThe Gemini model analyzes the video and provides structured recommendations. Here's a summary of the extracted information:\n\n1. **Takayama City**: The main destination, located in the Hida Region of Gifu Prefecture.\n2. **Miyagawa Morning Market**: A historic market open daily from 7am to noon.\n3. **Nakaya Shop**: Sells handcrafted Sarubobo good luck charms.\n4. **Koma Coffee**: A 50-60 year old shop famous for serving coffee in cookie cups.\n5. **Kissako Katsure**: A traditional Japanese cafe offering various teas and sweets.\n6. **Sanmachi Historic District**: A preserved merchant district from the Edo Period.\n7. **Suwa Orchard**: A 50+ year old orchard business.\n8. **Kitchen HIDA**: A restaurant with a 50-year history, known for Hida Beef dishes.\n9. **Kin no Kotte Ushi**: A shop specializing in Hida Wagyu Beef Sushi.\n10. **Shirakawa-go**: A World Heritage Site in Gifu Prefecture.\n\n## Limitations, Challenges, and Future Directions\n\nWhile the current approach demonstrates the power of multimodal AI for video analysis, there are several limitations and challenges to consider:\n\n1. **Lack of Temporal Information**: Our current method extracts overall recommendations but doesn't provide timestamps for specific mentions. This limits the ability to link recommendations to exact moments in the video.\n\n2. **Speaker Diarization**: The model doesn't distinguish between different speakers in the video. Implementing speaker diarization could provide valuable context about who is making specific recommendations.\n\n3. **Content Density**: Longer or more complex videos might overwhelm the model, potentially leading to missed information or less accurate extractions.\n\n### Future Explorations\n\nTo address these limitations and expand the capabilities of our video analysis system, here are some promising areas to explore:\n\n1. **Timestamp Extraction**: Enhance the model to provide timestamps for each recommendation or point of interest mentioned in the video. This could be achieved by:\n\n   ```python\n   class TimestampedRecommendation(BaseModel):\n       timestamp: str\n       timestamp_format: Literal[\"HH:MM\", \"HH:MM:SS\"]  # Helps with parsing\n       recommendation: str\n\n\n   class EnhancedRecommendations(BaseModel):\n       destinations: list[TouristDestination]\n       timestamped_mentions: list[TimestampedRecommendation]\n   ```\n\n2. **Speaker Diarization**: Implement speaker recognition to attribute recommendations to specific individuals. This could be particularly useful for videos featuring multiple hosts or interviewees.\n\n3. **Segment-based Analysis**: Process longer videos in segments to maintain accuracy and capture all relevant information. This approach could involve:\n\n   - Splitting the video into smaller chunks\n   - Analyzing each chunk separately\n   - Aggregating and deduplicating results\n\n4. **Multi-language Support**: Extend the model's capabilities to accurately analyze videos in various languages and capture culturally specific recommendations.\n\n5. **Visual Element Analysis**: Enhance the model to recognize and describe visual elements like landmarks, food dishes, or activities shown in the video, even if not explicitly mentioned in the audio.\n\n6. **Sentiment Analysis**: Incorporate sentiment analysis to gauge the speaker's enthusiasm or reservations about specific recommendations.\n\nBy addressing these challenges and exploring these new directions, we can create a more comprehensive and nuanced video analysis system, opening up even more possibilities for applications in travel, education, and beyond.\n\n## Related Documentation\n- [Multimodal Concepts](../../concepts/multimodal.md) - Working with images, video, and audio\n- [Google Integration](../../integrations/google.md) - Complete Gemini setup guide\n\n## See Also\n- [OpenAI Multimodal](openai-multimodal.md) - Compare multimodal approaches\n- [Anthropic Structured Output](structured-output-anthropic.md) - Alternative provider\n- [Chat with PDFs using Gemini](chat-with-your-pdf-with-gemini.md) - Practical PDF processing\n"
  },
  {
    "path": "docs/blog/posts/native_caching.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Performance Optimization\n- Cost Reduction\n- API Efficiency\n- Python Development\ncomments: true\ndate: 2025-01-08\ndescription: Instructor v1.9.1 introduces native caching support for all providers. Learn how to drastically reduce API costs and improve response times with built-in cache adapters.\ndraft: false\nslug: native-caching-v1-9-1\ntags:\n- Python\n- Caching\n- Performance Optimization\n- API Cost Optimization\n- LLM Applications\n- Production Scaling\n- from_provider\n---\n\n# Native Caching in Instructor v1.9.1: Zero-Configuration Performance Boost\n\n> **New in v1.9.1**: Instructor now ships with built-in caching support for all providers. Simply pass a cache adapter when creating your client to dramatically reduce API costs and improve response times.\n\nStarting with Instructor v1.9.1, we've introduced native caching support that makes optimization effortless. Instead of implementing complex caching decorators or wrapper functions, you can now pass a cache adapter directly to `from_provider()` and automatically cache all your structured LLM calls.\n\n## The Game Changer: Built-in Caching\n\nBefore v1.9.1, caching required custom decorators and manual implementation. Now, it's as simple as:\n\n```python\nfrom instructor import from_provider\nfrom instructor.cache import AutoCache\n\n# Works with any provider - caching flows through automatically\nclient = from_provider(\"openai/gpt-4o\", cache=AutoCache(maxsize=1000))\n\n# Your normal calls are now cached automatically\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nfirst = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 25\"}], response_model=User\n)\n\nsecond = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 25\"}], response_model=User\n)\n\n# second call was served from cache - same result, zero cost!\nassert first.name == second.name\n```\n\n## Universal Provider Support\n\nThe beauty of native caching is that it works with **every provider** through the same simple API:\n\n```python\nfrom instructor.cache import AutoCache, DiskCache\n\n# Works with OpenAI\nopenai_client = from_provider(\"openai/gpt-5-nano\", cache=AutoCache())\n\n# Works with Anthropic\nanthropic_client = from_provider(\"anthropic/claude-3-haiku\", cache=AutoCache())\n\n# Works with Google\ngoogle_client = from_provider(\"google/gemini-pro\", cache=DiskCache())\n\n# Works with any provider in the ecosystem\ngroq_client = from_provider(\"groq/llama-3.1-8b\", cache=AutoCache())\n```\n\nNo provider-specific configuration needed. The cache parameter flows through `**kwargs` to all underlying implementations automatically.\n\n## Built-in Cache Adapters\n\nInstructor v1.9.1 ships with two production-ready cache implementations:\n\n### 1. AutoCache - In-Process LRU Cache\n\nPerfect for single-process applications and development:\n\n```python\nfrom instructor.cache import AutoCache\n\n# Thread-safe in-memory cache with LRU eviction\ncache = AutoCache(maxsize=1000)\nclient = from_provider(\"openai/gpt-4o\", cache=cache)\n```\n\n**When to use**:\n- Development and testing\n- Single-process applications\n- When you need maximum speed (200,000x+ faster cache hits)\n- Applications where cache persistence isn't required\n\n### 2. DiskCache - Persistent Storage\n\nIdeal when you need cache persistence across sessions:\n\n```python\nfrom instructor.cache import DiskCache\n\n# Persistent disk-based cache\ncache = DiskCache(directory=\".instructor_cache\")\nclient = from_provider(\"anthropic/claude-3-sonnet\", cache=cache)\n```\n\n**When to use**:\n- Applications that restart frequently\n- Development workflows where you want to preserve cache between sessions\n- When working with expensive or time-intensive API calls\n- Local applications with moderate performance requirements\n\n## Smart Cache Key Generation\n\nInstructor automatically generates intelligent cache keys that include:\n\n- **Provider/model name** - Different models get different cache entries\n- **Complete message history** - Full conversation context is hashed\n- **Response model schema** - Any changes to your Pydantic model automatically bust the cache\n- **Mode configuration** - JSON vs Tools mode changes are tracked\n\nThis means when you update your Pydantic model (adding fields, changing descriptions, etc.), the cache automatically invalidates old entries - no stale data!\n\n```python\nfrom instructor.cache import make_cache_key\n\n# Generate deterministic cache key\nkey = make_cache_key(\n    messages=[{\"role\": \"user\", \"content\": \"hello\"}],\n    model=\"gpt-3.5-turbo\",\n    response_model=User,\n    mode=\"TOOLS\",\n)\nprint(key)  # SHA-256 hash: 9b8f5e2c8c9e...\n```\n\n## Custom Cache Implementations\n\nWant Redis, Memcached, or a custom backend? Simply inherit from `BaseCache`:\n\n```python\nfrom instructor.cache import BaseCache\nimport redis\n\n\nclass RedisCache(BaseCache):\n    def __init__(self, host=\"localhost\", port=6379, **kwargs):\n        self.redis = redis.Redis(host=host, port=port, **kwargs)\n\n    def get(self, key: str):\n        value = self.redis.get(key)\n        return value.decode() if value else None\n\n    def set(self, key: str, value, ttl: int | None = None):\n        if ttl:\n            self.redis.setex(key, ttl, value)\n        else:\n            self.redis.set(key, value)\n\n\n# Use your custom cache\nredis_cache = RedisCache(host=\"my-redis-server\")\nclient = from_provider(\"openai/gpt-4o\", cache=redis_cache)\n```\n\nThe `BaseCache` interface is intentionally minimal - just implement `get()` and `set()` methods and you're ready to go.\n\n## Time-to-Live (TTL) Support\n\nControl cache expiration with per-call TTL overrides:\n\n```python\n# Cache this result for 1 hour\nresult = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Generate daily report\"}],\n    response_model=Report,\n    cache_ttl=3600,  # 1 hour in seconds\n)\n```\n\nTTL support depends on your cache backend:\n- **AutoCache**: TTL is ignored (no expiration)\n- **DiskCache**: Full TTL support with automatic expiration\n- **Custom backends**: Implement TTL handling in your `set()` method\n\n## Migration from Manual Caching\n\nIf you were using custom caching decorators, migrating is straightforward:\n\n**Before v1.9.1**:\n```python\n@functools.cache\ndef extract_user(text: str) -> User:\n    return client.create(\n        messages=[{\"role\": \"user\", \"content\": text}], response_model=User\n    )\n```\n\n**With v1.9.1**:\n```python\n# Remove decorator, add cache to client\nclient = from_provider(\"openai/gpt-4o\", cache=AutoCache())\n\n\ndef extract_user(text: str) -> User:\n    return client.create(\n        messages=[{\"role\": \"user\", \"content\": text}], response_model=User\n    )\n```\n\nNo more function-level caching logic - just create your client with caching enabled and all calls benefit automatically.\n\n## Real-World Performance Impact\n\nNative caching delivers the same dramatic performance improvements you'd expect:\n\n- **AutoCache**: 200,000x+ speed improvement for cache hits\n- **DiskCache**: 5-10x improvement with persistence benefits\n- **Cost Reduction**: 50-90% API cost savings depending on cache hit rate\n\nFor a comprehensive deep-dive into caching strategies and performance analysis, check out our [complete caching guide](caching.md).\n\n## Getting Started\n\nReady to enable native caching? Here's your quick start:\n\n1. **Upgrade to v1.9.1+**:\n   ```bash\n   pip install \"instructor>=1.9.1\"\n   ```\n\n2. **Choose your cache backend**:\n   ```python\n   from instructor.cache import AutoCache, DiskCache\n   \n   # For development/single-process\n   cache = AutoCache(maxsize=1000)\n   \n   # For persistence\n   cache = DiskCache(directory=\".cache\")\n   ```\n\n3. **Add cache to your client**:\n   ```python\n   from instructor import from_provider\n   \n   client = from_provider(\"your/favorite/model\", cache=cache)\n   ```\n\n4. **Use normally - caching happens automatically**:\n   ```python\n   result = client.create(\n       messages=[{\"role\": \"user\", \"content\": \"your prompt\"}], response_model=YourModel\n   )\n   ```\n\n## Learn More\n\nFor detailed information about cache design, custom implementations, and advanced patterns, visit our [Caching Concepts](../../concepts/caching.md) documentation.\n\nThe native caching feature represents our commitment to making high-performance LLM applications simple and accessible. No more complex caching logic - just fast, cost-effective structured outputs out of the box.\n\n---\n\n*Have questions about native caching or want to share your use case? Join the discussion in our [GitHub repository](https://github.com/jxnl/instructor) or check out the [complete documentation](../../concepts/caching.md).*"
  },
  {
    "path": "docs/blog/posts/open_source.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- API Development\ncomments: true\ndate: 2024-03-07\ndescription: Discover how Instructor integrates with OpenAI and local LLMs for structured\n  outputs using Pydantic and JSON schema.\ndraft: false\nslug: open-source-local-structured-output-pydantic-json-openai\ntags:\n- OpenAI\n- Pydantic\n- LLMs\n- Structured Outputs\n- API Integration\n---\n\n# Structured Output for Open Source and Local LLMs\n\nInstructor has expanded its capabilities for language models. It started with API interactions via the OpenAI SDK, using [Pydantic](https://pydantic-docs.helpmanual.io/) for structured data validation. Now, Instructor supports multiple models and platforms.\n\nThe integration of [JSON mode](../../concepts/patching.md#json-mode) improved adaptability to vision models and open source alternatives. This allows support for models from [GPT](https://openai.com/api/) and [Mistral](https://mistral.ai) to models on [Ollama](https://ollama.ai) and [Hugging Face](https://huggingface.co/models), using [llama-cpp-python](../../integrations/llama-cpp-python.md).\n\nInstructor now works with cloud-based APIs and local models for structured data extraction. Developers can refer to our guide on [Patching](../../concepts/patching.md) for information on using JSON mode with different models.\n\nFor learning about Instructor and Pydantic, we offer a course on [Steering language models towards structured outputs](https://www.wandb.courses/courses/steering-language-models).\n\nThe following sections show examples of Instructor's integration with platforms and local setups for structured outputs in AI projects.\n\n<!-- more -->\n\n\n## Exploring Different OpenAI Clients with Instructor\n\nOpenAI clients offer functionalities for different needs. We explore clients integrated with Instructor, providing structured outputs and capabilities. Examples show how to initialize and patch each client.\n\n## Local Models\n\n### Ollama: A New Frontier for Local Models\n\nOllama enables structured outputs with local models using JSON schema. See our [Ollama documentation](../../integrations/ollama.md) for details.\n\nFor setup and features, refer to the documentation. The [Ollama website](https://ollama.ai/download) provides resources, models, and support.\n\n```\nollama run llama2\n```\n\n```python\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport instructor\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n# enables `response_model` in create call\nclient = instructor.from_openai(\n    OpenAI(\n        base_url=\"http://localhost:11434/v1\",\n        api_key=\"ollama\",  # required, but unused\n    ),\n    mode=instructor.Mode.JSON,\n)\n\n\nuser = client.create(\n    model=\"llama2\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Jason is 30 years old\",\n        }\n    ],\n    response_model=UserDetail,\n)\n\nprint(user)\n#> name='Jason' age=30\n```\n\n### llama-cpp-python\n\nllama-cpp-python provides the `llama-cpp` model for structured outputs using JSON schema. It uses [constrained sampling](https://llama-cpp-python.readthedocs.io/en/latest/#json-schema-mode) and [speculative decoding](https://llama-cpp-python.readthedocs.io/en/latest/#speculative-decoding). An [OpenAI compatible client](https://llama-cpp-python.readthedocs.io/en/latest/#openai-compatible-web-server) allows in-process structured output without network dependency.\n\nExample of using llama-cpp-python for structured outputs:\n\n\n```python\nimport llama_cpp\nimport instructor\nfrom llama_cpp.llama_speculative import LlamaPromptLookupDecoding\nfrom pydantic import BaseModel\n\n\nllama = llama_cpp.Llama(\n    model_path=\"../../models/OpenHermes-2.5-Mistral-7B-GGUF/openhermes-2.5-mistral-7b.Q4_K_M.gguf\",\n    n_gpu_layers=-1,\n    chat_format=\"chatml\",\n    n_ctx=2048,\n    draft_model=LlamaPromptLookupDecoding(num_pred_tokens=2),\n    logits_all=True,\n    verbose=False,\n)\n\n\ncreate = instructor.patch(\n    create=llama.create_chat_completion_openai_v1,\n    mode=instructor.Mode.JSON_SCHEMA,\n)\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\nuser = create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract `Jason is 30 years old`\",\n        }\n    ],\n    response_model=UserDetail,\n)\n\nprint(user)\n#> name='Jason' age=30\n```\n\n## Alternative Providers\n\n### Groq\n\nGroq's platform, detailed further in our [Groq documentation](../../integrations/groq.md) and on [Groq's official documentation](https://groq.com/), offers a unique approach to processing with its tensor architecture. This innovation significantly enhances the performance of structured output processing.\n\n```bash\nexport GROQ_API_KEY=\"your-api-key\"\n```\n\n```python\nimport os\nfrom pydantic import BaseModel\n\nimport groq\nimport instructor\n\n\nclient = groq.Groq(\n    api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\n# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.create methods\n# to support the response_model parameter\nclient = instructor.from_openai(client, mode=instructor.Mode.MD_JSON)\n\n\n# Now, we can use the response_model parameter using only a base model\n# rather than having to use the OpenAISchema class\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.create(\n    model=\"mixtral-8x7b-32768\",\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nassert isinstance(user, UserExtract), \"Should be instance of UserExtract\"\n\nprint(user)\n#> name='jason' age=25\n```\n\n### Together AI\n\nTogether AI, when combined with Instructor, offers a seamless experience for developers looking to leverage structured outputs in their applications. For more details, refer to our [Together AI documentation](../../integrations/together.md) and explore the [patching guide](../../concepts/patching.md) to enhance your applications.\n\n```bash\nexport TOGETHER_API_KEY=\"your-api-key\"\n```\n\n```python\nimport os\nfrom pydantic import BaseModel\n\nimport instructor\nimport openai\n\n\nclient = openai.OpenAI(\n    base_url=\"https://api.together.xyz/v1\",\n    api_key=os.environ[\"TOGETHER_API_KEY\"],\n)\n\nclient = instructor.from_openai(client, mode=instructor.Mode.TOOLS)\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.create(\n    model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nassert isinstance(user, UserExtract), \"Should be instance of UserExtract\"\n\nprint(user)\n#> name='jason' age=25\n```\n\n### Mistral\n\nFor those interested in exploring the capabilities of Mistral Large with Instructor, we highly recommend checking out our comprehensive guide on [Mistral Large](../../integrations/mistral.md).\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom mistralai.client import MistralClient\n\n\nclient = MistralClient()\n\npatched_chat = instructor.from_openai(\n    create=client.chat, mode=instructor.Mode.TOOLS\n)\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\nresp = patched_chat(\n    model=\"mistral-large-latest\",\n    response_model=UserDetails,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f'Extract the following entities: \"Jason is 20\"',\n        },\n    ],\n)\n\nprint(resp)\n#> name='Jason' age=20\n```\n"
  },
  {
    "path": "docs/blog/posts/openai-distilation-store.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- OpenAI\ncomments: true\ndate: 2024-10-02\ndescription: Learn how to use OpenAI's API Model Distillation with Instructor to create\n  efficient, tailored models for your applications.\ndraft: false\ntags:\n- OpenAI\n- API Model Distillation\n- Instructor\n- Machine Learning\n- Data Processing\n---\n\n# OpenAI API Model Distillation with Instructor\n\nOpenAI has recently introduced a new feature called [API Model Distillation](https://openai.com/index/api-model-distillation/), which allows developers to create custom models tailored to their specific use cases. This feature is particularly powerful when combined with Instructor's structured output capabilities. In this post, we'll explore how to leverage API Model Distillation with Instructor to create more efficient and specialized models.\n\n<!-- more -->\n\n## What is API Model Distillation?\n\nAPI Model Distillation is a process that allows you to create a smaller, more focused model based on the inputs and outputs of a larger model. This distilled model can be more efficient and cost-effective for specific tasks while maintaining high performance.\n\n## Using Instructor with API Model Distillation\n\nInstructor's integration with OpenAI's API makes it seamless to use API Model Distillation. Here's how you can get started, make sure you have the latest version of OpenAI!\n\n```\npip install -U openai\n```\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Enable response_model and API Model Distillation\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n    def introduce(self):\n        return f\"Hello, I'm {self.name} and I'm {self.age} years old\"\n\n\n# Use the store parameter to enable API Model Distillation\nuser: UserDetail = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"},\n    ],\n    store=True,  # Enable API Model Distillation\n)\n```\n\nIn this example, we've added the `store=True` parameter to the `chat.completions.create` method. This enables API Model Distillation for this specific call.\n\n## Metadata and Proxy Kwargs\n\nOne of the great advantages of using Instructor with API Model Distillation is that it automatically handles metadata and proxies kwargs to the underlying OpenAI API. This means you can use additional parameters supported by the [OpenAI API](https://platform.openai.com/docs/api-reference) without any extra configuration.\n\nFor example, you can add metadata to your API calls:\n\n```python\nuser: UserDetail = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"},\n    ],\n    store=True,\n    metadata={\"task\": \"user_extraction\", \"source\": \"customer_support_chat\"},\n)\n```\n\nThe `metadata` parameter will be automatically passed to the OpenAI API, allowing you to track and organize your API calls for distillation purposes.\n\n\n## Completions Dashboard\n\nTo better understand how API Model Distillation works with Instructor, let's take a look at the following diagram:\n\n![API Model Distillation with Instructor](./img/distil_openai.png)\n\nThis image illustrates the process of API Model Distillation when using Instructor with OpenAI's API. It shows how the structured output from Instructor, combined with metadata and other parameters, feeds into the distillation process to create a specialized model tailored to your specific use case.\n\nThe diagram highlights:\n\n1. The initial request with structured output using Instructor\n2. The inclusion of metadata and additional parameters\n3. The distillation process that creates a specialized model\n4. The resulting distilled model that can be used for faster, more efficient responses\n\nThis visual representation helps to clarify the flow and benefits of using API Model Distillation in conjunction with Instructor's capabilities.\n\n\n## Benefits of Using Instructor with API Model Distillation\n\n1. **Structured Output**: Instructor's use of [Pydantic](https://docs.pydantic.dev/) models ensures that your distilled model produces structured, validated output.\n2. **Simplified Integration**: The proxy kwargs feature means you can use all OpenAI API parameters without additional configuration.\n3. **Improved Efficiency**: By distilling models for specific tasks, you can reduce latency and costs for your applications.\n4. **Consistency**: Distilled models can provide more consistent outputs for specialized tasks.\n\n## Conclusion\n\nAPI Model Distillation with Instructor's structured output creates efficient, specialized models. Instructor's integration with OpenAI's API allows you to incorporate this feature into workflows, improving performance and cost-effectiveness of AI applications.\n\nRemember to check [OpenAI's documentation](https://platform.openai.com/docs) for the latest information on API Model Distillation and best practices for creating and using distilled models.\n\nFor more information on using Instructor, visit the [Instructor GitHub repository](https://github.com/jxnl/instructor) and give it a star if you find it helpful!"
  },
  {
    "path": "docs/blog/posts/openai-multimodal.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - OpenAI\n  - Audio\ncomments: true\ndate: 2024-10-17\ndescription: Explore the new audio capabilities in OpenAI's Chat Completions API using the gpt-4o-audio-preview model.\ndraft: false\ntags:\n  - OpenAI\n  - Audio Processing\n  - API\n  - Machine Learning\n---\n\n# Audio Support in OpenAI's Chat Completions API\n\nOpenAI has recently introduced audio support in their Chat Completions API, opening up exciting new possibilities for developers working with audio and text interactions. This feature is powered by the new `gpt-4o-audio-preview` model, which brings advanced voice capabilities to the familiar Chat Completions API interface.\n\n<!-- more -->\n\n## Key Features\n\nThe new audio support in the Chat Completions API offers several compelling features:\n\n1. **Flexible Input Handling**: The API can now process any combination of text and audio inputs, allowing for more versatile applications.\n\n2. **Natural, Steerable Voices**: Similar to the Realtime API, developers can use prompting to shape various aspects of the generated audio, including language, pronunciation, and emotional range.\n\n3. **Tool Calling Integration**: The audio support seamlessly integrates with existing tool calling functionality, enabling complex workflows that combine audio, text, and external tools.\n\n## Practical Example\n\nTo demonstrate how to use this new functionality, let's look at a simple example using the `instructor` library:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor.processing.multimodal import Audio\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    model=\"gpt-4o-audio-preview\",\n    response_model=Person,\n    modalities=[\"text\"],\n    audio={\"voice\": \"alloy\", \"format\": \"wav\"},\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract the following information from the audio\",\n                Audio.from_path(\"./output.wav\"),\n            ],\n        },\n    ],\n)\n\nprint(resp)\n# Expected output: Person(name='Jason', age=20)\n```\n\nIn this example, we're using the `gpt-4o-audio-preview` model to extract information from an audio file. The API processes the audio input and returns structured data (a Person object with name and age) based on the content of the audio.\n\n## Use Cases\n\nThe addition of audio support to the Chat Completions API enables a wide range of applications:\n\n1. **Voice-based Personal Assistants**: Create more natural and context-aware voice interfaces for various applications.\n\n2. **Audio Content Analysis**: Automatically extract information, sentiments, or key points from audio recordings or podcasts.\n\n3. **Language Learning Tools**: Develop interactive language learning applications that can process and respond to spoken language.\n\n4. **Accessibility Features**: Improve accessibility in applications by providing audio-based interactions and text-to-speech capabilities.\n\n## Considerations\n\nWhile this new feature is exciting, it's important to note that it's best suited for asynchronous use cases that don't require extremely low latencies. For more dynamic and real-time interactions, OpenAI recommends using their Realtime API.\n\nAs with any AI-powered feature, it's crucial to consider ethical implications and potential biases in audio processing and generation. Always test thoroughly and consider the diversity of your user base when implementing these features.\n\n## Related Documentation\n- [Multimodal Guide](../../concepts/multimodal.md) - Comprehensive multimodal reference\n- [OpenAI Integration](../../integrations/openai.md) - Full OpenAI setup\n\n## See Also\n- [Gemini Multimodal](multimodal-gemini.md) - Alternative multimodal approach\n- [Prompt Caching](anthropic-prompt-caching.md) - Cache large audio files\n- [Monitoring with Logfire](logfire.md) - Track multimodal processing\n"
  },
  {
    "path": "docs/blog/posts/pairwise-llm-judge.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - LLM\n  - Pydantic\ncomments: true\ndate: 2024-10-17\ndescription: Explore how to use Instructor and Pydantic to create a pairwise LLM judge for evaluating text relevance.\ndraft: false\ntags:\n  - LLM\n  - Pydantic\n  - Instructor\n  - Text Relevance\n  - AI Evaluation\n---\n\n# Building a Pairwise LLM Judge with Instructor and Pydantic\n\nIn this blog post, we'll explore how to create a pairwise LLM judge using Instructor and Pydantic. This judge will evaluate the relevance between a question and a piece of text, demonstrating a practical application of structured outputs in language model interactions.\n\n## Introduction\n\nEvaluating text relevance is a common task in natural language processing and information retrieval. By leveraging large language models (LLMs) and structured outputs, we can create a system that judges the similarity or relevance between a question and a given text.\n\n<!-- more -->\n\n## Setting Up the Environment\n\nFirst, let's set up our environment with the necessary imports:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n```\n\nHere, we're using the `instructor` library, which integrates seamlessly with OpenAI's API and Pydantic for structured outputs.\n\n## Defining the Judgment Model\n\nWe'll use Pydantic to define a `Judgment` model that structures the output of our LLM:\n\n```python\nclass Judgment(BaseModel):\n    thought: str = Field(\n        description=\"The step-by-step reasoning process used to analyze the question and text\"\n    )\n    justification: str = Field(\n        description=\"Explanation for the similarity judgment, detailing key factors that led to the conclusion\"\n    )\n    similarity: bool = Field(\n        description=\"Boolean judgment indicating whether the question and text are similar or relevant (True) or not (False)\"\n    )\n```\n\nThis model ensures that our LLM's output is structured and includes a thought process, justification, and a boolean similarity judgment.\n\n## Creating the Judge Function\n\nNext, we'll create a function that uses our LLM to judge the relevance between a question and a text:\n\n```python\ndef judge_relevance(question: str, text: str) -> Judgment:\n    return client.chat.create(\n        model=\"gpt-4\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                    You are tasked with comparing a question and a piece of text to determine if they are relevant to each other or similar in some way. Your goal is to analyze the content, context, and potential connections between the two.\n\n                    To determine if the question and text are relevant or similar, please follow these steps:\n\n                    1. Carefully read and understand both the question and the text.\n                    2. Identify the main topic, keywords, and concepts in the question.\n                    3. Analyze the text for any mention of these topics, keywords, or concepts.\n                    4. Consider any potential indirect connections or implications that might link the question and text.\n                    5. Evaluate the overall context and purpose of both the question and the text.\n\n                    As you go through this process, please use a chain of thought approach. Write out your reasoning for each step inside <thought> tags.\n\n                    After your analysis, provide a boolean judgment on whether the question and text are similar or relevant to each other. Use \"true\" if they are similar or relevant, and \"false\" if they are not.\n\n                    Before giving your final judgment, provide a justification for your decision. Explain the key factors that led to your conclusion.\n\n                    Please ensure your analysis is thorough, impartial, and based on the content provided.\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n                    Here is the question:\n\n                    <question>\n                    {{question}}\n                    </question>\n\n                    Here is the text:\n                    <text>\n                    {{text}}\n                    </text>\n                \"\"\",\n            },\n        ],\n        response_model=Judgment,\n        context={\"question\": question, \"text\": text},\n    )\n```\n\nThis function takes a question and a text as input, sends them to the LLM with a predefined prompt, and returns a structured `Judgment` object.\n\n## Testing the Judge\n\nTo test our pairwise LLM judge, we can create a set of test pairs and evaluate the judge's performance:\n\n```python\nif __name__ == \"__main__\":\n    test_pairs = [\n        {\n            \"question\": \"What are the main causes of climate change?\",\n            \"text\": \"Global warming is primarily caused by human activities, such as burning fossil fuels, deforestation, and industrial processes. These activities release greenhouse gases into the atmosphere, trapping heat and leading to a rise in global temperatures.\",\n            \"is_similar\": True,\n        },\n        # ... (other test pairs)\n    ]\n\n    score = 0\n    for pair in test_pairs:\n        result = judge_relevance(pair[\"question\"], pair[\"text\"])\n        if result.similarity == pair[\"is_similar\"]:\n            score += 1\n\n    print(f\"Score: {score}/{len(test_pairs)}\")\n    #> Score 9/10\n```\n\nThis test loop runs the judge on each pair and compares the result to a predetermined similarity value, calculating an overall score.\n\n## Conclusion\n\nBy combining Instructor, Pydantic, and OpenAI's language models, we've created a powerful tool for judging text relevance. This approach demonstrates the flexibility and power of structured outputs in LLM applications.\n\nThe pairwise LLM judge we've built can be used in various scenarios, such as:\n\n1. Improving search relevance in information retrieval systems\n2. Evaluating the quality of question-answering systems\n3. Assisting in content recommendation algorithms\n4. Automating parts of the content moderation process\n\nAs you explore this technique, consider how you might extend or adapt it for your specific use cases. The combination of structured outputs and large language models opens up a world of possibilities for creating intelligent, interpretable AI systems.\n"
  },
  {
    "path": "docs/blog/posts/parea.md",
    "content": "---\nauthors:\n  - jxnl\n  - joschkabraun\ncategories:\n  - LLM Observability\ncomments: true\ndate: 2024-07-17\ndescription:\n  Explore how Parea enhances the OpenAI instructor, enabling better monitoring,\n  collaboration, and error tracking for LLM applications.\ndraft: false\ntags:\n  - Parea\n  - OpenAI\n  - LLM\n  - instructor\n  - validation\n---\n\n# Parea for Observing, Testing & Fine-tuning of Instructor\n\n[Parea](https://www.parea.ai) is a platform that enables teams to monitor, collaborate, test & label for LLM applications. In this blog we will explore how Parea can be used to enhance the OpenAI client alongside `instructor` and debug + improve `instructor` calls. Parea has some features which makes it particularly useful for `instructor`:\n\n- it automatically groups any LLM calls due to reties under a single trace\n- it automatically tracks any validation error counts & fields that occur when using `instructor`\n- it provides a UI to label JSON responses by filling out a form instead of editing JSON objects\n\n??? info \"Configure Parea\"\n\n    Before starting this tutorial, make sure that you've registered for a [Parea](https://www.parea.ai) account. You'll also need to create an [API key](https://docs.parea.ai/api-reference/authentication).\n\n## Example: Writing Emails with URLs from Instructor Docs\n\nWe will demonstrate Parea by using `instructor` to write emails which only contain URLs from the `instructor` docs. We'll need to install our dependencies before proceeding so simply run the command below.\n\n<!-- more -->\n\n```bash\npip install -U parea-ai instructor\n```\n\nParea is dead simple to integrate - all it takes is 2 lines of code, and we have it setup.\n\n```python hl_lines=\"9 15-16\"\nimport os\n\nimport instructor\nfrom dotenv import load_dotenv\nfrom openai import OpenAI\nfrom parea import Parea  # (1)!\n\nload_dotenv()\n\nclient = OpenAI()\n\np = Parea(api_key=os.getenv(\"PAREA_API_KEY\"))  # (2)!\np.wrap_openai_client(client, \"instructor\")\n\nclient = instructor.from_provider(\"openai/gpt-4o\")\n```\n\n1. Import `Parea` from the `parea` module\n2. Setup tracing using their native integration with `instructor`\n\nIn this example, we'll be looking at writing emails which only contain links to the instructor docs. To do so, we can define a simple Pydantic model as seen below.\n\n```python\nclass Email(BaseModel):\n    subject: str\n    body: str = Field(\n        ...,\n        description=\"Email body, Should contain links to instructor documentation. \",\n    )\n\n    @field_validator(\"body\")\n    def check_urls(cls, v):\n        urls = re.findall(r\"https?://(?:[-\\w.]|(?:%[\\da-fA-F]{2}))+\", v)\n        errors = []\n        for url in urls:\n            if not url.startswith(\"https://python.useinstructor.com\"):\n                errors.append(\n                    f\"URL {url} is not from useinstructor.com, Only include URLs that include use instructor.com. \"\n                )\n            response = requests.get(url)\n            if response.status_code != 200:\n                errors.append(\n                    f\"URL {url} returned status code {response.status_code}. Only include valid URLs that exist.\"\n                )\n            elif \"404\" in response.text:\n                errors.append(\n                    f\"URL {url} contained '404' in the body. Only include valid URLs that exist.\"\n                )\n        if errors:\n            raise ValueError(\"\\n\".join(errors))\n        return\n```\n\nNow we can proceed to create an email using above Pydantic model.\n\n```python hl_lines=\"5-14\"\nemail = client.messages.create(\n    model=\"gpt-3.5-turbo\",\n    max_tokens=1024,\n    max_retries=3,\n    messages=[  # (1)!\n        {\n            \"role\": \"user\",\n            \"content\": \"I'm responding to a student's question. Here is the link to the documentation: {{doc_link1}} and {{doc_link2}}\",\n        }\n    ],\n    template_inputs={\n        \"doc_link1\": \"https://python.useinstructor.com/docs/tutorial/tutorial-1\",\n        \"doc_link2\": \"https://jxnl.github.io/docs/tutorial/tutorial-2\",\n    },\n    response_model=Email,\n)\nprint(email)\n```\n\n1. Parea supports templated prompts via `{{...}}` syntax in the `messages` parameter. We can pass the template inputs as a dictionary to the `template_inputs` parameter.\n\nIf you follow what we've done, Parea has wrapped the client, and we wrote an email with links from the instructor docs.\n\n## Validation Error Tracking\n\nTo take a look at trace of this execution checkout the screenshot below. Noticeable:\n\n- left sidebar: all related LLM calls are grouped under a trace called `instructor`\n- middle section: the root trace visualizes the `templated_inputs` as inputs and the created `Email` object as output\n- bottom of right sidebar: any validation errors are captured and tracked as score for the trace which enables visualizing them in dashboards and filtering by them on tables\n\n![](./img/parea/trace.png)\n\nAbove we can see that while the email was successfully created, there was a validation error which meant that additional cost & latency were introduced because of the initially failed validation.\nBelow we can see a visualization of the average validation error count for our instructor usage over time.\n\n![](./img/parea/validation-error-chart.png)\n\n## Label Responses for Fine-Tuning\n\nSometimes you may want to let subject-matter experts (SMEs) label responses to use them for fine-tuning. Parea provides a way to do this via an annotation queue. Editing raw JSON objects to correct tool use & function calling responses can be error-prone, esp. for non-devs. For that purpose, Parea has a so-called [Form Mode](https://docs.parea.ai/manual-review/overview#labeling-function-calling-tool-use-responses) which allows the user to safely fill-out a form instead of editing the JSON object. The labeled data can then be exported and used for fine-tuning.\n\n![Form Mode](img/parea/form-mode.gif)\n\n??? info \"Export Labeled Data & Fine-Tune\"\n\n    After labeling the data, you can export them as JSONL file:\n\n    ```python hl_lines=\"5 6\"\n    from parea import Parea\n\n    p = Parea(api_key=os.getenv(\"PAREA_API_KEY\"))\n\n    dataset = p.get_collection(DATASET_ID)  # (1)!\n    dataset.write_to_finetune_jsonl(\"finetune.jsonl\")  # (2)!\n    ```\n\n    1. Replace `DATASET_ID` with the actual dataset ID\n    2. Writes the dataset to a JSONL file\n\n    Now we can use `instructor` to fine-tune the model:\n\n    ```bash\n    instructor jobs create-from-file finetune.jsonl\n    ```\n"
  },
  {
    "path": "docs/blog/posts/pydantic-is-still-all-you-need.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2024-09-07\ndescription: Explore how Pydantic enhances structured outputs in LLM applications,\n  ensuring reliability and improved data management.\ndraft: false\nslug: pydantic-is-still-all-you-need\ntags:\n- Pydantic\n- Structured Outputs\n- Data Validation\n- LLM Techniques\n- Performance Optimization\n---\n\n# Pydantic is Still All You Need: Reflections on a Year of Structured Outputs\n\nA year ago, I gave a talk titled \"Pydantic: All You Need\" that kickstarted my Twitter career. Today, I'm back to reaffirm that message and share what I've learned in the past year about using structured outputs with language models.\n\n[Watch the youtube video](https://www.youtube.com/watch?v=pZ4DIH2BVqg){ .md-button .md-button--primary }\n\n<!-- more -->\n\n## The Problem with Unstructured Outputs\n\nImagine hiring an intern to write an API that returns a string you have to JSON load into a dictionary and pray the data is still there. You'd probably fire them and replace them with GPT. Yet, many of us are content using LLMs in the same haphazard way.\n\nBy not using schemas and structured responses, we lose compatibility, composability, and reliability when building tools that interact with external systems. But there's a better way.\n\n## The Power of Pydantic\n\nPydantic, combined with function calling, offers a superior alternative for structured outputs. It allows for:\n\n- Nested objects and models for modular structures\n- Validators to improve system reliability\n- Cleaner, more maintainable code\n\nFor more details on how Pydantic enhances data validation, check out our [Data Validation with Pydantic](../../concepts/models.md) guide.\n\nAnd here's the kicker: nothing's really changed in the past year. The core API is still just:\n\n```python\nfrom instructor import from_openai\n\nclient = from_openai(OpenAI())\n\nresponse = client.create(model=\"gpt-3.5-turbo\", response_model=User, messages=[...])\n```\n\n## What's New in Pydantic?\n\nSince last year:\n\n- We've released version 1.0\n- Launched in 5 languages (Python, TypeScript, Ruby, Go, Elixir)\n- Built a version in Rust\n- Seen 40% month-over-month growth in the Python library\n\nWe now support [Ollama](../../integrations/ollama.md), [llama-cpp-python](../../integrations/llama-cpp-python.md), [Anthropic](../../integrations/anthropic.md), [Cohere](../../integrations/cohere.md), [Google](../../integrations/google.md), [Vertex AI](../../integrations/vertex.md), and more. As long as language models support function calling capabilities, this API will remain standard.\n\n## Key Features\n\n1. **Streaming with Structure**: Get objects as they return, improving latency while maintaining structured output. Learn more about this in our [Streaming Support](../../concepts/partial.md) guide.\n\n2. **Partials**: Validate entire objects, enabling real-time rendering for generative UI without complex JSON parsing. See our [Partial](../../concepts/partial.md) documentation for implementation details.\n\n3. **Validators**: Add custom logic to ensure correct outputs, with the ability to retry on errors. Dive deeper into this topic in our [Reasking and Validation](../../concepts/reask_validation.md) guide.\n\n## Real-World Applications\n\n### Generation and Extraction\n\nStructured outputs shine in tasks like:\n\n- Generating follow-up questions in RAG applications\n- Validating URLs in generated content\n- Extracting structured data from transcripts or images\n\nFor a practical example, see our [Structured Data Extraction from Images](../../examples/image_to_ad_copy.md) case study.\n\n### Search Queries\n\nFor complex search scenarios:\n\n```python\nclass Search(BaseModel):\n    query: str\n    start_date: Optional[datetime]\n    end_date: Optional[datetime]\n    limit: Optional[int]\n    source: Literal[\"news\", \"social\", \"blog\"]\n```\n\nThis structure allows for more sophisticated search capabilities, handling queries like \"What is the latest news from X?\" that embeddings alone can't handle.\n\n## Lessons Learned\n\n1. Validation errors are crucial for improving system performance.\n2. Not all language models support retry logic effectively yet.\n3. Structured outputs benefit vision, text, RAG, and agent applications alike.\n\n## The Future of Programming with LLMs\n\nWe're not changing the language of programming; we're relearning how to program with data structures. Structured outputs allow us to:\n\n- Own the objects we define\n- Control the functions we implement\n- Manage the control flow\n- Own the prompts\n\nThis approach makes Software 3.0 backwards compatible with existing software, demystifying language models and returning us to a more classical programming structure.\n\n## Wrapping Up\n\nPydantic is still all you need for effective structured outputs with LLMs. It's not just about generating accurate responses; it's about doing so in a way that's compatible with our existing programming paradigms and tools.\n\nAs we continue to refine AI language models, keeping these principles in mind will lead to more robust, maintainable, and powerful applications. The future of AI isn't just about what the models can do, but how seamlessly we can integrate them into our existing software ecosystems.\n\nFor more advanced use cases and integrations, check out our [examples](../../examples/index.md) section, which covers various LLM providers and specialized implementations.\n\n## Related Documentation\n- [Instructor Philosophy](../../concepts/philosophy.md) - Why we chose Pydantic\n- [Validation Guide](../../concepts/validation.md) - Practical validation techniques\n\n## See Also\n- [Validation Deep Dive](validation-part1.md) - Advanced validation patterns\n- [Best Framework Comparison](best_framework.md) - Why Instructor stands out\n- [Introduction to Instructor](introduction.md) - Getting started guide\n"
  },
  {
    "path": "docs/blog/posts/rag-and-beyond.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- LLM Techniques\ncomments: true\ndate: 2023-09-17\ndescription: 'Explore how to enhance Retrieval Augmented Generation (RAG) with query\n  understanding for smarter search solutions. '\ndraft: false\ntags:\n- RAG\n- query understanding\n- LLMs\n- data modeling\n- Pydantic\n---\n\n# RAG is more than just embedding search\n\nWith the advent of large language models (LLM), retrieval augmented generation (RAG) has become a hot topic. However throughout the past year of [helping startups](https://jxnl.co) integrate LLMs into their stack I've noticed that the pattern of taking user queries, embedding them, and directly searching a vector store is effectively demoware.\n\n!!! note \"What is RAG?\"\n\n    Retrieval augmented generation (RAG) is a technique that uses an LLM to generate responses, but uses a search backend to augment the generation. In the past year using text embeddings with a vector databases has been the most popular approach I've seen being socialized.\n\n<figure markdown>\n  ![RAG](img/dumb_rag.png)\n  <figcaption>Simple RAG that embedded the user query and makes a search.</figcaption>\n</figure>\n\nSo let's kick things off by examining what I like to call the 'Dumb' RAG Model-a basic setup that's more common than you'd think.\n\n<!-- more -->\n\n## The 'Dumb' RAG Model\n\nWhen you ask a question like, \"what is the capital of France?\" The RAG 'dumb' model embeds the query and searches in some unopinionated search endpoint. Limited to a single method API like `search(query: str) -> List[str]`. This is fine for simple queries, since you'd expect words like 'paris is the capital of france' to be in the top results of say, your wikipedia embeddings.\n\n### Why is this a problem?\n\n- **Query-Document Mismatch**: This model assumes that query embedding and the content embedding are similar in the embedding space, which is not always true based on the text you're trying to search over. Only using queries that are semantically similar to the content is a huge limitation!\n\n- **Monolithic Search Backend**: Assumes a single search backend, which is not always the case. You may have multiple search backends, each with their own API, and you want to route the query to vector stores, search clients, sql databases, and more.\n- **Limitation of text search**: Restricts complex queries to a single string (`{query: str}`), sacrificing expressiveness, in using keywords, filters, and other advanced features. For example, asking `what problems did we fix last week` cannot be answered by a simple text search since documents that contain `problem, last week` are going to be present at every week.\n\n- **Limited ability to plan**: Assumes that the query is the only input to the search backend, but you may want to use other information to improve the search, like the user's location, or the time of day using the context to rewrite the query. For example, if you present the language model of more context it is able to plan a suite of queries to execute to return the best results.\n\nNow let's dive into how we can make it smarter with query understanding. This is where things get interesting.\n\n## Improving the RAG Model with Query Understanding\n\n!!! note \"Shoutouts\"\nMuch of this work has been inspired by / done in collab with a few of my clients at [new.computer](https://new.computer), [Metaphor Systems](https://metaphor.systems), and [Naro](https://narohq.com), go check them out!\n\nUltimately what you want to deploy is a [system that understands](https://en.wikipedia.org/wiki/Query_understanding) how to take the query and rewrite it to improve precision and recall.\n\n<figure markdown>\n  ![RAG](img/query_understanding.png)\n  <figcaption>Query Understanding system routes to multiple search backends.</figcaption>\n</figure>\n\nNot convinced? Let's move from theory to practice with a real-world example. First up, Metaphor Systems.\n\n## Whats instructor?\n\nInstructor uses Pydantic to simplify the interaction between the programmer and language models via the function calling API.\n\n- **Widespread Adoption**: Pydantic is a popular tool among Python developers.\n- **Simplicity**: Pydantic allows model definition in Python.\n- **Framework Compatibility**: Many Python frameworks already use Pydantic.\n\n## Case Study 1: Metaphor Systems\n\nTake [Metaphor Systems](https://metaphor.systems), which turns natural language queries into their custom search-optimized query. If you take a look web UI you'll notice that they have an auto-prompt option, which uses function calls to further optimize your query using a language model, and turns it into a fully specified metaphor systems query.\n\n<figure markdown>\n![Metaphor Systems](img/meta.png)\n<figcaption>Metaphor Systems UI</figcaption>\n</figure>\n\nIf we peek under the hood, we can see that the query is actually a complex object, with a date range, and a list of domains to search in. It's actually more complex than this but this is a good start. We can model this structured output in Pydantic using the instructor library\n\n```python\nclass DateRange(BaseModel):\n    start: datetime.date\n    end: datetime.date\n\n\nclass MetaphorQuery(BaseModel):\n    rewritten_query: str\n    published_daterange: DateRange\n    domains_allow_list: List[str]\n\n    async def execute():\n        return await metaphor.search(...)\n```\n\nNote how we model a rewritten query, range of published dates, and a list of domains to search in. This powerful pattern allows the user query to be restructured for better performance without the user having to know the details of how the search backend works.\n\n```python\nimport instructor\n\n# Enables response_model in the openai client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nquery = client.create(\n    model=\"gpt-4\",\n    response_model=MetaphorQuery,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You're a query understanding system for the Metafor Systems search engine. Here are some tips: ...\",\n        },\n        {\"role\": \"user\", \"content\": \"What are some recent developments in AI?\"},\n    ],\n)\n```\n\n**Example Output**\n\n```json\n{\n  \"rewritten_query\": \"novel developments advancements ai artificial intelligence machine learning\",\n  \"published_daterange\": {\n    \"start\": \"2023-09-17\",\n    \"end\": \"2021-06-17\"\n  },\n  \"domains_allow_list\": [\"arxiv.org\"]\n}\n```\n\nThis isn't just about adding some date ranges. It's about nuanced, tailored searches, that are deeply integrated with the backend. Metaphor Systems has a whole suite of other filters and options that you can use to build a powerful search query. They can even use some chain of thought prompting to improve how they use some of these advanced features.\n\n```python\nclass DateRange(BaseModel):\n    start: datetime.date\n    end: datetime.date\n    chain_of_thought: str = Field(\n        None,\n        description=\"Think step by step to plan what is the best time range to search in\",\n    )\n```\n\nNow, let's see how this approach can help model an agent like personal assistant.\n\n## Case Study 2: Personal Assistant\n\nAnother great example of this multiple dispatch pattern is a personal assistant. You might ask, \"What do I have today?\", from a vague query you might want events, emails, reminders etc. That data will likely exist in multiple backends, but what you want is one unified summary of results. Here you can't assume that text of those documents are all embedded in a search backend. There might be a calendar client, email client, across personal and profession accounts.\n\n```python\nclass ClientSource(enum.Enum):\n    GMAIL = \"gmail\"\n    CALENDAR = \"calendar\"\n\n\nclass SearchClient(BaseModel):\n    query: str\n    keywords: List[str]\n    email: str\n    source: ClientSource\n    start_date: datetime.date\n    end_date: datetime.date\n\n    async def execute(self) -> str:\n        if self.source == ClientSource.GMAIL:\n            ...\n        elif self.source == ClientSource.CALENDAR:\n            ...\n\n\nclass Retrieval(BaseModel):\n    queries: List[SearchClient]\n\n    async def execute(self) -> str:\n        return await asyncio.gather(*[query.execute() for query in self.queries])\n```\n\nNow we can call this with a simple query like \"What do I have today?\" and it will try to async dispatch to the correct backend. It's still important to prompt the language model well, but we'll leave that for another day.\n\n```python\nimport instructor\n\n# Enables response_model in the openai client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nretrieval = client.create(\n    model=\"gpt-4\",\n    response_model=Retrieval,\n    messages=[\n        {\"role\": \"system\", \"content\": \"You are Jason's personal assistant.\"},\n        {\"role\": \"user\", \"content\": \"What do I have today?\"},\n    ],\n)\n```\n\n**Example Output**\n\n```json\n{\n    \"queries\": [\n        {\n            \"query\": None,\n            \"keywords\": None,\n            \"email\": \"jason@example.com\",\n            \"source\": \"gmail\",\n            \"start_date\": \"2023-09-17\",\n            \"end_date\": None\n        },\n        {\n            \"query\": None,\n            \"keywords\": [\"meeting\", \"call\", \"zoom\"]]],\n            \"email\": \"jason@example.com\",\n            \"source\": \"calendar\",\n            \"start_date\": \"2023-09-17\",\n            \"end_date\": None\n\n        }\n    ]\n}\n```\n\nNotice that we have a list of queries that route to different search backends (email and calendar). We can even dispatch them async to be as performance as possible. Not only do we dispatch to different backends (that we have no control over), but you are likely going to render them to the user differently as well. Perhaps you want to summarize the emails in text, but you want to render the calendar events as a list that they can scroll across on a mobile app.\n\n!!! Note \"Can I used framework X?\"\nI get this question a lot, but it's just code. Within these dispatches you can do whatever you want. You can use `input()` to ask the user for more information, make a post request, call a Langchain agent or LLamaindex query engine to get more information. The sky is the limit.\n\nBoth of these examples showcase how both search providers and consumers can use `instructor` to model their systems. This is a powerful pattern that allows you to build a system that can be used by anyone, and can be used to build an LLM layer, from scratch, in front of any arbitrary backend.\n\n## Conclusion\n\nThis is not about fancy embedding tricks, it's just plain old information retrieval and query understanding. The beauty of instructor is that it simplifies modeling the complex and lets you define the output of the language model, the prompts, and the payload we send to the backend in a single place.\n\n## What's Next?\n\nHere I want to show that `instructor` isn’t just about data extraction. It’s a powerful framework for building a data model and integrating it with your LLM. Structured output is just the beginning - the untapped goldmine is skilled use of tools and APIs.\n\n## Related Documentation\n- [Validation Concepts](../../concepts/validation.md) - Validate RAG outputs\n\n## See Also\n- [LLM as Reranker](llm-as-reranker.md) - Improve search relevance\n- [Citation Extraction](citations.md) - Verify sources\n- [PDF Processing](chat-with-your-pdf-with-gemini.md) - Document handling\n\nIf you enjoy the content or want to try out `instructor` please check out the [github](https://github.com/jxnl/instructor) and give us a star!"
  },
  {
    "path": "docs/blog/posts/rag-timelines.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - LLM Techniques\ncomments: true\ndate: 2024-06-06\ndescription:\n  Explore enhancing RAG systems with time filters using Instructor and\n  Pydantic for accurate, relevant data retrieval.\ndraft: false\ntags:\n  - RAG\n  - Time Filters\n  - Pydantic\n  - Instructor\n  - LLM Techniques\n---\n\n# Enhancing RAG with Time Filters Using Instructor\n\nRetrieval-augmented generation (RAG) systems often need to handle queries with time-based constraints, like \"What new features were released last quarter?\" or \"Show me support tickets from the past week.\" Effective time filtering is crucial for providing accurate, relevant responses.\n\nInstructor is a Python library that simplifies integrating large language models (LLMs) with data sources and APIs. It allows defining structured output models using Pydantic, which can be used as prompts or to parse LLM outputs.\n\n<!-- more -->\n\n## Modeling Time Filters\n\nTo handle time filters, we can define a Pydantic model representing a time range:\n\n```python\nfrom datetime import datetime\nfrom typing import Optional\nfrom pydantic import BaseModel\n\n\nclass TimeFilter(BaseModel):\n    start_date: Optional[datetime] = None\n    end_date: Optional[datetime] = None\n```\n\nThe `TimeFilter` model can represent an absolute date range or a relative time range like \"last week\" or \"previous month.\"\n\nWe can then combine this with a search query string:\n\n```python\nclass SearchQuery(BaseModel):\n    query: str\n    time_filter: TimeFilter\n```\n\n## Prompting the LLM\n\nUsing Instructor, we can prompt the LLM to generate a `SearchQuery` object based on the user's query:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nresponse = client.create(\n    model=\"gpt-4o\",\n    response_model=SearchQuery,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a query generator for customer support tickets. The current date is 2024-02-17\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": \"Show me customer support tickets opened in the past week.\",\n        },\n    ],\n)\n\n# Example response:\n{\n    \"query\": \"Show me customer support tickets opened in the past week.\",\n    \"time_filter\": {\n        \"start_date\": \"2024-02-10T00:00:00\",\n        \"end_date\": \"2024-02-17T00:00:00\",\n    },\n}\n```\n\n## Nuances in dates and timezones\n\nWhen working with time-based queries, it's important to consider the nuances of dates, timezones, and publication times. Depending on the data source, the user's location, and when the content was originally published, the definition of \"past week\" or \"last month\" may vary.\n\nTo handle this, you'll want to design your `TimeFilter` model to intelligently reason about these relative time periods. This could involve:\n\n- Defaulting to the user's local timezone if available, or using a consistent default like UTC\n- Defining clear rules for how to calculate the start and end of relative periods like \"week\" or \"month\"\n  - e.g. does \"past week\" mean the last 7 days or the previous Sunday-Saturday range?\n- Allowing for flexibility in how users specify dates (exact datetimes, just dates, natural language phrases)\n- Validating and normalizing user input to fit the expected `TimeFilter` format\n- Considering the original publication timestamp of the content, not just the current date\n  - e.g. \"articles published in the last month\" should look at the publish date, not the query date\n\nBy building this logic into the `TimeFilter` model, you can abstract away the complexity and provide a consistent interface for the rest of your RAG system to work with standardized absolute datetime ranges\n\nOf course, there may be edge cases or ambiguities that are hard to resolve programmatically. In these situations, you may need to prompt the user for clarification or make a best guess based on the available information. The key is to strive for a balance of flexibility and consistency in how you handle time-based queries, factoring in publication dates when relevant.\n\nBy modeling time filters with Pydantic and leveraging Instructor, RAG systems can effectively handle time-based queries. Clear prompts, careful model design, and appropriate parsing strategies enable accurate retrieval of information within specific time frames, enhancing the system's overall relevance and accuracy.\n"
  },
  {
    "path": "docs/blog/posts/semantic-validation-structured-outputs.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Validation\n- Pydantic\n- LLMs\ncomments: true\ndate: 2025-05-20\ndescription: Learn how semantic validation with LLMs can ensure your structured outputs meet complex, subjective, and contextual criteria beyond what traditional rule-based validation can achieve.\ndraft: false\ntags:\n- Semantic Validation\n- Structured Outputs\n- LLM Validator\n- Pydantic\n- Data Quality\n---\n\n# Understanding Semantic Validation with Structured Outputs\n\n> Semantic validation uses LLMs to evaluate content against complex, subjective, and contextual criteria that would be difficult to implement with traditional rule-based validation approaches.\n\nAs LLMs become increasingly integrated into production systems, ensuring the quality and safety of their outputs is paramount. Traditional validation methods relying on explicit rules can't keep up with the complexity and nuance of natural language. With the release of Instructor's semantic validation capabilities, we now have a powerful way to validate structured outputs against sophisticated criteria.\n\n<!-- more -->\n\n## Beyond Rule-Based Validation\n\nTraditional validation approaches focus on verifying that data conforms to certain rules-ensuring that:\n\n- A field has the correct type (`int`, `str`, etc.)\n- A value falls within predefined ranges (e.g., `age >= 0`)\n- A pattern matches expected formats (e.g., email regex)\n\nThese approaches work well for structured data with clear constraints but fall short when validating natural language against less precise criteria like:\n\n- \"Content must be family-friendly\"\n- \"Description must be professional and free of hyperbole\"\n- \"Criticism must be constructive and respectful\"\n- \"Message must adhere to community guidelines\"\n\nThis is where semantic validation with LLMs comes in.\n\n## What is Semantic Validation?\n\nSemantic validation uses an LLM to interpret and evaluate text against natural language criteria. Instead of writing explicit rules, you express validation requirements in plain language, and the LLM determines whether content meets those requirements.\n\nLet's see how this works with Instructor's `llm_validator`:\n\n```python\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nimport instructor\nfrom instructor import llm_validator\n\n# Initialize client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ProductDescription(BaseModel):\n    name: str\n    description: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"\"\"The description must be:\n                1. Professional and factual\n                2. Free of excessive hyperbole or unsubstantiated claims\n                3. Between 50-200 words in length\n                4. Written in third person (no \"you\" or \"your\")\n                5. Free of spelling and grammar errors\"\"\",\n                client=client,\n            )\n        ),\n    ]\n```\n\nWhat makes this approach powerful is that we're leveraging the LLM's understanding of language and context to perform validation that would be extremely difficult to implement with traditional approaches.\n\n## When to Use Semantic Validation\n\nSemantic validation shines in situations where:\n\n1. **Criteria is complex or subjective**: \"Ensure this content is respectful\" requires understanding nuance that's difficult to capture in rules.\n\n2. **Context matters**: \"The summary must accurately reflect the key findings\" requires comparing multiple pieces of content.\n\n3. **The rules are constantly evolving**: Harmful content strategies change as bad actors adapt, making static rules obsolete quickly.\n\n4. **Human-like judgment is required**: \"This product description should be compelling without being misleading\" requires nuanced evaluation.\n\n## Real-World Examples\n\n### Content Moderation\n\nOne of the most obvious applications is content moderation. Companies need to ensure user-generated content meets community guidelines without being overly restrictive:\n\n```python\nclass UserComment(BaseModel):\n    user_id: str\n    content: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"\"\"Content must comply with community guidelines:\n                - No hate speech, harassment, or discrimination\n                - No explicit sexual or violent content\n                - No promotion of illegal activities\n                - No sharing of personal information\n                - No spamming or excessive self-promotion\"\"\",\n                client=client,\n            )\n        ),\n    ]\n```\n\n### Tone and Style Enforcement\n\nOrganizations often need to maintain a consistent tone and style in their communications:\n\n```python\nclass CompanyAnnouncement(BaseModel):\n    title: str\n    content: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"The announcement must maintain a professional, positive tone without being overly informal or using slang\",\n                client=client,\n            )\n        ),\n    ]\n```\n\n### Fact-Checking\n\nFor applications where factual accuracy is critical:\n\n```python\nclass FactCheckedClaim(BaseModel):\n    claim: str\n    is_accurate: bool\n    supporting_evidence: list[str]\n\n    @classmethod\n    def validate_claim(cls, text: str) -> \"FactCheckedClaim\":\n        return client.create(\n            response_model=cls,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a fact-checking system. Assess the factual accuracy of the claim.\",\n                },\n                {\"role\": \"user\", \"content\": \"Fact check this claim: {{ claim }}\"},\n            ],\n            context={\"claim\": text},\n        )\n```\n\n## Beyond Field Validation: Model-Level Semantic Validation\n\nWhile field-level validation is powerful, sometimes we need to validate relationships between fields. This is where model-level semantic validation becomes useful:\n\n```python\nclass Report(BaseModel):\n    title: str\n    summary: str\n    key_findings: list[str]\n\n    @model_validator(mode='after')\n    def validate_consistency(self):\n        # Semantic validation at the model level using Jinja templating\n        validation_result = client.create(\n            response_model=Validator,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"Validate that the summary accurately reflects the key findings.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": \"\"\"\n                        Please validate if this summary accurately reflects the key findings:\n\n                        Title: {{ title }}\n                        Summary: {{ summary }}\n\n                        Key findings:\n                        {% for finding in findings %}\n                        - {{ finding }}\n                        {% endfor %}\n\n                        Evaluate for consistency, completeness, and accuracy.\n                    \"\"\",\n                },\n            ],\n            context={\n                \"title\": self.title,\n                \"summary\": self.summary,\n                \"findings\": self.key_findings,\n            },\n        )\n\n        if not validation_result.is_valid:\n            raise ValueError(f\"Consistency error: {validation_result.reason}\")\n\n        return self\n```\n\n## Technical Implementation\n\nUnder the hood, the `llm_validator` uses a special `Validator` model that determines whether content meets the criteria and provides detailed error messages when it doesn't:\n\n```python\nclass Validator(BaseModel):\n    is_valid: bool\n    reason: Optional[str] = None\n    fixed_value: Optional[str] = None\n```\n\nWhen validation fails, the reason field contains a detailed explanation, which is perfect for both developers debugging issues and for automatic retry mechanisms.\n\n## Self-Healing with Retries\n\nOne of the most powerful features of Instructor's validation system is its ability to automatically retry with error context:\n\n```python\ntry:\n    product = client.create(\n        response_model=ProductDescription,\n        messages=[\n            {\"role\": \"system\", \"content\": \"Generate a product description.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a description for UltraClean 9000 Washing Machine\",\n            },\n        ],\n        max_retries=2,  # Automatically retry up to 2 times with error context\n    )\n    print(\"Success:\", product.model_dump_json(indent=2))\nexcept Exception as e:\n    print(f\"Failed after retries: {e}\")\n    #> Failed after retries: name 'client' is not defined\n```\n\nWith `max_retries` set, if the initial response fails validation, Instructor will automatically send the error context back to the LLM, giving it a chance to correct the issue. This creates a self-healing system that can recover from validation failures without developer intervention.\n\n## Performance and Cost Considerations\n\nSemantic validation adds an additional API call for each validation, which impacts:\n\n1. **Latency**: Each validation requires an LLM inference\n2. **Cost**: More API calls mean higher usage costs\n3. **Reliability**: Depends on LLM API availability\n\nFor high-throughput applications, consider these strategies:\n\n- **Batch validations**: Validate multiple items in a single call where possible\n- **Strategic placement**: Apply semantic validation at critical points rather than everywhere\n- **Caching**: Cache validation results for identical or similar content\n- **Use the right model**: `gpt-4o-mini` or similar models offer a good balance of capability and cost for many validation scenarios\n\n## Building a Layered Validation Strategy\n\nThe most robust approach combines traditional validation with semantic validation:\n\n1. **Type validation**: Use Pydantic's built-in type validation as your first defense\n2. **Rule-based validation**: Apply explicit rules where they make sense\n3. **Semantic validation**: Reserve LLM-based validation for complex criteria\n\nThis layered approach ensures you get the benefits of semantic validation without unnecessary API calls for simple validations.\n\n## Advanced Applications\n\n### Custom Guardrails Framework\n\nYou can build a comprehensive guardrails framework by combining semantic validators:\n\n```python\ndef create_guarded_model(base_class, guardrails):\n    \"\"\"Create a model with multiple semantic guardrails applied.\"\"\"\n    validators = {}\n\n    for field_name, criteria in guardrails.items():\n        validators[field_name] = Annotated[\n            str, BeforeValidator(llm_validator(criteria, client=client))\n        ]\n\n    return create_model(\n        f\"Guarded{base_class.__name__}\", __base__=base_class, **validators\n    )\n\n\n# Usage\nguardrails = {\n    \"title\": \"Must be concise, descriptive, and free of clickbait\",\n    \"content\": \"Must follow community guidelines and be respectful\",\n}\n\nGuardedPost = create_guarded_model(Post, guardrails)\n```\n\n### Contextual Validation with External References\n\nFor validations that require external knowledge:\n\n```python\nclass LegalCompliance(BaseModel):\n    document: str\n    compliance_status: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"\"\"Check if this document complies with the provided guidelines.\n                Guidelines: {{ guidelines }}\"\"\",\n                client=client,\n            )\n        ),\n    ]\n\n\n# Usage\nresult = client.create(\n    response_model=LegalCompliance,\n    messages=[{\"role\": \"user\", \"content\": \"Check this document: \" + document_text}],\n    context={\"guidelines\": company_legal_guidelines},\n)\n```\n\n## Conclusion\n\nSemantic validation represents a significant advancement in ensuring the quality and safety of LLM outputs. By combining the flexibility of natural language criteria with the structured validation of Pydantic, we can build systems that are both powerful and safe.\n\nAs these techniques mature, we can expect to see semantic validation become a standard part of AI application development, especially in regulated industries where output quality is critical.\n\nTo get started with semantic validation in your projects, check out the [Semantic Validation documentation](https://python.useinstructor.com../../concepts/semantic_validation/.md) and explore the various examples and patterns.\n\nThis approach isn't just a technical improvement-it's a fundamental shift in how we think about validation, moving from rigid rules to intelligent understanding of content and context.\n\n## Related Documentation\n- [Validation Fundamentals](../../concepts/validation.md) - Core validation concepts\n- [Semantic Validation](../../concepts/semantic_validation.md) - Using LLMs for validation\n\n## See Also\n- [Validation Deep Dive](validation-part1.md) - Foundation validation concepts\n- [Anthropic Prompt Caching](anthropic-prompt-caching.md) - Optimize validation costs\n- [Monitoring with Logfire](logfire.md) - Track validation performance"
  },
  {
    "path": "docs/blog/posts/situate-context.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - Anthropic\n  - LLM Techniques\n  - Python\ncomments: true\ndate: 2024-09-26\ndescription:\n  Learn to implement Anthropic's Contextual Retrieval with async processing\n  to enhance RAG systems and preserve crucial context efficiently.\ndraft: false\ntags:\n  - Contextual Retrieval\n  - Async Processing\n  - RAG Systems\n  - Performance Optimization\n  - Document Chunking\n---\n\n# Implementing Anthropic's Contextual Retrieval with Async Processing\n\nAnthropic's [Contextual Retrieval](https://www.anthropic.com/blog/contextual-retrieval-for-rag) technique enhances RAG systems by preserving crucial context.\n\nThis post examines the method and demonstrates an efficient implementation using async processing. We'll explore how to optimize your RAG applications with this approach, building on concepts from our [async processing guide](./learn-async.md).\n\n<!-- more -->\n\n## Background: The Context Problem in RAG\n\nAnthropic identifies a key issue in traditional RAG systems: loss of context when documents are split into chunks. They provide an example:\n\n\"Imagine you had a collection of financial information (say, U.S. SEC filings) embedded in your knowledge base, and you received the following question: 'What was the revenue growth for ACME Corp in Q2 2023?'\n\nA relevant chunk might contain the text: 'The company's revenue grew by 3% over the previous quarter.' However, this chunk on its own doesn't specify which company it's referring to or the relevant time period.\"\n\n## Anthropic's Solution: Contextual Retrieval\n\nContextual Retrieval solves this by adding chunk-specific explanatory context before embedding. Anthropic's example:\n\n```\noriginal_chunk = \"The company's revenue grew by 3% over the previous quarter.\"\n\ncontextualized_chunk = \"This chunk is from an SEC filing on ACME corp's performance in Q2 2023; the previous quarter's revenue was $314 million. The company's revenue grew by 3% over the previous quarter.\"\n```\n\n## Implementing Contextual Retrieval\n\nAnthropic uses Claude to generate context. They provide this prompt:\n\n```\n<document>\n{{WHOLE_DOCUMENT}}\n</document>\nHere is the chunk we want to situate within the whole document\n<chunk>\n{{CHUNK_CONTENT}}\n</chunk>\nPlease give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk. Answer only with the succinct context and nothing else.\n```\n\n## Performance Improvements\n\nAnthropic reports significant improvements:\n\n- Contextual Embeddings reduced top-20-chunk retrieval failure rate by 35% (5.7% → 3.7%).\n- Combining Contextual Embeddings and Contextual BM25 reduced failure rate by 49% (5.7% → 2.9%).\n- Adding reranking further reduced failure rate by 67% (5.7% → 1.9%).\n\n## Instructor implementation of Contextual Retrieval with Async Processing\n\nWe can implement Anthropic's technique using async processing for improved efficiency:\n\n```python\nfrom instructor import AsyncInstructor, Mode, patch\nfrom anthropic import AsyncAnthropic\nfrom pydantic import BaseModel, Field\nimport asyncio\nfrom typing import List, Dict\n\n\nclass SituatedContext(BaseModel):\n    title: str = Field(..., description=\"The title of the document.\")\n    context: str = Field(\n        ..., description=\"The context to situate the chunk within the document.\"\n    )\n\n\nclient = AsyncInstructor(\n    create=patch(\n        create=AsyncAnthropic().beta.prompt_caching.messages.create,\n        mode=Mode.TOOLS,\n    ),\n    mode=Mode.TOOLS,\n)\n\n\nasync def situate_context(doc: str, chunk: str) -> str:\n    response = await client.create(\n        model=\"claude-3-haiku-20240307\",\n        max_tokens=1024,\n        temperature=0.0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"<document>{{doc}}</document>\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Here is the chunk we want to situate within the whole document\\n<chunk>{{chunk}}</chunk>\\nPlease give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk.\\nAnswer only with the succinct context and nothing else.\",\n                    },\n                ],\n            }\n        ],\n        response_model=SituatedContext,\n        context={\"doc\": doc, \"chunk\": chunk},\n    )\n    return response.context\n\n\ndef chunking_function(doc: str) -> List[str]:\n    chunk_size = 1000\n    overlap = 200\n    chunks = []\n    start = 0\n    while start < len(doc):\n        end = start + chunk_size\n        chunks.append(doc[start:end])\n        start += chunk_size - overlap\n    return chunks\n\n\nasync def process_chunk(doc: str, chunk: str) -> Dict[str, str]:\n    context = await situate_context(doc, chunk)\n    return {\"chunk\": chunk, \"context\": context}\n\n\nasync def process(doc: str) -> List[Dict[str, str]]:\n    chunks = chunking_function(doc)\n    tasks = [process_chunk(doc, chunk) for chunk in chunks]\n    results = await asyncio.gather(*tasks)\n    return results\n\n\n# Example usage\nasync def main():\n    document = \"Your full document text here...\"\n    processed_chunks = await process(document)\n    for i, item in enumerate(processed_chunks):\n        print(f\"Chunk {i + 1}:\")\n        print(f\"Text: {item['chunk'][:50]}...\")\n        print(f\"Context: {item['context']}\")\n        print()\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n## Key Features of This Implementation\n\n1. Async Processing: Uses `asyncio` for concurrent chunk processing.\n2. Structured Output: Uses Pydantic models for type-safe responses.\n3. Prompt Caching: Utilizes Anthropic's prompt caching for efficiency.\n4. Chunking: Implements a basic chunking strategy with overlap.\n5. Jinja2 templating: Uses Jinja2 templating to inject variables into the prompt.\n\n## Considerations from Anthropic's Article\n\nAnthropic mentions several implementation considerations:\n\n1. Chunk boundaries: Experiment with chunk size, boundary, and overlap.\n2. Embedding model: They found Gemini and Voyage embeddings effective.\n3. Custom contextualizer prompts: Consider domain-specific prompts.\n4. Number of chunks: They found using 20 chunks most effective.\n5. Evaluation: Always run evaluations on your specific use case.\n\n## Further Enhancements\n\nBased on Anthropic's suggestions:\n\n1. Implement dynamic chunk sizing based on content complexity.\n2. Integrate with vector databases for efficient storage and retrieval.\n3. Add error handling and retry mechanisms.\n4. Experiment with different embedding models and prompts.\n5. Implement a reranking step for further performance improvements.\n\nThis implementation provides a starting point for leveraging Anthropic's Contextual Retrieval technique with the added efficiency of async processing.\n"
  },
  {
    "path": "docs/blog/posts/string-based-init.md",
    "content": "---\ndraft: false\ndate: 2024-04-20\nauthors:\n  - jxnl\ncategories:\n  - Tutorial\n---\n\n# Unified Provider Interface with String-Based Initialization\n\nInstructor now offers a simplified way to initialize any supported LLM provider with a single consistent interface. This approach makes it easier than ever to switch between different LLM providers while maintaining the same structured output functionality you rely on.\n\n## The Problem\n\nAs the number of LLM providers grows, so does the complexity of initializing and working with different client libraries. Each provider has its own initialization patterns, API structures, and quirks. This leads to code that isn't portable between providers and requires significant refactoring when you want to try a new model.\n\n## The Solution: String-Based Initialization\n\nWe've introduced a new unified interface that allows you to initialize any supported provider with a simple string format:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n\n# Initialize any provider with a single consistent interface\nclient = instructor.from_provider(\"openai/gpt-4\")\nclient = instructor.from_provider(\"anthropic/claude-3-sonnet\")\nclient = instructor.from_provider(\"google/gemini-pro\")\nclient = instructor.from_provider(\"mistral/mistral-large\")\n```\n\nThe `from_provider` function takes a string in the format `\"provider/model-name\"` and handles all the details of setting up the appropriate client with the right model. This provides several key benefits:\n\n- **Simplified Initialization**: No need to manually create provider-specific clients\n- **Consistent Interface**: Same syntax works across all providers\n- **Reduced Dependency Exposure**: You don't need to import specific provider libraries in your application code\n- **Easy Experimentation**: Switch between providers with a single line change\n\n## Supported Providers\n\nThe string-based initialization currently supports all major providers in the ecosystem:\n\n- OpenAI: `\"openai/gpt-4\"`, `\"openai/gpt-4o\"`, `\"openai/gpt-5-nano\"`\n- Anthropic: `\"anthropic/claude-3-opus-20240229\"`, `\"anthropic/claude-3-sonnet-20240229\"`, `\"anthropic/claude-3-5-haiku-latest\"`\n- Google Gemini: `\"google/gemini-pro\"`, `\"google/gemini-pro-vision\"`\n- Mistral: `\"mistral/mistral-small-latest\"`, `\"mistral/mistral-medium-latest\"`, `\"mistral/mistral-large-latest\"`\n- Cohere: `\"cohere/command\"`, `\"cohere/command-r\"`, `\"cohere/command-light\"`\n- Perplexity: `\"perplexity/sonar-small-online\"`, `\"perplexity/sonar-medium-online\"`\n- Groq: `\"groq/llama2-70b-4096\"`, `\"groq/mixtral-8x7b-32768\"`, `\"groq/gemma-7b-it\"`\n- Writer: `\"writer/palmyra-instruct\"`, `\"writer/palmyra-instruct-v2\"`\n- AWS Bedrock: `\"bedrock/anthropic.claude-v2\"`, `\"bedrock/amazon.titan-text-express-v1\"`\n- Cerebras: `\"cerebras/cerebras-gpt\"`, `\"cerebras/cerebras-gpt-2.7b\"`\n- Fireworks: `\"fireworks/llama-v2-70b\"`, `\"fireworks/firellama-13b\"`\n- Vertex AI: `\"vertexai/gemini-pro\"`, `\"vertexai/text-bison\"`\n- Google GenAI: `\"genai/gemini-pro\"`, `\"genai/gemini-pro-vision\"`\n\nEach provider will be initialized with sensible defaults, but you can also pass additional keyword arguments to customize the configuration. For model-specific details, consult each provider's documentation.\n\n## Async Support\n\nThe unified interface fully supports both synchronous and asynchronous clients:\n\n```python\n# Synchronous client (default)\nclient = instructor.from_provider(\"openai/gpt-4\")\n\n# Asynchronous client\nasync_client = instructor.from_provider(\"anthropic/claude-3-sonnet\", async_client=True)\n\n# Use like any other async client\nresponse = await async_client.create(\n    response_model=UserInfo,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract information about John who is 30 years old\",\n        }\n    ],\n)\n```\n\n## Mode Selection\n\nYou can also specify which structured output mode to use with the provider:\n\n```python\nimport instructor\nfrom instructor import Mode\n\n# Override the default mode for a provider\nclient = instructor.from_provider(\n    \"anthropic/claude-3-sonnet\", mode=Mode.TOOLS\n)\n\n# Use JSON mode instead of the default tools mode\nclient = instructor.from_provider(\n    \"mistral/mistral-large\", mode=Mode.JSON_SCHEMA\n)\n\n# Use reasoning tools instead of regular tools for Anthropic\nclient = instructor.from_provider(\n    \"anthropic/claude-3-opus\", mode=Mode.TOOLS\n)\n```\n\nIf not specified, each provider will use its recommended default mode:\n\n- OpenAI: `Mode.OPENAI_FUNCTIONS`\n- Anthropic: `Mode.TOOLS`\n- Google Gemini: `Mode.MD_JSON`\n- Mistral: `Mode.TOOLS`\n- Cohere: `Mode.TOOLS`\n- Perplexity: `Mode.JSON`\n- Groq: `Mode.GROQ_TOOLS`\n- Writer: `Mode.MD_JSON`\n- Bedrock: `Mode.TOOLS` (for Claude on Bedrock)\n- Vertex AI: `Mode.TOOLS`\n\nYou can always customize this based on your specific needs and model capabilities.\n\n## Error Handling\n\nThe `from_provider` function includes robust error handling to help you quickly identify and fix issues:\n\n```python\n# Missing dependency\ntry:\n    client = instructor.from_provider(\"anthropic/claude-3-sonnet\")\nexcept ImportError as e:\n    print(\"Error: Install the anthropic package first\")\n    # pip install anthropic\n\n# Invalid provider format\ntry:\n    client = instructor.from_provider(\"invalid-format\")\nexcept ValueError as e:\n    print(e)  # Model string must be in format \"provider/model-name\"\n\n# Unsupported provider\ntry:\n    client = instructor.from_provider(\"unknown/model\")\nexcept ValueError as e:\n    print(e)  # Unsupported provider: unknown. Supported providers are: ...\n```\n\nThe function validates the provider string format, checks if the provider is supported, and ensures the necessary packages are installed.\n\n## Environment Variables\n\nLike the native client libraries, `from_provider` respects environment variables set for each provider:\n\n```python\n# Set environment variables\nimport os\n\nos.environ[\"OPENAI_API_KEY\"] = \"your-openai-key\"\nos.environ[\"ANTHROPIC_API_KEY\"] = \"your-anthropic-key\"\nos.environ[\"MISTRAL_API_KEY\"] = \"your-mistral-key\"\n\n# No need to pass API keys directly\nclient = instructor.from_provider(\"openai/gpt-4\")\n```\n\n## Troubleshooting\n\nHere are some common issues and solutions when using the unified provider interface:\n\n### Model Not Found Errors\n\nIf you receive a 404 error, check that you're using the correct model name format:\n\n```\nError code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-3-haiku'}}\n```\n\nFor Anthropic models, always include the version date:\n- ✅ Correct: `anthropic/claude-3-haiku-20240307`\n- ❌ Incorrect: `anthropic/claude-3-haiku`\n\n### Provider-Specific Parameters\n\nSome providers require specific parameters for API calls:\n\n```python\n# Anthropic requires max_tokens\nanthropic_client = instructor.from_provider(\n    \"anthropic/claude-3-5-haiku-latest\", max_tokens=400  # Required for Anthropic\n)\n\n# Use models with vision capabilities for multimodal content\ngemini_client = instructor.from_provider(\n    \"google/gemini-pro-vision\"  # Required for image processing\n)\n```\n\n### Working Example\n\nHere's a complete example that demonstrates the automodel functionality with multiple providers:\n\n```python\nimport os\nimport asyncio\nimport instructor\nfrom pydantic import BaseModel, Field\n\n\nclass UserInfo(BaseModel):\n    \"\"\"User information extraction model.\"\"\"\n\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n    occupation: str = Field(description=\"The user's job or profession\")\n\n\nasync def main():\n    # Test OpenAI\n    openai_client = instructor.from_provider(\"openai/gpt-5-nano\")\n    openai_result = openai_client.create(\n        response_model=UserInfo,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Jane Doe is a 28-year-old data scientist.\"}\n        ],\n    )\n    print(f\"OpenAI result: {openai_result.model_dump()}\")\n\n    # Test Anthropic with async client\n    if os.environ.get(\"ANTHROPIC_API_KEY\"):\n        anthropic_client = instructor.from_provider(\n            model=\"anthropic/claude-3-5-haiku-latest\",\n            async_client=True,\n            max_tokens=400,  # Required for Anthropic\n        )\n        anthropic_result = await anthropic_client.create(\n            response_model=UserInfo,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"John Smith is a 35-year-old software engineer.\",\n                }\n            ],\n        )\n        print(f\"Anthropic result: {anthropic_result.model_dump()}\")\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n## Conclusion\n\nString-based initialization is a significant step toward making Instructor even more user-friendly and flexible. It reduces the learning curve for working with multiple providers and makes it easier than ever to experiment with different models.\n\nBenefits include:\n- Simplified initialization with a consistent interface\n- Automatic selection of appropriate default modes\n- Support for both synchronous and asynchronous clients\n- Clear error messages to quickly identify issues\n- Respect for provider-specific environment variables\n- Comprehensive model selection across the entire LLM ecosystem\n\nWhether you're building a new application or migrating an existing one, the unified provider interface offers a cleaner, more maintainable way to work with structured outputs across the LLM ecosystem.\n\nTry it today with `instructor.from_provider()` and check out the [complete example code](https://github.com/instructor-ai/instructor/tree/main/examples/automodel) in our repository!"
  },
  {
    "path": "docs/blog/posts/structured-output-anthropic.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - Anthropic\ncomments: true\ndate: 2024-10-23\ndescription: Learn how to leverage Anthropic's Claude with Instructor for structured outputs and prompt caching, enhancing AI application development.\ndraft: false\ntags:\n  - Anthropic\n  - API Development\n  - Pydantic\n  - Python\n  - LLM Techniques\n  - Prompt Caching\n---\n\n# Structured Outputs and Prompt Caching with Anthropic\n\nAnthropic's ecosystem now offers two powerful features for AI developers: structured outputs and prompt caching. These advancements enable more efficient use of large language models (LLMs). This guide demonstrates how to leverage these features with the Instructor library to enhance your AI applications.\n\n## Structured Outputs with Anthropic and Instructor\n\nInstructor now offers seamless integration with Anthropic's powerful language models, allowing developers to easily create structured outputs using Pydantic models. This integration simplifies the process of extracting specific information from AI-generated responses.\n\n<!-- more -->\n\nTo get started, you'll need to install Instructor with Anthropic support:\n\n```bash\npip install instructor[anthropic]\n```\n\nHere's a basic example of how to use Instructor with Anthropic:\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\nimport anthropic\nimport instructor\n\n# Patch the Anthropic client with Instructor\nanthropic_client = instructor.from_anthropic(create=anthropic.Anthropic())\n\n\n# Define your Pydantic models\nclass Properties(BaseModel):\n    name: str\n    value: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    properties: List[Properties]\n\n\n# Use the patched client to generate structured output\nuser_response = anthropic_client(\n    model=\"claude-3-7-sonnet-latest\",\n    max_tokens=1024,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Create a user for a model with a name, age, and properties.\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(user_response.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"John Doe\",\n  \"age\": 30,\n  \"properties\": [\n    { \"name\": \"favorite_color\", \"value\": \"blue\" }\n  ]\n}\n\"\"\"\n```\n\nThis approach allows you to easily extract structured data from Claude's responses, making it simpler to integrate AI-generated content into your applications.\n\n## Prompt Caching: Boosting Performance and Reducing Costs\n\nAnthropic has introduced a new prompt caching feature that can significantly improve response times and reduce costs for applications dealing with large context windows. This feature is particularly useful when making multiple calls with similar large contexts over time.\n\nHere's how you can implement prompt caching with Instructor and Anthropic:\n\n```python\nfrom pydantic import BaseModel\n\n# Set up the client with prompt caching\nclient = instructor.from_provider(\"anthropic/claude-3-5-haiku-latest\")\n\n\n# Define your Pydantic model\nclass Character(BaseModel):\n    name: str\n    description: str\n\n\n# Load your large context\nwith open(\"./book.txt\") as f:\n    book = f.read()\n\n# Make multiple calls using the cached context\nfor _ in range(2):\n    resp, completion = client.create_with_completion(\n        model=\"claude-3-7-sonnet-latest\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"<book>\" + book + \"</book>\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract a character from the text given above\",\n                    },\n                ],\n            },\n        ],\n        response_model=Character,\n        max_tokens=1000,\n    )\n```\n\nIn this example, the large context (the book content) is cached after the first request and reused in subsequent requests. This can lead to significant time and cost savings, especially when working with extensive context windows.\n\n## Conclusion\n\nBy combining Anthropic's Claude with Instructor's structured output capabilities and leveraging prompt caching, developers can create more efficient, cost-effective, and powerful AI applications. These features open up new possibilities for building sophisticated AI systems that can handle complex tasks with ease.\n\nAs the AI landscape continues to evolve, staying up-to-date with the latest tools and techniques is crucial. We encourage you to explore these features and share your experiences with the community. Happy coding!\n\n## Related Documentation\n- [How Patching Works](../../concepts/patching.md) - Understand provider integration\n- [Anthropic Integration](../../integrations/anthropic.md) - Complete setup guide\n\n## See Also\n- [Anthropic Prompt Caching](anthropic-prompt-caching.md) - Optimize Anthropic costs\n- [Unified Provider Interface](announcing-unified-provider-interface.md) - Switch providers easily\n- [Framework Comparison](best_framework.md) - Why Instructor excels\n"
  },
  {
    "path": "docs/blog/posts/tidy-data-from-messy-tables.md",
    "content": "---\ntitle: Using Structured Outputs to convert messy tables into tidy data\ndescription: With instructor, converting messy tables into tidy data is easy and fast\ncategories:\n  - Data Analysis\n  - Structured Outputs\ndate: 2024-11-21\ndraft: false\n---\n\n# Using Structured Outputs to convert messy tables into tidy data\n\n## Why is this a problem?\n\nMessy data exports are a common problem. Whether it's multiple headers in the table, implicit relationships that make analysis a pain or even just merged cells, using `instructor` with structured outputs makes it easy to convert messy tables into tidy data, even if all you have is just an image of the table as we'll see below.\n\nLet's look at the following table as an example. It makes analysis unnecessarily difficult because it hides data relationships through empty cells and implicit repetition. If we were using it for data analysis, cleaning it manually would be a huge nightmare.\n\n<!-- more -->\n\n![](./img/untidy_table.png)\n\nFor example, the subject ID (321) and GTT date only appear in the first row, with blank cells below implying these values apply to the following rows. This format breaks most pandas operations - you can't simply group by subject ID or merge with other datasets without complex preprocessing to fill in these missing values.\n\nInstead, we have time series measurements spread across multiple rows, mixed data types in the insulin column (numbers and \"lo off curve\"), and repeated subject information hidden through empty cells. This means even simple operations like calculating mean glucose levels by time point or plotting glucose curves require data reshaping and careful handling of missing/special values.\n\n## Using Structured Outputs\n\n### Defining a custom type\n\nUsing tools like instructor to automatically convert untidy data into tidy format can save hours of preprocessing and reduce errors in your analysis pipeline.\n\nLet's start by first defining a custom type that can parse the markdown table into a pandas dataframe.\n\n```python\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import BeforeValidator, PlainSerializer, InstanceOf, WithJsonSchema\nimport pandas as pd\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda df: df.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be separate\",\n        }\n    ),\n]\n```\n\n### Extracting the table\n\nThen with this new custom data type, it becomes easy to just pass the image to the LLM and get a tidy dataframe in response.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame  # Custom type for handling tables\n\n\nclass TidyTables(BaseModel):\n    tables: list[Table]\n\n\n# Patch the OpenAI client with instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef extract_table(image_path: str) -> TidyTables:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"Convert this untidy table to tidy format\",\n                    instructor.Image.from_path(image_path),\n                ],\n            }\n        ],\n        response_model=TidyTables,\n    )\n\n\nextracted_tables = extract_table(\"./untidy_table.png\")\n```\n\nThis then returns the following output for us as a single pandas dataframe which we can easily plot and do any sort of data analysis on.\n\n| ID  | GTT date | GTT weight | time | glucose mg/dl | insulin ng/ml | Comment      |\n| --- | -------- | ---------- | ---- | ------------- | ------------- | ------------ |\n| 321 | 2/9/15   | 24.5       | 0    | 99.2          |               | lo off curve |\n| 321 | 2/9/15   | 24.5       | 5    | 349.3         | 0.205         |              |\n| 321 | 2/9/15   | 24.5       | 15   | 286.1         | 0.129         |              |\n| 321 | 2/9/15   | 24.5       | 30   | 312           | 0.175         |              |\n| 321 | 2/9/15   | 24.5       | 60   | 99.9          | 0.122         |              |\n| 321 | 2/9/15   | 24.5       | 120  | 217.9         |               | lo off curve |\n| 322 | 2/9/15   | 18.9       | 0    | 185.8         | 0.251         |              |\n| 322 | 2/9/15   | 18.9       | 5    | 297.4         | 2.228         |              |\n| 322 | 2/9/15   | 18.9       | 15   | 439           | 2.078         |              |\n| 322 | 2/9/15   | 18.9       | 30   | 362.3         | 0.775         |              |\n| 322 | 2/9/15   | 18.9       | 60   | 232.7         | 0.5           |              |\n| 322 | 2/9/15   | 18.9       | 120  | 260.7         | 0.523         |              |\n| 323 | 2/9/15   | 24.7       | 0    | 198.5         | 0.151         |              |\n| 323 | 2/9/15   | 24.7       | 5    | 530.6         |               | off curve lo |\n\nMore importantly, we can also extract multiple tables from a single image. This would be useful in helping to segment and identify different sections of a messy report. With tidy data, we get the benefits of\n\n1. Each variable being its own column\n2. Each observation being its own row\n3. Each value having its own cell\n4. Seamlessly working with pandas/numpy operations\n5. Visualization libraries \"just working\"\n\n## Conclusion\n\nWe can actually go one step further and make this even tidier by converting things like weight, glucose and insulin into a specific column called metric which would allow us to add arbitrary metrics to the table without having to change the schema or our plotting code. This is a huge productivity boost when doing complex data analysis.\n\nNo more wrestling with complex data cleaning pipelines. Let the model handle the heavy lifting while you focus on analysis. With instructor, getting to that step just became a whole lot easier.\n\nGive `instructor` a try today and see how you can build reliable applications. Just run `pip install instructor` or check out our [Getting Started Guide](../../index.md)\n"
  },
  {
    "path": "docs/blog/posts/timestamp.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Pydantic\ncomments: true\ndate: 2024-09-26\ndescription: Learn how to ensure consistent timestamp formats in video content using\n  Pydantic for effective parsing and validation.\ndraft: false\nslug: consistent-timestamp-formats\ntags:\n- timestamp\n- Pydantic\n- data validation\n- video processing\n- NLP\n---\n\n# Ensuring Consistent Timestamp Formats with Language Models\n\nGemini can Understand timestamps in language model outputs, but they can be inconsistent. Video content timestamps vary between HH:MM:SS and MM:SS formats, causing parsing errors and calculations. This post presents a technique to handle timestamps for clips and films without formatting issues.\n\nWe combine Pydantic's data validation with custom parsing for consistent timestamp handling. You'll learn to process timestamps in any format, reducing errors in video content workflows. Kinda like how we ensured [matching language in multilingal summarization](./matching-language.md) by adding a simple field.\n\nThe post provides a solution using Pydantic to improve timestamp handling in language model projects. This method addresses format inconsistencies and enables timestamp processing.\n\n<!-- more -->\n\n## The Problem\n\nConsider a scenario where we're using a language model to generate timestamps for video segments. For shorter videos, timestamps might be in MM:SS format, while longer videos require HH:MM:SS. This inconsistency can lead to parsing errors and incorrect time calculations.\n\nHere's a simple example of how this problem might manifest:\n\n```python\nclass Segment(BaseModel):\n    title: str = Field(..., description=\"The title of the segment\")\n    timestamp: str = Field(..., description=\"The timestamp of the event as HH:MM:SS\")\n\n\n# This might work for some cases, but fails for others:\n# \"2:00\" could be interpreted as 2 minutes or 2 hours\n# \"1:30:00\" doesn't fit the expected format\n```\n\nThis approach doesn't account for the variability in timestamp formats and can lead to misinterpretations.\n\n## The Solution\n\nTo address this issue, we can use a combination of Pydantic for data validation and a custom parser to handle different timestamp formats. Here's how we can implement this:\n\n1. Define the expected time formats\n2. Use a custom validator to parse and normalize the timestamps\n3. Ensure the output is always in a consistent format\n\nLet's look at the improved implementation:\n\n```python\nfrom pydantic import BaseModel, Field, model_validator\nfrom typing import Literal\n\n\nclass SegmentWithTimestamp(BaseModel):\n    title: str = Field(..., description=\"The title of the segment\")\n    time_format: Literal[\"HH:MM:SS\", \"MM:SS\"] = Field(\n        ..., description=\"The format of the timestamp\"\n    )\n    timestamp: str = Field(\n        ..., description=\"The timestamp of the event as either HH:MM:SS or MM:SS\"\n    )\n\n    @model_validator(mode=\"after\")\n    def parse_timestamp(self):\n        if self.time_format == \"HH:MM:SS\":\n            hours, minutes, seconds = map(int, self.timestamp.split(\":\"))\n        elif self.time_format == \"MM:SS\":\n            hours, minutes, seconds = 0, *map(int, self.timestamp.split(\":\"))\n        else:\n            raise ValueError(\"Invalid time format, must be HH:MM:SS or MM:SS\")\n\n        # Normalize seconds and minutes\n        total_seconds = hours * 3600 + minutes * 60 + seconds\n        hours, remainder = divmod(total_seconds, 3600)\n        minutes, seconds = divmod(remainder, 60)\n\n        if hours > 0:\n            self.timestamp = f\"{hours:02d}:{minutes:02d}:{seconds:02d}\"\n        else:\n            self.timestamp = f\"00:{minutes:02d}:{seconds:02d}\"\n\n        return self\n```\n\nThis implementation offers several advantages:\n\n1. It explicitly defines the expected time format, reducing ambiguity.\n2. The custom validator parses the input based on the specified format.\n3. It normalizes all timestamps to a consistent HH:MM:SS format.\n4. It handles edge cases, such as when minutes or seconds exceed 59.\n\n## Why This Works Better Than Alternatives\n\nYou might wonder why we can't solve this problem with constrained sampling methods or JSON schema alone. The reason is that timestamp parsing often requires context-aware processing that goes beyond simple pattern matching.\n\n1. **Constrained sampling** might enforce a specific format, but it doesn't handle the conversion between different formats or normalization of times.\n\n2. **JSON schema** can validate the structure of the data, but it can't perform the complex parsing and normalization required for timestamps.\n\nOur approach combines the strengths of schema validation (using Pydantic) with custom logic to handle the intricacies of timestamp formatting.\n\n## Testing the Solution\n\nTo ensure our implementation works as expected, we can create some test cases:\n\n```python\nif __name__ == \"__main__\":\n    # Test cases for SegmentWithTimestamp\n    test_cases = [\n        (\n            SegmentWithTimestamp(\n                title=\"Introduction\", time_format=\"MM:SS\", timestamp=\"00:30\"\n            ),\n            \"00:00:30\",\n        ),\n        (\n            SegmentWithTimestamp(\n                title=\"Main Topic\", time_format=\"HH:MM:SS\", timestamp=\"00:15:45\"\n            ),\n            \"00:15:45\",\n        ),\n        (\n            SegmentWithTimestamp(\n                title=\"Conclusion\", time_format=\"MM:SS\", timestamp=\"65:00\"\n            ),\n            \"01:05:00\",\n        ),\n    ]\n\n    for input_data, expected_output in test_cases:\n        try:\n            assert input_data.timestamp == expected_output\n            print(f\"Test passed: {input_data.timestamp} == {expected_output}\")\n        except AssertionError:\n            print(f\"Test failed: {input_data.timestamp} != {expected_output}\")\n\n    # Output:\n    # Test passed: 00:00:30 == 00:00:30\n    # Test passed: 00:15:45 == 00:15:45\n    # Test passed: 01:05:00 == 01:05:00\n```\n\nThese test cases demonstrate that our solution correctly handles different input formats and normalizes them to a consistent output format.\n\n## Conclusion\n\nParsing and validation are needed when handling language model outputs. Its not about coercing language models, but building valid inputs into downstream systems. Combining Pydantic's validation with logic ensures handling across formats. This approach solves timestamp inconsistency and provides a framework for challenges in NLP tasks.\n\nWhen dealing with time-based data in language models, account for format variability and implement validation and normalization to maintain consistency."
  },
  {
    "path": "docs/blog/posts/using_json.md",
    "content": "---\nauthors:\n  - jxnl\ncategories:\n  - LLM Techniques\ncomments: true\ndate: 2024-06-15\ndescription:\n  Learn how to easily get structured JSON data from LLMs using the Instructor\n  library with Pydantic models in Python.\ndraft: false\nslug: zero-cost-abstractions\ntags:\n  - Instructor\n  - JSON\n  - LLM\n  - Pydantic\n  - Python\n---\n\n# Why Instructor is the best way to get JSON from LLMs\n\nLarge Language Models (LLMs) like GPT are incredibly powerful, but getting them to return well-formatted JSON can be challenging. This is where the Instructor library shines. Instructor allows you to easily map LLM outputs to JSON data using Python type annotations and Pydantic models.\n\nInstructor makes it easy to get structured data like JSON from LLMs like GPT-3.5, GPT-4, GPT-4-Vision, and open-source models including [Mistral/Mixtral](../../integrations/together.md), [Ollama](../../integrations/ollama.md), and [llama-cpp-python](../../integrations/llama-cpp-python.md).\n\nIt stands out for its simplicity, transparency, and user-centric design, built on top of Pydantic. Instructor helps you manage [validation context](../../concepts/reask_validation.md), retries with [Tenacity](../../concepts/retrying.md), and streaming [Lists](../../concepts/lists.md) and [Partial](../../concepts/partial.md) responses.\n\n- Instructor provides support for a wide range of programming languages, including:\n  - [Python](https://python.useinstructor.com)\n  - [TypeScript](https://js.useinstructor.com)\n  - [Ruby](https://ruby.useinstructor.com)\n  - [Go](https://go.useinstructor.com)\n  - [Elixir](https://hex.pm/packages/instructor)\n\n<!-- more -->\n\n## The Simple Patch for JSON LLM Outputs\n\nInstructor works as a lightweight patch over the OpenAI Python SDK. To use it, you simply apply the patch to your OpenAI client:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n```\n\nThen, you can pass a `response_model` parameter to the `completions.create` or `chat.completions.create` methods. This parameter takes in a Pydantic model class that defines the JSON structure you want the LLM output mapped to. Just like `response_model` when using FastAPI.\n\nHere's an example of a `response_model` for a simple user profile:\n\n```python\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nuser = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the user's name, age, and email from this: John Doe is 25 years old. His email is john@example.com\",\n        }\n    ],\n)\n\nprint(user.model_dump())\n#> {\n#     \"name\": \"John Doe\",\n#     \"age\": 25,\n#     \"email\": \"john@example.com\"\n#   }\n```\n\nInstructor extracts the JSON data from the LLM output and returns an instance of your specified Pydantic model. You can then use the `model_dump()` method to serialize the model instance to a JSON string.\n\nSome key benefits of Instructor:\n\n- Zero new syntax to learn - it builds on standard Python type hints\n- Seamless integration with existing OpenAI SDK code\n- Incremental, zero-overhead adoption path\n- Direct access to the `messages` parameter for flexible prompt engineering\n- Broad compatibility with any OpenAI SDK-compatible platform or provider\n\n## Pydantic: More Powerful than Plain Dictionaries\n\nYou might be wondering, why use Pydantic models instead of just returning a dictionary of key-value pairs? While a dictionary could hold JSON data, Pydantic models provide several powerful advantages:\n\n1. Type validation: Pydantic models enforce the types of the fields. If the LLM returns an incorrect type (e.g. a string for an int field), it will raise a validation error.\n\n2. Field requirements: You can mark fields as required or optional. Pydantic will raise an error if a required field is missing.\n\n3. Default values: You can specify default values for fields that aren't always present.\n\n4. Advanced types: Pydantic supports more advanced field types like dates, UUIDs, URLs, lists, nested models, and more.\n\n5. Serialization: Pydantic models can be easily serialized to JSON, which is helpful for saving results or passing them to other systems.\n\n6. IDE support: Because Pydantic models are defined as classes, IDEs can provide autocompletion, type checking, and other helpful features when working with the JSON data.\n\nSo while dictionaries can work for very simple JSON structures, Pydantic models are far more powerful for working with complex, validated JSON in a maintainable way.\n\n## JSON from LLMs Made Easy\n\nInstructor and Pydantic together provide a fantastic way to extract and work with JSON data from LLMs. The lightweight patching of Instructor combined with the powerful validation and typing of Pydantic models makes it easy to integrate JSON outputs into your LLM-powered applications. Give Instructor a try and see how much easier it makes getting JSON from LLMs!\n"
  },
  {
    "path": "docs/blog/posts/validation-part1.md",
    "content": "---\nauthors:\n- jxnl\n- ivanleomk\ncategories:\n- Pydantic\n- Data Validation\n- Python\ncomments: true\ndate: 2023-10-23\ndescription: Explore dynamic, machine learning-driven validation using Python's Pydantic\n  and Instructor to enhance software reliability.\ndraft: false\ntags:\n- LLM Validation\n- Pydantic\n- Python\n- Machine Learning\n- Software Development\n---\n\n# Good LLM Validation is Just Good Validation\n\n> What if your validation logic could learn and adapt like a human, but operate at the speed of software? This is the future of validation and it's already here.\n\nValidation is the backbone of reliable software. But traditional methods are static, rule-based, and can't adapt to new challenges. This post looks at how to bring dynamic, machine learning-driven validation into your software stack using Python libraries like `Pydantic` and `Instructor`. We validate these outputs using a validation function which conforms to the structure seen below.\n\n```python\ndef validation_function(value):\n    if condition(value):\n        raise ValueError(\"Value is not valid\")\n    return mutation(value)\n```\n\n<!-- more -->\n\n## What is Instructor?\n\n`Instructor` helps to ensure you get the exact response type you're looking for when using openai's function call api. Once you've defined the `Pydantic` model for your desired response, `Instructor` handles all the complicated logic in-between - from the parsing/validation of the response to the automatic retries for invalid responses. This means that we can build in validators 'for free' and have a clear separation of concerns between the prompt and the code that calls openai.\n\n```python\nimport instructor  # pip install instructor\nfrom pydantic import BaseModel\n\n# This enables response_model keyword\n# from client.chat.completions.create\nclient = instructor.from_provider(\"openai/gpt-5-nano\")  # (1)!\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserDetail = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"},\n    ],\n    max_retries=3,  # (2)!\n)\n\nassert user.name == \"Jason\"  # (3)!\nassert user.age == 25\n```\n\n1.  To simplify your work with OpenAI models and streamline the extraction of Pydantic objects from prompts, we\n    offer a patching mechanism for the `ChatCompletion` class.\n\n2.  Invalid responses that fail to be validated successfully will trigger up to as many reattempts as you define.\n\n3.  As long as you pass in a `response_model` parameter to the `ChatCompletion` api call, the returned object will always\n    be a validated `Pydantic` object.\n\nIn this post, we'll explore how to evolve from static, rule-based validation methods to dynamic, machine learning-driven ones. You'll learn to use `Pydantic` and `Instructor` to leverage language models and dive into advanced topics like content moderation, validating chain of thought reasoning, and contextual validation.\n\nLet's examine how these approaches with an example. Imagine that you run a software company that wants to ensure you never serve hateful and racist content. This isn't an easy job since the language around these topics change very quickly and frequently.\n\n## Software 1.0: Introduction to Validations in Pydantic\n\nA simple method could be to compile a list of different words that are often associated with hate speech. For simplicity, let's assume that we've found that the words `Steal` and `Rob` are good predictors of hateful speech from our database. We can modify our validation structure above to accommodate this.\n\nThis will throw an error if we pass in a string like `Let's rob the bank!` or `We should steal from the supermarkets`.\n\nPydantic offers two approaches for this validation: using the `field_validator` decorator or the `Annotated` hints.\n\n### Using `field_validator` decorator\n\nWe can use the `field_validator` decorator to define a validator for a field in Pydantic. Here's a quick example of how we might be able to do so.\n\n```python\nfrom pydantic import BaseModel, ValidationError, field_validator\n\n\nclass UserMessage(BaseModel):\n    message: str\n\n    @field_validator('message')\n    def message_cannot_have_blacklisted_words(cls, v: str) -> str:\n        for word in v.split():  # (1)!\n            if word.lower() in {'rob', 'steal'}:\n                raise ValueError(f\"`{word}` was found in the message `{v}`\")\n        return v\n\n\ntry:\n    UserMessage(message=\"This is a lovely day\")\n    UserMessage(message=\"We should go and rob a bank\")\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserMessage\n    message\n      Value error, `rob` was found in the message `We should go and rob a bank` [type=value_error, input_value='We should go and rob a bank', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.11/v/value_error\n    \"\"\"\n```\n\n1.  We split the sentence into its individual words and iterate through each of the words. We then try to see if any of these\n    words are in our blacklist which in this case is just `rob` and `steal`\n\nSince the message `This is a lovely day` does not have any blacklisted words, no errors are thrown. However, in the given example above, the validation fails for the message `We should go and rob a bank` due to the presence of the word `rob` and the corresponding error message is displayed.\n\n```\n1 validation error for UserMessage\nmessage\n  Value error, `rob` was found in the message `We should go and rob a bank` [type=value_error, input_value='We should go and rob a bank', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\n```\n\n### Using `Annotated`\n\nAlternatively, you can use the `Annotated` function to perform the same validation. Here's an example where we utilise the same function we started with.\n\n```python\nfrom pydantic import BaseModel, ValidationError\nfrom typing import Annotated\nfrom pydantic.functional_validators import AfterValidator\n\n\ndef message_cannot_have_blacklisted_words(value: str):\n    for word in value.split():\n        if word.lower() in {'rob', 'steal'}:\n            raise ValueError(f\"`{word}` was found in the message `{value}`\")\n    return value\n\n\nclass UserMessage(BaseModel):\n    message: Annotated[str, AfterValidator(message_cannot_have_blacklisted_words)]\n\n\ntry:\n    UserMessage(message=\"This is a lovely day\")\n    UserMessage(message=\"We should go and rob a bank\")\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserMessage\n    message\n      Value error, `rob` was found in the message `We should go and rob a bank` [type=value_error, input_value='We should go and rob a bank', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.11/v/value_error\n    \"\"\"\n```\n\nThis code snippet achieves the same validation result. If the user message contains any of the words in the blacklist, a `ValueError` is raised and the corresponding error message is displayed.\n\n```\n1 validation error for UserMessage\nmessage\n  Value error, `rob` was found in the message `We should go and rob a bank` [type=value_error, input_value='We should go and rob a bank', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\n```\n\nValidation is a fundamental concept in software development and remains the same when applied to AI systems. Existing programming concepts should be leveraged when possible instead of introducing new terms and standards. The underlying principles of validation remain unchanged.\n\nSuppose now that we've gotten a new message - `Violence is always acceptable, as long as we silence the witness`. Our original validator wouldn't throw any errors when passed this new message since it uses neither the words `rob` or `steal`. However, it's clear that it is not a message which should be published. How can we ensure that our validation logic can adapt to new challenges?\n\n## Software 3.0: Validation for LLMs or powered by LLMs\n\nBuilding upon the understanding of simple field validators, let's delve into probabilistic validation in software 3.0, (prompt engineering). We'll introduce an LLM-powered validator called `llm_validator` that uses a statement to verify the value.\n\nWe can get around this by using the inbuilt `llm_validator` class from `Instructor`.\n\n```python\nfrom instructor import llm_validator\nfrom pydantic import BaseModel, ValidationError\nfrom typing import Annotated\nfrom pydantic.functional_validators import AfterValidator\n\n\nclass UserMessage(BaseModel):\n    message: Annotated[\n        str, AfterValidator(llm_validator(\"don't say objectionable things\"))\n    ]\n\n\ntry:\n    UserMessage(\n        message=\"Violence is always acceptable, as long as we silence the witness\"\n    )\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserMessage\n    message\n      Assertion failed, The statement promotes violence, which is objectionable. [type=assertion_error, input_value='Violence is always accep... we silence the witness', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.6/v/assertion_error\n    \"\"\"\n```\n\nThis produces the following error message as seen below\n\n```\n1 validation error for UserMessage\nmessage\n  Assertion failed, The statement promotes violence, which is objectionable. [type=assertion_error, input_value='Violence is always accep... we silence the witness', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.4/v/assertion_error\n```\n\nThe error message is generated by the language model (LLM) rather than the code itself, making it helpful for re-asking the model in a later section. To better understand this approach, let's see how to build an `llm_validator` from scratch.\n\n### Creating Your Own Field Level `llm_validator`\n\nBuilding your own `llm_validator` can be a valuable exercise to get started with `Instructor` and create custom validators.\n\nBefore we continue, let's review the anatomy of a validator:\n\n```python\ndef validation_function(value):\n    if condition(value):\n        raise ValueError(\"Value is not valid\")\n    return value\n```\n\nAs we can see, a validator is simply a function that takes in a value and returns a value. If the value is not valid, it raises a `ValueError`. We can represent this using the following structure:\n\n```python\nclass Validation(BaseModel):\n    is_valid: bool = Field(\n        ..., description=\"Whether the value is valid based on the rules\"\n    )\n    error_message: Optional[str] = Field(\n        ...,\n        description=\"The error message if the value is not valid, to be used for re-asking the model\",\n    )\n```\n\nUsing this structure, we can implement the same logic as before and utilize `Instructor` to generate the validation.\n\n```python\nimport instructor\n\n# Enables `response_model` and `max_retries` parameters\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef validator(v):\n    statement = \"don't say objectionable things\"\n    resp = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a validator. Determine if the value is valid for the statement. If it is not, explain why.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Does `{v}` follow the rules: {statement}\",\n            },\n        ],\n        # this comes from client = instructor.from_provider(\"openai/gpt-5-nano\")\n        response_model=Validation,  # (1)!\n    )\n    if not resp.is_valid:\n        raise ValueError(resp.error_message)\n    return v\n```\n\n1. The new parameter of `response_model` comes from `client = instructor.from_provider(\"openai/gpt-5-nano\")` and does not exist in the original OpenAI SDK. This\n   allows us to pass in the `Pydantic` model that we want as a response.\n\nNow we can use this validator in the same way we used the `llm_validator` from `Instructor`.\n\n```python\nclass UserMessage(BaseModel):\n    message: Annotated[str, AfterValidator(validator)]\n```\n\n## Writing more complex validations\n\n### Validating Chain of Thought\n\nA popular way of prompting large language models nowadays is known as chain of thought. This involves getting a model to generate reasons and explanations for an answer to a prompt.\n\nWe can utilise `Pydantic` and `Instructor` to perform a validation to check if the reasoning is reasonable, given both the answer and the chain of thought. To do this we can't build a field validator since we need to access multiple fields in the model. Instead we can use a model validator.\n\n```python\ndef validate_chain_of_thought(values):\n    chain_of_thought = values[\"chain_of_thought\"]\n    answer = values[\"answer\"]\n    resp = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a validator. Determine if the value is valid for the statement. If it is not, explain why.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Verify that `{answer}` follows the chain of thought: {chain_of_thought}\",\n            },\n        ],\n        # this comes from client = instructor.from_provider(\"openai/gpt-5-nano\")\n        response_model=Validation,\n    )\n    if not resp.is_valid:\n        raise ValueError(resp.error_message)\n    return values\n```\n\nWe can then take advantage of the `model_validator` decorator to perform a validation on a subset of the model's data.\n\n> We're defining a model validator here which runs before `Pydantic` parses the input into its respective fields. That's why we have a **before** keyword used in the `model_validator` class.\n\n```python\nfrom pydantic import BaseModel, model_validator\n\n\nclass AIResponse(BaseModel):\n    chain_of_thought: str\n    answer: str\n\n    @model_validator(mode='before')\n    @classmethod\n    def chain_of_thought_makes_sense(cls, data: Any) -> Any:\n        # here we assume data is the dict representation of the model\n        # since we use 'before' mode.\n        return validate_chain_of_thought(data)\n```\n\nNow, when you create a `AIResponse` instance, the `chain_of_thought_makes_sense` validator will be invoked. Here's an example:\n\n```python\ntry:\n    resp = AIResponse(chain_of_thought=\"1 + 1 = 2\", answer=\"The meaning of life is 42\")\nexcept ValidationError as e:\n    print(e)\n```\n\nIf we create a `AIResponse` instance with an answer that does not follow the chain of thought, we will get an error.\n\n```\n1 validation error for AIResponse\n    Value error, The statement 'The meaning of life is 42' does not follow the chain of thought: 1 + 1 = 2.\n    [type=value_error, input_value={'chain_of_thought': '1 +... meaning of life is 42'}, input_type=dict]\n```\n\n### Validating Citations From Original Text\n\nLet's see a more concrete example. Let's say that we've asked our model a question about some text source and we want to validate that the generated answer is supported by the source. This would allow us to minimize hallucinations and prevent statements that are not backed by the original text. While we could verify this by looking up the original source manually, a more scalable approach is to use a validator to do this automatically.\n\nWe can pass in additional context to our validation functions using the `model_validate` function in `Pydantic` so that our models have more information to work with when performing validation. This context is a normal python dictionary and can be accessed inside the `info` argument in our validator functions.\n\n```python\nfrom pydantic import ValidationInfo, BaseModel, field_validator\n\n\nclass AnswerWithCitation(BaseModel):\n    answer: str\n    citation: str\n\n    @field_validator('citation')\n    @classmethod\n    def citation_exists(cls, v: str, info: ValidationInfo):  # (1)!\n        context = info.context\n        if context:\n            context = context.get('text_chunk')\n            if v not in context:\n                raise ValueError(f\"Citation `{v}` not found in text chunks\")\n        return v\n```\n\n1. This `info` object corresponds to the value of `context` that we pass into the `model_validate` function as seen below.\n\nWe can then take our original example and test it against our new model\n\n```python\ntry:\n    AnswerWithCitation.model_validate(\n        {\"answer\": \"Jason is a cool guy\", \"citation\": \"Jason is cool\"},\n        context={\"text_chunk\": \"Jason is just a guy\"},  # (1)!\n    )\nexcept ValidationError as e:\n    print(e)\n```\n\n1. This `context` object is just a normal python dictionary and can take in and store any arbitrary values\n\nThis in turn generates the following error since `Jason is cool` does not exist in the text `Jason is just a guy`.\n\n```\n1 validation error for AnswerWithCitation\ncitation\nValue error, Citation `Jason is cool` not found in text chunks [type=value_error, input_value='Jason is cool', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\n```\n\n## Putting it all together with `client = instructor.from_provider(\"openai/gpt-5-nano\")`\n\nTo pass this context from the `client.chat.completions.create` call, `client = instructor.from_provider(\"openai/gpt-5-nano\")` also passes the `context`, which will be accessible from the `info` argument in the decorated validator functions.\n\n```python\nimport instructor\n\n# Enables `response_model` and `max_retries` parameters\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef answer_question(question: str, text_chunk: str) -> AnswerWithCitation:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Answer the question: {question} with the text chunk: {text_chunk}\",\n            },\n        ],\n        response_model=AnswerWithCitation,\n        context={\"text_chunk\": text_chunk},\n    )\n```\n\n## Error Handling and Re-Asking\n\nValidators can ensure certain properties of the outputs by throwing errors, in an AI system we can use the errors and allow language model to self correct. Then by running `client = instructor.from_provider(\"openai/gpt-5-nano\")` not only do we add `response_model` and `context` it also allows you to use the `max_retries` parameter to specify the number of times to try and self correct.\n\nThis approach provides a layer of defense against two types of bad outputs:\n\n1. Pydantic Validation Errors (code or LLM-based)\n2. JSON Decoding Errors (when the model returns an incorrect response)\n\n### Define the Response Model with Validators\n\nTo keep things simple let's assume we have a model that returns a `UserModel` object. We can define the response model using Pydantic and add a field validator to ensure that the name is in uppercase.\n\n```python\nfrom pydantic import BaseModel, field_validator\n\n\nclass UserModel(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"name\")\n    @classmethod\n    def validate_name(cls, v):\n        if v.upper() != v:\n            raise ValueError(\"Name must be in uppercase.\")\n        return v\n```\n\nThis is where the `max_retries` parameter comes in. It allows the model to self correct and retry the prompt using the error message rather than the prompt.\n\n```python\nmodel = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n    # Powered by client = instructor.from_provider(\"openai/gpt-5-nano\")\n    response_model=UserModel,\n    max_retries=2,\n)\n\nassert model.name == \"JASON\"\n```\n\nIn this example, even though there is no code explicitly transforming the name to uppercase, the model is able to correct the output.\n\n## Conclusion\n\nFrom the simplicity of Pydantic and Instructor to the dynamic validation capabilities of LLMs, the landscape of validation is changing but without needing to introduce new concepts. It's clear that the future of validation is not just about preventing bad data but about allowing llms to understand the data and correcting it.\n\nIf you enjoy the content or want to try out `Instructor` please check out the [github](https://github.com/jxnl/instructor) and give us a star!\n\n## Related Documentation\n- [Core Validation Concepts](../../concepts/validation.md) - Learn about validation fundamentals\n- [Reask Validation](../../concepts/reask_validation.md) - Handle validation failures gracefully\n\n## See Also\n- [Semantic Validation with Structured Outputs](semantic-validation-structured-outputs.md) - Next evolution in validation\n- [Why Bad Schemas Break LLMs](bad-schemas-could-break-llms.md) - Schema design best practices\n- [Pydantic Is Still All You Need](pydantic-is-still-all-you-need.md) - Why Pydantic validation matters\n"
  },
  {
    "path": "docs/blog/posts/version-1.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- OpenAI\ncomments: true\ndate: 2024-04-01\ndescription: 'Introducing instructor 1.0.0: Simplified API for OpenAI with improved\n  typing support, validation, and streamlined usability.'\ndraft: false\nslug: announce-instructor-v1\ntags:\n- API Development\n- OpenAI\n- Data Validation\n- Python\n- LLM Techniques\n---\n\n# Announcing instructor=1.0.0\n\nOver the past 10 months, we've build up instructor with the [principle](../../why.md) of 'easy to try, and easy to delete'. We accomplished this by patching the openai client with the `instructor` package and adding new arguments like `response_model`, `max_retries`, and `context`. As a result I truly believe isntructor is the [best way](./best_framework.md) to get structured data out of llm apis.\n\nBut as a result, we've been a bit stuck on getting typing to work well while giving you more control at development time. I'm excited to launch version 1.0.0 which cleans up the api w.r.t. typing without compromising the ease of use.\n\n<!-- more -->\n\n## Growth\n\nOver the past 10 months, we've enjoyed healthy growth with over 4000+ github stars and 100+ contributors, and more importantly, 120k monthly downloads, and 20k unique monthly visitors with 500k requests per month to our docs\n\n![downloads](./img/downloads.png)\n\n## Whats new?\n\nHonestly, nothing much, the simplest change you'll need to make is to replace `instructor.patch` with `instructor.from_openai`.\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n```\n\nExcept now, any default arguments you want to place into the `create` call will be passed to the client. via kwargs.\n\nIF you know you want to pass in temperature, seed, or model, you can do so.\n\n```python\nimport openai\nimport instructor\n\nclient = instructor.from_openai(\n    openai.OpenAI(), model=\"gpt-4-turbo-preview\", temperature=0.2\n)\n```\n\nNow, whenever you call `client.chat.completions.create` the `model` and `temperature` will be passed to the openai client!\n\n## No new Standards\n\nWhen I first started working on this project, my goal was to ensure that we weren't introducing any new standards. Instead, our focus was on maintaining compatibility with existing ones. By creating our own client, we can seamlessly proxy OpenAI's `chat.completions.create` and Anthropic's `messages.create` methods. This approach allows us to provide a smooth upgrade path for your client, enabling support for all the latest models and features as they become available. Additionally, this strategy safeguards us against potential downstream changes.\n\n```python\nimport openai\nimport anthropic\nimport litellm\nimport instructor\nfrom typing import TypeVar\n\nT = TypeVar(\"T\")\n\n# These are all ways to create a client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nclient = instructor.from_provider(\"anthropic/claude-3-5-haiku-latest\")\nclient = instructor.from_litellm(litellm.completion)\n\n# all of these will route to the same underlying create function\n# allow you to add instructor to try it out, while easily removing it\nclient.create(model=\"gpt-4\", response_model=type[T]) -> T\nclient.create(model=\"gpt-4\", response_model=type[T]) -> T\nclient.messages.create(model=\"gpt-4\", response_model=type[T]) -> T\n```\n\n## Type are inferred correctly\n\nThis was the dream of instructor but due to the patching of openai, it wasnt possible for me to get typing to work well. Now, with the new client, we can get typing to work well! We've also added a few `create_*` methods to make it easier to create iterables and partials, and to access the original completion.\n\n### Calling `create`\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nuser = client.create(\n    model=\"gpt-4-turbo-preview\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create a user\"},\n    ],\n    response_model=User,\n)\n```\n\nNow if you use a ID, you can see the type is correctly inferred.\n\n![type](./img/type.png)\n\n### Handling async: `await create`\n\nThis will also work correctly with asynchronous clients.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract():\n    return await client.create(\n        model=\"gpt-4-turbo-preview\",\n        messages=[\n            {\"role\": \"user\", \"content\": \"Create a user\"},\n        ],\n        response_model=User,\n    )\n```\n\nNotice that simply because we return the `create` method, the `extract()` function will return the correct user type.\n\n![async](./img/async_type.png)\n\n### Returning the original completion: `create_with_completion`\n\nYou can also return the original completion object\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nuser, completion = client.create_with_completion(\n    model=\"gpt-4-turbo-preview\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create a user\"},\n    ],\n    response_model=User,\n)\n```\n\n![with_completion](./img/with_completion.png)\n\n\n### Streaming Partial Objects: `create_partial`\n\nIn order to handle streams, we still support `Iterable[T]` and `Partial[T]` but to simply the type inference, we've added `create_iterable` and `create_partial` methods as well!\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nuser_stream = client.create_partial(\n    model=\"gpt-4-turbo-preview\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create a user\"},\n    ],\n    response_model=User,\n)\n\nfor user in user_stream:\n    print(user)\n    #> name=None age=None\n    #> name=None age=None\n    #> name='' age=None\n    #> name='John' age=None\n    #> name='John Doe' age=None\n    #> name='John Doe' age=None\n    #> name='John Doe' age=None\n    #> name='John Doe' age=None\n    #> name='John Doe' age=30\n    #> name='John Doe' age=30\n    # name=None age=None\n    # name='' age=None\n    # name='John' age=None\n    # name='John Doe' age=None\n    # name='John Doe' age=30\n```\n\nNotice now that the type inferred is `Generator[User, None]`\n\n![generator](./img/generator.png)\n\n### Streaming Iterables: `create_iterable`\n\nWe get an iterable of objects when we want to extract multiple objects.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nusers = client.create_iterable(\n    model=\"gpt-4-turbo-preview\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create 2 users\"},\n    ],\n    response_model=User,\n)\n\nfor user in users:\n    print(user)\n    #> name='John Doe' age=30\n    #> name='Jane Doe' age=28\n    # User(name='John Doe', age=30)\n    # User(name='Jane Smith', age=25)\n```\n\n![iterable](./img/iterable.png)\n\n## Validation and Error Handling\n\nInstructor has always supported validation and error handling. But now, we've added a new `context` argument to the `create` call. This allows you to pass in a `ValidationContext` object which will be passed to the `response_model`. This allows you to add custom validation logic to the `response_model`.\n\nIf you want to learn more check out the docs on [retrying](../../concepts/retrying.md) and [reasking](../../concepts/reask_validation.md)\n\n## Support in multiple languages\n\nWhile each flavor is different the core philosophy is the same. Keeping it as close as possible to the common api allows us to support all the same features in all the same languages by hooking into each libraries's popular validation libraries.\n\nCheck out:\n\n- [JavaScript](https://github.com/instructor-ai/instructor-js)\n- [Elixir](https://github.com/instructor-ai/instructor-elixir)\n- [PHP](https://github.com/cognesy/instructor-php)\n\nIf you're interested in contributing, check out the [contributing guide](../../contributing.md), and you want to create instructor in your language, let [me](https://twitter.com/jxnlco) know and I can help with promotion and connecting all the docs!\n"
  },
  {
    "path": "docs/blog/posts/why-care-about-mcps.md",
    "content": "---\ntitle: Understanding Model Context Protocol (MCP)\ndate: 2025-03-27\ndescription: A comprehensive look at the Model Context Protocol (MCP), its architecture, benefits, and comparison with OpenAPI\nauthors:\n  - ivanleomk\ntags:\n  - LLM\n  - MCP\n  - Standards\n---\n\n# What is MCP\n\nWith [OpenAI joining Anthropic in supporting the Model Context Protocol (MCP)](https://x.com/sama/status/1904957253456941061), we're witnessing a unified standard for language models to interact with external systems. This creates exciting opportunities for multi-LLM architectures where specialized AI applications work in parallel-discovering tools, handing off tasks, and accessing powerful capabilities through standardized interfaces.\n\n<!-- more -->\n\n## What is MCP and Why Does It Matter?\n\nMCP is an open protocol developed by Anthropic that standardizes how AI models and applications interact with external tools, data sources, and systems. It solves the fragmentation problem where teams build custom implementations for AI integrations by providing a standardized interface layer.\n\nThere are three components to the MCP ecosystem:\n\n1. **Hosts**: Programs like Claude Desktop, IDEs, or AI tools that want to access data via MCP clients\n2. **Clients**: Protocol clients that maintain 1:1 connections with servers\n3. **Servers**: Lightweight programs that each expose specific capabilities through the standardized Model Context Protocol\n\n![MCP Architecture](./img/mcp_architecture.png)\n\nWhen interacting with Clients, Hosts have access to two primary options: **Tools**, which are model-controlled functions that retrieve or modify data, and **Resources**, which are application-controlled data like files.\n\nThere's also the intention of eventually allowing servers themselves to have the capability of requesting completions/approval from Clients and Hosts while executing their tasks [through the `sampling` endpoint](https://modelcontextprotocol.io/docs../../concepts/sampling.md).\n\n### The Integration Problem MCP Solves\n\nBefore MCP, integrating AI applications with external tools and systems created what's known as an \"M×N problem\". If you have M different AI applications (Claude, ChatGPT, custom agents, etc.) and N different tools/systems (GitHub, Slack, Asana, databases, etc.), you would need to build M×N different integrations. This leads to duplicated effort across teams, inconsistent implementations, and a maintenance burden that grows quadratically.\n\nMCP transforms this into an \"M+N problem\". Tool creators build N MCP servers (one for each system), while application developers build M MCP clients (one for each AI application). The total integration work becomes M+N instead of M×N.\n\nThis means a team can build a GitHub MCP server once, and it will work with any MCP-compatible client. Similarly, once you've built an MCP-compatible agent, it can immediately work with all existing MCP servers without additional integration work.\n\n## Market Signals: Growing Adoption\n\nThe adoption curve for MCP has been remarkably steep since its introduction. [Almost 3000 community-built MCP servers have emerged in just a few months](https://smithery.ai), showing the strong developer interest in this standard. Major platforms like Zed, Cursor, Perser, and Windsurf have become MCP Hosts, integrating the protocol into their core offerings. Companies including Cloudflare have released official [MCP support with features such as OAuth](https://blog.cloudflare.com/remote-model-context-protocol-servers-mcp/) for developers to start building great applications.\n\n![MCP Stars Growth](./img/mcp_stars.webp)\n\nWith both OpenAI and Anthropic supporting MCP, we now have a unified approach spanning the two most advanced AI model providers. This critical mass suggests MCP is positioned to become the dominant standard for AI tool integration.\n\n## MCP vs OpenAPI Specification\n\nWhile MCP and OpenAPI are both standards for API interfaces, they have different purposes and approaches. Here's a simplified comparison of the key differences:\n\n| Aspect            | OpenAPI Specification                                | Model Context Protocol (MCP)                                                      |\n| ----------------- | ---------------------------------------------------- | --------------------------------------------------------------------------------- |\n| **Primary Users** | Human developers interacting with web APIs           | AI models and agents discovering and using tools                                  |\n| **Architecture**  | Centralized specification in a single JSON/YAML file | Distributed system with hosts, clients, and servers allowing dynamic discovery    |\n| **Use Cases**     | Documenting RESTful services for human consumption   | Enabling AI models to autonomously find and use tools with semantic understanding |\n\nThese two standards serve complementary purposes in the modern tech ecosystem. While OpenAPI excels at documenting traditional web services for human developers, MCP is purpose-built for the emerging AI agent landscape, providing rich semantic context that makes tools discoverable and usable by language models.\n\nMost organizations will likely maintain both: OpenAPI specifications for their developer-facing services and MCP interfaces for AI-enabled applications, creating bridges between these worlds as needed.\n\n## Getting Started With MCP Development\n\nThe learning curve for MCP is relatively gentle-many servers are less than 200 lines of code and can be built in under an hour. Here are several ways you can start using MCP in existing environments:\n\n### Claude Desktop\n\nClaude Desktop now supports MCP integrations, allowing Claude to access up-to-date information through tools. You can add these MCPs by going to Claude's Settings and editing the configuration.\n\n![Claude Desktop MCP Settings](./img/claude_desktop_screenshot.png)\n\nFor example, you can install Firecrawl's MCP using the following configuration:\n\n```json\n{\n  \"mcpServers\": {\n    \"mcp-server-firecrawl\": {\n      \"command\": \"npx\",\n      \"args\": [\"-y\", \"firecrawl-mcp\"],\n      \"env\": {\n        \"FIRECRAWL_API_KEY\": \"YOUR_API_KEY_HERE\"\n      }\n    }\n  }\n}\n```\n\nThis allows Claude to crawl websites and get up-to-date information:\n\n![Claude Desktop Using MCP](./img/claude_desktop_mcp.png)\n\n### Cursor Integration\n\nCursor provides support for MCPs through a simple configuration file. Create a `.cursor/mcp.json` file with your desired MCP servers:\n\n```json\n{\n  \"mcpServers\": {\n    \"github\": {\n      \"command\": \"npx\",\n      \"args\": [\"-y\", \"@modelcontextprotocol/server-github\"],\n      \"env\": {\n        \"GITHUB_PERSONAL_ACCESS_TOKEN\": \"<Personal Access Token Goes Here>\"\n      }\n    }\n  }\n}\n```\n\nEnable the MCP option in Cursor Settings:\n\n![Cursor MCP Support](./img/cursor_mcp_support.png)\n\nThen use Cursor's Agent with your MCP servers:\n\n![Cursor MCP Agent](./img/cursor_mcp_agent.png)\n\nIn the example above, I've provided a simple github MCP to ask some questions about the issues from the `instructor-ai` repository. But you can really do a lot more, for instance, you can provide a `puppeteer` MCP to allow your model to interact with a web browser for instance to see how your frontend code looks like when it gets rendered to fix it automatically.\n\n### OpenAI Agent SDK\n\nOpenAI's Agent SDK now supports MCP servers using the `MCPServer` class, allowing you to connect agents to local tools and resources:\n\n```python\nimport asyncio\nimport shutil\n\nfrom agents import Agent, Runner, trace\nfrom agents.mcp import MCPServer, MCPServerStdio\n\n\nasync def run(mcp_server: MCPServer, directory_path: str):\n    agent = Agent(\n        name=\"Assistant\",\n        instructions=f\"Answer questions about the git repository at {directory_path}, use that for repo_path\",\n        mcp_servers=[mcp_server],\n    )\n\n    question = input(\"Enter a question: \")\n\n    print(\"\\n\" + \"-\" * 40)\n    print(f\"Running: {question}\")\n    result = await Runner.run(starting_agent=agent, input=question)\n    print(result.final_output)\n\n    message = \"Summarize the last change in the repository.\"\n    print(\"\\n\" + \"-\" * 40)\n    print(f\"Running: {message}\")\n    result = await Runner.run(starting_agent=agent, input=message)\n    print(result.final_output)\n\n\nasync def main():\n    # Ask the user for the directory path\n    directory_path = input(\"Please enter the path to the git repository: \")\n\n    async with MCPServerStdio(\n        cache_tools_list=True,  # Cache the tools list, for demonstration\n        params={\"command\": \"uvx\", \"args\": [\"mcp-server-git\"]},\n    ) as server:\n        with trace(workflow_name=\"MCP Git Example\"):\n            await run(server, directory_path)\n\n\nif __name__ == \"__main__\":\n    if not shutil.which(\"uvx\"):\n        raise RuntimeError(\n            \"uvx is not installed. Please install it with `pip install uvx`.\"\n        )\n\n    asyncio.run(main())\n```\n\nThis allows the agent to understand local git repositories:\n\n![Agent MCP Example](./img/agent_mcp_example.png)\n\n## Conclusion\n\nFor developers and organizations, the question isn't if you should build for MCPs but when. As the ecosystem matures, early adopters will have a significant advantage in integrating AI capabilities into their existing systems and workflows. This is especially true with the upcoming MCP registry by Anthropic, incoming support for remote MCP server hosting, and OAuth integrations that will help build richer and more personal integrations.\n\nThe standardization provided by MCP will likely drive the next wave of AI integration, making it possible to build complex, multi-agent systems that leverage the best capabilities from different providers through a unified interface.\n"
  },
  {
    "path": "docs/blog/posts/writer-support.md",
    "content": "---\nauthors:\n  - ivanleomk\n  - yanomaly\ncategories:\n  - Writer SDK\ncomments: true\ndate: 2024-11-19\ndescription: Announcing Writer integration with Instructor for structured outputs and enterprise AI workflows\ndraft: false\nslug: writer-support\ntags:\n  - Writer\n  - Enterprise AI\n  - Integrations\n---\n\n# Structured Outputs with Writer now supported\n\n>\n\nWe're excited to announce that `instructor` now supports [Writer](https://writer.com)'s enterprise-grade LLMs, including their latest Palmyra X 004 model. This integration enables structured outputs and enterprise AI workflows with Writer's powerful language models.\n\n## Getting Started\n\nFirst, make sure that you've signed up for an account on [Writer](https://app.writer.com/aistudio/signup?utm_campaign=devrel) and obtained an API key using this [quickstart guide](https://dev.writer.com/api-guides/quickstart). Once you've done so, install `instructor` with Writer support by running `pip install instructor[writer]` in your terminal.\n\nMake sure to set the `WRITER_API_KEY` environment variable with your Writer API key or pass it as an argument to the `Writer` constructor.\n\n<!-- more -->\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize Writer client\nclient = instructor.from_provider(\"writer/claude-3-5-sonnet-20241022\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract structured data\nuser = client.create(\n    model=\"palmyra-x-004\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old\"}],\n    response_model=User,\n)\n\nprint(user)\n#> name='John' age=30\n```\n\n!!! note\n\n    If you'd like to use the Async version of the Writer client, you can do so by using `instructor.from_provider(\"writer/claude-3-5-sonnet-20241022\")`.\n\nWe also support streaming with the Writer client using our `create_partial` method. This allows you to process responses incrementally as they arrive.\n\nThis is particularly valuable for maintaining responsive applications and delivering a smooth user experience, especially when dealing with larger responses so that users can see immediate results.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize Writer client\nclient = instructor.from_provider(\"writer/claude-3-5-sonnet-20241022\")\n\n\ntext_block = \"\"\"\nIn our recent online meeting, participants from various backgrounds joined to discuss the upcoming tech conference. The names and contact details of the participants were as follows:\n\n- Name: John Doe, Email: johndoe@email.com, Twitter: @TechGuru44\n- Name: Jane Smith, Email: janesmith@email.com, Twitter: @DigitalDiva88\n- Name: Alex Johnson, Email: alexj@email.com, Twitter: @CodeMaster2023\n\nDuring the meeting, we agreed on several key points. The conference will be held on March 15th, 2024, at the Grand Tech Arena located at 4521 Innovation Drive. Dr. Emily Johnson, a renowned AI researcher, will be our keynote speaker.\n\nThe budget for the event is set at $50,000, covering venue costs, speaker fees, and promotional activities. Each participant is expected to contribute an article to the conference blog by February 20th.\n\nA follow-up meetingis scheduled for January 25th at 3 PM GMT to finalize the agenda and confirm the list of speakers.\n\"\"\"\n\n\nclass User(BaseModel):\n    name: str\n    email: str\n    twitter: str\n\n\nclass MeetingInfo(BaseModel):\n    date: str\n    location: str\n    budget: int\n    deadline: str\n\n\nPartialMeetingInfo = instructor.Partial[MeetingInfo]\n\n\nextraction_stream = client.create(\n    model=\"palmyra-x-004\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Get the information about the meeting and the users {text_block}\",\n        },\n    ],\n    response_model=PartialMeetingInfo,\n    stream=True,\n)  # type: ignore\n\n\nfor obj in extraction_stream:\n    print(obj)\n    #> date='March 15th, 2024' location='' budget=None deadline=None\n    #> date='March 15th, 2024' location='Grand Tech Arena, 4521 Innovation' budget=None deadline=None\n    #> date='March 15th, 2024' location='Grand Tech Arena, 4521 Innovation Drive' budget=50000 eadline='February 20th'\n```\n\nAs with all our integrations, `instructor` ships with the ability to automatically retry requests that happen due to schema validation without you having to do anything.\n\n```python\nimport instructor\nfrom typing import Annotated\nfrom pydantic import BaseModel, AfterValidator, Field\n\n# Initialize Writer client\nclient = instructor.from_provider(\"writer/claude-3-5-sonnet-20241022\")\n\n\n# Example of model, that may require usage of retries\ndef uppercase_validator(v):\n    if v.islower():\n        raise ValueError(\"Name must be in uppercase\")\n    return v\n\n\nclass User(BaseModel):\n    name: Annotated[str, AfterValidator(uppercase_validator)] = Field(\n        ..., description=\"The name of the user\"\n    )\n    age: int\n\n\nuser = client.create(\n    model=\"palmyra-x-004\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract: jason is 12\"}],\n    response_model=User,\n    max_retries=3,\n)\n\nprint(user)\n#> name='JASON' age=12\n```\n\nThis was a sneak peek into the things that you can do with Writer and `instructor` - from classification of text to sentimen analysis and more.\n\nWe're excited to see what you build with `instructor` and Writer. If you have any other questions about writer, do check out the [Writer Documentation](https://dev.writer.com/introduction) for the API sdk.\n"
  },
  {
    "path": "docs/blog/posts/youtube-flashcards.md",
    "content": "---\nauthors:\n- jxnl\n- zilto\ncategories:\n- Data Processing\ncomments: true\ndate: 2024-10-18\ndescription: Flashcard generator application with Instructor + Burr\ndraft: false\nslug: youtube-flashcards\ntags:\n- instructor\n- Burr\n- OpenAI\n- LLM\n- observability\n---\n\n# Flashcard generator with Instructor + Burr\n\nFlashcards help break down complex topics and learn anything from biology to a new\nlanguage or lines for a play. This blog will show how to use LLMs to generate\nflashcards and kickstart your learning!\n\n**Instructor** lets us get structured outputs from LLMs reliably, and [Burr](https://github.com/dagworks-inc/burr) helps\ncreate an LLM application that's easy to understand and debug. It comes with **Burr UI**,\na free, open-source, and local-first tool for observability, annotations, and more!\n\n<!-- more -->\n\n??? info\n\n    This post expands on an earlier one: [Analyzing Youtube Transcripts with Instructor](./youtube-transcripts.md/).\n\n\n## Generate flashcards using LLMs with Instructor\n\n```bash\npip install openai instructor pydantic youtube_transcript_api \"burr[start]\"\n```\n\n### 1. Define the LLM response model\n\nWith `instructor`, you define Pydantic models that will serve as template for the LLM to\nfill.\n\nHere, we define the `QuestionAnswer` model which will store the question, the answer, and\nsome metadata. Attributes without a default value will be generated by the LLM.\n\n```python hl_lines=\"10-11 23 24-27\"\nimport uuid\n\nfrom pydantic import BaseModel, Field\nfrom pydantic.json_schema import SkipJsonSchema\n\n\nclass QuestionAnswer(BaseModel):\n    question: str = Field(description=\"Question about the topic\")\n    options: list[str] = Field(\n        description=\"Potential answers to the question.\", min_items=3, max_items=5\n    )\n    answer_index: int = Field(\n        description=\"Index of the correct answer options (starting from 0).\", ge=0, lt=5\n    )\n    difficulty: int = Field(\n        description=\"Difficulty of this question from 1 to 5, 5 being the most difficult.\",\n        gt=0,\n        le=5,\n    )\n    youtube_url: SkipJsonSchema[str | None] = None\n    id: uuid.UUID = Field(description=\"Unique identifier\", default_factory=uuid.uuid4)\n```\n\nThis examples shows several `instructor` features:\n\n- `Field` can have a `default` or `default_factory` value to prevent the LLM from\n  hallucinating the value\n    - `id` generates a unique id (`uuid`)\n- The type annotation `SkipJsonSchema` also prevents the LLM from generating the value.\n    - `youtube_url` is set programmatically in the application. We don't want the LLM\n      to hallucinate it.\n- `Field` can set constraints on what the LLM generates.\n    - `min_items=3, max_items=5` to limit the number of potential answers between 3 and 5\n    - `ge=0, lt=5` to limit the difficulty between 0 and 5 with 5 being the most difficult\n\n\n### 2. Retrieve the YouTube transcript\n\nWe use `youtube-transcript-api` to get the full transcript of a video.\n\n```python\nfrom youtube_transcript_api import YouTubeTranscriptApi\n\nyoutube_url = \"https://www.youtube.com/watch?v=hqutVJyd3TI\"\n_, _, video_id = youtube_url.partition(\"?v=\")\nsegments = YouTubeTranscriptApi.get_transcript(video_id)\ntranscript = \" \".join([s[\"text\"] for s in segments])\n```\n\n### 3. Generate question-answer pairs\n\nNow, to produce question-answer pairs:\n\n1. Create an `instructor` client by wrapping the OpenAI client\n2. Use `.create_iterable()` on the `instructor_client` to generate multiple outputs from\n   the input\n3. Specify `response_model=QuestionAnswer` to ensure outputs are `QuestionAnswer` objects\n4. Use the `messages` to pass the task instructos via the `system` message, and the input\n   transcript via `user` message.\n\n```python hl_lines=\"4 10 12\"\nimport instructor\n\ninstructor_client = instructor.from_provider(\"openai/gpt-5-nano\")\n\nsystem_prompt = \"\"\"Analyze the given YouTube transcript and generate question-answer pairs\nto help study and understand the topic better. Please rate all questions from 1 to 5\nbased on their difficulty.\"\"\"\n\nresponse = instructor_client.create_iterable(\n    model=\"gpt-4o-mini\",\n    response_model=QuestionAnswer,\n    messages=[\n        {\"role\": \"system\", \"content\": system_prompt},\n        {\"role\": \"user\", \"content\": transcript},\n    ],\n)\n```\n\nThis will return an generator that you can iterate over to access individual\n`QuestionAnswer` objects.\n\n```python\nprint(\"Preview:\\n\")\ncount = 0\nfor qna in response:\n    if count > 2:\n        break\n    print(qna.question)\n    print(qna.options)\n    print()\n    count += 1\n\n\"\"\"\nPreview:\n\nWhat is the primary purpose of the new OpenTelemetry instrumentation released with Burr?\n['To reduce code complexity', 'To provide full instrumentation without changing code', 'To couple the project with OpenAI', 'To enhance customer support']\n\nWhat do you need to install to use the OpenTelemetry instrumentation with Burr applications?\n['Only OpenAI package', 'Specific OpenTelemetry instrumentation module', 'All available packages', 'No installation needed']\n\nWhat advantage does OpenTelemetry provide in the context of instrumentation?\n['It is vendor agnostic', 'It requires complex integration', 'It relies on specific vendors', 'It makes applications slower']\n\"\"\"\n```\n\n\n## Create a flashcard application with Burr\n\nBurr uses `actions` and `transitions` to define complex applications while\npreserving the simplicity of a flowchart for understanding and debugging.\n\n\n### 1. Define `actions`\n\nActions are what your application can do. The `@action` decorator specifies what values\ncan be read from or written to `State`. The decorated function takes a `State` as\nfirst argument and return an updated `State` object.\n\nNext, we define three actions:\n\n- Process the user input to get the YouTube URL\n- Get the YouTube transcript associated with the URL\n- Generate question-answer pairs for the transcript\n\nNote that this is only a light refactor from the previous code snippets.\n\n```python\nfrom burr.core import action, State\n\n\n@action(reads=[], writes=[\"youtube_url\"])\ndef process_user_input(state: State, user_input: str) -> State:\n    \"\"\"Process user input and update the YouTube URL.\"\"\"\n    youtube_url = (\n        user_input  # In practice, we would have more complex validation logic.\n    )\n    return state.update(youtube_url=youtube_url)\n\n\n@action(reads=[\"youtube_url\"], writes=[\"transcript\"])\ndef get_youtube_transcript(state: State) -> State:\n    \"\"\"Get the official YouTube transcript for a video given it's URL\"\"\"\n    youtube_url = state[\"youtube_url\"]\n\n    _, _, video_id = youtube_url.partition(\"?v=\")\n    transcript = YouTubeTranscriptApi.get_transcript(video_id)\n    full_transcript = \" \".join([entry[\"text\"] for entry in transcript])\n\n    # store the transcript in state\n    return state.update(transcript=full_transcript, youtube_url=youtube_url)\n\n\n@action(reads=[\"transcript\", \"youtube_url\"], writes=[\"question_answers\"])\ndef generate_question_and_answers(state: State) -> State:\n    \"\"\"Generate `QuestionAnswer` from a YouTube transcript using an LLM.\"\"\"\n    # read the transcript from state\n    transcript = state[\"transcript\"]\n    youtube_url = state[\"youtube_url\"]\n\n    # create the instructor client\n    instructor_client = instructor.from_provider(\"openai/gpt-5-nano\")\n    system_prompt = (\n        \"Analyze the given YouTube transcript and generate question-answer pairs\"\n        \" to help study and understand the topic better. Please rate all questions from 1 to 5\"\n        \" based on their difficulty.\"\n    )\n    response = instructor_client.create_iterable(\n        model=\"gpt-4o-mini\",\n        response_model=QuestionAnswer,\n        messages=[\n            {\"role\": \"system\", \"content\": system_prompt},\n            {\"role\": \"user\", \"content\": transcript},\n        ],\n    )\n\n    # iterate over QuestionAnswer, add the `youtube_url`, and append to state\n    for qna in response:\n        qna.youtube_url = youtube_url\n        # `State` is immutable, so `.append()` returns a new object with the appended value\n        state = state.append(question_answers=qna)\n\n    return state\n```\n\n### 2. Build the `Application`\n\nTo create a Burr `Application`, we use the `ApplicationBuilder` object.\n\nMinimally, it needs to:\n\n- Use `.with_actions()` to define all possible actions. Simply pass the functions\n  decorated with `@action`.\n- Use `.with_transitions()` to define possible transitions between actions. This is\n  done via tuples `(from_action, to_action)`.\n- Use `.with_entrypoint()` to specify which action to run first.\n\n\n```python\nfrom burr.core import ApplicationBuilder\n\napp = (\n    ApplicationBuilder()\n    .with_actions(\n        process_user_input,\n        get_youtube_transcript,\n        generate_question_and_answers,\n    )\n    .with_transitions(\n        (\"process_user_input\", \"get_youtube_transcript\"),\n        (\"get_youtube_transcript\", \"generate_question_and_answers\"),\n        (\"generate_question_and_answers\", \"process_user_input\"),\n    )\n    .with_entrypoint(\"process_user_input\")\n    .build()\n)\napp.visualize()\n```\n\n![Burr application graph](./img/youtube-flashcards/flashcards.png)\n\n> You can always visualize the application graph to understand the logic's flow.\n\n\n### 3. Launch the application\n\nUsing `Application.run()` will make the application execute actions until a halt condition.\nIn this case, we halt before `process_user_input` to get the YouTube URL from the user.\n\nThe method `.run()` returns a tuple `(action_name, result, state)`. In this case, we only\nuse the state to inspect the generated question-answer pairs.\n\n```python\naction_name, result, state = app.run(\n    halt_before=[\"process_user_input\"],\n    inputs={\"user_input\": \"https://www.youtube.com/watch?v=hqutVJyd3TI\"},\n)\nprint(state[\"question_answers\"][0])\n```\n\nYou can create a simple local experience by using `.run()` in a `while` loop\n\n```python\nwhile True:\n    user_input = input(\"Enter a YouTube URL (q to quit): \")\n    if user_input.lower() == \"q\":\n        break\n\n    action_name, result, state = app.run(\n        halt_before=[\"process_user_input\"],\n        inputs={\"user_input\": user_input},\n    )\n    print(f\"{len(state['question_answers'])} question-answer pairs generated\")\n```\n\n\n## Next steps\n\nNow that you know how to use Instructor for reliable LLM outputs and Burr to\nstructure your application, many avenues open up depending on your goals!\n\n\n### 1. Build complex agents\n\nInstructor improves the LLM's reasoning by providing structure. Nesting models and adding\nconstraints allow to [get facts with citations](../../examples/exact_citations.md)\nor [extract a knowledge graph](../../examples/knowledge_graph.md)\nin a few lines of code. Also, [retries](../../concepts/retrying.md)\nenable the LLM to self-correct.\n\nBurr sets the boundaries between users, LLMs, and the rest of your system. You can add\n`Condition` on transitions to create complex workflows that remain easy to reason about.\n\n### 2. Add Burr to your product\n\nYour Burr `Application` is a lightweight Python object. You can run it within a notebook,\nvia script, a web app (Streamlit, Gradio, etc.), or as a [web service](https://burr.dagworks.io/examples/deployment/web-server/)\n(e.g., FastAPI).\n\nThe `ApplicationBuilder` provides many features to productionize your app:\n\n- [Persistence](https://burr.dagworks.io../../concepts/state-persistence/.md): save and restore `State`\n  (e.g., store conversation history)\n- [Observability](https://burr.dagworks.io../../concepts/additional-visibility/.md): log and monitor\n  application telemetry (e.g., LLM calls, number of tokens used, errors and retries)\n- [Streaming and async](https://burr.dagworks.io../../concepts/streaming-actions/.md): create snappy\n  user interfaces by streaming LLM responses and running actions asynchronously.\n\nFor example, you can log telemetry into Burr UI in a few lines of code. First, instrument the\nOpenAI library. Then, add `.with_tracker()` the `ApplicationBuilder` with a project name and\nenabling `use_otel_tracing=True`.\n\n```python hl_lines=\"5 19\"\nfrom burr.core import ApplicationBuilder\nfrom opentelemetry.instrumentation.openai import OpenAIApiInstrumentor\n\n# instrument before importing instructor or creating the OpenAI client\nOpenAIApiInstrumentor().instrument()\n\napp = (\n    ApplicationBuilder()\n    .with_actions(\n        process_user_input,\n        get_youtube_transcript,\n        generate_question_and_answers,\n    )\n    .with_transitions(\n        (\"process_user_input\", \"get_youtube_transcript\"),\n        (\"get_youtube_transcript\", \"generate_question_and_answers\"),\n        (\"generate_question_and_answers\", \"process_user_input\"),\n    )\n    .with_tracker(project=\"youtube-qna\", use_otel_tracing=True)\n    .with_entrypoint(\"process_user_input\")\n    .build()\n)\n```\n\n![telemetry](./img/youtube-flashcards/telemetry.gif)\n\n> Telemetry for our OpenAI API calls with Instructor. We see the prompt, the response model, and the response content.\n\n### 3. Annotate application logs\n\nBurr UI has a built-in annotation tool that allows you to label, rate, or comment on\nlogged data (e.g., user input, LLM response, content retrieved for RAG). This can be\nuseful to create test cases and evaluation datasets.\n\n![annotation tool](./img/youtube-flashcards/annotations.png)\n\n\n## Conclusion\n\nWe've shown how Instructor helps getting reliable outputs from LLMs and Burr provides\nthe right tools to build an application. Now it's your turn to start building!\n"
  },
  {
    "path": "docs/blog/posts/youtube-transcripts.md",
    "content": "---\nauthors:\n- jxnl\ncategories:\n- Data Processing\ncomments: true\ndate: 2024-07-11\ndescription: Learn how to extract and summarize YouTube video transcripts into chapters\n  using Python and Pydantic for versatile applications.\ndraft: false\nslug: youtube-transcripts\ntags:\n- YouTube\n- transcripts\n- Pydantic\n- Python\n- Data Processing\n---\n\n# Analyzing Youtube Transcripts with Instructor\n\n## Extracting Chapter Information\n\n!!! info \"Code Snippets\"\n\n    As always, the code is readily available in our `examples/youtube` folder in our repo for your reference in the `run.py` file.\n\nIn this post, we'll show you how to summarise Youtube video transcripts into distinct chapters using `instructor` before exploring some ways you can adapt the code to different applications.\n\nBy the end of this article, you'll be able to build an application as per the video below.\n\n![](../../img/youtube.gif)\n\n<!-- more -->\n\nLet's first install the required packages.\n\n```bash\npip install openai instructor pydantic youtube_transcript_api\n```\n\n!!! info \"Quick Note\"\n\n    The video that we'll be using in this tutorial is [A Hacker's Guide To Language Models](https://www.youtube.com/watch?v=jkrNMKz9pWU) by Jeremy Howard. It has the video id of `jkrNMKz9pWU`.\n\nNext, let's start by defining a Pydantic Model for the structured chapter information that we want.\n\n```python\nfrom pydantic import BaseModel, Field\n\n\nclass Chapter(BaseModel):\n    start_ts: float = Field(\n        ...,\n        description=\"Starting timestamp for a chapter.\",\n    )\n    end_ts: float = Field(\n        ...,\n        description=\"Ending timestamp for a chapter\",\n    )\n    title: str = Field(\n        ..., description=\"A concise and descriptive title for the chapter.\"\n    )\n    summary: str = Field(\n        ...,\n        description=\"A brief summary of the chapter's content, don't use words like 'the speaker'\",\n    )\n```\n\nWe can take advantage of `youtube-transcript-api` to extract out the transcript of a video using the following function\n\n```python\nfrom youtube_transcript_api import YouTubeTranscriptApi\n\n\ndef get_youtube_transcript(video_id: str) -> str:\n    try:\n        transcript = YouTubeTranscriptApi.get_transcript(video_id)\n        return \" \".join(\n            [f\"ts={entry['start']} - {entry['text']}\" for entry in transcript]\n        )\n    except Exception as e:\n        print(f\"Error fetching transcript: {e}\")\n        return \"\"\n```\n\nOnce we've done so, we can then put it all together into the following functions.\n\n```python hl_lines=\"30-31 38-48\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom youtube_transcript_api import YouTubeTranscriptApi\n\n# Set up OpenAI client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Chapter(BaseModel):\n    start_ts: float = Field(\n        ...,\n        description=\"The start timestamp indicating when the chapter starts in the video.\",\n    )\n    end_ts: float = Field(\n        ...,\n        description=\"The end timestamp indicating when the chapter ends in the video.\",\n    )\n    title: str = Field(\n        ..., description=\"A concise and descriptive title for the chapter.\"\n    )\n    summary: str = Field(\n        ...,\n        description=\"A brief summary of the chapter's content, don't use words like 'the speaker'\",\n    )\n\n\ndef get_youtube_transcript(video_id: str) -> str:\n    try:\n        transcript = YouTubeTranscriptApi.get_transcript(video_id)\n        return [f\"ts={entry['start']} - {entry['text']}\" for entry in transcript]\n    except Exception as e:\n        print(f\"Error fetching transcript: {e}\")\n        \"\"\"\n        Error fetching transcript: type object 'YouTubeTranscriptApi' has no attribute 'get_transcript'\n        \"\"\"\n        return \"\"\n\n\ndef extract_chapters(transcript: str):\n    return client.create_iterable(\n        model=\"gpt-4o\",  # You can experiment with different models\n        response_model=Chapter,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Analyze the given YouTube transcript and extract chapters. For each chapter, provide a start timestamp, end timestamp, title, and summary.\",\n            },\n            {\"role\": \"user\", \"content\": transcript},\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    transcripts = get_youtube_transcript(\"jkrNMKz9pWU\")\n\n    for transcript in transcripts[:2]:\n        print(transcript)\n        #> ts=0.539 - hi I am Jeremy Howard from fast.ai and\n        #> ts=4.62 - this is a hacker's guide to language\n\n    formatted_transcripts = ''.join(transcripts)\n    chapters = extract_chapters(formatted_transcripts)\n\n    for chapter in chapters:\n        print(chapter.model_dump_json(indent=2))\n        \"\"\"\n        {\n          \"start_ts\": 0.0,\n          \"end_ts\": 30.0,\n          \"title\": \"Introduction and Topic Overview\",\n          \"summary\": \"Introduction to the video, outlining the main topic of discussion.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 31.0,\n          \"end_ts\": 60.0,\n          \"title\": \"Background Information\",\n          \"summary\": \"Background information relevant to the topic.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 61.0,\n          \"end_ts\": 120.0,\n          \"title\": \"Key Concept Explanation\",\n          \"summary\": \"Detailed explanation of the key concepts.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 121.0,\n          \"end_ts\": 165.0,\n          \"title\": \"Critical Analysis\",\n          \"summary\": \"Analysis and discussion of the critical aspects of the topic.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 166.0,\n          \"end_ts\": 210.0,\n          \"title\": \"Examples and Case Studies\",\n          \"summary\": \"Presentation of examples and case studies related to the topic.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 211.0,\n          \"end_ts\": 240.0,\n          \"title\": \"Conclusion and Final Thoughts\",\n          \"summary\": \"Conclusion of the video with final thoughts on the topic.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 9.72,\n          \"end_ts\": 65.6,\n          \"title\": \"Understanding Language Models\",\n          \"summary\": \"Explains the code-first approach to using language models, suggesting prerequisites such as prior deep learning knowledge and recommends the course.fast.ai for in-depth learning.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 65.6,\n          \"end_ts\": 250.68,\n          \"title\": \"Basics of Language Models\",\n          \"summary\": \"Covers the concept of language models, demonstrating how they predict the next word in a sentence, and showcases OpenAI's text DaVinci for creative brainstorming with examples.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"start_ts\": 250.68,\n          \"end_ts\": 459.199,\n          \"title\": \"How Language Models Work\",\n          \"summary\": \"Dives deeper into how language models like ULMfit and others were developed, their training on datasets like Wikipedia, and the importance of learning various aspects of the world to predict the next word effectively.\"\n        }\n        \"\"\"\n        # ... other chapters\n```\n\n## Alternative Ideas\n\nNow that we've seen a complete example of chapter extraction, let's explore some alternative ideas using different Pydantic models. These models can be used to adapt our YouTube transcript analysis for various applications.\n\n### 1. Study Notes Generator\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Concept(BaseModel):\n    term: str = Field(..., description=\"A key term or concept mentioned in the video\")\n    definition: str = Field(\n        ..., description=\"A brief definition or explanation of the term\"\n    )\n\n\nclass StudyNote(BaseModel):\n    timestamp: float = Field(\n        ..., description=\"The timestamp where this note starts in the video\"\n    )\n    topic: str = Field(..., description=\"The main topic being discussed at this point\")\n    key_points: List[str] = Field(..., description=\"A list of key points discussed\")\n    concepts: List[Concept] = Field(\n        ..., description=\"Important concepts mentioned in this section\"\n    )\n```\n\nThis model structures the video content into clear topics, key points, and important concepts, making it ideal for revision and study purposes.\n\n### 2. Content Summarization\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass ContentSummary(BaseModel):\n    title: str = Field(..., description=\"The title of the video\")\n    duration: float = Field(\n        ..., description=\"The total duration of the video in seconds\"\n    )\n    main_topics: List[str] = Field(\n        ..., description=\"A list of main topics covered in the video\"\n    )\n    key_takeaways: List[str] = Field(\n        ..., description=\"The most important points from the entire video\"\n    )\n    target_audience: str = Field(\n        ..., description=\"The intended audience for this content\"\n    )\n```\n\nThis model provides a high-level overview of the entire video, perfect for quick content analysis or deciding whether a video is worth watching in full.\n\n### 3. Quiz Generator\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass QuizQuestion(BaseModel):\n    question: str = Field(..., description=\"The quiz question\")\n    options: List[str] = Field(\n        ..., min_items=2, max_items=4, description=\"Possible answers to the question\"\n    )\n    correct_answer: int = Field(\n        ...,\n        ge=0,\n        lt=4,\n        description=\"The index of the correct answer in the options list\",\n    )\n    explanation: str = Field(\n        ..., description=\"An explanation of why the correct answer is correct\"\n    )\n\n\nclass VideoQuiz(BaseModel):\n    title: str = Field(\n        ..., description=\"The title of the quiz, based on the video content\"\n    )\n    questions: List[QuizQuestion] = Field(\n        ...,\n        min_items=5,\n        max_items=20,\n        description=\"A list of quiz questions based on the video content\",\n    )\n```\n\nThis model transforms video content into an interactive quiz, perfect for testing comprehension or creating engaging content for social media.\n\nTo use these alternative models, you would replace the `Chapter` model in our original code with one of these alternatives and adjust the system prompt in the `extract_chapters` function accordingly.\n\n## Conclusion\n\nThe power of this approach lies in its flexibility. By defining the result of our function calls as Pydantic Models, we're able to quickly adapt code for a wide variety of applications whether it be generating quizzes, creating study materials or just optimizing for simple SEO."
  },
  {
    "path": "docs/cli/batch.md",
    "content": "---\ntitle: Managing Batch Jobs with Multi-Provider CLI\ndescription: Learn how to create, list, cancel, and delete batch jobs using the unified Command Line Interface (CLI) across OpenAI and Anthropic providers.\n---\n\n# Using the Command Line Interface for Batch Jobs\n\nThe instructor CLI provides comprehensive functionalities for managing batch jobs across multiple providers with a unified interface. This multi-provider support allows users to leverage the strengths of different AI providers for their batch processing needs.\n\n## Supported Providers\n\n- **OpenAI**: Utilizes OpenAI's robust batch processing capabilities with metadata support\n- **Anthropic**: Leverages Anthropic's advanced language models with cancel/delete operations\n\nThe CLI uses a unified `--provider` flag for all commands, with backward compatibility for legacy flags.\n\n```bash\n$ instructor batch --help\n\n Usage: instructor batch [OPTIONS] COMMAND [ARGS]...\n\n Manage OpenAI Batch jobs\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ --help          Show this message and exit.                                  │\n╰──────────────────────────────────────────────────────────────────────────────╯\n╭─ Commands ───────────────────────────────────────────────────────────────────╮\n│ cancel             Cancel a batch job                                        │\n│ create             Create batch job using BatchProcessor                     │\n│ create-from-file   Create a batch job from a file                            │\n│ delete             Delete a completed batch job                              │\n│ download-file      Download the file associated with a batch job             │\n│ list               See all existing batch jobs                               │\n│ results            Retrieve results from a batch job                         │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\n## Creating a Batch Job\n\n### List Jobs with Enhanced Display\n\n```bash\n$ instructor batch list --help\n\n Usage: instructor batch list [OPTIONS]\n\n See all existing batch jobs\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ --limit                                  INTEGER  Total number of batch jobs │\n│                                                   to show                    │\n│                                                   [default: 10]              │\n│ --poll                                   INTEGER  Time in seconds to wait    │\n│                                                   for the batch job to       │\n│                                                   complete                   │\n│                                                   [default: 10]              │\n│ --screen           --no-screen                    Enable or disable screen   │\n│                                                   output                     │\n│                                                   [default: no-screen]       │\n│ --live             --no-live                      Enable live polling to     │\n│                                                   continuously update the    │\n│                                                   table                      │\n│                                                   [default: no-live]         │\n│ --provider                               TEXT     Provider to use (e.g.,     │\n│                                                   'openai', 'anthropic')     │\n│                                                   [default: openai]          │\n│ --use-anthropic    --no-use-anthropic             [DEPRECATED] Use --model   │\n│                                                   instead. Use Anthropic API │\n│                                                   instead of OpenAI          │\n│                                                   [default:                  │\n│                                                   no-use-anthropic]          │\n│ --help                                            Show this message and      │\n│                                                   exit.                      │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\nThe enhanced list command now shows rich information including timestamps, duration, and provider-specific metrics:\n\n```bash\n$ instructor batch list --provider openai --limit 3\n\n                                         Openai Batch Jobs\n┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━┓\n┃ Batch ID           ┃ Status     ┃ Created    ┃ Started    ┃ Duration┃ Completed┃ Failed ┃ Total ┃\n┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━┩\n│ batch_abc123...    │ completed  │ 07/07      │ 07/07      │ 2m      │ 15       │ 0      │ 15    │\n│                    │            │ 23:48      │ 23:48      │         │          │        │       │\n│ batch_def456...    │ processing │ 07/07      │ 07/07      │ 45m     │ 8        │ 0      │ 10    │\n│                    │            │ 22:30      │ 22:31      │         │          │        │       │\n│ batch_ghi789...    │ failed     │ 07/07      │ N/A        │ N/A     │ 0        │ 5      │ 5     │\n│                    │            │ 21:15      │            │         │          │        │       │\n└────────────────────┴────────────┴────────────┴────────────┴─────────┴──────────┴────────┴───────┘\n\n$ instructor batch list --provider anthropic --limit 2\n\n                                           Anthropic Batch Jobs\n┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┓\n┃ Batch ID             ┃ Status     ┃ Created    ┃ Started    ┃ Duration┃ Succeeded┃ Errored ┃ Processing  ┃\n┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━┩\n│ msgbatch_abc123...   │ completed  │ 07/08      │ 07/08      │ 1m      │ 20       │ 0       │ 0           │\n│                      │            │ 03:47      │ 03:47      │         │          │         │             │\n│ msgbatch_def456...   │ processing │ 07/08      │ 07/08      │ 15m     │ 5        │ 0       │ 10          │\n│                      │            │ 03:30      │ 03:30      │         │          │         │             │\n└──────────────────────┴────────────┴────────────┴────────────┴─────────┴──────────┴─────────┴─────────────┘\n```\n\n### Create From File with Metadata Support\n\nYou can create batch jobs directly from pre-formatted .jsonl files with enhanced metadata support:\n\n```bash\n$ instructor batch create-from-file --help\n\n Usage: instructor batch create-from-file [OPTIONS]\n\n Create a batch job from a file\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ *  --file-path                                  TEXT  File containing the    │\n│                                                       batch job requests     │\n│                                                       [default: None]        │\n│                                                       [required]             │\n│    --model                                      TEXT  Model in format        │\n│                                                       'provider/model-name'  │\n│                                                       (e.g., 'openai/gpt-4', │\n│                                                       'anthropic/claude-3-s… │\n│                                                       [default:              │\n│                                                       openai/gpt-4o-mini]    │\n│    --description                                TEXT  Description/metadata   │\n│                                                       for the batch job      │\n│                                                       [default: Instructor   │\n│                                                       batch job]             │\n│    --completion-window                          TEXT  Completion window for  │\n│                                                       the batch job (OpenAI  │\n│                                                       only)                  │\n│                                                       [default: 24h]         │\n│    --use-anthropic        --no-use-anthropic          [DEPRECATED] Use       │\n│                                                       --model instead. Use   │\n│                                                       Anthropic API instead  │\n│                                                       of OpenAI              │\n│                                                       [default:              │\n│                                                       no-use-anthropic]      │\n│    --help                                             Show this message and  │\n│                                                       exit.                  │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\nExample usage with metadata:\n\n```bash\n# OpenAI batch with custom metadata\ninstructor batch create-from-file \\\n    --file-path batch_requests.jsonl \\\n    --model \"openai/gpt-5-nano\" \\\n    --description \"Email classification batch - production v2.1\" \\\n    --completion-window \"24h\"\n\n# Anthropic batch\ninstructor batch create-from-file \\\n    --file-path batch_requests.jsonl \\\n    --model \"anthropic/claude-3-5-sonnet-20241022\" \\\n    --description \"Text analysis batch\"\n```\n\nFor creating .jsonl files, you can use the enhanced `BatchProcessor`:\n\n```python\nfrom instructor.batch import BatchProcessor\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\n\nclass Classification(BaseModel):\n    label: Literal[\"SPAM\", \"NOT_SPAM\"] = Field(\n        ..., description=\"Whether the email is spam or not\"\n    )\n\n# Create processor\nprocessor = BatchProcessor(\"openai/gpt-5-nano\", Classification)\n\n# Prepare message conversations\nmessages_list = [\n    [\n        {\"role\": \"system\", \"content\": \"Classify the following email\"},\n        {\"role\": \"user\", \"content\": \"Hello there I'm a Nigerian prince and I want to give you money\"}\n    ],\n    [\n        {\"role\": \"system\", \"content\": \"Classify the following email\"},\n        {\"role\": \"user\", \"content\": \"Meeting with Thomas has been set at Friday next week\"}\n    ]\n]\n\n# Create batch file\nprocessor.create_batch_from_messages(\n    messages_list=messages_list,\n    file_path=\"batch_requests.jsonl\",\n    max_tokens=100,\n    temperature=0.1\n)\n```\n\n## Job Management Operations\n\n### Cancelling a Batch Job\n\nCancel running batch jobs across all providers:\n\n```bash\n$ instructor batch cancel --help\n\n Usage: instructor batch cancel [OPTIONS]\n\n Cancel a batch job\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ *  --batch-id                               TEXT  Batch job ID to cancel     │\n│                                                   [default: None]            │\n│                                                   [required]                 │\n│    --provider                               TEXT  Provider to use (e.g.,     │\n│                                                   'openai', 'anthropic')     │\n│                                                   [default: openai]          │\n│    --use-anthropic    --no-use-anthropic          [DEPRECATED] Use           │\n│                                                   --provider 'anthropic'     │\n│                                                   instead. Use Anthropic API │\n│                                                   instead of OpenAI          │\n│                                                   [default:                  │\n│                                                   no-use-anthropic]          │\n│    --help                                         Show this message and      │\n│                                                   exit.                      │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\nExamples:\n\n```bash\n# Cancel OpenAI batch\ninstructor batch cancel --batch-id batch_abc123 --provider openai\n\n# Cancel Anthropic batch\ninstructor batch cancel --batch-id msgbatch_def456 --provider anthropic\n```\n\n### Deleting a Batch Job\n\nDelete completed batch jobs (supported by Anthropic):\n\n```bash\n$ instructor batch delete --help\n\n Usage: instructor batch delete [OPTIONS]\n\n Delete a completed batch job\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ *  --batch-id        TEXT  Batch job ID to delete [default: None] [required] │\n│    --provider        TEXT  Provider to use (e.g., 'openai', 'anthropic')     │\n│                            [default: openai]                                 │\n│    --help                  Show this message and exit.                       │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\nExamples:\n\n```bash\n# Delete Anthropic batch (supported)\ninstructor batch delete --batch-id msgbatch_abc123 --provider anthropic\n\n# Try to delete OpenAI batch (shows helpful message)\ninstructor batch delete --batch-id batch_ghi789 --provider openai\n# Note: OpenAI does not support batch deletion via API\n```\n\n### Retrieving Batch Results\n\nGet structured results from completed batch jobs:\n\n```bash\n$ instructor batch results --help\n\n Usage: instructor batch results [OPTIONS]\n\n Retrieve results from a batch job\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ *  --batch-id           TEXT  Batch job ID to get results from               │\n│                               [default: None]                                │\n│                               [required]                                     │\n│ *  --output-file        TEXT  File to save the results to [default: None]    │\n│                               [required]                                     │\n│    --model              TEXT  Model in format 'provider/model-name' (e.g.,   │\n│                               'openai/gpt-4', 'anthropic/claude-3-sonnet')   │\n│                               [default: openai/gpt-4o-mini]                  │\n│    --help                     Show this message and exit.                    │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\nExamples:\n\n```bash\n# Get OpenAI batch results\ninstructor batch results \\\n    --batch-id batch_abc123 \\\n    --output-file openai_results.jsonl \\\n    --model \"openai/gpt-5-nano\"\n\n# Get Anthropic batch results\ninstructor batch results \\\n    --batch-id msgbatch_def456 \\\n    --output-file anthropic_results.jsonl \\\n    --model \"anthropic/claude-3-5-sonnet-20241022\"\n```\n\n### Downloading Raw Files (Legacy)\n\nFor compatibility, the download-file command is still available:\n\n```bash\n$ instructor batch download-file --help\n\n Usage: instructor batch download-file [OPTIONS]\n\n Download the file associated with a batch job\n\n╭─ Options ────────────────────────────────────────────────────────────────────╮\n│ *  --batch-id                  TEXT  Batch job ID to download                │\n│                                      [default: None]                         │\n│                                      [required]                              │\n│ *  --download-file-path        TEXT  Path to download file to                │\n│                                      [default: None]                         │\n│                                      [required]                              │\n│    --provider                  TEXT  Provider to use (e.g., 'openai',        │\n│                                      'anthropic')                            │\n│                                      [default: openai]                       │\n│    --help                            Show this message and exit.             │\n╰──────────────────────────────────────────────────────────────────────────────╯\n```\n\n## Provider Support Matrix\n\n| Operation | OpenAI | Anthropic |\n|-----------|--------|-----------|\n| **List**  | ✅ Enhanced table | ✅ Enhanced table |\n| **Create** | ✅ With metadata | ✅ File-based |\n| **Cancel** | ✅ Standard API | ✅ Standard API |\n| **Delete** | ❌ Not supported | ✅ Standard API |\n| **Results** | ✅ Structured parsing | ✅ Structured parsing |\n\n## Enhanced Features\n\n- **Rich CLI Tables**: Color-coded status, timestamps, duration calculations\n- **Metadata Support**: Add descriptions and custom fields to organize batches\n- **Unified Commands**: Same interface works across all providers\n- **Provider Detection**: Automatic provider detection from model strings\n- **Error Handling**: Clear error messages and helpful notes for unsupported operations\n- **Backward Compatibility**: Legacy flags still work with deprecation warnings\n\nThis comprehensive CLI interface provides efficient batch job management across all supported providers with enhanced monitoring and control capabilities.\n"
  },
  {
    "path": "docs/cli/finetune.md",
    "content": "---\ntitle: Managing Fine-Tuning Jobs with the Instructor CLI\ndescription: Learn how to create, view, and manage fine-tuning jobs on OpenAI using the Instructor CLI, with essential commands and options.\n---\n\n# Using the Command Line Interface\n\nThe instructor CLI provides functionalities for managing fine-tuning jobs on OpenAI.\n\n!!! warning \"Incomplete API\"\nThe CLI is still under development and does not yet support all features of the API. If you would like to use a feature that is not yet supported, please consider using the contributing to our library [jxnl/instructor](https://www.github.com/jxnl/instructor) instead.\n\n    !!! note \"Low hanging fruit\"\n\n        If you want to contribute we're looking for a few things:\n\n        1. Adding filenames on upload\n\n## Creating a Fine-Tuning Job\n\n### View Jobs Options\n\n```sh\n$ instructor jobs --help\n\n Usage: instructor jobs [OPTIONS] COMMAND [ARGS]...\n\n Monitor and create fine tuning jobs\n\n╭─ Options ───────────────────────────────────────────────────────────────────────────────╮\n│ --help                            Display the help message.                             │\n╰─────────────────────────────────────────────────────────────────────────────────────────╯\n╭─ Commands ──────────────────────────────────────────────────────────────────────────────────────────────────╮\n│ cancel                    Cancel a fine-tuning job.                                                         │\n│ create-from-file          Create a fine-tuning job from a file.                                             │\n│ create-from-id            Create a fine-tuning job from an existing ID.                                     │\n│ list                      Monitor the status of the most recent fine-tuning jobs.                           │\n╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\n\n```\n\n### Create from File\n\nThe create-from-file command uploads and trains a model in a single step.\n\n```sh\n❯ instructor jobs create-from-file --help\n\nUsage: instructor jobs create-from-file [OPTIONS] FILE\n\n Create a fine-tuning job from a file.\n\n╭─ Arguments ───────────────────────────────────────────────────────────────────────────────────────╮\n│ *    file      TEXT  Path to the file for fine-tuning [default: None] [required]                  │\n╰───────────────────────────────────────────────────────────────────────────────────────────────────╯\n╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────╮\n│ --model                           TEXT     Model to use for fine-tuning [default: gpt-3.5-turbo]  │\n│ --poll                            INTEGER  Polling interval in seconds [default: 2]               │\n│ --n-epochs                        INTEGER  Number of epochs for fine-tuning                       │\n│ --batch-size                      TEXT     Batch size for fine-tuning                             │\n│ --learning-rate-multiplier        TEXT     Learning rate multiplier for fine-tuning               │\n│ --validation-file                 TEXT     Path to the validation file [default: None]            │\n│ --model-suffix                    TEXT     Suffix to identify the model [default: None]           │\n│ --help                                     Show this message and exit.                            │\n╰────────────────────────────────────────────────────────────────────────────────\n```\n\n#### Usage\n\n```sh\n$ instructor jobs create-from-file transformed_data.jsonl --validation_file validation_data.jsonl --n_epochs 3 --batch_size 16 --learning_rate_multiplier 0.5\n```\n\n### Create from ID\n\nThe create-from-id command uses an uploaded file and trains a model\n\n```sh\n❯ instructor jobs create-from-id --help\n\n Usage: instructor jobs create-from-id [OPTIONS] ID\n\n Create a fine-tuning job from an existing ID.\n\n╭─ Arguments ───────────────────────────────────────────────────────────────────────────╮\n│ *    id      TEXT  ID of the existing fine-tuning job [default: None] [required]      │\n╰───────────────────────────────────────────────────────────────────────────────────────╯\n╭─ Options ─────────────────────────────────────────────────────────────────────────────╮\n│ --model                           TEXT     Model to use for fine-tuning               │\n│                                            [default: gpt-3.5-turbo]                   │\n│ --n-epochs                        INTEGER  Number of epochs for fine-tuning           │\n│ --batch-size                      TEXT     Batch size for fine-tuning                 │\n│ --learning-rate-multiplier        TEXT     Learning rate multiplier for fine-tuning   │\n│ --validation-file-id              TEXT     ID of the uploaded validation file         │\n│                                            [default: None]                            │\n│ --help                                     Show this message and exit.                │\n╰───────────────────────────────────────────────────────────────────────────────────────╯\n```\n\n#### Usage\n\n```sh\n$ instructor files upload transformed_data.jsonl\n$ instructor files upload validation_data.jsonl\n$ instructor files list\n...\n$ instructor jobs create_from_id <file_id> --validation_file <validation_file_id> --n_epochs 3 --batch_size 16 --learning_rate_multiplier 0.5\n```\n\n### Viewing Files and Jobs\n\n#### Viewing Jobs\n\n```sh\n$ instructor jobs list\n\nOpenAI Fine Tuning Job Monitoring\n┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━┓\n┃                ┃              ┃                ┃     Completion ┃                 ┃                ┃        ┃                 ┃\n┃ Job ID         ┃ Status       ┃  Creation Time ┃           Time ┃ Model Name      ┃ File ID        ┃ Epochs ┃ Base Model      ┃\n┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━┩\n│ ftjob-PWo6uwk... │ 🚫 cancelled │     2023-08-23 │            N/A │                 │ file-F7lJg6Z4... │ 3      │ gpt-3.5-turbo-... │\n│                │              │       23:10:54 │                │                 │                │        │                 │\n│ ftjob-1whjva8... │ 🚫 cancelled │     2023-08-23 │            N/A │                 │ file-F7lJg6Z4... │ 3      │ gpt-3.5-turbo-... │\n│                │              │       22:47:05 │                │                 │                │        │                 │\n│ ftjob-wGoBDld... │ 🚫 cancelled │     2023-08-23 │            N/A │                 │ file-F7lJg6Z4... │ 3      │ gpt-3.5-turbo-... │\n│                │              │       22:44:12 │                │                 │                │        │                 │\n│ ftjob-yd5aRTc... │ ✅ succeeded │     2023-08-23 │     2023-08-23 │ ft:gpt-3.5-tur... │ file-IQxAUDqX... │ 3      │ gpt-3.5-turbo-... │\n│                │              │       14:26:03 │       15:02:29 │                 │                │        │                 │\n└────────────────┴──────────────┴────────────────┴────────────────┴─────────────────┴────────────────┴────────┴─────────────────┘\n                                    Automatically refreshes every 5 seconds, press Ctrl+C to exit\n```\n\n#### Viewing Files\n\n```sh\n$ instructor files list\n\nOpenAI Files\n┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━┓\n┃ File ID                       ┃ Size (bytes) ┃ Creation Time       ┃ Filename ┃ Purpose   ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━┩\n│ file-0lw2BSNRUlXZXRRu2beCCWjl │       369523 │ 2023-08-23 23:31:57 │ file     │ fine-tune │\n│ file-IHaUXcMEykmFUp1kt2puCDEq │       369523 │ 2023-08-23 23:09:35 │ file     │ fine-tune │\n│ file-ja9vRBf0FydEOTolaa3BMqES │       369523 │ 2023-08-23 22:42:29 │ file     │ fine-tune │\n│ file-F7lJg6Z47CREvmx4kyvyZ6Sn │       369523 │ 2023-08-23 22:42:03 │ file     │ fine-tune │\n│ file-YUxqZPyJRl5GJCUTw3cNmA46 │       369523 │ 2023-08-23 22:29:10 │ file     │ fine-tune │\n└───────────────────────────────┴──────────────┴─────────────────────┴──────────┴───────────┘\n```\n\n# Contributions\n\nWe aim to provide a light wrapper around the API rather than offering a complete CLI. Contributions are welcome! Please feel free to make an issue at [jxnl/instructor/issues](https://github.com/jxnl/instructor/issues) or submit a pull request.\n"
  },
  {
    "path": "docs/cli/index.md",
    "content": "---\ntitle: Instructor CLI Tools\ndescription: Command-line utilities for monitoring API usage, fine-tuning models, and accessing documentation.\n---\n\n# Instructor CLI Tools\n\n<div class=\"grid cards\" markdown>\n\n- :material-console: **Command Line Utilities**\n\n    Powerful tools to enhance your Instructor workflow\n\n    [:octicons-arrow-right-16: View Commands](#available-commands)\n\n- :material-chart-line: **Usage Monitoring**\n\n    Track API usage, costs, and token consumption\n\n    [:octicons-arrow-right-16: Usage Guide](usage.md)\n\n- :material-tune-vertical: **Model Fine-Tuning**\n\n    Create and manage custom model versions\n\n    [:octicons-arrow-right-16: Fine-Tuning Guide](finetune.md)\n\n- :material-book-open-variant: **Documentation Access**\n\n    Quickly access docs from your terminal\n\n    [:octicons-arrow-right-16: Docs Command](#documentation-command)\n\n</div>\n\n## Getting Started\n\n### Installation\n\nThe CLI tools are included with the Instructor package:\n\n```bash\npip install instructor\n```\n\n### API Setup\n\nSet your OpenAI API key as an environment variable:\n\n```bash\nexport OPENAI_API_KEY=\"your-api-key-here\"\n```\n\n## Available Commands\n\nInstructor provides several command-line utilities:\n\n| Command | Description | Guide |\n|---------|-------------|-------|\n| `instructor usage` | Track API usage and costs | [Usage Guide](usage.md) |\n| `instructor finetune` | Create and manage fine-tuned models | [Fine-Tuning Guide](finetune.md) |\n| `instructor docs` | Quick access to documentation | [See below](#documentation-command) |\n\n## Usage Command\n\nMonitor your OpenAI API usage directly from the terminal:\n\n```bash\n# View total usage for the current month\ninstructor usage\n\n# View usage breakdown by day\ninstructor usage --by-day\n\n# Calculate cost for a specific model\ninstructor usage --model gpt-4\n```\n\nFor detailed usage statistics and options, see the [Usage Guide](usage.md).\n\n## Fine-Tuning Command\n\nCreate and manage fine-tuned models with an interactive interface:\n\n```bash\n# Start the fine-tuning interface\ninstructor finetune\n```\n\nThis launches an interactive application that guides you through the fine-tuning process. Learn more in the [Fine-Tuning Guide](finetune.md).\n\n## Documentation Command\n\nQuickly access Instructor documentation from your terminal:\n\n```bash\n# Open main documentation\ninstructor docs\n\n# Search for specific topic\ninstructor docs validation\n\n# Open specific page\ninstructor docs concepts/models\n```\n\nThis command opens the Instructor documentation in your default web browser, making it easy to find information when you need it.\n\n## Support & Contribution\n\n- **GitHub**: Visit our [GitHub Repository](https://github.com/jxnl/instructor)\n- **Issues**: Report bugs or request features on our [Issue Tracker](https://github.com/jxnl/instructor/issues)\n- **Discord**: Join our [Discord Community](https://discord.gg/bD9YE9JArw) for support\n"
  },
  {
    "path": "docs/cli/usage.md",
    "content": "---\ntitle: OpenAI API Usage CLI Guide\ndescription: Learn how to monitor OpenAI API usage with the CLI tool, including commands for viewing data by model, date, and cost.\n---\n\n# Using the OpenAI API Usage CLI\n\nThe OpenAI API Usage CLI tool provides functionalities for monitoring your OpenAI API usage, breaking it down by model, date, and cost.\n\n## Monitoring API Usage\n\n### View Usage Options\n\n```sh\n$ instructor usage --help\n\n Usage: instructor usage [OPTIONS] COMMAND [ARGS]...\n\n Check OpenAI API usage data\n\n╭─ Options ───────────────────────────────────────────────────────╮\n│ --help          Show this message and exit.                     │\n╰─────────────────────────────────────────────────────────────────╯\n╭─ Commands ──────────────────────────────────────────────────────╮\n│ list       Displays OpenAI API usage data for the past N days.  │\n╰─────────────────────────────────────────────────────────────────╯\n```\n\n### List Usage for Specific Number of Days\n\nTo display API usage for the past 3 days, use the following command:\n\n```sh\n$ instructor usage list --n 3\n```\n\nThis will output a table similar to:\n\n```plaintext\n                 Usage Summary by Date, Snapshot, and Cost\n┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓\n┃ Date       ┃ Snapshot ID               ┃ Total Requests ┃ Total Cost ($) ┃\n┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩\n│ 2023-09-04 │ gpt-4-0613                │             44 │           0.68 │\n│ 2023-09-04 │ gpt-3.5-turbo-16k-0613    │            195 │           0.84 │\n│ 2023-09-04 │ text-embedding-ada-002-v2 │            276 │           0.00 │\n│ 2023-09-04 │ gpt-4-32k-0613            │            328 │          49.45 │\n└────────────┴───────────────────────────┴────────────────┴────────────────┘\n```\n\n### List Usage for Today\n\nTo display the API usage for today, simply run:\n\n```sh\n$ instructor usage list\n```\n\n# Contributions\n\nWe aim to provide a light wrapper around the API rather than offering a complete CLI. Contributions are welcome! Please feel free to make an issue at [jxnl/instructor/issues](https://github.com/jxnl/instructor/issues) or submit a pull request.\n"
  },
  {
    "path": "docs/concepts/alias.md",
    "content": "---\ntitle: Pydantic Aliases Overview\ndescription: Explore the concept of aliases in Pydantic. Discover the latest documentation and features for better data validation.\n---\n\n## See Also\n\n- [Fields](./fields.md) - Customizing field metadata\n- [Response Models](./models.md) - Working with Pydantic models\n- [Types](./types.md) - Working with different data types\n- [Prompting](./prompting.md) - Prompt engineering techniques\n\n!!! warning \"This page is a work in progress\"\n\n    This page is a work in progress. Check out [Pydantic's documentation](https://docs.pydantic.dev/latest/concepts/alias/)\n"
  },
  {
    "path": "docs/concepts/batch.md",
    "content": "---\ntitle: Batch Processing\ndescription: Process multiple LLM requests efficiently using batch processing for 50% cost savings.\n---\n\n# Batch Processing\n\nBatch processing lets you send multiple requests in a single operation, saving up to 50% on costs. Instructor supports batch processing across multiple providers.\n\n## Supported Providers\n\n| Provider | Models | Cost Savings |\n|----------|--------|--------------|\n| OpenAI | gpt-4o, gpt-4.1-mini, gpt-4-turbo | 50% |\n| Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-haiku | 50% |\n| Google GenAI | gemini-2.5-flash, gemini-2.0-flash, gemini-pro | 50% |\n\n## Basic Usage\n\n```python\nfrom instructor.batch import BatchProcessor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nprocessor = BatchProcessor(\"openai/gpt-4.1-mini\", User)\n\nmessages_list = [\n    [\n        {\"role\": \"system\", \"content\": \"Extract user information from text.\"},\n        {\"role\": \"user\", \"content\": \"Hi, I'm Alice and I'm 28 years old.\"},\n    ],\n    [\n        {\"role\": \"system\", \"content\": \"Extract user information from text.\"},\n        {\"role\": \"user\", \"content\": \"Hello, I'm Bob, 35 years old.\"},\n    ],\n]\n\n# Create batch file\nprocessor.create_batch_from_messages(\n    file_path=\"batch_requests.jsonl\",\n    messages_list=messages_list,\n    max_tokens=200,\n    temperature=0.1,\n)\n\n# Submit batch job\nbatch_id = processor.submit_batch(\"batch_requests.jsonl\")\nprint(f\"Batch job submitted: {batch_id}\")\n\n# Check status and retrieve results\nstatus = processor.get_batch_status(batch_id)\nif status['status'] in ['completed', 'ended', 'JOB_STATE_SUCCEEDED']:\n    from instructor.batch import filter_successful, extract_results\n\n    all_results = processor.retrieve_results(batch_id)\n    for user in extract_results(all_results):\n        print(f\"Name: {user.name}, Age: {user.age}\")\n```\n\n## In-Memory Processing\n\nFor serverless deployments, use in-memory mode by setting `file_path=None`:\n\n```python\nimport time\nfrom instructor.batch import BatchProcessor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nprocessor = BatchProcessor(\"openai/gpt-4.1-mini\", User)\n\nmessages_list = [\n    [{\"role\": \"user\", \"content\": \"Extract: John is 25 years old\"}],\n    [{\"role\": \"user\", \"content\": \"Extract: Jane is 30 years old\"}],\n]\n\n# Create in-memory buffer (no file_path)\nbuffer = processor.create_batch_from_messages(\n    messages_list,\n    file_path=None,\n    max_tokens=150,\n)\n\n# Submit and poll for results\nbatch_id = processor.submit_batch(buffer)\n\nwhile True:\n    status = processor.get_batch_status(batch_id)\n    if status.get(\"status\") in [\"completed\", \"failed\", \"cancelled\"]:\n        break\n    time.sleep(10)\n\nif status.get(\"status\") == \"completed\":\n    results = processor.get_results(batch_id)\n    for r in results:\n        if hasattr(r, \"result\"):\n            print(f\"{r.result.name}, {r.result.age}\")\n```\n\n### When to Use Each Approach\n\n| Use Case | Approach |\n|----------|----------|\n| Serverless (Lambda, Cloud Functions) | In-memory |\n| Large batch jobs | File-based |\n| Security-sensitive environments | In-memory |\n| Debugging/audit requirements | File-based |\n\n## Provider Setup\n\n### OpenAI\n\n```bash\nexport OPENAI_API_KEY=\"your-openai-key\"\n```\n\n```python\nprocessor = BatchProcessor(\"openai/gpt-4.1-mini\", User)\n```\n\n### Anthropic\n\n```bash\nexport ANTHROPIC_API_KEY=\"your-anthropic-key\"\n```\n\n```python\nprocessor = BatchProcessor(\"anthropic/claude-3-5-sonnet-20241022\", User)\n```\n\n### Google GenAI\n\n```bash\nexport GOOGLE_CLOUD_PROJECT=\"your-project-id\"\nexport GCS_BUCKET=\"your-gcs-bucket-name\"\nexport GOOGLE_APPLICATION_CREDENTIALS=\"/path/to/service-account.json\"\n```\n\n```python\nprocessor = BatchProcessor(\"google/gemini-2.5-flash\", User)\n```\n\nRequired permissions: `roles/aiplatform.user` and `roles/storage.objectUser`.\n\n## Processing Results\n\nResults use a Maybe/Result pattern for type-safe handling:\n\n```python\nfrom instructor.batch import (\n    BatchProcessor,\n    filter_successful,\n    filter_errors,\n    extract_results,\n    get_results_by_custom_id,\n)\n\nall_results = processor.retrieve_results(batch_id)\n\n# Filter by type\nsuccessful = filter_successful(all_results)  # List[BatchSuccess[T]]\nerrors = filter_errors(all_results)           # List[BatchError]\nobjects = extract_results(all_results)        # List[T]\n\n# Access by custom_id\nby_id = get_results_by_custom_id(all_results)\nif \"request-1\" in by_id:\n    result = by_id[\"request-1\"]\n    if result.success:\n        print(f\"Success: {result.result}\")\n    else:\n        print(f\"Error: {result.error_message}\")\n```\n\n## API Reference\n\n| Method | Description |\n|--------|-------------|\n| `create_batch_from_messages(messages_list, file_path=None, ...)` | Create batch file or buffer |\n| `submit_batch(file_path_or_buffer, metadata=None)` | Submit batch job, returns job ID |\n| `get_batch_status(batch_id)` | Get job status |\n| `retrieve_results(batch_id)` | Download and parse results |\n| `parse_results(content)` | Parse raw results content |\n\n## CLI Commands\n\n```bash\n# List batch jobs\ninstructor batch list --model \"openai/gpt-4.1-mini\"\n\n# Create batch from file\ninstructor batch create-from-file --file-path batch.jsonl --model \"openai/gpt-4.1-mini\"\n\n# Get batch results\ninstructor batch results --batch-id \"batch_abc123\" --output-file results.jsonl\n```\n\n## Best Practices\n\n1. **Batch size**: Include at least 25,000 requests per job for optimal efficiency\n2. **Cost optimization**: Use batch processing for non-urgent workloads\n3. **Error handling**: Always check both successful and error results\n4. **Timeouts**: Batch jobs have execution limits (24 hours for Google)\n5. **Storage**: For Google, ensure GCS bucket is in the same region as your batch job\n\n## Troubleshooting\n\n| Issue | Solution |\n|-------|----------|\n| Missing GCS_BUCKET (Google) | Set the `GCS_BUCKET` environment variable |\n| Permission Denied (Google) | Add `aiplatform.user` and `storage.objectUser` roles |\n| Invalid Model Name | Use format `provider/model-name` |\n| Authentication Error | Verify API keys are set correctly |\n"
  },
  {
    "path": "docs/concepts/caching.md",
    "content": "## See Also\n\n- [Prompt Caching](./prompt_caching.md) - Cache prompts for cost optimization\n- [Performance Optimization](../examples/sqlmodel.md#performance-optimization) - Performance best practices\n- [Cost Optimization](../examples/batch_job_oai.md) - Reduce API costs\n- [Hooks](./hooks.md) - Monitor cache hits and misses\n\n---\ntitle: Caching Strategies with Instructor\ndescription: Learn how to use caching with Instructor to reduce API costs and improve performance.\n---\n\nFor more details on caching concepts, see our [blog](../blog/posts/caching.md).\n\n## Built-in Caching (v1.9.1 and later)\n\nInstructor supports caching for every client. Pass a cache adapter when you create the client. The cache parameter flows through to all provider implementations via **kwargs:\n\n```python\nfrom instructor import from_provider\nfrom instructor.cache import AutoCache, DiskCache\n\n# Works with any provider - cache flows through **kwargs automatically\nclient = from_provider(\"openai/gpt-4.1-mini\", cache=AutoCache(maxsize=1000))\nclient = from_provider(\"anthropic/claude-3-haiku\", cache=AutoCache(maxsize=1000))\nclient = from_provider(\"google/gemini-2.5-flash\", cache=DiskCache(directory=\".cache\"))\n\n# Your normal calls are now cached automatically\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n\n\nfirst = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Hi.\"}], response_model=User\n)\nsecond = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Hi.\"}], response_model=User\n)\nassert first.name == second.name  # second call was served from cache\n```\n\n### `cache_ttl` per-call override\n\nPass `cache_ttl=<seconds>` alongside `cache=` if you want a result to\nexpire automatically:\n\n```python\nfrom instructor import from_provider\nfrom instructor.cache import DiskCache\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n\n\ncache = DiskCache(directory=\".cache\")\nclient = from_provider(\"openai/gpt-4.1-mini\")\n\nclient.create(\n    messages=[{\"role\": \"user\", \"content\": \"Hi\"}],\n    response_model=User,\n    cache=cache,\n    cache_ttl=3600,  # 1 hour\n)\n```\n\nIf the underlying cache backend supports TTL (e.g. `DiskCache` does), the\nentry will be evicted after the specified duration.  For `AutoCache` the\nparameter is ignored.\n\n### Cache-key design\n\nUnder the hood Instructor generates a **deterministic** key for every\n call using `instructor.cache.make_cache_key`.\n\nComponents that influence the key:\n\n| Part                        | Why it matters                               |\n|-----------------------------|----------------------------------------------|\n| `model`                     | Different model names can yield different answers |\n| `messages` / `contents`     | The full chat history is hashed              |\n| `mode`                      | JSON vs. TOOLS vs. RESPONSES changes formatting |\n| `response_model` schema     | The entire `model_json_schema()` is included so **any** change in field names, types or *descriptions* busts the cache automatically |\n\nThe function returns a SHA-256 hex digest; its length is constant regardless\nof prompt size, so it is safe to use as a Redis key, file path, etc.\n\n```python\nfrom instructor.cache import make_cache_key\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n\n\nkey = make_cache_key(\n    messages=[{\"role\": \"user\", \"content\": \"hello\"}],\n    model=\"gpt-4.1-mini\",\n    response_model=User,\n    mode=\"TOOLS\",\n)\nprint(key)  # → 9b8f5e2c8c9e…\n#> 2e2a9521bd269d62ee9a8559d7deacba0025c1f6da0ec1fc63d472788be096fe\n```\n\nIf you need custom behaviour (e.g. ignoring certain prompt fields) you can\nwrite your own helper and pass a derived key into a bespoke cache adapter.\n\n### Raw Response Reconstruction\n\nFor raw completion objects (used with `create_with_completion`), we use a `SimpleNamespace` trick to reconstruct the original object structure:\n\n```python\nfrom pydantic import BaseModel\n\n\nclass Completion(BaseModel):\n    content: str\n    usage: dict\n\n\n# Example completion object\ncompletion = Completion(content=\"Hello\", usage={\"tokens\": 10})\n\n# When caching:\nraw_json = completion.model_dump_json()  # Serialize to JSON\n\n# When restoring from cache:\nimport json\nfrom types import SimpleNamespace\n\nrestored = json.loads(raw_json, object_hook=lambda d: SimpleNamespace(**d))\n```\n\nThis approach allows us to restore the original dot-notation access patterns (e.g., `completion.usage.total_tokens`) without requiring the original class definitions. The `SimpleNamespace` objects behave identically to the original completion objects for attribute access while being much simpler to reconstruct from JSON.\n\n## 1. `functools.cache` for Simple In-Memory Caching\n\n**When to Use**: Good for functions with immutable arguments, called repeatedly with the same parameters in small to medium-sized applications. Use this when reusing the same data within a single session.\n\n```python\nimport time\nimport functools\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@functools.cache\ndef extract(data) -> UserDetail:\n    return client.create(\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\nstart = time.perf_counter()  # (1)\nmodel = extract(\"Extract jason is 25 years old\")\nprint(f\"Time taken: {time.perf_counter() - start}\")\n#> Time taken: 0.43337099999189377\n\nstart = time.perf_counter()\nmodel = extract(\"Extract jason is 25 years old\")  # (2)\nprint(f\"Time taken: {time.perf_counter() - start}\")\n#> Time taken: 1.166015863418579e-06\n```\n\n1. Using `time.perf_counter()` to measure the time taken to run the function is better than using `time.time()` because it's more accurate and less susceptible to system clock changes.\n2. The second time we call `extract`, the result is returned from the cache, and the function is not called.\n\n!!! warning \"Changing the Model does not Invalidate the Cache\"\n\n    Note that changing the model does not invalidate the cache. This is because the cache key is based on the function's name and arguments, not the model. This means that if we change the model, the cache will still return the old result.\n\nCall `extract` multiple times with the same argument, and the result will be cached in memory for faster access.\n\n**Benefits**: Easy to implement, fast access due to in-memory storage, and requires no additional libraries.\n\n??? question \"What is a decorator?\"\n\n    A decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it. In Python, decorators are functions that take a function as an argument and return a closure.\n\n    ```python hl_lines=\"3-5 9\"\n    def decorator(func):\n        def wrapper(*args, **kwargs):\n            print(\"Do something before\")  # (1)\n            #> Do something before\n            result = func(*args, **kwargs)\n            print(\"Do something after\")  # (2)\n            #> Do something after\n            return result\n\n        return wrapper\n\n\n    @decorator\n    def say_hello():\n        #> Hello!\n        print(\"Hello!\")\n        #> Hello!\n\n\n    say_hello()\n    #> \"Do something before\"\n    #> \"Hello!\"\n    #> \"Do something after\"\n    ```\n\n    1. The code is executed before the function is called\n    2. The code is executed after the function is called\n\n## 2. `diskcache` for Persistent, Large Data Caching\n\n??? note \"Copy Caching Code\"\n\n    The same `instructor_cache` decorator works for both `diskcache` and `redis` caching. Copy the code below and use it for both examples.\n\n    ```python\n    import functools\n    import inspect\n    import diskcache\n\n    cache = diskcache.Cache('./my_cache_directory')  # (1)\n\n\n    def instructor_cache(func):\n        \"\"\"Cache a function that returns a Pydantic model\"\"\"\n        return_type = inspect.signature(func).return_annotation\n        if not issubclass(return_type, BaseModel):  # (2)\n            raise ValueError(\"The return type must be a Pydantic model\")\n\n        @functools.wraps(func)\n        def wrapper(*args, **kwargs):\n            key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"\n            # Check if the result is already cached\n            if (cached := cache.get(key)) is not None:\n                # Deserialize from JSON based on the return type\n                return return_type.model_validate_json(cached)\n\n            # Call the function and cache its result\n            result = func(*args, **kwargs)\n            serialized_result = result.model_dump_json()\n            cache.set(key, serialized_result)\n\n            return result\n\n        return wrapper\n    ```\n\n    1. We create a new `diskcache.Cache` instance to store the cached data. This will create a new directory called `my_cache_directory` in the current working directory.\n    2. We only want to cache functions that return a Pydantic model to simplify serialization and deserialization logic in this example code\n\n    Remember that you can change this code to support non-Pydantic models, or to use a different caching backend. More over, don't forget that this cache does not invalidate when the model changes, so you might want to encode the `Model.model_json_schema()` as part of the key.\n\n**When to Use**: Good for applications that need cache persistence between sessions or deal with large datasets. Use this when you want to reuse the same data across multiple sessions or store large amounts of data.\n\n```python hl_lines=\"10\"\nimport functools\nimport inspect\nimport instructor\nimport diskcache\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\ncache = diskcache.Cache('./my_cache_directory')\n\n\ndef instructor_cache(func):\n    \"\"\"Cache a function that returns a Pydantic model\"\"\"\n    return_type = inspect.signature(func).return_annotation  # (4)\n    if not issubclass(return_type, BaseModel):  # (1)\n        raise ValueError(\"The return type must be a Pydantic model\")\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        key = (\n            f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"  #  (2)\n        )\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type (3)\n            return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    return wrapper\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@instructor_cache\ndef extract(data) -> UserDetail:\n    return client.create(\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\n1. We only want to cache functions that return a Pydantic model to simplify serialization and deserialization logic\n2. We use functool's `_make_key` to generate a unique key based on the function's name and arguments. This is important because we want to cache the result of each function call separately.\n3. We use Pydantic's `model_validate_json` to deserialize the cached result into a Pydantic model.\n4. We use `inspect.signature` to get the function's return type annotation, which we use to validate the cached result.\n\n**Benefits**: Reduces computation time for heavy data processing and provides disk-based caching for persistence.\n\n## 3. Redis Caching Decorator for Distributed Systems\n\n??? note \"Copy Caching Code\"\n\n    The same `instructor_cache` decorator works for both `diskcache` and `redis` caching. Copy the code below and use it for both examples.\n\n    ```python\n    import functools\n    import inspect\n    import redis\n\n    cache = redis.Redis(\"localhost\")\n\n\n    def instructor_cache(func):\n        \"\"\"Cache a function that returns a Pydantic model\"\"\"\n        return_type = inspect.signature(func).return_annotation\n        if not issubclass(return_type, BaseModel):\n            raise ValueError(\"The return type must be a Pydantic model\")\n\n        @functools.wraps(func)\n        def wrapper(*args, **kwargs):\n            key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"\n            # Check if the result is already cached\n            if (cached := cache.get(key)) is not None:\n                # Deserialize from JSON based on the return type\n                return return_type.model_validate_json(cached)\n\n            # Call the function and cache its result\n            result = func(*args, **kwargs)\n            serialized_result = result.model_dump_json()\n            cache.set(key, serialized_result)\n\n            return result\n\n        return wrapper\n    ```\n\n    Remember that you can change this code to support non-Pydantic models, or to use a different caching backend. More over, don't forget that this cache does not invalidate when the model changes, so you might want to encode the `Model.model_json_schema()` as part of the key.\n\n**When to Use**: Good for distributed systems where multiple processes need to access cached data, or for applications that need fast read/write access and handle complex data structures.\n\n```python\nimport redis\nimport functools\nimport inspect\nimport instructor\n\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\ncache = redis.Redis(\"localhost\")\n\n\ndef instructor_cache(func):\n    \"\"\"Cache a function that returns a Pydantic model\"\"\"\n    return_type = inspect.signature(func).return_annotation\n    if not issubclass(return_type, BaseModel):  # (1)\n        raise ValueError(\"The return type must be a Pydantic model\")\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"  # (2)\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type\n            return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    return wrapper\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@instructor_cache\ndef extract(data) -> UserDetail:\n    # Assuming client.chat.completions.create returns a UserDetail instance\n    return client.create(\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n```\n\n1. We only want to cache functions that return a Pydantic model to simplify serialization and deserialization logic\n2. We use functool's `_make_key` to generate a unique key based on the function's name and arguments. This is important because we want to cache the result of each function call separately.\n\n**Benefits**: Scalable for large-scale systems, supports fast in-memory data storage and retrieval, and works with various data types.\n\n!!! note \"Same Decorator, Different Backend\"\n\n    The code above uses the same `instructor_cache` decorator as before. The implementation is the same, but it uses a different caching backend.\n"
  },
  {
    "path": "docs/concepts/citation.md",
    "content": "---\ntitle: Citation Extraction with CitationMixin\ndescription: Learn how to extract and validate citations from source text using CitationMixin to prevent hallucinations.\n---\n\n# Citation Extraction with CitationMixin\n\nCitationMixin is a Pydantic mixin that helps extract and validate citations from source text. It ensures that quotes used in your extracted data actually exist in the source context, preventing hallucinations.\n\n## What is CitationMixin?\n\nCitationMixin adds citation validation to your Pydantic models. When you use it, your model gets a `substring_quotes` field that contains quotes from the source text. The mixin automatically validates that these quotes exist in the source and corrects them to match exact spans.\n\n## Basic Usage\n\nInherit from CitationMixin to add citation support to your model:\n\n```python\nfrom pydantic import BaseModel, Field\nfrom instructor import CitationMixin\nimport instructor\n\n\nclass User(CitationMixin, BaseModel):\n    name: str = Field(description=\"The name of the person\")\n    age: int = Field(description=\"The age of the person\")\n    role: str = Field(description=\"The role of the person\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\ncontext = \"Betty was a student. Jason was a student. Jason is 20 years old\"\n\nuser = client.create(\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Extract information about Jason from: {context}\",\n        },\n    ],\n    context={\"context\": context},\n)\n\n# Verify quotes exist in context\nfor quote in user.substring_quotes:\n    assert quote in context\n\nprint(user.model_dump())\n# {\n#     \"name\": \"Jason\",\n#     \"age\": 20,\n#     \"role\": \"student\",\n#     \"substring_quotes\": [\n#         \"Jason was a student\",\n#         \"Jason is 20 years old\",\n#     ]\n# }\n```\n\n## How It Works\n\nCitationMixin works in three steps:\n\n1. **Extraction**: The LLM extracts data and provides quotes in the `substring_quotes` field\n2. **Validation**: The mixin checks if each quote exists in the source context using fuzzy matching\n3. **Correction**: Quotes are corrected to match exact spans in the source text\n\nThe validation happens automatically when you pass `context={\"context\": source_text}` to your `create()` call.\n\n## Using with Validation Context\n\nCitationMixin uses Pydantic's validation context to access the source text. Pass the source text in the `context` parameter:\n\n```python\nfrom pydantic import BaseModel, Field\nfrom instructor import CitationMixin\nimport instructor\n\n\nclass Fact(CitationMixin, BaseModel):\n    statement: str = Field(description=\"A factual statement\")\n    # substring_quotes is added automatically by CitationMixin\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\nsource_text = \"\"\"\nThe Eiffel Tower was completed in 1889 and stands 330 meters tall.\nIt was designed by Gustave Eiffel and is located in Paris, France.\n\"\"\"\n\nfact = client.create(\n    response_model=Fact,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Extract facts about the Eiffel Tower from: {source_text}\",\n        },\n    ],\n    context={\"context\": source_text},\n)\n\n# All quotes are validated and corrected to exact spans\nfor quote in fact.substring_quotes:\n    print(f\"Quote: {quote}\")\n    assert quote in source_text\n```\n\n## Fuzzy Matching\n\nCitationMixin uses fuzzy matching to find quotes even if they don't match exactly. This handles minor differences like:\n- Extra whitespace\n- Slight wording variations\n- Punctuation differences\n\nThe matching allows up to 5 character errors by default, which helps handle cases where the LLM paraphrases slightly.\n\n## Advanced Example: Question Answering with Citations\n\nUse CitationMixin to build question-answering systems that cite sources:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nfrom instructor import CitationMixin\nimport instructor\n\n\nclass Fact(CitationMixin, BaseModel):\n    statement: str = Field(description=\"A factual statement\")\n\n\nclass Answer(CitationMixin, BaseModel):\n    question: str\n    facts: List[Fact] = Field(description=\"List of facts that answer the question\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\nsource_text = \"\"\"\nJason Liu grew up in Toronto, Canada but was born in China.\nHe went to an arts high school but studied Computational Mathematics and Physics in university.\nHe worked at Stitchfix and Facebook as part of coop programs.\nHe started the Data Science club at the University of Waterloo and was president for 2 years.\n\"\"\"\n\nanswer = client.create(\n    response_model=Answer,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"Answer questions with exact citations from the source text.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"Source: {source_text}\\n\\nQuestion: What did Jason do during college?\",\n        },\n    ],\n    context={\"context\": source_text},\n)\n\n# Verify all citations exist\nfor fact in answer.facts:\n    for quote in fact.substring_quotes:\n        assert quote in source_text\n        print(f\"Verified: {quote}\")\n```\n\n## When to Use CitationMixin\n\nUse CitationMixin when:\n\n- You need to verify that extracted information comes from source text\n- You're building RAG (Retrieval Augmented Generation) systems\n- You want to prevent hallucinations by validating citations\n- You need exact quote spans for highlighting or display\n\n## Limitations\n\n- Requires passing source text in `context={\"context\": ...}`\n- Uses fuzzy matching which may not catch all paraphrasing\n- Only validates quotes, not the accuracy of extracted facts themselves\n\n## See Also\n\n- [Validation](./validation.md) - Learn about validation in Instructor\n- [Context-Based Validation](./validation.md#context-based-validation) - Using context for validation\n- [Citation Examples](../examples/exact_citations.md) - More citation examples\n- [RAG Patterns](../blog/posts/rag-and-beyond.md) - Building RAG systems with Instructor\n"
  },
  {
    "path": "docs/concepts/dictionary_operations.md",
    "content": "## See Also\n\n- [Types](./types.md) - Working with different data types\n- [Response Models](./models.md) - Working with Pydantic models\n- [Fields](./fields.md) - Customizing field metadata\n- [Union Types](./unions.md) - Handle multiple possible types\n\n---\ntitle: Dictionary Operations Optimization in Instructor\ndescription: Learn about performance optimizations for dictionary operations in Instructor, including message extraction and configuration parameter handling.\n---\n\n# Dictionary Operations Optimization\n\nThis document explains the dictionary operations optimizations implemented in Instructor.\n\n## Overview\n\nDictionary operations are one of the most common operations in the Instructor codebase, especially when handling message passing between different LLM providers and managing configuration parameters. Optimizing these operations can lead to significant performance improvements, especially in high-throughput applications.\n\n## Optimized Areas\n\n### Message Extraction\n\nThe `extract_messages` function was optimized to use direct key lookups instead of nested `get()` calls, which reduces the overhead of function calls and improves performance.\n\n**Before:**\n```python\nfrom typing import Any\n\n\ndef extract_messages(kwargs: dict[str, Any]) -> Any:\n    return kwargs.get(\n        \"messages\", kwargs.get(\"contents\", kwargs.get(\"chat_history\", []))\n    )\n```\n\n**After:**\n```python\nfrom typing import Any\n\n\ndef extract_messages(kwargs: dict[str, Any]) -> Any:\n    if \"messages\" in kwargs:\n        return kwargs[\"messages\"]\n    if \"contents\" in kwargs:\n        return kwargs[\"contents\"]\n    if \"chat_history\" in kwargs:\n        return kwargs[\"chat_history\"]\n    return []\n```\n\n### Response Processing Functions\n\nThe response processing functions were optimized to:\n1. Pre-extract commonly used variables to avoid repeated dictionary lookups\n2. Use the optimized `extract_messages` function instead of nested get operations\n3. Reduce redundant dictionary operations in error handling\n\n### Message Handler Selection\n\nThe `handle_reask_kwargs` function was optimized to use direct conditional checks instead of creating a large mapping dictionary, which reduces memory overhead and improves lookup performance.\n\n**Before:**\n```python\ndef handle_reask_kwargs(kwargs, mode, response, exception):\n    kwargs = kwargs.copy()\n    functions = {\n        Mode.TOOLS: reask_anthropic_tools,\n        Mode.JSON: reask_anthropic_json,\n        # ... many more mappings\n    }\n    reask_function = functions.get(mode, reask_default)\n    return reask_function(kwargs=kwargs, response=response, exception=exception)\n```\n\n**After:**\n```python\ndef handle_reask_kwargs(kwargs, mode, response, exception):\n    kwargs_copy = kwargs.copy()\n\n    if mode in {Mode.TOOLS, Mode.ANTHROPIC_REASONING_TOOLS}:\n        return reask_anthropic_tools(kwargs_copy, response, exception)\n    elif mode == Mode.JSON:\n        return reask_anthropic_json(kwargs_copy, response, exception)\n    # ... optimized conditional checks with grouped modes\n    else:\n        return reask_default(kwargs_copy, response, exception)\n```\n\n### System Message Handling\n\nThe `combine_system_messages` function in `utils.py` was optimized to:\n1. Cache type checks to avoid repeated calls\n2. Use more efficient list operations to avoid creating intermediate lists\n3. Optimize type conversion scenarios\n\n## Benchmarks\n\nBenchmarks show significant improvements in dictionary operation performance:\n\n| Operation | Before (ms) | After (ms) | Improvement |\n|-----------|-------------|------------|-------------|\n| extract_messages | ~0.08 | ~0.03 | ~62% |\n| handle_reask_kwargs | ~0.09 | ~0.05 | ~44% |\n| combine_system_messages | ~0.12 | ~0.07 | ~42% |\n\nThe exact improvement depends on the specific use case and data patterns.\n\n## Testing\n\nTwo types of tests were created to ensure the optimizations were safe:\n\n1. **Validation Tests** - Ensure the optimized functions return the same results as before\n2. **Benchmark Tests** - Measure and verify the performance improvements\n\nThese tests help ensure that the optimizations improve performance without changing behavior.\n\n## Conclusion\n\nDictionary operations optimization is a key part of making Instructor more efficient, especially for high-throughput applications. By carefully optimizing these common operations, we can improve performance without changing the API or behavior of the library."
  },
  {
    "path": "docs/concepts/distillation.md",
    "content": "---\ntitle: Seamless Fine-Tuning of Python Functions Using Instructor's Distillation\ndescription: Learn how to fine-tune language models with Python functions using Instructor's `Instructions` for efficient data preparation and logging.\n---\n\n## See Also\n\n- [Response Models](./models.md) - Working with Pydantic models\n- [Validation](./validation.md) - Ensuring output quality\n- [Types](./types.md) - Working with different data types\n- [Custom Validators](../learning/validation/custom_validators.md) - Build custom validation logic\n\n# Distilling python functions into LLM\n\n`Instructions` from the `Instructor` library offers a seamless way to make language models backward compatible with existing Python functions. By employing Pydantic type hints, it not only ensures compatibility but also facilitates fine-tuning `gpt-4.1-mini` to emulate these functions end-to-end.\n\nIf you want to see the full example checkout [examples/distillation](https://github.com/jxnl/instructor/tree/main/examples/distilations)\n\n## The Challenges in Function-Level Fine-Tuning\n\nReplicating the behavior of a Python function in a language model involves intricate data preparation. For instance, teaching a model to execute three-digit multiplication is not as trivial as implementing `def f(a, b): return a * b`. OpenAI's fine-tuning script coupled with their function calling utility provides a structured output, thereby simplifying the data collection process. Additionally, this eliminates the need for passing the schema to the model, thus conserving tokens.\n\n## The Role of `Instructions` in Simplifying the Fine-Tuning Process\n\nBy using `Instructions`, you can annotate a Python function that returns a Pydantic object, thereby automating the dataset creation for fine-tuning. A handler for logging is all that's needed to build this dataset.\n\n## How to Implement `Instructions` in Your Code\n\n## Quick Start: How to Use Instructor's Distillation Feature\n\nBefore we dig into the nitty-gritty, let's look at how easy it is to use Instructor's distillation feature to use function calling finetuning to export the data to a JSONL file.\n\n```python\nimport logging\nimport random\nfrom pydantic import BaseModel\n\n# Logging setup\nlogging.basicConfig(level=logging.INFO)\n\nfrom instructor import Instructions, FinetuneFormat  # pip install instructor\n\ninstructions = Instructions(\n    name=\"three_digit_multiply\",\n    finetune_format=FinetuneFormat.MESSAGES,  # or FinetuneFormat.RAW\n    # log handler is used to save the data to a file\n    # you can imagine saving it to a database or other storage\n    # based on your needs!\n    log_handlers=[logging.FileHandler(\"math_finetunes.jsonl\")],\n)\n\n\nclass Multiply(BaseModel):\n    a: int\n    b: int\n    result: int\n\n\n# Define a function with distillation\n# The decorator will automatically generate a dataset for fine-tuning\n# They must return a pydantic model to leverage function calling\n@instructions.distil\ndef fn(a: int, b: int) -> Multiply:\n    resp = a * b\n    return Multiply(a=a, b=b, result=resp)\n\n\n# Generate some data\nfor _ in range(10):\n    random.seed(42)\n    a = random.randint(100, 999)\n    b = random.randint(100, 999)\n    print(fn(a, b))\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n    #> a=754 b=214 result=161356\n```\n\n## The Intricacies of Fine-tuning Language Models\n\nFine-tuning isn't just about writing a function like `def f(a, b): return a * b`. It requires detailed data preparation and logging. However, Instructor provides a built-in logging feature and structured outputs to simplify this.\n\n## Why Instructor and Distillation are Game Changers\n\nThe library offers two main benefits:\n\n1. **Efficiency**: Streamlines functions, distilling requirements into model weights and a few lines of code.\n2. **Integration**: Eases combining classical machine learning and language models by providing a simple interface that wraps existing functions.\n\n## Role of Instructor in Simplifying Fine-Tuning\n\nThe `from instructor import Instructions` feature is a time saver. It auto-generates a fine-tuning dataset, making it a breeze to imitate a function's behavior.\n\n## FinetuneFormat Options\n\nThe `finetune_format` parameter controls how the fine-tuning data is structured. There are two options:\n\n### MESSAGES Format (Default)\n\nThe `MESSAGES` format creates data in OpenAI's chat completion format with messages and function calls. This is the recommended format for most use cases as it matches OpenAI's fine-tuning API format.\n\n```python\nfrom instructor import Instructions, FinetuneFormat\n\ninstructions = Instructions(\n    name=\"my_function\",\n    finetune_format=FinetuneFormat.MESSAGES,\n    log_handlers=[logging.FileHandler(\"output.jsonl\")],\n)\n```\n\n### RAW Format\n\nThe `RAW` format creates a simpler format with function metadata, arguments, and response. Use this format if you need more control over the data structure or are using a custom fine-tuning pipeline.\n\n```python\nfrom instructor import Instructions, FinetuneFormat\n\ninstructions = Instructions(\n    name=\"my_function\",\n    finetune_format=FinetuneFormat.RAW,\n    log_handlers=[logging.FileHandler(\"output.jsonl\")],\n)\n```\n\n## Logging Output and Running a Finetune\n\nHere's how the logging output would look for MESSAGES format:\n\n```python\n{\n    \"messages\": [\n        {\"role\": \"system\", \"content\": 'Predict the results of this function: ...'},\n        {\"role\": \"user\", \"content\": 'Return fn(133, b=539)'},\n        {\n            \"role\": \"assistant\",\n            \"function_call\": {\n                \"name\": \"Multiply\",\n                \"arguments\": '{\"a\":133,\"b\":539,\"result\":89509}',\n            },\n        },\n    ],\n    \"functions\": [\n        {\"name\": \"Multiply\", \"description\": \"Correctly extracted `Multiply`...\"}\n    ],\n}\n```\n\nFor RAW format, the output would look like:\n\n```python\n{\n    \"fn_name\": \"three_digit_multiply\",\n    \"fn_repr\": \"def fn(a: int, b: int) -> Multiply:\\n    ...\",\n    \"args\": [133],\n    \"kwargs\": {\"b\": 539},\n    \"response\": {\"a\": 133, \"b\": 539, \"result\": 89509}\n}\n```\n\nRun a finetune like this:\n\n```bash\ninstructor jobs create-from-file math_finetunes.jsonl\n```\n\nOnce a model is trained you can simply change `mode` to `dispatch` and it will use the model to run the function!\n\n```python\nfrom instructor import Instructions\nfrom pydantic import BaseModel\n\n\nclass Multiply(BaseModel):\n    a: int\n    b: int\n    result: int\n\n\ninstructions = Instructions(\n    name=\"three_digit_multiply\",\n)\n\n\n@instructions.distil(model='gpt-4.1-mini:finetuned-123', mode=\"dispatch\")\ndef fn(a: int, b: int) -> Multiply:\n    # now this code will be short circuited and the model will be used instead.\n    resp = a + b\n    return Multiply(a=a, b=b, result=resp)\n```\n\nWith this, you can swap the function implementation, making it backward compatible. You can even imagine using the different models for different tasks or validating and runnign evals by using the original function and comparing it to the distillation.\n"
  },
  {
    "path": "docs/concepts/enums.md",
    "content": "---\ntitle: Using Enums and Literals in Pydantic for Role Management\ndescription: Learn how to implement Enums and Literals in Pydantic to manage standardized user roles with a fallback option.\n---\n\nTo prevent data misalignment, we can use Enums for standardized fields. Always include an \"Other\" option as a fallback so the model can signal uncertainty.\n\n```python hl_lines=\"7 12\"\nfrom pydantic import BaseModel, Field\nfrom enum import Enum\n\n\nclass Role(Enum):\n    PRINCIPAL = \"PRINCIPAL\"\n    TEACHER = \"TEACHER\"\n    STUDENT = \"STUDENT\"\n    OTHER = \"OTHER\"\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Role = Field(\n        description=\"Correctly assign one of the predefined roles to the user.\"\n    )\n```\n\nIf you're having a hard time with `Enum` an alternative is to use `Literal` instead.\n\n```python hl_lines=\"4\"\nfrom typing import Literal\nfrom pydantic import BaseModel\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Literal[\"PRINCIPAL\", \"TEACHER\", \"STUDENT\", \"OTHER\"]\n```\n\n## See Also\n\n- [Types](./types.md) - Working with different data types including Literal\n- [Union Types](./unions.md) - Using unions with enums for multiple choices\n- [Response Models](./models.md) - Using enums in Pydantic models\n- [Fields](./fields.md) - Customizing enum fields with Field metadata\n"
  },
  {
    "path": "docs/concepts/error_handling.md",
    "content": "---\ntitle: Error Handling\ndescription: Learn how to handle errors and exceptions when using Instructor for structured outputs.\n---\n\n# Error Handling\n\nInstructor provides a comprehensive exception hierarchy to help you handle errors gracefully. All Instructor exceptions inherit from `InstructorError`.\n\n## Exception Reference\n\n| Exception | Description | Key Attributes |\n|-----------|-------------|----------------|\n| `InstructorError` | Base exception for all Instructor errors | - |\n| `IncompleteOutputException` | Output truncated due to token limit | `last_completion` |\n| `InstructorRetryException` | All retry attempts exhausted | `n_attempts`, `failed_attempts`, `total_usage` |\n| `ValidationError` | Response validation failed | - |\n| `ResponseParsingError` | Cannot parse LLM response | `mode`, `raw_response` |\n| `ProviderError` | Provider-specific error | `provider` |\n| `ConfigurationError` | Invalid configuration | - |\n| `ModeError` | Invalid mode for provider | `mode`, `provider`, `valid_modes` |\n| `ClientError` | Client initialization failed | - |\n| `MultimodalError` | Processing image/audio/PDF failed | `content_type`, `file_path` |\n| `AsyncValidationError` | Async validation failed | `errors` |\n\n## Common Exceptions\n\n### Incomplete Output\n\nRaised when the LLM output is truncated due to reaching the token limit:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.core.exceptions import IncompleteOutputException, InstructorRetryException\n\n\nclass Report(BaseModel):\n    content: str\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\", mode=instructor.Mode.JSON)\n\ntry:\n    response = client.create(\n        response_model=Report,\n        messages=[{\"role\": \"user\", \"content\": \"Write a long report...\"}],\n        max_tokens=50,\n        max_retries=0,\n    )\nexcept (IncompleteOutputException, InstructorRetryException) as e:\n    print(f\"Output truncated: {e}\")\n    print(f\"Last completion: {e.last_completion}\")\n```\n\n### Retry Exhausted\n\nRaised when all retry attempts fail:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.core.exceptions import InstructorRetryException\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\ntry:\n    response = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract user info...\"}],\n        max_retries=3,\n    )\nexcept InstructorRetryException as e:\n    print(f\"Failed after {e.n_attempts} attempts\")\n    for attempt in e.failed_attempts:\n        print(f\"  Attempt {attempt.attempt_number}: {attempt.exception}\")\n```\n\n### Validation Error\n\nRaised when the response fails validation:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, field_validator\nfrom instructor.core.exceptions import ValidationError\n\n\nclass StrictModel(BaseModel):\n    value: int\n\n    @field_validator(\"value\")\n    @classmethod\n    def validate_value(cls, v: int) -> int:\n        if v < 0:\n            raise ValueError(\"Value must be positive\")\n        return v\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\ntry:\n    response = client.create(\n        response_model=StrictModel,\n        messages=[{\"role\": \"user\", \"content\": \"Extract data...\"}],\n    )\nexcept ValidationError as e:\n    print(f\"Validation failed: {e}\")\n```\n\n### Provider and Configuration Errors\n\nRaised for provider-specific issues or invalid configuration:\n\n```python\nimport instructor\nfrom instructor.core.exceptions import ConfigurationError, ModeError\n\n# Invalid provider format\ntry:\n    client = instructor.from_provider(\"invalid-format\")\nexcept ConfigurationError as e:\n    print(f\"Configuration error: {e}\")\n\n# Wrong mode for provider\ntry:\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.TOOLS,\n    )\nexcept ModeError as e:\n    print(f\"Invalid mode. Valid modes: {e.valid_modes}\")\n```\n\n## Best Practices\n\n### Catch Specific Exceptions\n\n```python\nimport logging\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.core.exceptions import (\n    IncompleteOutputException,\n    InstructorRetryException,\n    ValidationError,\n)\n\nlogger = logging.getLogger(__name__)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\ntry:\n    response = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Sam is 34\"}],\n    )\nexcept IncompleteOutputException:\n    logger.warning(\"Output truncated, retrying with more tokens\")\n    response = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Sam is 34\"}],\n        max_tokens=2000,\n    )\nexcept InstructorRetryException as e:\n    logger.error(f\"Failed after {e.n_attempts} attempts\")\n    response = None\nexcept ValidationError as e:\n    logger.error(f\"Validation failed: {e}\")\n    raise\n```\n\n### Use Base Exception for General Handling\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.core.exceptions import InstructorError\n\n\nclass Data(BaseModel):\n    value: str\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\ntry:\n    response = client.create(\n        response_model=Data,\n        messages=[{\"role\": \"user\", \"content\": \"Extract data\"}],\n    )\nexcept InstructorError as e:\n    # Catches any Instructor-specific error\n    print(f\"Instructor error: {type(e).__name__}: {e}\")\n```\n\n### Graceful Degradation\n\n```python\nimport instructor\nfrom pydantic import BaseModel, field_validator\nfrom instructor.core.exceptions import ValidationError, InstructorRetryException\n\n\nclass StrictData(BaseModel):\n    value: int\n\n    @field_validator(\"value\")\n    @classmethod\n    def validate_value(cls, v: int) -> int:\n        if v < 0:\n            raise ValueError(\"Value must be positive\")\n        return v\n\n\nclass RelaxedData(BaseModel):\n    value: str\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\ndef extract_with_fallback(content: str):\n    try:\n        return client.create(\n            response_model=StrictData,\n            messages=[{\"role\": \"user\", \"content\": content}],\n        )\n    except ValidationError:\n        # Fall back to less strict model\n        return client.create(\n            response_model=RelaxedData,\n            messages=[{\"role\": \"user\", \"content\": content}],\n        )\n    except InstructorRetryException:\n        return None\n```\n\n## Backwards Compatibility\n\nNew exceptions inherit from both `ValueError` and `InstructorError`, so existing code continues to work:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.core.exceptions import ResponseParsingError\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n# Old code still works\ntry:\n    response = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Kai is 41\"}],\n    )\nexcept ValueError as e:\n    print(f\"Error: {e}\")\n\n# New code can access additional context\ntry:\n    response = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Kai is 41\"}],\n    )\nexcept ResponseParsingError as e:\n    print(f\"Mode: {e.mode}, Raw: {e.raw_response}\")\n```\n\n## Integration with Hooks\n\nMonitor errors using the hooks system:\n\n```python\nimport instructor\nfrom instructor.core.exceptions import ValidationError\n\n\ndef on_parse_error(error: Exception):\n    if isinstance(error, ValidationError):\n        print(f\"Validation error: {error}\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\nclient.hooks.on(\"parse:error\", on_parse_error)\n```\n\n## See Also\n\n- [Retrying](./retrying.md) - Retry strategies with Tenacity\n- [Validation](./validation.md) - Validation patterns\n- [Hooks](./hooks.md) - Error monitoring with hooks\n"
  },
  {
    "path": "docs/concepts/fastapi.md",
    "content": "---\ntitle: FastAPI Integration with Instructor - API Development Guide\ndescription: Build production-ready APIs with FastAPI and Instructor. Create type-safe endpoints for structured LLM outputs with automatic validation and documentation.\n---\n\n# Integrating Pydantic Models with FastAPI\n\n[FastAPI](https://fastapi.tiangolo.com/) is an enjoyable tool for building web applications in Python. It is well known for its integration with `Pydantic` models, which makes defining and validating data structures straightforward and efficient. In this guide, we explore how simple functions that return `Pydantic` models can seamlessly integrate with `FastAPI`.\n\n## Why Choose FastAPI and Pydantic?\n\n- FastAPI is a modern, high-performance web framework for building APIs with Python.\n- Supports OpenAPI and JSON Schema for automatic documentation and validation.\n- Supports AsyncIO for asynchronous programming leveraging the AsyncOpenAI() client\n\n## Code Example: Starting a FastAPI App with a POST Request\n\nThe following code snippet demonstrates how to start a `FastAPI` app with a POST endpoint. This endpoint accepts and returns data defined by a `Pydantic` model.\n\n```python\nimport instructor\n\nfrom fastapi import FastAPI\nfrom pydantic import BaseModel\n\n# Enables response_model\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\",\n    async_client=True,\n)\napp = FastAPI()\n\n\nclass UserData(BaseModel):\n    # This can be the model for the input data\n    query: str\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@app.post(\"/endpoint\", response_model=UserDetail)\nasync def endpoint_function(data: UserData) -> UserDetail:\n    user_detail = await client.create(\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Extract: `{data.query}`\"},\n        ],\n    )\n    return user_detail\n```\n\n## Streaming Responses with FastAPI\n\n`FastAPI` supports streaming responses, which is useful for returning large amounts of data. This feature is particularly useful when working with large language models (LLMs) that generate a large amount of data.\n\n```python hl_lines=\"6-7\"\nfrom fastapi import FastAPI\nfrom fastapi.responses import StreamingResponse\nfrom typing import Iterable\nfrom pydantic import BaseModel\n\napp = FastAPI()\n\n\nclass UserData(BaseModel):\n    query: str\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n# Route to handle SSE events and return users\n@app.post(\"/extract\", response_class=StreamingResponse)\nasync def extract(data: UserData):\n    users = await client.create(\n        response_model=Iterable[UserDetail],\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": data.query},\n        ],\n    )\n\n    async def generate():\n        async for user in users:\n            resp_json = user.model_dump_json()\n            yield f\"data: {resp_json}\"\n        yield \"data: [DONE]\"\n\n    return StreamingResponse(generate(), media_type=\"text/event-stream\")\n```\n\n## Automatic Documentation with FastAPI\n\nFastAPI leverages the OpenAPI specification to automatically generate a dynamic and interactive documentation page, commonly referred to as the `/docs` page. This feature is incredibly useful for developers, as it offers a live environment to test API endpoints directly through the browser.\n\nTo explore the capabilities of your API, follow these steps:\n\n1. Run the API using the Uvicorn command: `uvicorn main:app --reload`.\n2. Open your web browser and navigate to `http://127.0.0.1:8000/docs`.\n3. You will find an interactive UI where you can send different requests to your API and see the responses in real-time.\n\n![Screenshot of FastAPI /docs page](response.png)\n"
  },
  {
    "path": "docs/concepts/fields.md",
    "content": "---\ntitle: Customizing Pydantic Models with Field Metadata\ndescription: Learn how to enhance Pydantic models with metadata using Field, including default values, JSON schema customization, and more.\n---\n\nThe `pydantic.Field` function is used to customize and add metadata to fields of models. To learn more, check out the Pydantic [documentation](https://docs.pydantic.dev/latest/concepts/fields/) as this is a near replica of that documentation that is relevant to prompting.\n\n## Default values\n\nThe `default` parameter is used to define a default value for a field.\n\n```py\nfrom pydantic import BaseModel, Field\n\n\nclass User(BaseModel):\n    name: str = Field(default='John Doe')\n\n\nuser = User()\nprint(user)\n#> name='John Doe'\n```\n\nYou can also use `default_factory` to define a callable that will be called to generate a default value.\n\n```py\nfrom uuid import uuid4\n\nfrom pydantic import BaseModel, Field\n\n\nclass User(BaseModel):\n    id: str = Field(default_factory=lambda: uuid4().hex)\n```\n\n!!! info\n\n    The `default` and `default_factory` parameters are mutually exclusive.\n\n!!! note\n\n    If you use `typing.Optional`, it doesn't mean that the field has a default value of `None` you must use `default` or `default_factory` to define a default value. Then it will be considered `not required` when sent to the language model.\n\n## Using `Annotated`\n\nThe `Field` function can also be used together with `Annotated`.\n\n```py\nfrom uuid import uuid4\nfrom typing_extensions import Annotated\nfrom pydantic import BaseModel, Field\n\n\nclass User(BaseModel):\n    id: Annotated[str, Field(default_factory=lambda: uuid4().hex)]\n```\n\n## Exclude\n\nThe `exclude` parameter can be used to control which fields should be excluded from the\nmodel when exporting the model. This is helpful when you want to exclude fields that are not relevant to the model\ngeneration like `scratch_pad` or `chain_of_thought`\n\nSee the following example:\n\n```py\nfrom pydantic import BaseModel, Field\nfrom datetime import date\n\n\nclass DateRange(BaseModel):\n    chain_of_thought: str = Field(\n        description=\"Reasoning behind the date range.\", exclude=True\n    )\n    start_date: date\n    end_date: date\n\n\ndate_range = DateRange(\n    chain_of_thought=\"\"\"\n        I want to find the date range for the last 30 days.\n        Today is 2021-01-30 therefore the start date\n        should be 2021-01-01 and the end date is 2021-01-30\"\"\",\n    start_date=date(2021, 1, 1),\n    end_date=date(2021, 1, 30),\n)\nprint(date_range.model_dump_json())\n#> {\"start_date\":\"2021-01-01\",\"end_date\":\"2021-01-30\"}\n```\n\n## Omitting fields from schema sent to the language model\n\nIn some cases, you may wish to have the language model ignore certain fields in your model. You can do this by using Pydantic's `SkipJsonSchema` annotation. This omits a field from the JSON schema emitted by Pydantic (which `instructor` uses for constructing its prompts and tool definitions). For example:\n\n```py\nfrom pydantic import BaseModel\nfrom pydantic.json_schema import SkipJsonSchema\nfrom typing import Union\n\n\nclass Response(BaseModel):\n    question: str\n    answer: str\n    private_field: SkipJsonSchema[Union[str, None]] = None\n\n\nassert \"private_field\" not in Response.model_json_schema()[\"properties\"]\n```\n\nNote that because the language model will never return a value for `private_field`, you'll need a default value (this can be a generator via a declared Pydantic `Field`).\n\n## Customizing JSON Schema\n\nThere are some fields that are exclusively used to customise the generated JSON Schema:\n\n- `title`: The title of the field.\n- `description`: The description of the field.\n- `examples`: The examples of the field.\n- `json_schema_extra`: Extra JSON Schema properties to be added to the field.\n\nThese all work as great opportunities to add more information to the JSON schema as part of your prompt engineering.\n\nHere's an example:\n\n```py\nfrom pydantic import BaseModel, Field, SecretStr\n\n\nclass User(BaseModel):\n    age: int = Field(description='Age of the user')\n    name: str = Field(title='Username')\n    password: SecretStr = Field(\n        json_schema_extra={\n            'title': 'Password',\n            'description': 'Password of the user',\n            'examples': ['123456'],\n        }\n    )\n\n\nprint(User.model_json_schema())\n\"\"\"\n{\n    'properties': {\n        'age': {'description': 'Age of the user', 'title': 'Age', 'type': 'integer'},\n        'name': {'title': 'Username', 'type': 'string'},\n        'password': {\n            'description': 'Password of the user',\n            'examples': ['123456'],\n            'format': 'password',\n            'title': 'Password',\n            'type': 'string',\n            'writeOnly': True,\n        },\n    },\n    'required': ['age', 'name', 'password'],\n    'title': 'User',\n    'type': 'object',\n}\n\"\"\"\n```\n\n## See Also\n\n- [Response Models](./models.md) - Using Pydantic models with Instructor\n- [Fields Tutorial](../learning/patterns/field_validation.md) - Field-level validation patterns\n- [Types](./types.md) - Working with different field types\n- [Pydantic Fields Documentation](https://docs.pydantic.dev/latest/concepts/fields/) - Complete Field reference\n\n# General notes on JSON schema generation\n\n- The JSON schema for Optional fields indicates that the value null is allowed.\n- The Decimal type is exposed in JSON schema (and serialized) as a string.\n- The JSON schema does not preserve namedtuples as namedtuples.\n- When they differ, you can specify whether you want the JSON schema to represent the inputs to validation or the outputs from serialization.\n- Sub-models used are added to the `$defs` JSON attribute and referenced, as per the spec.\n- Sub-models with modifications (via the Field class) like a custom title, description, or default value, are recursively included instead of referenced.\n- The description for models is taken from either the docstring of the class or the argument description to the Field class.\n"
  },
  {
    "path": "docs/concepts/from_provider.md",
    "content": "---\ntitle: Using from_provider for Unified Client Creation\ndescription: Learn how to use from_provider to create Instructor clients for any LLM provider.\n---\n\n# Using from_provider\n\nThe `from_provider` function creates Instructor clients for any LLM provider. It uses the same interface across all providers, making it easy to switch between models.\n\n!!! note \"V2 Preview\"\n\n    `from_provider` routes to the v2 implementation by default for supported providers. Legacy provider-specific modes are deprecated, emit warnings, and map to generic modes (`Mode.TOOLS`, `Mode.JSON`, `Mode.JSON_SCHEMA`, `Mode.MD_JSON`).\n\n## Why Use from_provider?\n\n`from_provider` provides:\n\n- Simple syntax: One function works for all providers\n- Automatic setup: Handles provider-specific configuration automatically\n- Consistent interface: Same code works across different providers\n- Type safety: Full IDE support with proper type inference\n- Easy switching: Change providers with a single string change\n\n## Basic Usage\n\nThe basic syntax is simple: `instructor.from_provider(\"provider/model-name\")`\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create a client for any provider\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n# Or: instructor.from_provider(\"anthropic/claude-3-5-sonnet\")\n# Or: instructor.from_provider(\"google/gemini-2.5-flash\")\n\n# Use the client as usual\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old\"}],\n)\n```\n\n## Supported Providers\n\n`from_provider` supports all major LLM providers:\n\n### Cloud Providers\n\n- OpenAI: `\"openai/gpt-4o\"`, `\"openai/gpt-4o-mini\"`, `\"openai/gpt-4-turbo\"`\n- Anthropic: `\"anthropic/claude-3-5-sonnet\"`, `\"anthropic/claude-3-opus\"`\n- Google: `\"google/gemini-2.5-flash\"`, `\"google/gemini-pro\"`\n- Azure OpenAI: `\"azure_openai/gpt-4o\"`\n- AWS Bedrock: `\"bedrock/claude-3-5-sonnet\"`\n- Vertex AI: `\"vertexai/gemini-pro\"` (or use `\"google/gemini-pro\"` with `vertexai=True`)\n\n### Fast Inference Providers\n\n- Groq: `\"groq/llama-3.1-70b\"`\n- Fireworks: `\"fireworks/mixtral-8x7b\"`\n- Together: `\"together/meta-llama/Llama-3-70b\"`\n- Anyscale: `\"anyscale/meta-llama/Llama-3-70b\"`\n\n### Other Providers\n\n- Mistral: `\"mistral/mistral-large\"`\n- Cohere: `\"cohere/command-r-plus\"`\n- Perplexity: `\"perplexity/llama-3.1-sonar\"`\n- DeepSeek: `\"deepseek/deepseek-chat\"`\n- xAI: `\"xai/grok-beta\"`\n- OpenRouter: `\"openrouter/meta-llama/llama-3.1-70b\"`\n- Ollama: `\"ollama/llama3\"` (local models)\n- LiteLLM: `\"litellm/gpt-4o\"` (meta-provider)\n\nSee the [Integrations](../integrations/index.md) section for complete provider documentation.\n\n## Provider String Format\n\nThe provider string follows the format: `\"provider/model-name\"`\n\n```python\n# Correct formats\n\"openai/gpt-4o\"\n\"anthropic/claude-3-5-sonnet-20241022\"\n\"google/gemini-2.5-flash\"\n\n# Incorrect formats (will raise errors)\n\"gpt-4o\"  # Missing provider prefix\n\"openai\"  # Missing model name\n\"openai/gpt-4o/mini\"  # Too many slashes\n```\n\n## Async Clients\n\nCreate async clients by setting `async_client=True`:\n\n```python\nimport asyncio\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def main() -> None:\n    # Create async client\n    async_client = instructor.from_provider(\"openai/gpt-4o-mini\", async_client=True)\n\n    # Use with await\n    await async_client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Alice is 25\"}],\n    )\n\n\nasyncio.run(main())\n```\n\n## Advanced Configuration\n\n### Custom API Keys\n\nPass API keys directly or use environment variables:\n\n```python\nimport instructor\n\n# Pass API key directly\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", api_key=\"sk-your-key-here\")\n\n# Or use environment variables (recommended)\n# export OPENAI_API_KEY=sk-your-key-here\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n```\n\n### Mode Overrides\n\nOverride the default mode for a provider:\n\n```python\nimport instructor\n\n# OpenAI defaults to TOOLS mode, but you can override\nclient = instructor.from_provider(\n    \"openai/gpt-4o-mini\", mode=instructor.Mode.JSON  # Use JSON mode instead\n)\n```\n\n### Caching\n\nEnable response caching:\n\n```python\nfrom instructor.cache import AutoCache\nimport instructor\n\ncache = AutoCache(maxsize=1000)\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", cache=cache)\n```\n\n### Provider-Specific Options\n\nPass provider-specific options through `**kwargs`:\n\n```python\nimport os\nimport instructor\n\n# For OpenAI\nclient = instructor.from_provider(\n    \"openai/gpt-4o-mini\", organization=\"org-your-org-id\", timeout=30.0\n)\n\n# For Anthropic\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet\", max_tokens=4096)\n\n# For Google with Vertex AI\ngoogle_api_key = os.environ.pop(\"GOOGLE_API_KEY\", None)\n\nclient = instructor.from_provider(\n    \"google/gemini-pro\",\n    vertexai=True,\n    project=\"your-project-id\",\n    location=\"us-central1\",\n)\n\nif google_api_key is not None:\n    os.environ[\"GOOGLE_API_KEY\"] = google_api_key\n```\n\n## Default Modes\n\nEach provider uses a recommended default mode:\n\n- OpenAI: `Mode.TOOLS`\n- Anthropic: `Mode.TOOLS`\n- Google: `Mode.TOOLS` or `Mode.JSON` based on the model\n- Ollama: `Mode.TOOLS` (if supported) or `Mode.JSON`\n- Others: `Mode.TOOLS` or `Mode.MD_JSON` depending on capability\n\nLegacy provider-specific modes still work but are deprecated. See the [Mode Migration Guide](./mode-migration.md) for details.\n\nOverride these defaults with the `mode` parameter.\n\n## Error Handling\n\n`from_provider` raises clear errors for common issues:\n\n```python\nimport instructor\nfrom instructor.core.exceptions import ConfigurationError\n\ntry:\n    # Invalid provider format\n    client = instructor.from_provider(\"invalid-format\")\nexcept ConfigurationError as e:\n    print(f\"Configuration error: {e}\")\n    \"\"\"\n    Configuration error: Model string must be in format \"provider/model-name\" (e.g. \"openai/gpt-4\" or \"anthropic/claude-3-sonnet\")\n    \"\"\"\n\ntry:\n    # Unsupported provider\n    client = instructor.from_provider(\"unsupported/provider\")\nexcept ConfigurationError as e:\n    print(f\"Unsupported provider: {e}\")\n    \"\"\"\n    Unsupported provider: Unsupported provider: unsupported. Supported providers are: ['openai', 'azure_openai', 'databricks', 'anthropic', 'google', 'generative-ai', 'vertexai', 'mistral', 'cohere', 'perplexity', 'groq', 'writer', 'bedrock', 'cerebras', 'deepseek', 'fireworks', 'ollama', 'openrouter', 'xai', 'litellm']\n    \"\"\"\n\ntry:\n    # Missing required package\n    client = instructor.from_provider(\"anthropic/claude-3\")\nexcept ImportError as e:\n    print(f\"Missing package: {e}\")\n    # Install with: pip install anthropic\n```\n\n## Environment Variables\n\nMost providers support environment variables for configuration:\n\n```bash\n# OpenAI\nexport OPENAI_API_KEY=sk-your-key\n\n# Anthropic\nexport ANTHROPIC_API_KEY=sk-ant-your-key\n\n# Google\nexport GOOGLE_API_KEY=your-key\n\n# Azure OpenAI\nexport AZURE_OPENAI_API_KEY=your-key\nexport AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/\n\n# AWS Bedrock\nexport AWS_DEFAULT_REGION=us-east-1\nexport AWS_ACCESS_KEY_ID=your-key\nexport AWS_SECRET_ACCESS_KEY=your-secret\n\n# Others\nexport MISTRAL_API_KEY=your-key\nexport COHERE_API_KEY=your-key\nexport GROQ_API_KEY=your-key\nexport DEEPSEEK_API_KEY=your-key\nexport OPENROUTER_API_KEY=your-key\n```\n\n## Switching Between Providers\n\nOne of the biggest advantages of `from_provider` is easy provider switching:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Easy to switch providers\nPROVIDER = \"openai/gpt-4o-mini\"  # Change this to switch\n# PROVIDER = \"anthropic/claude-3-5-sonnet\"\n# PROVIDER = \"google/gemini-2.5-flash\"\n\nclient = instructor.from_provider(PROVIDER)\n\n# Same code works for all providers\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: Bob is 40\"}],\n)\n```\n\n## Best Practices\n\n1. Use environment variables: Store API keys in environment variables, not in code\n2. Use type hints: Let your IDE help with autocomplete and type checking\n3. Handle errors: Wrap provider creation in try-except blocks\n4. Cache when appropriate: Use caching for repeated requests\n5. Choose the right mode: Let defaults work, but override when needed\n\n## Comparison with Other Methods\n\n### from_provider vs. Manual Patching\n\n```python\n# Old way (still works, but more verbose)\nimport openai\nimport instructor\n\nopenai_client = openai.OpenAI()\nclient = instructor.patch(openai_client)\n\n# New way (recommended)\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n```\n\n### from_provider vs. Provider-Specific Functions\n\nProvider-specific helpers were removed. Use `from_provider` for all clients:\n\n```python\nimport instructor\n\nopenai_client = instructor.from_provider(\"openai/gpt-4o-mini\")\nanthropic_client = instructor.from_provider(\"anthropic/claude-3-5-sonnet\")\n```\n\n## Troubleshooting\n\n### Provider Not Found\n\nIf you get an error about an unsupported provider:\n\n1. Check the provider name spelling\n2. Verify the provider is in the supported list\n3. Check if you need to install an extra package: `uv pip install \"instructor[provider-name]\"`\n\n### Import Errors\n\nIf you get import errors:\n\n```bash\n# Install the required package\n# For Anthropic\nuv pip install anthropic\n\n# For Google\nuv pip install google-genai\n\n# For others, see integration docs\n```\n\n### Invalid Model String\n\nThe model string must be in format `\"provider/model-name\"`:\n\n```python\n# Correct\n\"openai/gpt-4o\"\n\n# Incorrect\n\"gpt-4o\"  # Missing provider\n\"openai\"  # Missing model\n```\n\n## Related Documentation\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [Patching](./patching.md) - How Instructor enhances clients\n- [Integrations](../integrations/index.md) - Provider-specific documentation\n- [Migration Guide](./migration.md) - Migrating from old patterns\n"
  },
  {
    "path": "docs/concepts/hooks.md",
    "content": "---\ntitle: Hooks\ndescription: Learn how to use hooks for event handling, logging, and error handling in Instructor.\n---\n\n# Hooks\n\nHooks let you intercept and handle events during the completion and parsing process. Use them to add logging, monitoring, or error handling at different stages of API interactions.\n\n## Hook Events\n\n| Event | Description | Handler Signature |\n|-------|-------------|-------------------|\n| `completion:kwargs` | Arguments passed to completion | `def handler(*args, **kwargs)` |\n| `completion:response` | Raw API response received | `def handler(response)` |\n| `completion:error` | Error before retries | `def handler(error)` |\n| `parse:error` | Pydantic validation failed | `def handler(error)` |\n| `completion:last_attempt` | Last retry attempt | `def handler(error)` |\n\n## Registering and Removing Hooks\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\ndef log_kwargs(*args, **kwargs):\n    print(f\"Model: {kwargs.get('model')}\")\n\n\ndef log_response(response):\n    print(f\"Response received: {response.id}\")\n\n\n# Register hooks\nclient.on(\"completion:kwargs\", log_kwargs)\nclient.on(\"completion:response\", log_response)\n\n# Make a request\nresp = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Hello, world!\"}],\n    response_model=str,\n)\n\n# Remove a specific hook\nclient.off(\"completion:kwargs\", log_kwargs)\n\n# Clear all hooks for an event\nclient.clear(\"completion:kwargs\")\n\n# Clear all hooks\nclient.clear()\n```\n\nYou can use enum values or strings for hook names:\n\n```python\nfrom instructor.hooks import HookName\n\nclient.on(HookName.COMPLETION_KWARGS, log_kwargs)  # Using enum\nclient.on(\"completion:kwargs\", log_kwargs)          # Using string\n```\n\n## Practical Example: Logging\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass ErrorCounter:\n    def __init__(self):\n        self.count = 0\n\n    def handle_error(self, error: Exception):\n        self.count += 1\n        print(f\"Error #{self.count}: {type(error).__name__}: {error}\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\ncounter = ErrorCounter()\n\nclient.on(\"completion:error\", counter.handle_error)\nclient.on(\"parse:error\", counter.handle_error)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\ntry:\n    user = client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Extract: John is twenty\"}],\n        response_model=User,\n    )\n    print(f\"Extracted: {user}\")\nexcept Exception as e:\n    print(f\"Final error: {e}\")\n\nprint(f\"Total errors: {counter.count}\")\n```\n\n## Error Handling\n\nMonitor errors by type using Instructor's exception hierarchy:\n\n```python\nimport logging\nimport instructor\nfrom instructor.core.exceptions import (\n    IncompleteOutputException,\n    InstructorRetryException,\n    ValidationError,\n    ProviderError,\n)\n\nlogger = logging.getLogger(__name__)\n\n\ndef handle_error(error: Exception):\n    if isinstance(error, IncompleteOutputException):\n        logger.warning(f\"Incomplete output: {error}\")\n    elif isinstance(error, ValidationError):\n        logger.error(f\"Validation failed: {error}\")\n    elif isinstance(error, ProviderError):\n        logger.error(f\"Provider error ({error.provider}): {error}\")\n    elif isinstance(error, InstructorRetryException):\n        logger.critical(f\"Retries exhausted after {error.n_attempts} attempts\")\n    else:\n        logger.error(f\"Unexpected error: {error}\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\nclient.on(\"completion:error\", handle_error)\nclient.on(\"parse:error\", handle_error)\n```\n\n## Hook Combination\n\nCombine different hook sets using the `+` operator:\n\n```python\nimport instructor\nfrom instructor.core.hooks import Hooks\n\n# Create specialized hook sets\nlogging_hooks = Hooks()\nlogging_hooks.on(\"completion:kwargs\", lambda **kw: print(\"Logging kwargs\"))\n\nmetrics_hooks = Hooks()\nmetrics_hooks.on(\"completion:response\", lambda resp: print(\"Recording metrics\"))\n\n# Combine hooks\ncombined = logging_hooks + metrics_hooks\n\n# Or combine multiple at once\nall_hooks = Hooks.combine(logging_hooks, metrics_hooks)\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\", hooks=combined)\n```\n\n## Per-Call Hooks\n\nSpecify hooks for individual API calls:\n\n```python\nimport instructor\nfrom instructor.core.hooks import Hooks\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Client with standard hooks\nclient_hooks = Hooks()\nclient_hooks.on(\"completion:kwargs\", lambda **kw: print(\"Standard logging\"))\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\", hooks=client_hooks)\n\n# Debug hooks for specific calls\ndebug_hooks = Hooks()\ndebug_hooks.on(\"parse:error\", lambda err: print(f\"Debug: {err}\"))\n\n# Per-call hooks combine with client hooks\nuser = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: Alice is 25\"}],\n    response_model=User,\n    hooks=debug_hooks,  # Both client and debug hooks run\n)\n```\n\n## Testing with Hooks\n\nUse hooks to inspect requests and responses in tests:\n\n```python\nimport unittest\nfrom unittest.mock import Mock\nimport instructor\n\n\nclass TestMyApp(unittest.TestCase):\n    def test_completion(self):\n        client = instructor.from_provider(\"openai/gpt-4.1-mini\")\n        mock_handler = Mock()\n\n        client.on(\"completion:response\", mock_handler)\n\n        result = client.create(\n            messages=[{\"role\": \"user\", \"content\": \"Hello\"}],\n            response_model=str,\n        )\n\n        mock_handler.assert_called_once()\n        response = mock_handler.call_args[0][0]\n        self.assertEqual(response.model, \"gpt-4.1-mini\")\n```\n\n## Custom Hooks\n\nCreate custom hook systems by extending the base pattern:\n\n```python\nfrom enum import Enum\nfrom instructor.hooks import HookName\n\n\nclass CustomHookName(str, Enum):\n    CUSTOM_EVENT = \"custom:event\"\n    # Include base hooks for compatibility\n    COMPLETION_KWARGS = HookName.COMPLETION_KWARGS.value\n\n\nclass CustomHooks:\n    def __init__(self):\n        self._handlers: dict[str, list] = {}\n\n    def on(self, hook_name: CustomHookName, handler):\n        self._handlers.setdefault(hook_name.value, []).append(handler)\n\n    def emit(self, hook_name: CustomHookName, payload):\n        for handler in self._handlers.get(hook_name.value, []):\n            handler(payload)\n\n\nhooks = CustomHooks()\nhooks.on(CustomHookName.CUSTOM_EVENT, lambda data: print(f\"Custom: {data}\"))\nhooks.emit(CustomHookName.CUSTOM_EVENT, {\"key\": \"value\"})\n```\n\n## See Also\n\n- [Debugging](../debugging.md) - Practical debugging techniques\n- [Retrying](./retrying.md) - Monitor retry attempts\n- [Error Handling](./error_handling.md) - Exception handling patterns\n"
  },
  {
    "path": "docs/concepts/index.md",
    "content": "---\ntitle: Instructor Concepts - Core Features and Patterns\ndescription: Explore core concepts and features of the Instructor library. Learn about structured outputs, validation, streaming, and advanced patterns.\n---\n\n# Instructor Concepts\n\nThis section explains the core concepts and features of the Instructor library, organized by category to help you find what you need.\n\n## Core Concepts\n\nThese are the fundamental concepts you need to understand to use Instructor effectively:\n\n- [Models](./models.md) - Using Pydantic models to define output structures\n- [Patching](./patching.md) - How Instructor patches LLM clients\n- [from_provider](./from_provider.md) - Unified interface for creating clients across all providers\n- [Migration Guide](./migration.md) - Migrating from older patterns to from_provider\n- [Types](./types.md) - Working with different data types in your models\n- [Validation](./validation.md) - Validating LLM outputs against your models\n- [Prompting](./prompting.md) - Creating effective prompts for structured output extraction\n- [Multimodal](./multimodal.md) - Working with Audio Files, Images and PDFs\n\n## Data Handling and Structures\n\nThese concepts relate to defining and working with different data structures:\n\n- [Fields](./fields.md) - Working with Pydantic fields and attributes\n- [Lists and Arrays](./lists.md) - Handling lists and arrays in your models\n- [TypedDicts](./typeddicts.md) - Using TypedDict for flexible typing\n- [Union Types](./unions.md) - Working with union types\n- [Enums](./enums.md) - Using enumerated types in your models\n- [Missing](./maybe.md) - Handling missing or optional values\n- [Alias](./alias.md) - Create field aliases\n- [Citation](./citation.md) - Extract and validate citations from source text\n\n## Streaming Features\n\nThese features help you work with streaming responses:\n\n- [Stream Partial](./partial.md) - Stream partially completed responses\n- [Stream Iterable](./iterable.md) - Stream collections of completed objects\n- [Raw Response](./raw_response.md) - Access the raw LLM response\n\n## Error Handling and Validation\n\nThese features help you ensure data quality:\n\n- [Retrying](./retrying.md) - Configure automatic retry behavior\n- [Validators](./reask_validation.md) - Define custom validation logic\n- [Hooks](./hooks.md) - Add callbacks for monitoring and debugging\n\n## Performance Optimization\n\nThese features help you optimize performance:\n\n- [Caching](./caching.md) - Cache responses to improve performance\n- [Prompt Caching](./prompt_caching.md) - Cache prompts to reduce token usage\n- [Usage Tokens](./usage.md) - Track token usage\n- [Parallel Tools](./parallel.md) - Run multiple tools in parallel\n- [Dictionary Operations](./dictionary_operations.md) - Performance optimizations for dictionary operations\n\n## Integration Features\n\nThese features help you integrate with other technologies:\n\n- [FastAPI](./fastapi.md) - Integrate with FastAPI\n- [Type Adapter](./typeadapter.md) - Use TypeAdapter with Instructor\n- [Templating](./templating.md) - Use templates for dynamic prompts\n- [Distillation](./distillation.md) - Optimize models for production\n\n## Philosophy\n\n- [Philosophy](./philosophy.md) - The guiding principles behind Instructor\n\n## How These Concepts Work Together\n\nInstructor is built around a few key ideas that work together:\n\n1. **Define Structure with Pydantic**: Use Pydantic models to define exactly what data you want.\n2. **Create Clients with from_provider**: Use the unified interface to create clients for any provider.\n3. **Validate and Retry**: Automatically validate responses and retry if necessary.\n4. **Process Streams**: Handle streaming responses for real-time updates.\n\n### Typical Workflow\n\n```mermaid\nsequenceDiagram\n    participant User as Your Code\n    participant Instructor\n    participant LLM as LLM Provider\n\n    User->>Instructor: Define Pydantic model\n    User->>Instructor: Create client with from_provider\n    User->>Instructor: Call create() with response_model\n    Instructor->>LLM: Send structured request\n    LLM->>Instructor: Return LLM response\n    Instructor->>Instructor: Validate against model\n\n    alt Validation Success\n        Instructor->>User: Return validated Pydantic object\n    else Validation Failure\n        Instructor->>LLM: Retry with error context\n        LLM->>Instructor: Return new response\n        Instructor->>Instructor: Validate again\n        Instructor->>User: Return validated object or error\n    end\n```\n\n## What to Read Next\n\n- If you're new to Instructor, start with [Models](./models.md) and [from_provider](./from_provider.md)\n- If you're migrating from older patterns, see the [Migration Guide](./migration.md)\n- If you're having validation issues, check out [Validators](./reask_validation.md) and [Retrying](./retrying.md)\n- For streaming applications, read [Stream Partial](./partial.md) and [Stream Iterable](./iterable.md)\n- To optimize your application, look at [Caching](./caching.md) and [Usage Tokens](./usage.md)\n\nFor practical examples of these concepts, visit the [Cookbook](../examples/index.md) section.\n\n!!! see-also \"See Also\"\n    - [Getting Started Guide](../getting-started.md) - Begin your journey with Instructor\n    - [Examples](../examples/index.md) - Practical implementations of these concepts\n    - [Integrations](../integrations/index.md) - Connect with different LLM providers\n"
  },
  {
    "path": "docs/concepts/iterable.md",
    "content": "---\ntitle: Iterable Extraction with Instructor - Stream Multiple Objects\ndescription: Use Iterable types to extract and stream multiple structured objects from LLM responses. Perfect for entity extraction and multi-task outputs.\n---\n\n# Multi-Task and Streaming\n\nUsing an `Iterable` lets you extract multiple structured objects from a single LLM call, streaming them as they arrive. This is useful for entity extraction, multi-task outputs, and more.\n\n**We recommend using the `create_iterable` method for most use cases.** It's simpler and less error-prone than manually specifying `Iterable[...]` and `stream=True`.\n\nHere's a simple example showing how to extract multiple users from a single sentence. You can use either the recommended `create_iterable` method or the `create` method with `Iterable[User]`:\n\n=== \"Using `create_iterable` (recommended)\"\n    ```python\n    import instructor\n    from pydantic import BaseModel\n\n    client = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n\n    resp = client.create_iterable(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Ivan is 28, lives in Moscow and his friends are Alex, John and Mary who are 25, 30 and 27 respectively\",\n            }\n        ],\n        response_model=User,\n    )\n\n    for user in resp:\n        print(user)\n        #> name='Ivan' age=28\n        #> name='Alex' age=25\n        #> name='John' age=30\n        #> name='Mary' age=27\n    ```\n    _Recommended for most use cases. Handles streaming and iteration for you._\n\n=== \"Using `create` with `Iterable[User]`\"\n    ```python\n    import instructor\n    from pydantic import BaseModel\n    from typing import Iterable\n\n    client = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n\n    resp = client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Ivan is 28, lives in Moscow and his friends are Alex, John and Mary who are 25, 30 and 27 respectively\",\n            }\n        ],\n        response_model=Iterable[User],\n    )\n\n    for user in resp:\n        print(user)\n        #> name='Ivan' age=28\n        #> name='Alex' age=25\n        #> name='John' age=30\n        #> name='Mary' age=27\n    ```\n    _Use this if you need more manual control or compatibility with legacy code._\n\n---\n\n\nWe also support more complex extraction patterns such as Unions as you'll see below out of the box.\n\n???+ warning\n\n    Unions don't work with Gemini because the AnyOf is not supported in the current response schema.\n\n## Synchronous Usage\n\n=== \"Using `create`\"\n\n    ```python\n    import instructor\n    from typing import Iterable, Union, Literal\n    from pydantic import BaseModel\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    client = instructor.from_provider(\"openai/gpt-4.1-mini\", mode=instructor.Mode.TOOLS)\n\n    results = client.create(\n        messages=[\n            {\"role\": \"system\", \"content\": \"You must always use tools\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n            },\n        ],\n        response_model=Iterable[Union[Weather, GoogleSearch]],\n        stream=True,\n    )\n\n    for item in results:\n        print(item)\n        #> location='Toronto' units='metric'\n        #> location='Dallas' units='imperial'\n        #> query='Super Bowl winner'\n    ```\n\n=== \"Using `create_iterable` (recommended)\"\n\n    ```python\n    import instructor\n    from typing import Union, Literal\n    from pydantic import BaseModel\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    client = instructor.from_provider(\"openai/gpt-4.1-mini\", mode=instructor.Mode.TOOLS)\n\n    results = client.create_iterable(\n        messages=[\n            {\"role\": \"system\", \"content\": \"You must always use tools\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n            },\n        ],\n        response_model=Union[Weather, GoogleSearch],\n    )\n\n    for item in results:\n        print(item)\n        #> location='Toronto' units='metric'\n        #> location='Dallas' units='imperial'\n        #> query='Super Bowl winner'\n    ```\n\n---\n\n## See Also\n\n- [Streaming Lists](./lists.md) - Similar functionality with different API\n- [Streaming Partial](./partial.md) - Stream partially completed objects\n- [List Extraction Tutorial](../learning/patterns/list_extraction.md) - Step-by-step guide\n- [Streaming Basics](../learning/streaming/basics.md) - Introduction to streaming\n\n## Asynchronous Usage\n\n=== \"Using `create`\"\n\n    ```python\n    import instructor\n    from typing import Iterable, Union, Literal\n    from pydantic import BaseModel\n    import asyncio\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    aclient = instructor.from_provider(\n        \"openai/gpt-4.1-mini\", async_client=True, mode=instructor.Mode.TOOLS\n    )\n\n\n    async def main():\n        results = await aclient.create(\n            messages=[\n                {\"role\": \"system\", \"content\": \"You must always use tools\"},\n                {\n                    \"role\": \"user\",\n                    \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n                },\n            ],\n            response_model=Iterable[Union[Weather, GoogleSearch]],\n            stream=True,\n        )\n        async for item in results:\n            print(item)\n            #> location='Toronto' units='metric'\n            #> location='Dallas' units='imperial'\n            #> query='Super Bowl winner'\n\n\n    asyncio.run(main())\n    ```\n\n=== \"Using `create_iterable` (recommended)\"\n\n    ```python\n    import asyncio\n    from typing import Literal, Union\n\n    import instructor\n    from pydantic import BaseModel\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    aclient = instructor.from_provider(\n        \"openai/gpt-4.1-mini\", async_client=True, mode=instructor.Mode.TOOLS\n    )\n\n\n    async def iter_results():\n        async for item in aclient.create_iterable(\n            messages=[\n                {\"role\": \"system\", \"content\": \"You must always use tools\"},\n                {\n                    \"role\": \"user\",\n                    \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n                },\n            ],\n            response_model=Union[Weather, GoogleSearch],\n        ):\n            yield item\n\n\n    async def main():\n        async for item in iter_results():\n            print(item)\n            #> location='Toronto' units='metric'\n            #> location='Dallas' units='imperial'\n            #> query='Super Bowl winner'\n\n\n    asyncio.run(main())\n    ```\n"
  },
  {
    "path": "docs/concepts/lists.md",
    "content": "---\ntitle: Streaming Lists with Instructor - Extract Multiple Objects\ndescription: Learn how to extract multiple structured objects from a single LLM call using streaming lists. Stream collections of Pydantic models as they're generated.\n---\n\n# Multi-task and Streaming\n\nA common use case of structured extraction is defining a single schema class and then making another schema to create a list to do multiple extraction\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass Users(BaseModel):\n    users: List[User]\n\n\nprint(Users.model_json_schema())\n\"\"\"\n{\n    '$defs': {\n        'User': {\n            'properties': {\n                'name': {'title': 'Name', 'type': 'string'},\n                'age': {'title': 'Age', 'type': 'integer'},\n            },\n            'required': ['name', 'age'],\n            'title': 'User',\n            'type': 'object',\n        }\n    },\n    'properties': {\n        'users': {'items': {'$ref': '#/$defs/User'}, 'title': 'Users', 'type': 'array'}\n    },\n    'required': ['users'],\n    'title': 'Users',\n    'type': 'object',\n}\n\"\"\"\n```\n\nDefining a task and creating a list of classes is a common enough pattern that we make this convenient by making use of `Iterable[T]`. This lets us dynamically create a new class that:\n\n1. Has dynamic docstrings and class name based on the task\n2. Support streaming by collecting tokens until a task is received back out.\n\n## Extracting Tasks using Iterable\n\nBy using `Iterable` you get a very convenient class with prompts and names automatically defined:\n\n```python\nimport instructor\nfrom typing import Iterable\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini-1106\",\n    mode=instructor.Mode.JSON,\n)\n\nusers = client.create(\n    temperature=0.1,\n    response_model=Iterable[User],\n    stream=False,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": (\n                \"Consider this data: Jason is 10 and John is 30. \"\n                \"Correctly segment it into entities. \"\n                \"Make sure the JSON is correct.\"\n            ),\n        },\n    ],\n)\nfor user in users:\n    print(user)\n    #> name='Jason' age=10\n    #> name='John' age=30\n```\n\n## Streaming Tasks\n\nWe can also generate tasks as the tokens are streamed in by defining an `Iterable[T]` type.\n\nLets look at an example in action with the same class\n\n```python hl_lines=\"6 26\"\nimport instructor\nfrom typing import Iterable\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\",\n    mode=instructor.Mode.TOOLS,\n)\n\nusers = client.create(\n    temperature=0.1,\n    stream=True,\n    response_model=Iterable[User],\n    messages=[\n        {\"role\": \"system\", \"content\": \"You are a perfect entity extraction system\"},\n        {\"role\": \"user\", \"content\": \"Extract `Jason is 10 and John is 10`\"},\n    ],\n    max_tokens=1000,\n)\n\nfor user in users:\n    print(user)\n    #> name='Jason' age=10\n    #> name='John' age=10\n```\n\n## Asynchronous Streaming\n\nI also just want to call out in this example that `instructor` also supports asynchronous streaming. This is useful when you want to stream a response model and process the results as they come in, but you'll need to use the `async for` syntax to iterate over the results.\n\n```python\nimport instructor\nfrom typing import Iterable\nfrom pydantic import BaseModel\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nasync def print_iterable_results():\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        async_client=True,\n        mode=instructor.Mode.TOOLS,\n    )\n\n    model = await client.create(\n        response_model=Iterable[UserExtract],\n        max_retries=2,\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Make two up people\"},\n        ],\n    )\n    async for m in model:\n        print(m)\n        #> name='Alice' age=30\n        #> name='Bob' age=25\n\n\nimport asyncio\n\nasyncio.run(print_iterable_results())\n```\n\n## See Also\n\n- [Streaming Partial](./partial.md) - Stream partially completed objects\n- [Streaming Lists Tutorial](../learning/streaming/lists.md) - Step-by-step list streaming guide\n- [Iterable Patterns](../learning/patterns/list_extraction.md) - List extraction patterns\n- [Raw Response](./raw_response.md) - Access original LLM responses\n"
  },
  {
    "path": "docs/concepts/logging.md",
    "content": "---\ntitle: Logging and Monitoring with Instructor - Debug Guide\ndescription: Implement comprehensive logging for Instructor LLM calls. Track API usage, debug issues, and monitor performance with DEBUG level logging.\n---\n\nIn order to see the requests made to OpenAI and the responses, you can set logging to DEBUG. This will show the requests and responses made to OpenAI. This can be useful for debugging and understanding the requests and responses made to OpenAI. I would love some contributions that make this a lot cleaner, but for now this is the fastest way to see the prompts.\n\n```python\nimport instructor\nimport logging\n\nfrom pydantic import BaseModel\n\n\n# Set logging to DEBUG\nlogging.basicConfig(level=logging.DEBUG)\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\nuser = client.create(\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"},\n    ],\n)  # type: ignore\n\n\"\"\"\n...\nDEBUG:instructor:Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>\nDEBUG:instructor:Instructor Request: mode.value='tool_call', response_model=<class '__main__.UserDetail'>, new_kwargs={'model': 'gpt-4.1-mini', 'messages': [{'role': 'user', 'content': 'Extract Jason is 25 years old'}], 'tools': [{'type': 'function', 'function': {'name': 'UserDetail', 'description': 'Correctly extracted `UserDetail` with all the required parameters with correct types', 'parameters': {'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'integer'}}, 'required': ['age', 'name'], 'type': 'object'}}}], 'tool_choice': {'type': 'function', 'function': {'name': 'UserDetail'}}}\nDEBUG:instructor:max_retries: 1\n...\nDEBUG:instructor:Instructor Pre-Response: ChatCompletion(id='chatcmpl-8zBxMxsOqm5Sj6yeEI38PnU2r6ncC', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_E1cftF5U0zEjzIbWt3q0ZLbN', function=Function(arguments='{\"name\":\"Jason\",\"age\":25}', name='UserDetail'), type='function')]))], created=1709594660, model='gpt-4.1-mini-0125', object='chat.completion', system_fingerprint='fp_2b778c6b35', usage=CompletionUsage(completion_tokens=9, prompt_tokens=81, total_tokens=90))\nDEBUG:httpcore.connection:close.started\nDEBUG:httpcore.connection:close.complete\n\"\"\"\n```\n\n## Provider initialization logs\n\n`from_provider()` now emits structured logs at the `INFO` level when a provider\nis initialized. Enable logging to see which provider and model are being used.\n\n```python\nimport logging\nimport instructor\n\nlogging.basicConfig(level=logging.INFO)\n\ninstructor.from_provider(\"openai/gpt-4.1-mini\")\n```\n\nExample output:\n\n```\nINFO:instructor.auto_client:Initializing openai provider with model gpt-4.1-mini\nINFO:instructor.auto_client:Client initialized\n```\n"
  },
  {
    "path": "docs/concepts/maybe.md",
    "content": "---\ntitle: Maybe Types and Optional Handling in Instructor\ndescription: Handle optional and nullable data with Maybe types in Instructor. Learn to work with potentially missing fields and optional responses from LLMs.\n---\n\n# Handling Missing Data\n\nThe `Maybe` pattern is a concept in functional programming used for error handling. Instead of raising exceptions or returning `None`, you can use a `Maybe` type to encapsulate both the result and potential errors.\n\nThis pattern is particularly useful when making LLM calls, as providing language models with an escape hatch can effectively reduce hallucinations.\n\n## Defining the Model\n\nUsing Pydantic, we'll first define the `UserDetail` and `MaybeUser` classes.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Optional[str] = Field(default=None)\n\n\nclass MaybeUser(BaseModel):\n    result: Optional[UserDetail] = Field(default=None)\n    error: bool = Field(default=False)\n    message: Optional[str] = Field(default=None)\n\n    def __bool__(self):\n        return self.result is not None\n```\n\nNotice that `MaybeUser` has a `result` field that is an optional `UserDetail` instance where the extracted data will be stored. The `error` field is a boolean that indicates whether an error occurred, and the `message` field is an optional string that contains the error message.\n\n## Defining the function\n\nOnce we have the model defined, we can create a function that uses the `Maybe` pattern to extract the data.\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\n# This enables the `response_model` keyword\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Optional[str] = Field(default=None)\n\n\nclass MaybeUser(BaseModel):\n    result: Optional[UserDetail] = Field(default=None)\n    error: bool = Field(default=False)\n    message: Optional[str] = Field(default=None)\n\n    def __bool__(self):\n        return self.result is not None\n\n\ndef extract(content: str) -> MaybeUser:\n    return client.create(\n        response_model=MaybeUser,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Extract `{content}`\"},\n        ],\n    )\n\n\nuser1 = extract(\"Jason is a 25-year-old scientist\")\nprint(user1.model_dump_json(indent=2))\n\"\"\"\n{\n  \"result\": {\n    \"age\": 25,\n    \"name\": \"Jason\",\n    \"role\": \"scientist\"\n  },\n  \"error\": false,\n  \"message\": null\n}\n\"\"\"\n\nuser2 = extract(\"Unknown user\")\nprint(user2.model_dump_json(indent=2))\n\"\"\"\n{\n  \"result\": null,\n  \"error\": false,\n  \"message\": null\n}\n\"\"\"\n```\n\nAs you can see, when the data is extracted successfully, the `result` field contains the `UserDetail` instance. When an error occurs, the `error` field is set to `True`, and the `message` field contains the error message.\n\nIf you want to learn more about pattern matching, check out Pydantic's docs on [Structural Pattern Matching](https://docs.pydantic.dev/latest/concepts/models/#structural-pattern-matching)\n"
  },
  {
    "path": "docs/concepts/migration.md",
    "content": "---\ntitle: Migration Guide\ndescription: Migrate from older Instructor patterns to the modern from_provider approach.\n---\n\n# Migration Guide\n\nThis guide helps you migrate from older Instructor patterns to `from_provider`, the recommended approach for all providers.\n\n## Why Migrate?\n\n- **Simpler code**: Less boilerplate, easier to read\n- **Consistent interface**: Same pattern works for all providers\n- **Better type safety**: Improved IDE support\n- **Future-proof**: Recommended pattern going forward\n\n## Quick Reference\n\n| Old Pattern | New Pattern |\n|-------------|-------------|\n| `instructor.patch(openai.OpenAI())` | `instructor.from_provider(\"openai/model\")` |\n| `instructor.apatch(openai.AsyncOpenAI())` | `instructor.from_provider(\"openai/model\", async_client=True)` |\n| `from_openai(client)` | `instructor.from_provider(\"openai/model\")` |\n| `from_anthropic(client)` | `instructor.from_provider(\"anthropic/model\")` |\n| `from_genai(client)` | `instructor.from_provider(\"google/model\")` |\n| `client.chat.completions.create(...)` | `client.create(...)` |\n| `client.messages.create(...)` | `client.create(...)` |\n\n## Basic Migration\n\n**Before:**\n\n```python\nimport openai\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nopenai_client = openai.OpenAI()\nclient = instructor.patch(openai_client)\n\nuser = client.chat.completions.create(\n    model=\"gpt-4o-mini\",\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30\"}],\n)\n```\n\n**After:**\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30\"}],\n)\n```\n\n## Async Migration\n\n**Before:**\n\n```python\nimport openai\nimport instructor\n\nopenai_client = openai.AsyncOpenAI()\nclient = instructor.apatch(openai_client)\n\nuser = await client.chat.completions.create(...)\n```\n\n**After:**\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", async_client=True)\n\nuser = await client.create(...)\n```\n\n## Provider-Specific Migrations\n\n### Anthropic\n\n```python\n# Before (removed)\nimport anthropic\nfrom instructor import from_anthropic\n\nclient = from_anthropic(anthropic.Anthropic())\nuser = client.messages.create(model=\"claude-3-5-sonnet\", ...)\n\n# After\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet\")\nuser = client.create(...)\n```\n\n### Google/Gemini\n\n```python\n# Before (removed)\nimport google.genai as genai\nfrom instructor import from_genai\n\nclient = from_genai(genai.Client(), model=\"gemini-pro\")\nuser = client.generate_content(...)\n\n# After\nclient = instructor.from_provider(\"google/gemini-pro\")\nuser = client.create(messages=[...])\n```\n\n## Configuration Options\n\nPass configuration directly to `from_provider`:\n\n```python\nimport instructor\n\n# Mode configuration\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", mode=instructor.Mode.JSON)\n\n# Custom API settings\nclient = instructor.from_provider(\n    \"openai/gpt-4o-mini\",\n    api_key=\"custom-key\",\n    organization=\"org-id\",\n    timeout=30.0,\n)\n```\n\n## Multiple Providers\n\n**Before:**\n\n```python\nimport openai\nimport anthropic\nimport instructor\nfrom instructor import from_anthropic\n\nopenai_client = instructor.patch(openai.OpenAI())\nanthropic_client = from_anthropic(anthropic.Anthropic())\n```\n\n**After:**\n\n```python\nimport instructor\n\nopenai_client = instructor.from_provider(\"openai/gpt-4o-mini\")\nanthropic_client = instructor.from_provider(\"anthropic/claude-3-5-sonnet\")\n```\n\n## Migration Checklist\n\n1. **Identify your current pattern**: `patch()`, `apatch()`, or `from_*()` functions\n2. **Find your model name**: e.g., `gpt-4o-mini`, `claude-3-5-sonnet`\n3. **Replace client creation**: Use `from_provider(\"provider/model\")`\n4. **Update method calls**: Change to `client.create(...)`\n5. **Use standard message format**: `[{\"role\": \"user\", \"content\": \"...\"}]`\n6. **Test your code**\n\n## Troubleshooting\n\n| Error | Cause | Solution |\n|-------|-------|----------|\n| `'Instructor' object has no attribute 'chat'` | Using old method call | Use `client.create()` instead of `client.chat.completions.create()` |\n| Invalid model string | Wrong format | Use `\"provider/model-name\"` format |\n| Message format error | Provider-specific format | Use standard `messages` list format |\n\n## Backward Compatibility\n\nLegacy helpers have been removed:\n\n- `instructor.patch()` → Use `from_provider` instead\n- `instructor.apatch()` → Use `from_provider` with `async_client=True`\n- `from_openai()`, `from_anthropic()`, etc. → Use `from_provider`\n\nUpdate all call sites before upgrading.\n\n## See Also\n\n- [from_provider Guide](./from_provider.md) - Complete guide to using from_provider\n- [Patching](./patching.md) - How Instructor enhances clients\n"
  },
  {
    "path": "docs/concepts/mode-migration.md",
    "content": "---\ntitle: Mode Migration Guide\ndescription: Migrate from provider-specific modes to the core modes in Instructor.\n---\n\n# Mode Migration Guide\n\nThis guide helps you move from provider-specific modes to the core modes.\nCore modes work across providers and are the recommended choice for new code.\n\n!!! note \"V2 Preview\"\n\n    Provider-specific modes are deprecated in v2. They still work, emit warnings, and map to core modes.\n\n## Core Modes\n\nThese are the core modes you should use:\n\n- `TOOLS`: Tool or function calling\n- `JSON_SCHEMA`: Native schema support when a provider has it\n- `MD_JSON`: JSON extracted from text or code blocks\n- `PARALLEL_TOOLS`: Multiple tool calls in one response\n- `RESPONSES_TOOLS`: OpenAI Responses API tools\n\n## Quick Mapping\n\nUse this table to replace legacy modes:\n\n| Legacy Mode | Core Mode |\n|------------|-----------|\n| `FUNCTIONS` | `TOOLS` |\n| `TOOLS_STRICT` | `TOOLS` |\n| `ANTHROPIC_TOOLS` | `TOOLS` |\n| `ANTHROPIC_JSON` | `MD_JSON` |\n| `COHERE_TOOLS` | `TOOLS` |\n| `COHERE_JSON_SCHEMA` | `JSON_SCHEMA` |\n| `XAI_TOOLS` | `TOOLS` |\n| `XAI_JSON` | `MD_JSON` |\n| `MISTRAL_TOOLS` | `TOOLS` |\n| `MISTRAL_STRUCTURED_OUTPUTS` | `JSON_SCHEMA` |\n| `FIREWORKS_TOOLS` | `TOOLS` |\n| `FIREWORKS_JSON` | `MD_JSON` |\n| `CEREBRAS_TOOLS` | `TOOLS` |\n| `CEREBRAS_JSON` | `MD_JSON` |\n| `WRITER_TOOLS` | `TOOLS` |\n| `WRITER_JSON` | `MD_JSON` |\n| `BEDROCK_TOOLS` | `TOOLS` |\n| `BEDROCK_JSON` | `MD_JSON` |\n| `PERPLEXITY_JSON` | `MD_JSON` |\n| `VERTEXAI_TOOLS` | `TOOLS` |\n| `VERTEXAI_JSON` | `MD_JSON` |\n| `VERTEXAI_PARALLEL_TOOLS` | `PARALLEL_TOOLS` |\n\n## Example: Anthropic\n\n**Before:**\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"anthropic/claude-3-5-haiku-latest\",\n    mode=Mode.ANTHROPIC_TOOLS,\n)\n```\n\n**After:**\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"anthropic/claude-3-5-haiku-latest\",\n    mode=Mode.TOOLS,\n)\n```\n\n## Example: Bedrock\n\n**Before:**\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\",\n    mode=Mode.BEDROCK_TOOLS,\n)\n```\n\n**After:**\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\",\n    mode=Mode.BEDROCK_TOOLS,\n)\n```\n\n## Notes\n\n- Legacy modes still work but show a deprecation warning.\n- Use core modes for new code and docs.\n- Core tests are parameterized by provider and mode for consistent coverage.\n- Streaming extraction is now handled by provider handlers instead of the DSL.\n- Legacy `ResponseSchema.parse_*` helpers are deprecated. Use `process_response` or\n  `ResponseSchema.from_response` with core modes so the v2 registry handles parsing.\n- See [Mode Comparison](../modes-comparison.md) for details.\n"
  },
  {
    "path": "docs/concepts/models.md",
    "content": "---\ntitle: Using Pydantic Models for Structured Outputs\ndescription: Learn how to define LLM output schemas with Pydantic models.\n---\n\n# Response Model\n\nDefine LLM output schemas using `pydantic.BaseModel`. For more details, see the [Pydantic documentation](https://docs.pydantic.dev/latest/concepts/models/).\n\nAfter defining a Pydantic model, use it as the `response_model` in your client `create` calls. The `response_model` parameter:\n\n- Defines the schema and prompts for the language model\n- Validates the response from the API\n- Returns a Pydantic model instance\n\n## Prompting\n\nUse docstrings and field annotations to define the prompt for generating responses.\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\n\n\nclass User(BaseModel):\n    \"\"\"\n    This is the prompt that will be used to generate the response.\n    Any instructions here will be passed to the language model.\n    \"\"\"\n\n    name: str = Field(description=\"The name of the user.\")\n    age: int = Field(description=\"The age of the user.\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old\"}],\n)\n```\n\nDocstrings, types, and field annotations are used to generate the prompt. The `create` method uses this prompt to generate the response.\n\n## Optional Values\n\nUse `Optional` and `default` to make fields optional when sent to the language model.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\nimport instructor\n\n\nclass User(BaseModel):\n    name: str = Field(description=\"The name of the user.\")\n    age: int = Field(description=\"The age of the user.\")\n    email: Optional[str] = Field(description=\"The email of the user.\", default=None)\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old\"}],\n)\n```\n\nFields can also be omitted from the schema sent to the language model using Pydantic's `SkipJsonSchema` annotation. See [Fields](fields.md#omitting-fields-from-schema-sent-to-the-language-model) for details.\n\n## Dynamic Model Creation\n\nCreate models at runtime using Pydantic's `create_model` function:\n\n```python\nfrom pydantic import BaseModel, create_model\n\n\nclass FooModel(BaseModel):\n    foo: str\n    bar: int = 123\n\n\nBarModel = create_model(\n    'BarModel',\n    apple=(str, 'russet'),\n    banana=(str, 'yellow'),\n    __base__=FooModel,\n)\nprint(BarModel)\n#> <class '__main__.BarModel'>\nprint(BarModel.model_fields.keys())\n#> dict_keys(['foo', 'bar', 'apple', 'banana'])\n```\n\n??? notes \"When would I use this?\"\n\n    Consider a situation where the model is dynamically defined, based on some configuration or database. For example, we could have a database table that stores the properties of a model for\n    some model name or id. We could then query the database for the properties of the model and use that to create the model.\n\n    ```sql\n    SELECT property_name, property_type, description\n    FROM prompt\n    WHERE model_name = {model_name}\n    ```\n\n    We can then use this information to create the model.\n\n    ```python\n    from pydantic import BaseModel, create_model, Field\n    from typing import List\n\n    types = {\n        'string': str,\n        'integer': int,\n        'boolean': bool,\n        'number': float,\n        'List[str]': List[str],\n    }\n\n    # Mocked cursor.fetchall()\n    cursor = [\n        ('name', 'string', 'The name of the user.'),\n        ('age', 'integer', 'The age of the user.'),\n        ('email', 'string', 'The email of the user.'),\n    ]\n\n    BarModel = create_model(\n        'User',\n        **{\n            property_name: (types[property_type], Field(description=description))\n            for property_name, property_type, description in cursor\n        },\n        __base__=BaseModel,\n    )\n\n    print(BarModel.model_json_schema())\n    \"\"\"\n    {\n        'properties': {\n            'name': {\n                'description': 'The name of the user.',\n                'title': 'Name',\n                'type': 'string',\n            },\n            'age': {\n                'description': 'The age of the user.',\n                'title': 'Age',\n                'type': 'integer',\n            },\n            'email': {\n                'description': 'The email of the user.',\n                'title': 'Email',\n                'type': 'string',\n            },\n        },\n        'required': ['name', 'age', 'email'],\n        'title': 'User',\n        'type': 'object',\n    }\n    \"\"\"\n    ```\n\n    This would be useful when different users have different descriptions for the same model. We can use the same model but have different prompts for each user.\n\n## Adding Behavior\n\nAdd methods to Pydantic models like any Python class. This lets you add custom logic to your models.\n\n```python\nfrom pydantic import BaseModel\nfrom typing import Literal\n\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass SearchQuery(BaseModel):\n    query: str\n    query_type: Literal[\"web\", \"image\", \"video\"]\n\n    def execute(self):\n        print(f\"Searching for {self.query} of type {self.query_type}\")\n        #> Searching for cat of type image\n        return \"Results for cat\"\n\n\nquery = client.create(\n    model=\"gpt-4.1-mini\",\n    messages=[{\"role\": \"user\", \"content\": \"Search for a picture of a cat\"}],\n    response_model=SearchQuery,\n)\n\nresults = query.execute()\nprint(results)\n#> Results for cat\n```\n\nNow we can call `execute` on our model instance after extracting it from a language model. If you want to see more examples of this checkout our post on [RAG is more than embeddings](../blog/posts/rag-and-beyond.md)\n\n## See Also\n\n- [Response Models Tutorial](../learning/getting_started/response_models.md) - Step-by-step guide to creating response models\n- [Simple Object Extraction](../learning/patterns/simple_object.md) - Basic extraction patterns\n- [Nested Structures](../learning/patterns/nested_structure.md) - Complex hierarchical models\n- [Optional Fields](../learning/patterns/optional_fields.md) - Working with optional data\n- [Types](./types.md) - Working with different data types\n- [Fields](./fields.md) - Advanced field configuration\n"
  },
  {
    "path": "docs/concepts/multimodal.md",
    "content": "---\ntitle: Seamless Multimodal Interactions with Instructor\ndescription: Learn how the Image, PDF and Audio class in Instructor enables seamless handling of multimodal content across different AI models.\n---\n\n---\ntitle: Multimodal Processing with Instructor - Vision and Audio\ndescription: Process images, audio, and video with Instructor for multimodal structured outputs. Extract data from visual content using GPT-4 Vision and Gemini models.\n---\n\n# Multimodal\n\n> We've provided a few different sample files for you to use to test out these new features. All examples below use these files.\n>\n> - (Image) : An image of some blueberry plants [image.jpg](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg)\n> - (Audio) : A Recording of the Original Gettysburg Address : [gettysburg.wav](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav)\n> - (PDF) : A sample PDF file which contains a fake invoice [invoice.pdf](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf)\n>   Instructor provides a unified, provider-agnostic interface for working with multimodal inputs like images and PDFs.\n\nInstructor provides a unified, provider-agnostic interface for working with multimodal inputs like images, PDFs, and audio files.\n\nWith Instructor's multimodal objects, you can easily load media from URLs, Google Cloud Storage URLs, local files, or base64 strings using a consistent API that works across different AI providers (OpenAI, Anthropic, Mistral, etc.).\n\nInstructor handles all the provider-specific formatting requirements behind the scenes, ensuring your code remains clean and future-proof as provider APIs evolve. Let's see how to use the Image, Audio and PDF classes.\n\n## `Image`\n\nThis class represents an image that can be loaded from a URL or file path. It provides a set of methods to create `Image` instances from different sources (Eg. URLs, paths and base64 strings). The following shows which methods are supported for the individual providers.\n\n| Method            | OpenAI | Anthropic | Google GenAI |\n| ----------------- | ------ | --------- | ------------ |\n| `from_url()`      | ✅     | ✅        | ✅           |\n| `from_gs_url()`   | ✅     | ✅        | ✅           |\n| `from_path()`     | ✅     | ✅        | ✅           |\n| `from_base64()`   | ✅     | ✅        | ✅           |\n| `autodetect()`    | ✅     | ✅        | ✅           |\n\nWe also support Anthropic Prompt Caching for images with the `ImageWith\n\n### Usage\n\nBy using the `Image` class, we can abstract away the differences between the different formats, allowing you to work with a unified interface.\n\nYou can create an `Image` instance from a URL, Google Cloud Storage (GCS) URL, or file path using the `from_url`, `from_gs_url`, or `from_path` methods. The `Image` class will automatically convert the image to a base64-encoded string and include it in the API request.\n\n```python\nimport instructor\nfrom instructor.processing.multimodal import Image\nfrom pydantic import BaseModel\n\n\nclass ImageDescription(BaseModel):\n    description: str\n    items: list[str]\n\n\n# Use our sample image provided above.\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresponse = client.create(\n    response_model=ImageDescription,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What is in this image?\",\n                Image.from_url(url),\n            ],\n        }\n    ],\n)\n\nprint(response)\n\"\"\"\ndescription='Blueberry bushes with clusters of ripe and unripe blueberries. The berries are blue to purplish in color, and the leaves are green. The sky in the background is cloudy.' items=['blueberry bushes', 'ripe blueberries', 'unripe blueberries', 'green leaves', 'cloudy sky']\n\"\"\"\n```\n\n### Google Cloud Storage Support\n\nInstructor now supports loading images directly from Google Cloud Storage URLs. This is particularly useful when working with images stored in GCS buckets.\n\n```python\nimport instructor\nfrom instructor.processing.multimodal import Image\nfrom pydantic import BaseModel\n\n\nclass ImageDescription(BaseModel):\n    description: str\n    items: list[str]\n\n\n# Load image from GCS URL (must be publicly accessible)\ngs_url = \"gs://my-bucket/path/to/image.jpg\"\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresponse = client.create(\n    response_model=ImageDescription,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What is in this image?\",\n                Image.from_gs_url(gs_url),\n            ],\n        }\n    ],\n)\n\nprint(response)\n\"\"\"\ndescription='A sample image loaded from Google Cloud Storage.' items=['sample image']\n\"\"\"\n```\n\n> **Note**: GCS URLs must point to publicly accessible objects. The `from_gs_url` method converts `gs://` URLs to `https://storage.googleapis.com/` URLs for access.\n\nWe also provide an `autodetect_images` keyword argument that allows you to provide URLs, GCS URLs, or file paths as normal strings when you set it to true. The system will automatically detect and handle different media types including images, audio, and PDFs.\n\nYou can see an example below.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass ImageDescription(BaseModel):\n    description: str\n    items: list[str]\n\n\n# Download a sample image for demonstration\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresponse = client.create(\n    response_model=ImageDescription,\n    autodetect_images=True,  # Set this to True\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\"What is in this image?\", url],\n        }\n    ],\n)\n\nprint(response)\n\"\"\"\ndescription='The image shows a close-up of a blueberry bush with ripe blueberries and green leaves. The background includes more blueberry bushes and a cloudy sky.' items=['Blueberry bush', 'Ripe blueberries', 'Green leaves', 'Cloudy sky']\n\"\"\"\n```\n\nIf you'll like to support Anthropic prompt caching with images, we provide the `ImageWithCacheControl` Object to do so. Simply use the `from_image_params` method and you'll be able to leverage Anthropic's prompt caching.\n\n```python\nimport instructor\nfrom instructor.processing.multimodal import ImageWithCacheControl\nfrom pydantic import BaseModel\n\n\nclass ImageDescription(BaseModel):\n    description: str\n    items: list[str]\n\n\n# Download a sample image for demonstration\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet-20240620\")\n\nresponse, completion = client.create_with_completion(\n    response_model=ImageDescription,\n    autodetect_images=True,  # Set this to True\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What is in this image?\",\n                ImageWithCacheControl.from_image_params(\n                    {\n                        \"source\": url,\n                        \"cache_control\": {\n                            \"type\": \"ephemeral\",\n                        },\n                    }\n                ),\n            ],\n        }\n    ],\n    max_tokens=1000,\n)\n\nprint(response)\n\"\"\"\ndescription='A bush with numerous clusters of blueberries surrounded by green leaves, under a cloudy sky.' items=['blueberries', 'green leaves', 'cloudy sky']\n\"\"\"\n\nprint(completion.usage.cache_creation_input_tokens)\n#> 1820\n```\n\nBy leveraging Instructor's multimodal capabilities, you can focus on building your application logic without worrying about the intricacies of each provider's image handling format. This not only saves development time but also makes your code more maintainable and adaptable to future changes in AI provider APIs.\n\n## `Audio`\n\n> Note : Only OpenAI and Gemini support audio files at the moment. For Gemini, we're passing in the raw bytes as bytes for this feature. If you'd like to use the `Files` API instead, we also support it, [read more at](../integrations/genai.md) to see how to do so.\n\nSimilar to the Image class, we provide methods to create `Audio` instances.\n\n| Method          | OpenAI | Google GenAI |\n| --------------- | ------ | ------------ |\n| `from_url()`    | ✅     | ✅           |\n| `from_gs_url()` | ✅     | ✅           |\n| `from_path()`   | ✅     | ✅           |\n| `from_base64()` | ✅     | ✅           |\n| `autodetect()`  | ✅     | ✅           |\n\nThe `Audio` class represents an audio file that can be loaded from a URL, Google Cloud Storage URL, or file path. It provides methods to create `Audio` instances using the `from_path`, `from_url`, `from_gs_url`, `from_base64`, and `autodetect` methods.\n\nThe `Audio` class will automatically convert it to the right format and include it in the API request.\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor.processing.multimodal import Audio\n\n# Initialize the client\nclient = instructor.from_provider(\"openai/gpt-4o-audio-preview\")\n\n\n# Define our response model\nclass AudioDescription(BaseModel):\n    summary: str\n    transcript: str\n\n\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav\"\n\n# Make the API call with the audio file\nresp = client.create(\n    response_model=AudioDescription,\n    modalities=[\"text\"],\n    audio={\"voice\": \"alloy\", \"format\": \"wav\"},\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract the following information from the audio:\",\n                Audio.from_url(url),\n            ],\n        },\n    ],\n)\n\nprint(resp)\n\"\"\"\nsummary='This excerpt is from a famous historical speech discussing the founding principles of equality and liberty, and the ongoing civil war testing the endurance of those principles.' transcript='Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.'\n\"\"\"\n```\n\n### Google Cloud Storage Support\n\nYou can also load audio files directly from Google Cloud Storage:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor.processing.multimodal import Audio\n\n# Initialize the client\nclient = instructor.from_provider(\"openai/gpt-4o-audio-preview\")\n\n\n# Define our response model\nclass AudioDescription(BaseModel):\n    summary: str\n    transcript: str\n\n\n# Load audio from GCS URL (must be publicly accessible)\ngs_url = \"gs://my-bucket/path/to/audio.wav\"\n\n# Make the API call with the GCS audio file\nresp = client.create(\n    response_model=AudioDescription,\n    modalities=[\"text\"],\n    audio={\"voice\": \"alloy\", \"format\": \"wav\"},\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract the following information from the audio:\",\n                Audio.from_gs_url(gs_url),\n            ],\n        },\n    ],\n)\n\nprint(resp)\n\"\"\"\nsummary='A short historical speech about equality and liberty.' transcript='Four score and seven years ago our fathers brought forth...'\n\"\"\"\n```\n\n## `PDF`\n\nThe `PDF` class represents a PDF file that can be loaded from a URL or file path.\n\nIt provides methods to create `PDF` instances and is currently supported for OpenAI, Mistral, GenAI, Anthropic, and Bedrock client integrations.\n\n| Method            | OpenAI | Anthropic | Google GenAI | Mistral | Bedrock |\n| ----------------- | ------ | --------- | ------------ | ------- | ------- |\n| `from_url()`      | ✅     | ✅        | ✅           | ✅      | ✅      |\n| `from_gs_url()`   | ✅     | ✅        | ✅           | ✅      | ✅      |\n| `from_path()`     | ✅     | ✅        | ✅           | ❎      | ✅      |\n| `from_base64()`   | ✅     | ✅        | ✅           | ❎      | ✅      |\n| `autodetect()`    | ✅     | ✅        | ✅           | ✅      | ✅      |\n\nFor Gemini, we also provide two additional methods that make working with the google-genai files package easy which you can access in the `PDFWithGenaiFile` object.\n\nFor Anthropic, you can enable caching with the `PDFWithCacheControl` object. Note that this has caching configured by default for easy usage.\n\nWe provide examples of how to use all three object classes below.\n\nFor Bedrock, you can convert a `PDF` into the Bedrock-native document format with `PDF.to_bedrock()` and include the result in the message content list.\n\n### Usage\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.processing.multimodal import PDF\n\n# Set up the client\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\n# Create a model for analyzing PDFs\nclass Invoice(BaseModel):\n    total: float\n    items: list[str]\n\n\n# Load and analyze a PDF\nresponse = client.create(\n    response_model=Invoice,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Analyze this document\",\n                PDF.from_url(url),\n            ],\n        }\n    ],\n)\n\nprint(response)\n\"\"\"\ntotal=220.0 items=['English Tea - 2 units at $100 each', 'Tofu - 10 units at $2 each']\n\"\"\"\n```\n\n### Google Cloud Storage Support\n\nYou can load PDF files directly from Google Cloud Storage URLs:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.processing.multimodal import PDF\n\n# Set up the client\ngs_url = \"gs://my-bucket/path/to/document.pdf\"\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\n# Create a model for analyzing PDFs\nclass Invoice(BaseModel):\n    total: float\n    items: list[str]\n\n\n# Load and analyze a PDF from GCS (must be publicly accessible)\nresponse = client.create(\n    response_model=Invoice,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Analyze this document\",\n                PDF.from_gs_url(gs_url),\n            ],\n        }\n    ],\n)\n\nprint(f\"Total = {response.total:.0f}, items = {response.items}\")\n#> Total = 220, items = ['English Tea', 'Tofu']\n```\n\n### Caching\n\nIf you'd like to cache the PDF for Anthropic, we provide the `PDFWithCacheControl` class which has caching configured by default.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.processing.multimodal import PDFWithCacheControl\n\n# Set up the client\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet-20240620\")\n\n\n# Create a model for analyzing PDFs\nclass Invoice(BaseModel):\n    total: float\n    items: list[str]\n\n\n# Load and analyze a PDF\nresponse, completion = client.create_with_completion(\n    response_model=Invoice,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Analyze this document\",\n                PDFWithCacheControl.from_url(url),\n            ],\n        }\n    ],\n    max_tokens=1000,\n)\n\nprint(f\"Total = {response.total:.0f}, items = {response.items}\")\n#> Total = 220, items = ['English Tea', 'Tofu']\n\nprint(completion.usage.cache_creation_input_tokens)\n#> 2091\n```\n\n### Using Files\n\nWe also provide a convinient wrapper around the Files API - allowing you to use both uploaded files and to block the main thread while your file is uploading.\n\nIn this example below, we download the sample PDF and then upload it using the `Files` api provided by the `google.genai` sdk.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.processing.multimodal import PDFWithGenaiFile\nimport requests\n\n# Set up the client\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n\n# Create a model for analyzing PDFs\nclass Invoice(BaseModel):\n    total: float\n    items: list[str]\n\n\n# Load and analyze a PDF\nwith requests.get(url) as download_response:\n    pdf_data = download_response.content\n    with open(\"./invoice.pdf\", \"wb\") as f:\n        f.write(pdf_data)\n\nresponse = client.create(\n    response_model=Invoice,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Analyze this document\",\n                PDFWithGenaiFile.from_new_genai_file(\n                    file_path=\"./invoice.pdf\",\n                    retry_delay=10,\n                    max_retries=20,\n                ),\n            ],\n        }\n    ],\n)\n\nprint(response)\n#> total=220.0 items=['English Tea', 'Tofu']\n```\n\nIf you've already uploaded your file ahead of time, we also support it. Just provide us with the file name as seen below\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.processing.multimodal import PDFWithGenaiFile\nimport requests\n\n# Set up the client\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n\n# Create a model for analyzing PDFs\nclass Invoice(BaseModel):\n    total: float\n    items: list[str]\n\n\n# Load and analyze a PDF\nwith requests.get(url) as download_response:\n    pdf_data = download_response.content\n    with open(\"./invoice.pdf\", \"wb\") as f:\n        f.write(pdf_data)\n\nfile = client.files.upload(\n    file=\"invoice.pdf\",\n)\n\nresponse = client.create(\n    response_model=Invoice,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Analyze this document\",\n                PDFWithGenaiFile.from_existing_genai_file(file_name=file.name),\n            ],\n        }\n    ],\n)\n\nprint(response)\n#> total=220.0 items=['English Tea', 'Tofu']\n```\n\nThis way you have more granular control over how the file is uploaded, potentially also processing multiple file uploads at once too.\n"
  },
  {
    "path": "docs/concepts/parallel.md",
    "content": "---\ntitle: Parallel Tools\ndescription: Learn about parallel tools in OpenAI, Google, and Anthropic.\n---\n\n## See Also\n\n- [from_provider Guide](./from_provider.md#async-clients) - Async client setup\n- [Batch Processing](../examples/batch_job_oai.md) - Process multiple requests efficiently\n- [Iterable](./iterable.md) - Extract multiple objects\n- [Lists](./lists.md) - Working with collections\n\n# Parallel Tools\n\nParallel Tool Calling is a feature that allows you to call multiple functions in a single request.\n\n!!! warning \"Experimental Feature\"\n\n    Parallel Tool Calling is supported by Google, OpenAI, and Anthropic. Make sure to use the equivalent parallel tool `mode` for your client.\n\n## Understanding Parallel Tool Calling\n\nParallel Function Calling helps you to significantly reduce the latency of your application without having to build a parent schema as a wrapper around these tool calls.\n\n=== \"OpenAI\"\n\n    ```python hl_lines=\"20 32\"\n    from __future__ import annotations\n\n    import instructor\n\n    from typing import Iterable, Literal\n    from pydantic import BaseModel\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.PARALLEL_TOOLS,\n    )\n    function_calls = client.create(\n        messages=[\n            {\"role\": \"system\", \"content\": \"You must always use tools\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n            },\n        ],\n        response_model=Iterable[Weather | GoogleSearch],\n    )\n\n    for fc in function_calls:\n        print(fc)\n        #> location='Toronto' units='metric'\n        #> location='Dallas' units='metric'\n        #> query='who won the super bowl 2023'\n    ```\n\n=== \"Vertex AI\"\n\n    ```python\n    from typing import Iterable, Literal\n\n    import instructor\n    from pydantic import BaseModel\n\n    try:\n        import vertexai\n        import vertexai.generative_models as gm\n        from instructor import from_vertexai\n    except ImportError:\n        vertexai = None\n        gm = None\n        from_vertexai = None\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    if from_vertexai is not None and vertexai is not None and gm is not None:\n        vertexai.init(project=\"your-project-id\", location=\"us-central1\")\n        client = from_vertexai(\n            gm.GenerativeModel(\"gemini-2.5-flash\"),\n            mode=instructor.Mode.PARALLEL_TOOLS,\n        )\n        function_calls = client.create(\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n                },\n            ],\n            response_model=Iterable[Weather | GoogleSearch],\n        )\n\n        for fc in function_calls:\n            print(fc)\n            #> location='Toronto' units='metric'\n            #> location='Dallas' units='imperial'\n            #> query='who won the super bowl'\n    ```\n\n=== \"Anthropic\"\n\n    ```python hl_lines=\"20 32\"\n    import instructor\n    from typing import Iterable, Literal\n    from pydantic import BaseModel\n\n\n    class Weather(BaseModel):\n        location: str\n        units: Literal[\"imperial\", \"metric\"]\n\n\n    class GoogleSearch(BaseModel):\n        query: str\n\n\n    client = instructor.from_provider(\n        \"anthropic/claude-3-7-sonnet-latest\",\n        mode=instructor.Mode.PARALLEL_TOOLS,\n    )\n    function_calls = client.create(\n        messages=[\n            {\"role\": \"system\", \"content\": \"You must always use tools\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n            },\n        ],\n        response_model=Iterable[Weather | GoogleSearch],\n    )\n\n    for fc in function_calls:\n        print(fc)\n        #> location='Toronto' units='metric'\n    ```\n\nWe need to set the response model to `Iterable[Weather | GoogleSearch]` to indicate that the response will be a list of `Weather` and `GoogleSearch` objects.\n\nThis is necessary because the response will be a list of objects, and we need to specify the types of the objects in the list. This returns an iterable which you can then iterate over\n"
  },
  {
    "path": "docs/concepts/partial.md",
    "content": "---\ntitle: Streaming Partial Responses with Instructor and OpenAI\ndescription: Learn to utilize field-level streaming with Instructor and OpenAI for incremental responses in Python.\n---\n\n# Streaming Partial Responses\n\n!!! info \"Literal\"\n\n    If the data structure you're using has literal values, you need to make sure to import the `PartialLiteralMixin` mixin.\n\n    ```python\n    from typing import Literal\n    from pydantic import BaseModel\n    from instructor.dsl.partial import PartialLiteralMixin\n\n\n    class User(BaseModel, PartialLiteralMixin):\n        name: str\n        age: int\n        category: Literal[\"admin\", \"user\", \"guest\"]\n\n\n    # The rest of your code below\n    ```\n\n    This is because `jiter` throws an error otherwise if it encounters a incomplete Literal value while it's being streamed in\n\nField level streaming provides incremental snapshots of the current state of the response model that are immediately useable. This approach is particularly relevant in contexts like rendering UI components.\n\nInstructor supports this pattern by making use of `create_partial`. This lets us dynamically create a new class that treats all of the original model's fields as `Optional`.\n\n## Understanding Partial Responses\n\nConsider what happens whene we define a response model:\n\n```python\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n```\n\nIf we streamed json out from OpenAI, we would only be able to parse when the object is completed returned!\n\n```\n{\"name\": \"Jo\n{\"name\": \"John\", \"ag\n{\"name\": \"John\", \"age:\n{\"name\": \"John\", \"age\": 25} # Completed\n```\n\nWhen specifying a `create_partial` and setting `stream=True`, the response from `instructor` becomes a `Generator[T]`. As the generator yields results, you can iterate over these incremental updates. The last value yielded by the generator represents the completed extraction!\n\n```\n{\"name\": \"Jo                 => User(name=\"Jo\", age=None)\n{\"name\": \"John\", \"ag         => User(name=\"John\", age=None)\n{\"name\": \"John\", \"age:       => User(name=\"John\", age=None)\n{\"name\": \"John\", \"age\": 25}  => User(name=\"John\", age=25)\n```\n\n!!! warning \"Limited Validator Support\"\n\n    Due to the streaming nature of the response model, we do not support validators since they would not be able to be applied to the streaming response.\n\nLet's look at an example of streaming an extraction of conference information, that would be used to stream in an react component.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import List\nfrom rich.console import Console\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\ntext_block = \"\"\"\nIn our recent online meeting, participants from various backgrounds joined to discuss the upcoming tech conference. The names and contact details of the participants were as follows:\n\n- Name: John Doe, Email: johndoe@email.com, Twitter: @TechGuru44\n- Name: Jane Smith, Email: janesmith@email.com, Twitter: @DigitalDiva88\n- Name: Alex Johnson, Email: alexj@email.com, Twitter: @CodeMaster2023\n\nDuring the meeting, we agreed on several key points. The conference will be held on March 15th, 2024, at the Grand Tech Arena located at 4521 Innovation Drive. Dr. Emily Johnson, a renowned AI researcher, will be our keynote speaker.\n\nThe budget for the event is set at $50,000, covering venue costs, speaker fees, and promotional activities. Each participant is expected to contribute an article to the conference blog by February 20th.\n\nA follow-up meetingis scheduled for January 25th at 3 PM GMT to finalize the agenda and confirm the list of speakers.\n\"\"\"\n\n\nclass User(BaseModel):\n    name: str\n    email: str\n    twitter: str\n\n\nclass MeetingInfo(BaseModel):\n    users: List[User]\n    date: str\n    location: str\n    budget: int\n    deadline: str\n\n\nextraction_stream = client.create_partial(\n    response_model=MeetingInfo,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Get the information about the meeting and the users {text_block}\",\n        },\n    ],\n    stream=True,\n)\n\n\nconsole = Console()\n\nfor extraction in extraction_stream:\n    obj = extraction.model_dump()\n    console.clear()\n    console.print(obj)\n\nprint(extraction.model_dump_json(indent=2))\n\"\"\"\n{\n  \"users\": [\n    {\n      \"name\": \"John Doe\",\n      \"email\": \"johndoe@email.com\",\n      \"twitter\": \"@TechGuru44\"\n    },\n    {\n      \"name\": \"Jane Smith\",\n      \"email\": \"janesmith@email.com\",\n      \"twitter\": \"@DigitalDiva88\"\n    },\n    {\n      \"name\": \"Alex Johnson\",\n      \"email\": \"alexj@email.com\",\n      \"twitter\": \"@CodeMaster2023\"\n    }\n  ],\n  \"date\": \"March 15th, 2024\",\n  \"location\": \"Grand Tech Arena, 4521 Innovation Drive\",\n  \"budget\": 50000,\n  \"deadline\": \"February 20th\"\n}\n\"\"\"\n```\n\nThis will output the following:\n\n![Partial Streaming Gif](../img/partial.gif)\n\n## Asynchronous Streaming\n\nI also just want to call out in this example that `instructor` also supports asynchronous streaming. This is useful when you want to stream a response model and process the results as they come in, but you'll need to use the `async for` syntax to iterate over the results.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\n    \"openai/gpt-5-nano\",\n    async_client=True,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def print_partial_results():\n    user = client.create_partial(\n        response_model=User,\n        max_retries=2,\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Jason is 12 years old\"},\n        ],\n    )\n    async for m in user:\n        print(m)\n        #> name=None age=None\n        #> name=None age=None\n        #> name=None age=None\n        #> name='' age=None\n        #> name='Jason' age=None\n        #> name='Jason' age=None\n        #> name='Jason' age=None\n        #> name='Jason' age=None\n        #> name='Jason' age=12\n        #> name='Jason' age=12\n\n\nimport asyncio\n\nasyncio.run(print_partial_results())\n```\n\n## See Also\n\n- [Streaming Lists](./lists.md) - Stream collections of completed objects\n- [Streaming Basics](../learning/streaming/basics.md) - Introduction to streaming concepts\n- [Iterable Streaming](./iterable.md) - Stream multiple objects\n- [Raw Response](./raw_response.md) - Access original LLM responses\n"
  },
  {
    "path": "docs/concepts/patching.md",
    "content": "---\ntitle: How Instructor Patches LLM Clients\ndescription: Learn how Instructor adds structured output capabilities to LLM clients through patching.\n---\n\n# Patching\n\nPatching adds structured output features to LLM client libraries. This page explains how it works. For most users, [`from_provider`](./from_provider.md) is simpler than manual patching.\n\n!!! tip \"Recommended Approach\"\n    Use [`from_provider`](./from_provider.md) instead of manual patching. It works the same way across all providers. See the [Migration Guide](./migration.md) if you're using older patching patterns.\n\n## What is Patching?\n\nPatching adds new features to LLM client objects without changing their original code. When Instructor patches a client, it adds:\n\n- New parameters: `response_model`, `max_retries`, and `context` to completion methods\n- Validation: Checks responses against Pydantic models\n- Retry logic: Retries when validation fails\n- Compatibility: The patched client still works with all original methods\n\n## How Patching Works\n\nWhen Instructor patches a client, it:\n\n1. Wraps the completion method: Intercepts calls to `create()` or `chat.completions.create()`\n2. Converts schemas: Changes Pydantic models into provider-specific formats (JSON schema, tool definitions, etc.)\n3. Validates responses: Checks LLM outputs against your Pydantic model\n4. Handles retries: Retries with validation feedback if needed\n5. Returns typed objects: Converts validated JSON into Pydantic model instances\n\n## Patching Modes\n\nDifferent providers support different modes for structured extraction. Instructor automatically selects the best mode for each provider, but you can override it:\n\n### Tool Calling (TOOLS)\n\nUses the provider's function/tool calling API. This is the default for OpenAI.\n\nSupported by: OpenAI, Anthropic (ANTHROPIC_TOOLS), Google (GENAI_TOOLS), Ollama (for supported models)\n\n### JSON Mode\n\nInstructs the model to return JSON directly. Works with most providers.\n\nSupported by: OpenAI, Anthropic, Google, Ollama, and most providers\n\n### Markdown JSON (MD_JSON)\n\nAsks for JSON wrapped in markdown. Only use for specific providers like Databricks.\n\nSupported by: Databricks, some vision models\n\n## Default Modes by Provider\n\nEach provider uses a recommended default mode:\n\n- **OpenAI**: `Mode.TOOLS` (function calling)\n- **Anthropic**: `Mode.TOOLS` (tool use)\n- **Google**: `Mode.TOOLS` (function calling)\n- **Ollama**: `Mode.TOOLS` (if model supports it) or `Mode.JSON`\n- **Others**: Provider-specific defaults\n\nWhen using `from_provider`, these defaults are applied automatically. You can override them with the `mode` parameter.\n\n## Manual Patching (Advanced)\n\nIf you need to patch a client manually (not recommended for most users):\n\n```python\nimport openai\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass YourModel(BaseModel):\n    message: str\n\n\n# Create the base client\nopenai_client = openai.OpenAI()\n\n# Patch it manually\nclient = instructor.patch(openai_client, mode=instructor.Mode.TOOLS)\n\n# Now use it\nresponse = client.chat.completions.create(\n    response_model=YourModel,\n    messages=[{\"role\": \"user\", \"content\": \"Say hello\"}],\n)\n```\n\nHowever, using `from_provider` is simpler and recommended:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Simpler approach\nclass YourModel(BaseModel):\n    message: str\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n_response = client.create(\n    response_model=YourModel,\n    messages=[{\"role\": \"user\", \"content\": \"Say hello\"}],\n)\n```\n\n## What Gets Patched?\n\nInstructor adds these features to patched clients:\n\n### New Parameters\n\n- `response_model`: A Pydantic model or type that defines the expected output structure\n- `max_retries`: Number of retry attempts if validation fails (default: 0)\n- `context`: Additional context for validation hooks\n\n### Enhanced Methods\n\nThe patched client's `create()` method:\n- Accepts `response_model` parameter\n- Validates responses automatically\n- Retries on validation failures\n- Returns typed Pydantic objects instead of raw responses\n\n## Provider-Specific Considerations\n\n### OpenAI\n\n- Default mode: `TOOLS` (function calling)\n- Supports streaming with structured outputs\n\n### Anthropic\n\n- Default mode: `ANTHROPIC_TOOLS` (tool use)\n- Uses Claude's native tool calling API\n\n### Google Gemini\n\n- Default mode: `GENAI_TOOLS` (function calling)\n- Requires `jsonref` package for tool calling\n- Some limitations with strict validation and enums\n\n### Ollama (Local Models)\n\n- Default mode: `TOOLS` (if model supports it) or `JSON`\n- Models like llama3.1, llama3.2, mistral-nemo support tools\n- Older models fall back to JSON mode\n\n## When to Use Manual Patching\n\nManual patching is rarely needed. Use it only if:\n\n1. You need fine-grained control over the patching process\n2. You're working with a custom client implementation\n3. You're debugging patching behavior\n\nFor 99% of use cases, `from_provider` is the better choice.\n\n## Related Documentation\n\n- [from_provider Guide](./from_provider.md) - Recommended way to create patched clients\n- [Migration Guide](./migration.md) - Migrating from manual patching to from_provider\n- [Modes Comparison](../modes-comparison.md) - Detailed comparison of different modes\n- [Integrations](../integrations/index.md) - Provider-specific documentation\n"
  },
  {
    "path": "docs/concepts/philosophy.md",
    "content": "---\ntitle: Philosophy\ndescription: The principles behind Instructor - why simple beats complex every time.\n---\n\n# Philosophy\n\nGreat tools make hard things easy without making easy things hard. That's Instructor.\n\n## Start with what developers know\n\nMost AI frameworks invent their own abstractions. We don't.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\n# What you already know (Pydantic)\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# What Instructor adds\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n_user = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n)  # That's it\n```\n\nIf you know Pydantic, you know Instructor. No new concepts, no new syntax, no 200-page manual.\n\n## Your escape hatch is always there\n\nThe worst frameworks are roach motels - easy to get in, impossible to get out. Instructor is different:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# With Instructor\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n_result = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n)\n\n# Want to go back to raw API? Just remove response_model:\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n_result = client.create(messages=[{\"role\": \"user\", \"content\": \"Say hello\"}])\n\n# Or use the provider directly:\nfrom openai import OpenAI\n\n_raw_client = OpenAI()  # Back to vanilla\n```\n\nWe patch, we don't wrap. Your code, your control.\n\n## Show, don't hide\n\nBad frameworks hide complexity. Good tools help you understand it.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# See exactly what Instructor sends\ninstructor.logfire.configure()  # Full observability\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\nresult = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n)\n\n# Access raw responses\n_raw_response = result._raw_response  # See what the LLM actually returned\n```\n\nWhen something goes wrong (and it will), you can see exactly what happened.\n\n## Composition beats configuration\n\nNo YAML files. No decorators. No magic. Just functions.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass Company(BaseModel):\n    name: str\n    industry: str\n\n\nclass Analysis(BaseModel):\n    user: User\n    company: Company\n\n\n# Build complex systems with simple functions\ndef extract_user(text: str) -> User:\n    return client.create(\n        response_model=User, messages=[{\"role\": \"user\", \"content\": text}]\n    )\n\n\ndef extract_company(text: str) -> Company:\n    return client.create(\n        response_model=Company, messages=[{\"role\": \"user\", \"content\": text}]\n    )\n\n\ndef analyze_email(email: str) -> Analysis:\n    user = extract_user(email)\n    company = extract_company(email)\n    return Analysis(user=user, company=company)\n\n\n# Compose however makes sense for YOUR application\n_analysis = analyze_email(\"Please introduce Jane from Acme.\")\n```\n\n## Start simple, grow naturally\n\nThe best code is code that grows with your needs:\n\n```python\nimport instructor\nfrom instructor import Partial\nfrom pydantic import BaseModel, field_validator\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Day 1: Just get it working\n_user = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n)\n\n\n# Day 7: Add validation\nclass User(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"age\")\n    def check_age(cls, value: int) -> int:\n        if value < 0 or value > 150:\n            raise ValueError(\"Invalid age\")\n        return value\n\n\n# Day 14: Add retries for production\n_user = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n    max_retries=3,\n)\n\n\n# Day 30: Add streaming for better UX\ndef update_ui(_partial: Partial[User]) -> None:\n    pass\n\n\nfor partial in client.create(\n    response_model=Partial[User],\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n    stream=True,\n):\n    update_ui(partial)\n```\n\nEach addition is one line. No refactoring. No migration guide.\n\n## What we intentionally DON'T do\n\n### No prompt engineering\n\nWe don't write prompts for you. You know your domain better than we do.\n\n```python\n# We DON'T do this:\n# @instructor.prompt(\"Extract the user information carefully\")\n# def extract_user(text: str):\n#     ...\n\n\n# You write your own prompts:\ntext = \"Jane is 33\"\n_messages = [\n    {\"role\": \"system\", \"content\": \"You are a precise data extractor\"},\n    {\"role\": \"user\", \"content\": f\"Extract user from: {text}\"},\n]\n```\n\n### No new abstractions\n\nWe don't invent concepts like \"Agents\", \"Chains\", or \"Tools\". Those are your domain concepts.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# We DON'T do this:\n# class UserExtractionAgent(instructor.Agent):\n#     tools = [instructor.WebSearch(), instructor.Calculator()]\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\ndef search_web(query: str) -> str:\n    return f\"Results for {query}\"\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\n# You build what makes sense:\ndef extract_user_with_search(query: str) -> User:\n    # Your logic, your way\n    search_results = search_web(query)\n    return client.create(\n        response_model=User, messages=[{\"role\": \"user\", \"content\": search_results}]\n    )\n\n\n_user = extract_user_with_search(\"Find Jane\")\n```\n\n### No framework lock-in\n\nYour code should work with or without us:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\n# This is just a Pydantic model\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# This is just a function\ndef process_user(user: User) -> dict:\n    return {\"name\": user.name.upper(), \"adult\": user.age >= 18}\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n# Instructor just connects them to LLMs\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"Jane is 33\"}],\n)\n\n_result = process_user(user)  # Works with or without Instructor\n```\n\n## The result\n\nBy following these principles, we get:\n\n- **Tiny API surface**: Learn it in minutes, not days\n- **Zero vendor lock-in**: Switch providers or remove Instructor anytime\n- **Debuggable**: When things break, you can see why\n- **Composable**: Build complex systems from simple parts\n- **Pythonic**: If it feels natural in Python, it feels natural in Instructor\n\n## In practice\n\nHere's what building with Instructor actually looks like:\n\n```python\nfrom enum import Enum\nfrom typing import List\n\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Your domain models (not ours)\nclass Priority(str, Enum):\n    HIGH = \"high\"\n    MEDIUM = \"medium\"\n    LOW = \"low\"\n\n\nclass Ticket(BaseModel):\n    title: str\n    description: str\n    priority: Priority\n    estimated_hours: float\n\n\n# Your business logic (not ours)\ndef prioritize_tickets(tickets: List[Ticket]) -> List[Ticket]:\n    return sorted(tickets, key=lambda t: (t.priority.value, -t.estimated_hours))\n\n\n# Connect to LLM (one line)\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n# Extract structured data (simple function call)\ntickets = client.create(\n    response_model=List[Ticket],\n    messages=[{\"role\": \"user\", \"content\": \"Parse these support tickets: ...\"}],\n)\n\n# Use your business logic\n_prioritized = prioritize_tickets(tickets)\n```\n\nNo framework. No abstractions. Just Python.\n\n## The philosophy in one sentence\n\n**Make structured LLM outputs as easy as defining a Pydantic model.**\n\nEverything else follows from that.\n"
  },
  {
    "path": "docs/concepts/prompt_caching.md",
    "content": "---\ntitle: Understanding Prompt Caching for API Efficiency\ndescription: Explore how prompt caching optimizes performance for API calls in OpenAI and Anthropic, enhancing efficiency and reducing costs.\n---\n\n## See Also\n\n- [Caching](./caching.md) - General caching concepts\n- [Cost Optimization](../examples/batch_job_oai.md) - Reduce API costs\n- [Performance Optimization](../examples/sqlmodel.md#performance-optimization) - Performance best practices\n- [Anthropic Integration](../integrations/anthropic.md) - Anthropic prompt caching support\n\n# Prompt Caching\n\nPrompt Caching is a feature that allows you to cache portions of your prompt, optimizing performance for multiple API calls with shared context. This helps to reduce cost and improve response times.\n\n## Prompt Caching in OpenAI\n\nOpenAI implements a prompt caching mechanism to optimize performance for API requests with similar prompts.\n\n> Prompt Caching works automatically on all your API requests (no code changes required) and has no additional fees associated with it.\n\nThis optimization is especially useful for applications making multiple API calls with shared context, minimizing redundant processing and improving overall performance.\n\nPrompt Caching is enabled for the following models:\n\n- gpt-4o\n- gpt-4.1-mini\n- o1-preview\n- o1-mini\n\nCaching is based on prefix matching, so if you're using a system prompt that contains a common set of instructions, you're likely to see a cache hit as long as you move all variable parts of the prompt to the end of the message when possible.\n\n## Prompt Caching in Anthropic\n\nPrompt Caching is now generally avaliable for Anthropic. This enables you to cache specific prompt portions, reuse cached content in subsequent calls, and reduce processed data per request.\n\n??? note \"Source Text\"\n\n    In the following example, we'll be using a short excerpt from the novel \"Pride and Prejudice\" by Jane Austen. This text serves as an example of a substantial context that might typically lead to slow response times and high costs when working with language models. You can download it manually [here](https://www.gutenberg.org/cache/epub/1342/pg1342.txt)\n\n    ```\n        _Walt Whitman has somewhere a fine and just distinction between \"loving\n    by allowance\" and \"loving with personal love.\" This distinction applies\n    to books as well as to men and women; and in the case of the not very\n    numerous authors who are the objects of the personal affection, it\n    brings a curious consequence with it. There is much more difference as\n    to their best work than in the case of those others who are loved \"by\n    allowance\" by convention, and because it is felt to be the right and\n    proper thing to love them. And in the sect--fairly large and yet\n    unusually choice--of Austenians or Janites, there would probably be\n    found partisans of the claim to primacy of almost every one of the\n    novels. To some the delightful freshness and humour of_ Northanger\n    Abbey, _its completeness, finish, and_ entrain, _obscure the undoubted\n    critical facts that its scale is small, and its scheme, after all, that\n    of burlesque or parody, a kind in which the first rank is reached with\n    difficulty._ Persuasion, _relatively faint in tone, and not enthralling\n    in interest, has devotees who exalt above all the others its exquisite\n    delicacy and keeping. The catastrophe of_ Mansfield Park _is admittedly\n    theatrical, the hero and heroine are insipid, and the author has almost\n    wickedly destroyed all romantic interest by expressly admitting that\n    Edmund only took Fanny because Mary shocked him, and that Fanny might\n    very likely have taken Crawford if he had been a little more assiduous;\n    yet the matchless rehearsal-scenes and the characters of Mrs. Norris and\n    others have secured, I believe, a considerable party for it._ Sense and\n    Sensibility _has perhaps the fewest out-and-out admirers; but it does\n    not want them._\n    _I suppose, however, that the majority of at least competent votes\n    would, all things considered, be divided between_ Emma _and the present\n    book; and perhaps the vulgar verdict (if indeed a fondness for Miss\n    Austen be not of itself a patent of exemption from any possible charge\n    of vulgarity) would go for_ Emma. _It is the larger, the more varied, the\n    more popular; the author had by the time of its composition seen rather\n    more of the world, and had improved her general, though not her most\n    peculiar and characteristic dialogue; such figures as Miss Bates, as the\n    Eltons, cannot but unite the suffrages of everybody. On the other hand,\n    I, for my part, declare for_ Pride and Prejudice _unhesitatingly. It\n    seems to me the most perfect, the most characteristic, the most\n    eminently quintessential of its author's works; and for this contention\n    in such narrow space as is permitted to me, I propose here to show\n    cause._\n    _In the first place, the book (it may be barely necessary to remind the\n    reader) was in its first shape written very early, somewhere about 1796,\n    when Miss Austen was barely twenty-one; though it was revised and\n    finished at Chawton some fifteen years later, and was not published till\n    1813, only four years before her death. I do not know whether, in this\n    combination of the fresh and vigorous projection of youth, and the\n    critical revision of middle life, there may be traced the distinct\n    superiority in point of construction, which, as it seems to me, it\n    possesses over all the others. The plot, though not elaborate, is almost\n    regular enough for Fielding; hardly a character, hardly an incident\n    could be retrenched without loss to the story. The elopement of Lydia\n    and Wickham is not, like that of Crawford and Mrs. Rushworth, a_ coup de\n    théâtre; _it connects itself in the strictest way with the course of the\n    story earlier, and brings about the denouement with complete propriety.\n    All the minor passages--the loves of Jane and Bingley, the advent of Mr.\n    Collins, the visit to Hunsford, the Derbyshire tour--fit in after the\n    same unostentatious, but masterly fashion. There is no attempt at the\n    hide-and-seek, in-and-out business, which in the transactions between\n    Frank Churchill and Jane Fairfax contributes no doubt a good deal to the\n    intrigue of_ Emma, _but contributes it in a fashion which I do not think\n    the best feature of that otherwise admirable book. Although Miss Austen\n    always liked something of the misunderstanding kind, which afforded her\n    opportunities for the display of the peculiar and incomparable talent to\n    be noticed presently, she has been satisfied here with the perfectly\n    natural occasions provided by the false account of Darcy's conduct given\n    by Wickham, and by the awkwardness (arising with equal naturalness) from\n    the gradual transformation of Elizabeth's own feelings from positive\n    aversion to actual love. I do not know whether the all-grasping hand of\n    the playwright has ever been laid upon_ Pride and Prejudice; _and I dare\n    say that, if it were, the situations would prove not startling or\n    garish enough for the footlights, the character-scheme too subtle and\n    delicate for pit and gallery. But if the attempt were made, it would\n    certainly not be hampered by any of those loosenesses of construction,\n    which, sometimes disguised by the conveniences of which the novelist can\n    avail himself, appear at once on the stage._\n    _I think, however, though the thought will doubtless seem heretical to\n    more than one school of critics, that construction is not the highest\n    merit, the choicest gift, of the novelist. It sets off his other gifts\n    and graces most advantageously to the critical eye; and the want of it\n    will sometimes mar those graces--appreciably, though not quite\n    consciously--to eyes by no means ultra-critical. But a very badly-built\n    novel which excelled in pathetic or humorous character, or which\n    displayed consummate command of dialogue--perhaps the rarest of all\n    faculties--would be an infinitely better thing than a faultless plot\n    acted and told by puppets with pebbles in their mouths. And despite the\n    ability which Miss Austen has shown in working out the story, I for one\n    should put_ Pride and Prejudice _far lower if it did not contain what\n    seem to me the very masterpieces of Miss Austen's humour and of her\n    faculty of character-creation--masterpieces who may indeed admit John\n    Thorpe, the Eltons, Mrs. Norris, and one or two others to their company,\n    but who, in one instance certainly, and perhaps in others, are still\n    superior to them._\n    _The characteristics of Miss Austen's humour are so subtle and delicate\n    that they are, perhaps, at all times easier to apprehend than to\n    express, and at any particular time likely to be differently\n    apprehended by different persons. To me this humour seems to possess a\n    greater affinity, on the whole, to that of Addison than to any other of\n    the numerous species of this great British genus. The differences of\n    scheme, of time, of subject, of literary convention, are, of course,\n    obvious enough; the difference of sex does not, perhaps, count for much,\n    for there was a distinctly feminine element in \"Mr. Spectator,\" and in\n    Jane Austen's genius there was, though nothing mannish, much that was\n    masculine. But the likeness of quality consists in a great number of\n    common subdivisions of quality--demureness, extreme minuteness of touch,\n    avoidance of loud tones and glaring effects. Also there is in both a\n    certain not inhuman or unamiable cruelty. It is the custom with those\n    who judge grossly to contrast the good nature of Addison with the\n    savagery of Swift, the mildness of Miss Austen with the boisterousness\n    of Fielding and Smollett, even with the ferocious practical jokes that\n    her immediate predecessor, Miss Burney, allowed without very much\n    protest. Yet, both in Mr. Addison and in Miss Austen there is, though a\n    restrained and well-mannered, an insatiable and ruthless delight in\n    roasting and cutting up a fool. A man in the early eighteenth century,\n    of course, could push this taste further than a lady in the early\n    nineteenth; and no doubt Miss Austen's principles, as well as her heart,\n    would have shrunk from such things as the letter from the unfortunate\n    husband in the_ Spectator, _who describes, with all the gusto and all the\n    innocence in the world, how his wife and his friend induce him to play\n    at blind-man's-buff. But another_ Spectator _letter--that of the damsel\n    of fourteen who wishes to marry Mr. Shapely, and assures her selected\n    Mentor that \"he admires your_ Spectators _mightily\"--might have been\n    written by a rather more ladylike and intelligent Lydia Bennet in the\n    days of Lydia's great-grandmother; while, on the other hand, some (I\n    think unreasonably) have found \"cynicism\" in touches of Miss Austen's\n    own, such as her satire of Mrs. Musgrove's self-deceiving regrets over\n    her son. But this word \"cynical\" is one of the most misused in the\n    English language, especially when, by a glaring and gratuitous\n    falsification of its original sense, it is applied, not to rough and\n    snarling invective, but to gentle and oblique satire. If cynicism means\n    the perception of \"the other side,\" the sense of \"the accepted hells\n    beneath,\" the consciousness that motives are nearly always mixed, and\n    that to seem is not identical with to be--if this be cynicism, then\n    every man and woman who is not a fool, who does not care to live in a\n    fool's paradise, who has knowledge of nature and the world and life, is\n    a cynic. And in that sense Miss Austen certainly was one. She may even\n    have been one in the further sense that, like her own Mr. Bennet, she\n    took an epicurean delight in dissecting, in displaying, in setting at\n    work her fools and her mean persons. I think she did take this delight,\n    and I do not think at all the worse of her for it as a woman, while she\n    was immensely the better for it as an artist.\n    ```\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Character(BaseModel):\n    name: str\n    description: str\n\n\n# Note: For testing this example locally, create a book.txt file with content like:\n# Sample book.txt content:\n# \"Pride and Prejudice by Jane Austen\n#\n# It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.\n# However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is\n# so well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or\n# other of their daughters...\"\nbook = \"\"\"\nPride and Prejudice by Jane Austen\n\nIt is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife.\nHowever little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is\nso well fixed in the minds of the surrounding families, that he is considered the rightful property of some one or\nother of their daughters...\n\"\"\"\n\n# Uncomment to read from an actual file instead of using the sample text above\n# with open(\"./book.txt\") as f:\n#     book = f.read()\n\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet-20240620\")\n\nresp, completion = client.create_with_completion(\n        model=\"claude-3-5-sonnet-20240620\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"<book>\" + book + \"</book>\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},  # (1)!\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract a character from the text given above\",\n                    },\n                ],\n            },\n        ],\n        response_model=Character,\n        max_tokens=1000,\n    )\n\nprint(completion)\n# Message(\n#     id='msg_01QcqjktYc1PXL8nk7y5hkMV',\n#     content=[\n#         ToolUseBlock(\n#             id='toolu_019wABRzQxtSbXeuuRwvJo15',\n#             input={\n#                 'name': 'Jane Austen',\n#                 'description': 'A renowned English novelist of the early 19th century, known for her wit, humor, and keen observations of human nature. She is the author of\n# several classic novels including \"Pride and Prejudice,\" \"Emma,\" \"Sense and Sensibility,\" and \"Mansfield Park.\" Austen\\'s writing is characterized by its subtlety, delicate touch,\n# and ability to create memorable characters. Her work often involves social commentary and explores themes of love, marriage, and societal expectations in Regency-era England.'\n#             },\n#             name='Character',\n#             type='tool_use'\n#         )\n#     ],\n#     model='claude-3-5-sonnet-20240620',\n#     role='assistant',\n#     stop_reason='tool_use',\n#     stop_sequence=None,\n#     type='message',\n#     usage=Usage(cache_creation_input_tokens=2777, cache_read_input_tokens=0, input_tokens=30, output_tokens=161)\n# )\n```\n\n1. Anthropic requires that you explicitly pass in the `cache_control` parameter to indicate that you want to cache the content.\n\n!!! Warning \"Caching Considerations\"\n\n    **Minimum cache size**: For Claude Haiku, your cached content needs to be a minimum of 2048 tokens. For Claude Sonnet, the minimum is 1024 tokens.\n\n**Benefits**: The cost of reading from the cache is 10x lower than if we were to process the same message again and enables us to execute our queries significantly faster.\n\nWe've written a more detailed blog on how to use the `create_with_completion` method [here](../blog/posts/anthropic-prompt-caching.md) to validate you're getting a cache hit with instructor.\n"
  },
  {
    "path": "docs/concepts/prompting.md",
    "content": "---\ntitle: Prompt Engineering Best Practices\ndescription: Learn prompt engineering tips for using Pydantic and Instructor effectively.\n---\n\n# General Tips for Prompt Engineering\n\nWhen using Instructor and Pydantic, make your models self-descriptive, modular, and flexible while keeping data integrity.\n\n- Modularity: Design self-contained components for reuse\n- Self-description: Use Pydantic's `Field` for clear field descriptions\n- Optionality: Use Python's `Optional` type for nullable fields and set defaults\n- Standardization: Use enumerations for fields with fixed values; include a fallback option\n- Dynamic data: Use key-value pairs for arbitrary properties and limit list lengths\n- Entity relationships: Define explicit identifiers and relationship fields\n- Contextual logic: Optionally add a \"chain of thought\" field in reusable components for extra context\n\n## Modular Chain of Thought {#chain-of-thought}\n\nUse chain of thought to improve data quality. You can add it to specific components rather than making it global.\n\n```python hl_lines=\"4 5\"\nfrom pydantic import BaseModel, Field\n\n\nclass Role(BaseModel):\n    chain_of_thought: str = Field(\n        ..., description=\"Think step by step to determine the correct title\"\n    )\n    title: str\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Role\n```\n\n## Utilize Optional Attributes\n\nUse Python's Optional type and set a default value to prevent undesired defaults like empty strings.\n\n```python hl_lines=\"6\"\nfrom typing import Optional\nfrom pydantic import BaseModel, Field\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Optional[str] = Field(default=None)\n```\n\n## Handling Errors Within Function Calls\n\nCreate a wrapper class to hold either the result of an operation or an error message. This lets you stay within a function call even if an error occurs, improving error handling without breaking the code flow.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Optional[str] = Field(default=None)\n\n\nclass MaybeUser(BaseModel):\n    result: Optional[UserDetail] = Field(default=None)\n    error: bool = Field(default=False)\n    message: Optional[str]\n\n    def __bool__(self):\n        return self.result is not None\n```\n\nWith the `MaybeUser` class, you can either receive a `UserDetail` object in result or get an error message in message.\n\n### Simplification with the Maybe Pattern\n\nSimplify this using Instructor to create the `Maybe` pattern dynamically from any `BaseModel`.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n\n\nMaybeUser = instructor.Maybe(UserDetail)\n```\n\nThis lets you quickly create a Maybe type for any class.\n\n## Tips for Enumerations\n\nUse Enums for standardized fields to prevent data misalignment. Always include an \"Other\" option as a fallback so the model can signal uncertainty.\n\n```python hl_lines=\"7 12\"\nfrom enum import Enum, auto\nfrom pydantic import BaseModel, Field\n\n\nclass Role(Enum):\n    PRINCIPAL = auto()\n    TEACHER = auto()\n    STUDENT = auto()\n    OTHER = auto()\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Role = Field(\n        description=\"Correctly assign one of the predefined roles to the user.\"\n    )\n```\n\n## Literals {#literals}\n\nIf you're having a hard time with `Enum` an alternative is to use `Literal`\n\n```python hl_lines=\"4\"\nfrom typing import Literal\nfrom pydantic import BaseModel\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Literal[\"PRINCIPAL\", \"TEACHER\", \"STUDENT\", \"OTHER\"]\n```\n\nIf you'd like to improve performance more you can reiterate the requirements in the field descriptions or in the docstrings.\n\n## Reiterate Long Instructions\n\nFor complex attributes, repeat the instructions in the field's description.\n\n```python hl_lines=\"5 11\"\nfrom pydantic import BaseModel, Field\n\n\nclass Role(BaseModel):\n    \"\"\"\n    Extract the role based on the following rules ...\n    \"\"\"\n\n    instructions: str = Field(\n        ...,\n        description=\"Restate the instructions and rules to correctly determine the title.\",\n    )\n    title: str\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Role\n```\n\n## Handle Arbitrary Properties\n\nWhen you need to extract undefined attributes, use a list of key-value pairs.\n\n```python hl_lines=\"10\"\nfrom typing import List\nfrom pydantic import BaseModel, Field\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    properties: List[Property] = Field(\n        ..., description=\"Extract any other properties that might be relevant.\"\n    )\n```\n\n## Limiting the Length of Lists\n\nWhen dealing with lists of attributes, especially arbitrary properties, manage the length. Use prompting and enumeration to limit the list length and keep a manageable set of properties.\n\n```python hl_lines=\"2 9\"\nfrom typing import List\nfrom pydantic import BaseModel, Field\n\n\nclass Property(BaseModel):\n    index: str = Field(..., description=\"Monotonically increasing ID\")\n    key: str\n    value: str\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    properties: List[Property] = Field(\n        ...,\n        description=\"Numbered list of arbitrary extracted properties, should be less than 6\",\n    )\n```\n\n### Using Tuples for Simple Types\n\nFor simple types, tuples can be a more compact alternative to custom classes, especially when the properties don't require additional descriptions.\n\n```python hl_lines=\"4\"\nfrom typing import List, Tuple\nfrom pydantic import BaseModel, Field\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    properties: List[Tuple[int, str]] = Field(\n        ...,\n        description=\"Numbered list of arbitrary extracted properties, should be less than 6\",\n    )\n```\n\n## Advanced Arbitrary Properties\n\nFor multiple users, use consistent key names when extracting properties.\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel\n\n\nclass UserDetail(BaseModel):\n    id: int\n    age: int\n    name: str\n\n\nclass UserDetails(BaseModel):\n    \"\"\"\n    Extract information for multiple users.\n    Use consistent key names for properties across users.\n    \"\"\"\n\n    users: List[UserDetail]\n```\n\nThis refined guide should offer a cleaner and more organized approach to structure engineering in Python.\n\n## Defining Relationships Between Entities\n\nWhen relationships exist between entities, define them explicitly in the model. The following example shows how to define relationships between users by adding an id and a friends field:\n\n```python hl_lines=\"2 5 8\"\nfrom typing import List\nfrom pydantic import BaseModel, Field\n\n\nclass UserDetail(BaseModel):\n    id: int = Field(..., description=\"Unique identifier for each user.\")\n    age: int\n    name: str\n    friends: List[int] = Field(\n        ...,\n        description=\"Correct and complete list of friend IDs, representing relationships between users.\",\n    )\n\n\nclass UserRelationships(BaseModel):\n    users: List[UserDetail] = Field(\n        ...,\n        description=\"Collection of users, correctly capturing the relationships among them.\",\n    )\n```\n\n## Reusing Components with Different Contexts\n\nYou can reuse the same component for different contexts within a model. In this example, the TimeRange component is used for both work_time and leisure_time.\n\n```python hl_lines=\"9 10\"\nfrom pydantic import BaseModel, Field\n\n\nclass TimeRange(BaseModel):\n    start_time: int = Field(..., description=\"The start time in hours.\")\n    end_time: int = Field(..., description=\"The end time in hours.\")\n\n\nclass UserDetail(BaseModel):\n    id: int = Field(..., description=\"Unique identifier for each user.\")\n    age: int\n    name: str\n    work_time: TimeRange = Field(\n        ..., description=\"Time range during which the user is working.\"\n    )\n    leisure_time: TimeRange = Field(\n        ..., description=\"Time range reserved for leisure activities.\"\n    )\n```\n\nSometimes, a component like TimeRange may need context or additional logic to work well. Adding a \"chain of thought\" field within the component can help understand or optimize the time range allocations.\n\n```python hl_lines=\"2\"\nfrom pydantic import BaseModel, Field\n\n\nclass TimeRange(BaseModel):\n    chain_of_thought: str = Field(\n        ..., description=\"Step by step reasoning to get the correct time range\"\n    )\n    start_time: int = Field(..., description=\"The start time in hours.\")\n    end_time: int = Field(..., description=\"The end time in hours.\")\n```\n"
  },
  {
    "path": "docs/concepts/raw_response.md",
    "content": "---\ntitle: Creating a Model with OpenAI Completions\ndescription: Learn how to create a custom model using OpenAI's API to extract user data efficiently with Python.\n---\n\n\n# Creating a model with completions\n\nIn instructor>1.0.0 we have a custom client, if you wish to use the raw response you can do the following\n\n```python\nimport instructor\n\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser, completion = client.create_with_completion(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nprint(user)\n#> name='jason' age=25\n\nprint(completion)\n\"\"\"\nChatCompletion(\n    id='chatcmpl-D1KqvmcGn5zeYfqRdquwERAH0wIVB',\n    choices=[\n        Choice(\n            finish_reason='stop',\n            index=0,\n            logprobs=None,\n            message=ChatCompletionMessage(\n                content=None,\n                refusal=None,\n                role='assistant',\n                annotations=[],\n                audio=None,\n                function_call=None,\n                tool_calls=[\n                    ChatCompletionMessageFunctionToolCall(\n                        id='call_8VastKJ2gYWNrYEQmBXGWnRv',\n                        function=Function(\n                            arguments='{\"name\":\"jason\",\"age\":25}', name='UserExtract'\n                        ),\n                        type='function',\n                    )\n                ],\n            ),\n        )\n    ],\n    created=1769210857,\n    model='gpt-4.1-mini-2025-04-14',\n    object='chat.completion',\n    service_tier='default',\n    system_fingerprint='fp_376a7ccef1',\n    usage=CompletionUsage(\n        completion_tokens=10,\n        prompt_tokens=79,\n        total_tokens=89,\n        completion_tokens_details=CompletionTokensDetails(\n            accepted_prediction_tokens=None,\n            audio_tokens=0,\n            reasoning_tokens=0,\n            rejected_prediction_tokens=None,\n        ),\n        prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0),\n    ),\n)\n\"\"\"\n```\n\n## Raw response with a list response model\n\nIf your response model is a list (for example, `list[UserExtract]`), you can still use `create_with_completion()`. Instructor wraps the list in a `ResponseList` (also called `ListResponse`) that behaves like a normal list but also preserves the raw response.\n\n### What is ResponseList?\n\n`ResponseList` is a special list type that Instructor uses when your `response_model` is a list. It extends Python's built-in `list` type and adds a `_raw_response` attribute to store the provider's raw response object.\n\nThis is necessary because `create_with_completion()` needs to return both the parsed result and the raw response. For single objects, this is straightforward: `(model_instance, raw_response)`. For lists, we need a way to attach the raw response to the list itself, which is what `ResponseList` does.\n\n### Using ResponseList\n\nThe returned value behaves exactly like a normal Python list, but you can access the raw response using `get_raw_response()`:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nusers, completion = client.create_with_completion(\n    response_model=list[UserExtract],\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract users: Jason is 25, Ivan is 30\"},\n    ],\n)\n\n# Use it like a normal list\nprint(users[0])\n#> name='Jason' age=25\nprint(len(users))\n#> 2\n\n# Access the raw response\nraw = users.get_raw_response()\nassert raw == completion\n\n# ResponseList supports all list operations\nfor user in users:\n    print(user.name)\n#> Jason\n#> Ivan\n```\n\n## See Also\n\n- [Hooks](./hooks.md) - Monitor LLM interactions without accessing raw responses\n- [Debugging](../debugging.md) - Debugging techniques for LLM outputs\n- [Response Models](./models.md) - Working with structured response models\n\n## Anthropic Raw Response\n\nYou can also access the raw response from Anthropic models. This is useful for debugging or when you need to access additional information from the response.\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet-latest\")\n\n\nuser, completion = client.create_with_completion(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nprint(user)\n#> name='Jason' age=25\n\nprint(completion)\n\"\"\""
  },
  {
    "path": "docs/concepts/reask_validation.md",
    "content": "---\ntitle: Enhancing AI Validations with Pydantic's Framework\ndescription: Learn how to improve AI outputs using Pydantic for validation and reasking techniques.\n---\n\n# Validation and Reasking\n\nInstead of framing \"self-critique\" or \"self-reflection\" in AI as new concepts, we can view them as validation errors with clear error messages that the system can use to self-correct.\n\n## Pydantic\n\nPydantic offers a customizable and expressive validation framework for Python. Instructor leverages Pydantic's validation framework to provide a uniform developer experience for both code-based and LLM-based validation, as well as a reasking mechanism for correcting LLM outputs based on validation errors. To learn more check out the [Pydantic docs](https://docs.pydantic.dev/latest/concepts/validators/) on validators.\n\n!!! note \"Good llm validation is just good validation\"\n\n    If you want to see some more examples on validators checkout our blog post [Good LLM validation is just good validation](https://python.useinstructor.com/blog/2023/10/23/good-llm-validation-is-just-good-validation/)\n\n### Code-based Validation Example\n\nFirst define a Pydantic model with a validator using the `Annotation` class from `typing_extensions`.\n\nEnforce a naming rule using Pydantic's built-in validation:\n\n```python hl_lines=\"5-8 12\"\nfrom pydantic import BaseModel, ValidationError\nfrom typing_extensions import Annotated\nfrom pydantic import AfterValidator\n\n\ndef name_must_contain_space(v: str) -> str:\n    if \" \" not in v:\n        raise ValueError(\"Name must contain a space.\")\n    return v.lower()\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: Annotated[str, AfterValidator(name_must_contain_space)]\n\n\ntry:\n    person = UserDetail(age=29, name=\"Jason\")\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserDetail\n    name\n      Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.11/v/value_error\n    \"\"\"\n```\n\n#### Output for Code-Based Validation\n\n```plaintext\n1 validation error for UserDetail\nname\n   Value error, name must contain a space (type=value_error)\n```\n\nAs we can see, Pydantic raises a validation error when the name attribute does not contain a space. This is a simple example, but it demonstrates how Pydantic can be used to validate attributes of a model.\n\n### LLM-Based Validation Example\n\nLLM-based validation can also be plugged into the same Pydantic model. Here, if the answer attribute contains content that violates the rule \"don't say objectionable things,\" Pydantic will raise a validation error.\n\n```python hl_lines=\"9 15\"\nimport instructor\nfrom instructor import llm_validator\nfrom pydantic import BaseModel, ValidationError, BeforeValidator\nfrom typing_extensions import Annotated\n\n\n# Apply the patch to the OpenAI client\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass QuestionAnswer(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(llm_validator(\"don't say objectionable things\", client=client)),\n    ]\n\n\ntry:\n    qa = QuestionAnswer(\n        question=\"What is the meaning of life?\",\n        answer=\"The meaning of life is to be evil and steal\",\n    )\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for QuestionAnswer\n    answer\n      Assertion failed, The statement promotes objectionable behavior by encouraging evil and stealing. [type=assertion_error, input_value='The meaning of life is to be evil and steal', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.11/v/assertion_error\n    \"\"\"\n```\n\n#### Output for LLM-Based Validation\n\nIt is important to note here that the error message is generated by the LLM, not the code, so it'll be helpful for re-asking the model.\n\n```plaintext\n1 validation error for QuestionAnswer\nanswer\n   Assertion failed, The statement is objectionable. (type=assertion_error)\n```\n\n## Using Reasking Logic to Correct Outputs\n\nValidators are a great tool for ensuring some property of the outputs. When you use the `patch()` method with the `openai` client, you can use the `max_retries` parameter to set the number of times you can reask the model to correct the output.\n\nIt is a great layer of defense against bad outputs of two forms:\n\n1. Pydantic Validation Errors (code or llm based)\n2. JSON Decoding Errors (when the model returns a bad response)\n\n### Step 1: Define the Response Model with Validators\n\nNotice that the field validator wants the name in uppercase, but the user input is lowercase. The validator will raise a `ValueError` if the name is not in uppercase.\n\n```python hl_lines=\"12-17\"\nimport instructor\nfrom pydantic import BaseModel, field_validator\n\n# Apply the patch to the OpenAI client\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"name\")\n    @classmethod\n    def validate_name(cls, v):\n        if v.upper() != v:\n            raise ValueError(\"Name must be in uppercase.\")\n        return v\n```\n\n### Step 2. Using the Client with Retries\n\nHere, the `UserDetails` model is passed as the `response_model`, and `max_retries` is set to 2.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\",\n    mode=instructor.Mode.TOOLS,\n)\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\nmodel = client.create(\n    response_model=UserDetails,\n    max_retries=2,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nprint(model.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"jason\",\n  \"age\": 25\n}\n\"\"\"\n```\n\n### What happens behind the scenes?\n\nBehind the scenes, the `instructor.from_provider()` method adds a `max_retries` parameter to the `openai.ChatCompletion.create()` method. The `max_retries` parameter will trigger up to 2 reattempts if the `name` attribute fails the uppercase validation in `UserDetails`.\n\n```python\nfrom pydantic import ValidationError\n\n\ntry:\n    ...\nexcept ValidationError as e:\n    kwargs[\"messages\"].append(response.choices[0].message)\n    kwargs[\"messages\"].append(\n        {\n            \"role\": \"user\",\n            \"content\": f\"Please correct the function call; errors encountered:\\n{e}\",\n        }\n    )\n```\n\n## Advanced Validation Techniques\n\n### Using Context for Dynamic Validation\n\nThe `context` parameter allows you to pass additional data to your validators, enabling validation against runtime data like source documents, allowed values, or external references. This is accessed in validators via `ValidationInfo`.\n\nHere's a complete example showing context-based validation:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, ValidationInfo, field_validator\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass QuoteExtraction(BaseModel):\n    \"\"\"Extract a claim with a supporting quote from source text.\"\"\"\n\n    claim: str\n    supporting_quote: str\n\n    @field_validator('supporting_quote')\n    @classmethod\n    def verify_quote_in_source(cls, v: str, info: ValidationInfo):\n        \"\"\"Verify the quote exists in the source text.\"\"\"\n        import re\n\n        context = info.context\n        if context:\n            source_text = context.get('source_text', '')\n            # Normalize whitespace for comparison\n            normalized_source = re.sub(r'\\s+', ' ', source_text.strip())\n            normalized_quote = re.sub(r'\\s+', ' ', v.strip())\n            if normalized_quote not in normalized_source:\n                raise ValueError(\n                    f\"The quote must be an exact substring from the source text. \"\n                    f\"Quote '{v}' was not found in the source.\"\n                )\n        return v\n\n\nsource_text = \"\"\"\nThe Python programming language was created by Guido van Rossum \nand first released in 1991. It emphasizes code readability and \nsimplicity, making it popular for beginners and experts alike.\n\"\"\"\n\nextraction = client.create(\n    response_model=QuoteExtraction,\n    max_retries=2,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"Extract a claim and find an exact quote from the text that supports it.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": \"Source text: {{ source_text }}\\n\\nExtract a claim about Python.\",\n        },\n    ],\n    context={\"source_text\": source_text},\n)\n\nprint(f\"Claim: {extraction.claim}\")\n#> Claim: Python emphasizes code readability and simplicity.\nprint(f\"Quote: {extraction.supporting_quote}\")\n\"\"\"\nQuote: It emphasizes code readability and simplicity, making it popular for beginners and experts alike.\n\"\"\"\n```\n\nIn this example:\n- The `context` parameter passes the source text to the validator\n- `ValidationInfo` provides access to the context in the validator\n- If the LLM generates a quote that doesn't exist in the source, validation fails and the model is re-asked\n\nFor more advanced examples including multi-field validation and citation verification, check out our [exact citations example](../examples/exact_citations.md).\n\n## Optimizing Token usage\n\nPydantic automatically includes a URL within the error message itself when an error is thrown so that users can learn more about the specific error that was thrown. Some users might want to remove this URL since it adds extra tokens that otherwise might not add much value to the validation process.\n\nWe've created a small helper function that you can use below which removes this url in the error message\n\n```python hl_lines=\"6\"\nfrom instructor.utils import disable_pydantic_error_url\nfrom pydantic import BaseModel, ValidationError\nfrom typing_extensions import Annotated\nfrom pydantic import AfterValidator\n\ndisable_pydantic_error_url()  # (1)!\n\n\ndef name_must_contain_space(v: str) -> str:\n    if \" \" not in v:\n        raise ValueError(\"Name must contain a space.\")\n    return v.lower()\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: Annotated[str, AfterValidator(name_must_contain_space)]\n\n\ntry:\n    person = UserDetail(age=29, name=\"Jason\")\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserDetail\n    name\n      Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\n    \"\"\"\n```\n\n1.  We disable the error by setting an environment variable `PYDANTIC_ERRORS_INCLUDE_URL` to `0`. This is valid only for the duration that the script is executed for, once the function is not called, the original behaviour is restored.\n\n## See Also\n\n- [Validation](./validation.md) - Core validation concepts and strategies\n- [Retrying](./retrying.md) - Configure automatic retry behavior with Tenacity\n- [Custom Validators](../learning/validation/custom_validators.md) - Build custom validation logic\n- [Field Validation](../learning/patterns/field_validation.md) - Field-level validation patterns\n- [Retry Mechanisms](../learning/validation/retry_mechanisms.md) - Practical retry configuration guide\n\n## Takeaways\n\nBy integrating these advanced validation techniques, we not only improve the quality and reliability of LLM-generated content, but also pave the way for more autonomous and effective systems.\n"
  },
  {
    "path": "docs/concepts/retrying.md",
    "content": "---\ntitle: \"Retry Logic with Tenacity\"\ndescription: \"Learn how to implement retry logic with Tenacity for LLM applications, including exponential backoff, conditional retries, and error handling.\"\n---\n\n# Retry Logic with Tenacity\n\nTenacity is a Python library for adding retry logic to your applications. Combined with Instructor, it helps handle API failures, rate limits, and validation errors.\n\n## Basic Retry with Exponential Backoff\n\nThe most common pattern uses exponential backoff to delay retries:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom tenacity import retry, stop_after_attempt, wait_exponential\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\n@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))\ndef extract_user_info(text: str) -> UserInfo:\n    \"\"\"Extract user information with retry logic.\"\"\"\n    return client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": f\"Extract user info: {text}\"}],\n    )\n\n\ntry:\n    user = extract_user_info(\"John is 30 years old with email john@example.com\")\n    print(f\"Success: {user.name}, {user.age}, {user.email}\")\n    #> Success: John, 30, john@example.com\nexcept Exception as e:\n    print(f\"Failed after retries: {e}\")\n```\n\n## Error-Specific Retries\n\nRetry only on specific error types for better control:\n\n```python\nimport instructor\nfrom openai import APIError, RateLimitError\nfrom pydantic import BaseModel, ValidationError\nfrom tenacity import (\n    retry,\n    retry_if_exception_type,\n    stop_after_attempt,\n    wait_exponential,\n)\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\n# Retry on API errors with longer delays\n@retry(\n    retry=retry_if_exception_type((RateLimitError, APIError)),\n    stop=stop_after_attempt(5),\n    wait=wait_exponential(multiplier=2, min=1, max=60),\n)\ndef handle_api_errors(text: str) -> UserInfo:\n    return client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": text}],\n    )\n\n\n# Retry on validation errors with shorter delays\n@retry(\n    retry=retry_if_exception_type(ValidationError),\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=1, max=10),\n)\ndef handle_validation_errors(text: str) -> UserInfo:\n    return client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": text}],\n    )\n```\n\n## Custom Retry Conditions\n\nRetry based on the result content rather than exceptions:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom tenacity import retry, retry_if_result, stop_after_attempt\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\ndef should_retry(result: UserInfo) -> bool:\n    \"\"\"Retry if the result doesn't meet quality criteria.\"\"\"\n    return result.age < 0 or result.age > 150 or not result.email\n\n\n@retry(retry=retry_if_result(should_retry), stop=stop_after_attempt(3))\ndef extract_valid_user(text: str) -> UserInfo:\n    return client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": text}],\n    )\n```\n\n## Context-Based Validation with Retries\n\nUse the `context` parameter to pass runtime data to validators:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, ValidationInfo, field_validator, ValidationError\nfrom tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_exponential\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Citation(BaseModel):\n    \"\"\"A claim with a supporting quote from source text.\"\"\"\n\n    claim: str\n    quote: str\n\n    @field_validator('quote')\n    @classmethod\n    def verify_quote_exists(cls, v: str, info: ValidationInfo):\n        context = info.context\n        if context:\n            source_text = context.get('source_text', '')\n            if v not in source_text:\n                raise ValueError(f\"Quote '{v}' not found in source text.\")\n        return v\n\n\n@retry(\n    retry=retry_if_exception_type(ValidationError),\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=2, max=10),\n)\ndef extract_citation(claim: str, source_text: str) -> Citation:\n    return client.create(\n        response_model=Citation,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Extract the claim and find an exact quote from the source.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Source: {{ source_text }}\\n\\nClaim: {{ claim }}\",\n            },\n        ],\n        context={\"source_text\": source_text, \"claim\": claim},\n    )\n\n\nsource = \"The Eiffel Tower was completed in 1889 and stands 330 meters tall.\"\ncitation = extract_citation(\"The tower is over 300 meters\", source)\nprint(f\"Quote: {citation.quote}\")\n```\n\n## Logging and Monitoring\n\nAdd logging to track retry attempts:\n\n```python\nimport logging\nimport instructor\nfrom pydantic import BaseModel\nfrom tenacity import after_log, before_log, retry, stop_after_attempt, wait_exponential\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\nlogger = logging.getLogger(__name__)\nlogging.basicConfig(level=logging.INFO)\n\n\n@retry(\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=4, max=10),\n    before=before_log(logger, logging.INFO),\n    after=after_log(logger, logging.ERROR),\n)\ndef logged_extraction(text: str) -> UserInfo:\n    return client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": text}],\n    )\n```\n\n## Instructor's Built-in Retries\n\nInstructor has built-in retry support that works alongside Tenacity:\n\n```python\nimport instructor\nfrom instructor import Mode\nfrom pydantic import BaseModel\nfrom tenacity import retry, stop_after_attempt\n\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\",\n    mode=Mode.JSON,\n    max_retries=3,\n    retry_delay=1,\n)\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\n# Combine Instructor and Tenacity retries for additional resilience\n@retry(stop=stop_after_attempt(2))\ndef double_retry_extraction(text: str) -> UserInfo:\n    return client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": text}],\n    )\n```\n\n## Failed Attempts Tracking\n\nWhen retries fail, Instructor provides detailed failure history:\n\n```python\nimport instructor\nfrom instructor.core.exceptions import InstructorRetryException\nfrom pydantic import BaseModel, field_validator\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n    @field_validator('age')\n    @classmethod\n    def validate_age(cls, v):\n        if v < 0 or v > 150:\n            raise ValueError(f\"Age {v} is invalid\")\n        return v\n\n\ntry:\n    result = client.create(\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: John is -5 years old\"}],\n        max_retries=3,\n    )\nexcept InstructorRetryException as e:\n    print(f\"Failed after {e.n_attempts} attempts\")\n    for attempt in e.failed_attempts:\n        print(f\"Attempt {attempt.attempt_number}: {attempt.exception}\")\n```\n\nFailed attempts are automatically propagated to reask handlers, enabling contextual error messages and progressive corrections.\n\n## Best Practices\n\n### Choose Appropriate Strategies\n\n| Error Type | Attempts | Min Delay | Max Delay |\n|------------|----------|-----------|-----------|\n| Rate limits | 5 | 1s | 60-120s |\n| Validation errors | 2-3 | 1s | 10s |\n| Network errors | 4 | 2s | 30s |\n\n### Always Set Stop Conditions\n\n```python\nfrom tenacity import retry, stop_after_attempt\n\n# Good: bounded retries\n@retry(stop=stop_after_attempt(3))\ndef bounded_retry():\n    pass\n\n# Bad: could retry forever\n@retry()  # Don't do this!\ndef unbounded_retry():\n    pass\n```\n\n## Troubleshooting\n\n**Infinite retries**: Always set `stop_after_attempt()` or `stop_after_delay()`.\n\n**Too many retries**: Use `retry_if_exception_type()` to retry only on specific errors.\n\n**Still hitting rate limits**: Increase max delay and use `wait_exponential()` with higher multipliers.\n\n## Related Resources\n\n- [Tenacity Documentation](https://tenacity.readthedocs.io/)\n- [Error Handling](./error_handling.md)\n- [Validation](./validation.md)\n"
  },
  {
    "path": "docs/concepts/semantic_validation.md",
    "content": "---\ntitle: Semantic Validation with LLMs\ndescription: Using LLMs for complex validation that goes beyond rule-based approaches to evaluate content based on natural language criteria.\n---\n\n## See Also\n\n- [Validation](./validation.md) - Core validation concepts and strategies\n- [Custom Validators](../learning/validation/custom_validators.md) - Build custom validation logic\n- [Field Validation](../learning/patterns/field_validation.md) - Field-level validation patterns\n- [Reask Validation](./reask_validation.md) - Automatic retry with validation feedback\n- [LLM Validator](./validation.md#semantic-validation) - Semantic validation examples\n\n# Semantic Validation with LLMs\n\nThis guide covers semantic validation in Instructor - using LLMs themselves to validate content against complex, subjective, or contextual criteria that would be difficult to implement with traditional rule-based approaches.\n\n## Overview\n\nSemantic validation leverages the language understanding capabilities of LLMs to validate inputs against natural language criteria. While traditional validation uses explicit rules and patterns, semantic validation can understand nuance, context, and subjective qualities in data.\n\n### When to Use Semantic Validation\n\nSemantic validation is particularly useful for:\n\n- **Complex criteria** that are difficult to express with rules\n- **Subjective qualities** like tone, style, or appropriateness\n- **Contextual validation** that requires understanding relationships between fields\n- **Policy enforcement** that involves nuanced understanding of guidelines\n- **Content moderation** for detecting harmful or inappropriate content\n\n### How It Works\n\nIn Instructor, semantic validation is implemented through the `llm_validator` function, which creates a validator that uses an LLM to check if values conform to specified requirements:\n\n```python\nimport instructor\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nfrom instructor import llm_validator\n\n# Initialize client\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserComment(BaseModel):\n    username: str\n    comment: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"Comment must be constructive, respectful, and not contain hate speech or profanity\",\n                client=client,\n            )\n        ),\n    ]\n```\n\nThe `llm_validator` function takes:\n\n1. A natural language description of the validation criteria\n2. An Instructor client instance to perform the validation\n3. Optional parameters for configuration\n\nDuring validation, the LLM evaluates whether the input matches the specified criteria and either passes the value or raises a validation error with a detailed explanation.\n\n## Validation Flow\n\nThe following diagram illustrates how semantic validation works in Instructor:\n\n```mermaid\nflowchart TD\n    A[Input Data] --> B[Pydantic Validation Process]\n    B --> C{Field has Semantic\\nValidator?}\n    C -->|No| D[Standard Validation]\n    C -->|Yes| E[Call LLM with Validation Criteria]\n    E --> F{LLM Determines\\nValue is Valid?}\n    F -->|Yes| G[Validation Passes]\n    F -->|No| H[Validation Fails with LLM-Generated Error]\n    H --> I{Auto-Retry Enabled?}\n    I -->|Yes| J[Try Again with Error Context]\n    I -->|No| K[Return Validation Error]\n    J --> E\n\n    classDef process fill:#e2f0fb,stroke:#b8daff,color:#004085;\n    classDef decision fill:#fff3cd,stroke:#ffeeba,color:#856404;\n    classDef success fill:#d4edda,stroke:#c3e6cb,color:#155724;\n    classDef error fill:#f8d7da,stroke:#f5c6cb,color:#721c24;\n\n    class A,B,E,J process\n    class C,F,I decision\n    class G,D success\n    class H,K error\n```\n\n## Basic Usage\n\nHere's a complete example of semantic validation in action:\n\n```python\n# Standard library imports\nfrom typing import Annotated\n\n# Third-party imports\nfrom pydantic import BaseModel, BeforeValidator\nimport instructor\nfrom instructor import llm_validator\n\n# Initialize client\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass ProductDescription(BaseModel):\n    \"\"\"Model for validating product descriptions.\"\"\"\n\n    name: str\n    description: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"\"\"The description must be:\n                1. Professional and factual\n                2. Free of excessive hyperbole or unsubstantiated claims\n                3. Between 50-200 words in length\n                4. Written in third person (no \"you\" or \"your\")\n                5. Free of spelling and grammar errors\"\"\",\n                client=client,\n            )\n        ),\n    ]\n\n\n# Example usage with Jinja templating\ntry:\n    product = client.create(\n        response_model=ProductDescription,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Generate a product description based on the product name.\",\n            },\n            {\"role\": \"user\", \"content\": \"Create a description for: {{ product_name }}\"},\n        ],\n        context={\"product_name\": \"UltraClean 9000 Washing Machine\"},\n    )\n    print(product.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"name\": \"UltraClean 9000 Washing Machine\",\n      \"description\": \"The UltraClean 9000 Washing Machine offers reliable and efficient cleaning with multiple wash settings and a high-capacity drum. It features an easy-to-use control panel and a design that suits modern home environments. The machine aims to provide a practical solution for everyday laundry needs with standard noise levels and energy consumption.\"\n    }\n    \"\"\"\nexcept Exception as e:\n    print(f\"Validation error: {e}\")\n    \"\"\"\n    Validation error: <failed_attempts>\n\n    <generation number=\"1\">\n    <exception>\n        1 validation error for ProductDescription\n    description\n      Assertion failed, The description contains excessive hyperbole and unsubstantiated claims. It needs to be more professional and factual. [type=assertion_error, input_value='The UltraClean 9000 Wash...ior laundry experience.', input_type=str]\n    </exception>\n    <completion>\n        ChatCompletion(id='chatcmpl-D08R5P8Ne4q4TvAbiSa6Kh18wQxQd', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='call_RZlWM3SJheQAv84bS1apYcFJ', function=Function(arguments='{\"name\":\"UltraClean 9000 Washing Machine\",\"description\":\"The UltraClean 9000 Washing Machine is a state-of-the-art appliance designed to deliver exceptional cleaning performance with maximum efficiency. Featuring advanced cleaning technology, multiple wash cycles, and energy-saving modes, it ensures your clothes come out spotless every time. Its sleek design and user-friendly interface make laundry effortless and convenient, while durable construction guarantees long-lasting use. Ideal for modern households, the UltraClean 9000 combines powerful washing capabilities with quiet operation for a superior laundry experience.\"}', name='ProductDescription'), type='function')]))], created=1768924799, model='gpt-4.1-mini-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_376a7ccef1', usage=CompletionUsage(completion_tokens=300, prompt_tokens=2619, total_tokens=2919, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=None), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))\n    </completion>\n    </generation>\n\n    <generation number=\"2\">\n    <exception>\n        1 validation error for ProductDescription\n    description\n      Assertion failed, The description contains hyperbolic and exaggerated language, which does not align with the requirement of being professional and factual. It also includes unsubstantiated claims such as 'efficient laundry' and 'reliable performance'. [type=assertion_error, input_value='The UltraClean 9000 Wash...lar home laundry needs.', input_type=str]\n    </exception>\n    <completion>\n        ChatCompletion(id='chatcmpl-D08R96HSWzEZhcj9nWHCn4th6IIxB', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='call_jsbD8AbEK8MvFWkVPOK0mooT', function=Function(arguments='{\"name\":\"UltraClean 9000 Washing Machine\",\"description\":\"The UltraClean 9000 Washing Machine is designed for efficient laundry with multiple wash settings to suit different fabric types. It includes energy-saving features to reduce power consumption during operation. The machine has a capacity suitable for medium to large households and operates with reduced noise levels. The user interface is straightforward, offering ease of use. Built with durable materials, the UltraClean 9000 provides reliable performance for regular home laundry needs.\"}', name='ProductDescription'), type='function')]))], created=1768924803, model='gpt-4.1-mini-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_376a7ccef1', usage=CompletionUsage(completion_tokens=300, prompt_tokens=2619, total_tokens=2919, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=None), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))\n    </completion>\n    </generation>\n\n    <generation number=\"3\">\n    <exception>\n        1 validation error for ProductDescription\n    description\n      Assertion failed, The description contains some marketing language and exaggerated claims, which do not align with a professional and factual tone. It also lacks specific details and technical information about the washing machine. [type=assertion_error, input_value=\"The UltraClean 9000 Wash...ehold washing machines.\", input_type=str]\n    </exception>\n    <completion>\n        ChatCompletion(id='chatcmpl-D08RCpkeVCnl1jfV4HXHHRxogx46h', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='call_1MdJh2HvUMYzIxU8qj9BPmCG', function=Function(arguments='{\"name\":\"UltraClean 9000 Washing Machine\",\"description\":\"The UltraClean 9000 Washing Machine features multiple wash cycles and fabric care settings. It is designed to operate with an energy-saving mode to reduce electricity usage. The machine\\'s capacity supports the needs of medium to large households. It includes noise reduction technology for quieter operation and has a user interface with basic controls for ease of operation. The machine is constructed from standard materials commonly used in household washing machines.\"}', name='ProductDescription'), type='function')]))], created=1768924806, model='gpt-4.1-mini-2025-04-14', object='chat.completion', service_tier='default', system_fingerprint='fp_376a7ccef1', usage=CompletionUsage(completion_tokens=300, prompt_tokens=2619, total_tokens=2919, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=None), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)))\n    </completion>\n    </generation>\n\n    </failed_attempts>\n\n    <last_exception>\n        1 validation error for ProductDescription\n    description\n      Assertion failed, The description contains some marketing language and exaggerated claims, which do not align with a professional and factual tone. It also lacks specific details and technical information about the washing machine. [type=assertion_error, input_value=\"The UltraClean 9000 Wash...ehold washing machines.\", input_type=str]\n    </last_exception>\n    \"\"\"\n```\n\n## Advanced Validation Patterns\n\n### Content Policy Enforcement\n\nThis example validates user-generated content against community guidelines:\n\n```python\nimport instructor\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nfrom instructor import llm_validator\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Comment(BaseModel):\n    \"\"\"Model representing a user comment with content moderation.\"\"\"\n\n    user_id: str\n    content: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"\"\"Content must comply with community guidelines:\n                - No hate speech, harassment, or discrimination\n                - No explicit sexual or violent content\n                - No promotion of illegal activities\n                - No sharing of personal information\n                - No spamming or excessive self-promotion\"\"\",\n                client=client,\n            )\n        ),\n    ]\n```\n\n### Topic Relevance Validation\n\nThis validator ensures that responses stay on topic:\n\n```python\nimport instructor\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nfrom instructor import llm_validator\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass ForumPost(BaseModel):\n    topic: str\n    post: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"The post must be directly relevant to the specified topic and not drift to unrelated subjects\",\n                client=client,\n            )\n        ),\n    ]\n\n    # Using Jinja templating for validation against dynamic values\n    @classmethod\n    def validate_post(cls, topic_name: str, post_content: str) -> \"ForumPost\":\n        return client.create(\n            response_model=cls,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"Validate that the forum post content stays relevant to the topic.\n                    If it's not relevant, explain why in detail.\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": \"\"\"\n                    Topic: {{ topic }}\n\n                    Post content:\n                    {{ post }}\n\n                    Is this post relevant to the topic?\n                    \"\"\",\n                },\n            ],\n            context={\n                \"topic\": topic_name,\n                \"post\": post_content,\n            },\n        )\n```\n\n### Fact-Checking Validator\n\nThis complex validator assesses factual accuracy:\n\n```python\nimport instructor\nfrom typing import List\nfrom pydantic import BaseModel, Field\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass FactCheckedClaim(BaseModel):\n    \"\"\"Model for validating factual accuracy of claims.\"\"\"\n\n    claim: str\n    is_accurate: bool = Field(description=\"Whether the claim is factually accurate\")\n    supporting_evidence: List[str] = Field(\n        default_factory=list,\n        description=\"Evidence supporting or refuting the claim\",\n    )\n\n    @classmethod\n    def validate_claim(cls, text: str) -> \"FactCheckedClaim\":\n        return client.create(\n            response_model=cls,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a fact-checking system. Assess the factual accuracy of the claim.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": \"Fact check this claim: {{ claim }}\",\n                },\n            ],\n            context={\"claim\": text},\n        )\n```\n\n## Complex Multi-Field Validation\n\nFor validation that needs to compare multiple fields, you can use model validators:\n\n```python\nimport instructor\nfrom typing import List\nfrom pydantic import BaseModel, model_validator\nfrom instructor.validation import Validator  # For response type\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Report(BaseModel):\n    \"\"\"Model representing a report with related fields that need semantic validation.\"\"\"\n\n    title: str\n    summary: str\n    key_findings: List[str]\n\n    @model_validator(mode=\"after\")\n    def validate_consistency(self):\n        # Semantic validation at the model level using Jinja templating\n        validation_result = client.create(\n            response_model=Validator,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"Validate that the summary accurately reflects the key findings.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": \"\"\"\n                        Please validate if this summary accurately reflects the key findings:\n\n                        Title: {{ title }}\n                        Summary: {{ summary }}\n\n                        Key findings:\n                        {% for finding in findings %}\n                        - {{ finding }}\n                        {% endfor %}\n\n                        Evaluate for consistency, completeness, and accuracy.\n                    \"\"\",\n                },\n            ],\n            context={\n                \"title\": self.title,\n                \"summary\": self.summary,\n                \"findings\": self.key_findings,\n            },\n        )\n\n        if not validation_result.is_valid:\n            raise ValueError(f\"Consistency error: {validation_result.reason}\")\n\n        return self\n```\n\n## Best Practices\n\n1. **Be Specific in Criteria**: Provide clear, detailed validation criteria in natural language\n2. **Use Appropriate Models**: Larger models tend to give better, more nuanced validation\n3. **Balance Cost and Latency**: Remember that each validation adds an LLM API call\n4. **Provide Examples**: Include examples of both valid and invalid content in your criteria\n5. **Handle Retries**: Configure retry logic for edge cases\n6. **Use Jinja Templates**: When validating against dynamic values, use Jinja templating\n7. **Separate Concerns**: Keep validation criteria focused on specific aspects\n8. **Consider Context**: Use model-level validation when comparing multiple fields\n\n## Advanced Configuration\n\nThe `llm_validator` function supports several configuration options:\n\n```python\nimport instructor\nfrom instructor import llm_validator\nfrom pydantic import BaseModel, BeforeValidator\nfrom typing import Annotated\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n# Configure the validator with options\nvalidator = llm_validator(\n    statement=\"Must be a professional, concise product description\",\n    client=client,  # Required Instructor client\n    allow_override=True,  # Allow LLM to fix invalid values\n    model=\"gpt-4o\",  # Specify model to use for validation\n    temperature=0.2,  # Add variability (default is 0)\n)\n\n\nclass Product(BaseModel):\n    description: Annotated[str, BeforeValidator(validator)]\n```\n\n## Performance Considerations\n\nSemantic validation adds API calls to your workflow, which impacts:\n\n1. **Latency**: Each validation requires an additional API call\n2. **Cost**: More API calls mean higher usage costs\n3. **Reliability**: Depends on API availability and response quality\n\nConsider these trade-offs when implementing semantic validation, especially for high-volume applications.\n\n## Comparison with Rule-Based Validation\n\n| Aspect | Rule-Based Validation | Semantic Validation |\n|--------|----------------------|---------------------|\n| **Implementation** | Regular expressions, constraints | Natural language criteria |\n| **Complexity** | Simple rules, explicit patterns | Can handle subjective criteria |\n| **Speed** | Fast, no external calls | Slower, requires API calls |\n| **Cost** | No additional API costs | Each validation costs tokens |\n| **Flexibility** | Limited to programmable rules | Can validate against any natural language criteria |\n| **Maintenance** | Rules must be updated manually | Criteria can be more adaptable |\n\n## Related Resources\n\n- [Validation in Instructor](./validation.md) - Core validation concepts\n- [Custom Validators](../learning/validation/custom_validators.md) - Creating custom validators\n- [llm_validator API Reference](../api.md#api-reference) - Full API reference\n\n---\n\nSemantic validation expands what's possible with validation beyond traditional rule-based approaches. By using LLMs to validate content against natural language criteria, you can build more sophisticated validation systems that understand context, nuance, and complex relationships.\n"
  },
  {
    "path": "docs/concepts/templating.md",
    "content": "---\ntitle: Prompt Templating with Jinja - Dynamic Prompt Generation\ndescription: Create dynamic prompts using Jinja templating with Instructor. Build reusable, versioned prompts with Pydantic validation and security.\n---\n\n# Prompt Templating\n\nWith Instructor's Jinja templating, you can:\n\n- Dynamically adapt prompts to any context\n- Easily manage and version your prompts better\n- Integrate seamlessly with validation processes\n- Handle sensitive information securely\n\nOur solution offers:\n\n- Separation of prompt structure and content\n- Complex logic implementation within prompts\n- Template reusability across scenarios\n- Enhanced prompt versioning and logging\n- Pydantic integration for validation and type safety\n\n## Context is available to the templating engine\n\nThe `context` parameter is a dictionary that is passed to the templating engine. It is used to pass in the relevant variables to the templating engine. This single `context` parameter will be passed to jinja to render out the final prompt.\n\n```python hl_lines=\"14-15 19-22\"\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"Extract the information from the\n        following text: `{{ data }}`\"\"\",  # (1)!\n        },\n    ],\n    response_model=User,\n    context={\"data\": \"John Doe is thirty years old\"},  # (2)!\n)\n\nprint(resp)\n#> name='John Doe' age=30\n```\n\n1. Declare jinja style template variables inside the prompt itself (e.g. `{{ name }}`)\n2. Pass in the variables to be used in the `context` parameter\n\n### Context is available to Pydantic validators\n\nIn this example, we demonstrate how to leverage the `context` parameter with Pydantic validators to enhance our validation and data processing capabilities. By passing the `context` to the validators, we can implement dynamic validation rules and data transformations based on the input context. This approach allows for flexible and context-aware validation, such as checking for banned words or applying redaction patterns to sensitive information.\n\n```python hl_lines=\"15-16 26-30\"\nimport instructor\nfrom pydantic import BaseModel, ValidationInfo, field_validator\nimport re\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Response(BaseModel):\n    text: str\n\n    @field_validator('text')\n    @classmethod\n    def redact_regex(cls, v: str, info: ValidationInfo):\n        context = info.context\n        if context:\n            redact_patterns = context.get('redact_patterns', [])\n            for pattern in redact_patterns:\n                v = re.sub(pattern, '****', v)\n        return v\n\n\nresponse = client.create(\n    response_model=Response,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n                Write about a {{ topic }}\n\n                {% if banned_words %}\n                You must not use the following banned words:\n\n                <banned_words>\n                {% for word in banned_words %}\n                * {{ word }}\n                {% endfor %}\n                </banned_words>\n                {% endif %}\n              \"\"\",\n        },\n    ],\n    context={\n        \"topic\": \"jason and now his phone number is 123-456-7890\",\n        \"redact_patterns\": [\n            r\"\\b\\d{3}[-.]?\\d{3}[-.]?\\d{4}\\b\",  # Phone number pattern\n            r\"\\b\\d{3}-\\d{2}-\\d{4}\\b\",  # SSN pattern\n        ],\n    },\n    max_retries=3,\n)\n\nprint(response.text)\n\"\"\"\nJason is a young man who loves technology and enjoys staying connected with his friends and family. He is known for his friendly demeanor and his passion for learning new things. Recently, he got a new phone, and his contact number is ****. Jason uses his phone not only to communicate but also to explore various apps, stay organized, and capture moments through photography.\n\"\"\"\n```\n\n1. Access the variables passed into the `context` variable inside your Pydantic validator\n\n2. Pass in the variables to be used for validation and/or rendering into the `context` parameter\n\n### Jinja Syntax\n\nJinja is used to render the prompts, allowing the use of familiar Jinja syntax. This enables rendering of lists, conditionals, and more. It also allows calling functions and methods within Jinja.\n\nThis makes formatting of prompts and rendering logic extremely easy.\n\n```python hl_lines=\"29-34 37-43\"\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Citation(BaseModel):\n    source_ids: list[int]\n    text: str\n\n\nclass Response(BaseModel):\n    answer: list[Citation]\n\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n                You are a {{ role }} tasks with the following question\n\n                <question>\n                {{ question }}\n                </question>\n\n                Use the following context to answer the question, make sure to return [id] for every citation:\n\n                <context>\n                {% for chunk in context %}\n                  <context_chunk>\n                    <id>{{ chunk.id }}</id>\n                    <text>{{ chunk.text }}</text>\n                  </context_chunk>\n                {% endfor %}\n                </context>\n\n                {% if rules %}\n                Make sure to follow these rules:\n\n                {% for rule in rules %}\n                  * {{ rule }}\n                {% endfor %}\n                {% endif %}\n            \"\"\",\n        },\n    ],\n    response_model=Response,\n    context={\n        \"role\": \"professional educator\",\n        \"question\": \"What is the capital of France?\",\n        \"context\": [\n            {\"id\": 1, \"text\": \"Paris is the capital of France.\"},\n            {\"id\": 2, \"text\": \"France is a country in Europe.\"},\n        ],\n        \"rules\": [\"Use markdown.\"],\n    },\n)\n\nprint(resp)\n#> answer=[Citation(source_ids=[1], text='The capital of France is Paris.')]\n# answer=[Citation(source_ids=[1], text='The capital of France is Paris.')]\n```\n\n### Working with Secrets\n\nYour prompts might need to include sensitive user information when they're sent to your model provider. This is probably something you don't want to hard code into your prompt or captured in your logs. An easy way to get around this is to use the `SecretStr` type from `Pydantic` in your model definitions.\n\n```python\nfrom pydantic import BaseModel, SecretStr\nimport instructor\n\n\nclass UserContext(BaseModel):\n    name: str\n    address: SecretStr\n\n\nclass Address(BaseModel):\n    street: SecretStr\n    city: str\n    state: str\n    zipcode: str\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\ncontext = UserContext(name=\"scolvin\", address=\"secret address\")\n\naddress = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"{{ user.name }} is `{{ user.address.get_secret_value() }}`, normalize it to an address object\",\n        },\n    ],\n    context={\"user\": context},\n    response_model=Address,\n)\nprint(context)\n#> name='scolvin' address=SecretStr('**********')\nprint(address)\n\"\"\"\nstreet=SecretStr('**********') city='secret address' state='secret address' zipcode='secret address'\n\"\"\"\n```\n\nThis allows you to preserve your sensitive information while still using it in your prompts.\n\n## Security\n\nWe use the `jinja2.sandbox.SandboxedEnvironment` to prevent security issues with the templating engine. This means that you can't use arbitrary python code in your prompts. But this doesn't mean that you should pass untrusted input to the templating engine, as this could still be abused for things like Denial of Service attacks.\n\nYou should [always sanitize](https://jinja.palletsprojects.com/en/stable/sandbox/#security-considerations) any input that you pass to the templating engine.\n"
  },
  {
    "path": "docs/concepts/typeadapter.md",
    "content": "---\ntitle: TypeAdapter in Instructor - Custom Type Handling\ndescription: Use Pydantic TypeAdapter for custom type validation and serialization with Instructor. Handle complex types and custom validation logic in structured outputs.\n---\n\n!!! warning \"This page is a work in progress\"\n\n    This page is a work in progress. Check out [Pydantic's documentation](https://docs.pydantic.dev/latest/concepts/type_adapter/)\n"
  },
  {
    "path": "docs/concepts/typeddicts.md",
    "content": "---\ntitle: Using TypedDicts with OpenAI API\ndescription: Learn how to utilize TypedDicts in Python with the OpenAI API for structured data responses.\n---\n\n---\ntitle: TypedDict Support in Instructor - Dictionary Validation\ndescription: Use Python TypedDict for type-safe dictionary structures with Instructor. Validate dictionary schemas without Pydantic models for lightweight structured outputs.\n---\n\n# TypedDicts\n\nWe also support typed dicts.\n\n```python\nfrom typing_extensions import TypedDict\nimport instructor\n\n\nclass User(TypedDict):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nresponse = client.create(\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Timothy is a man from New York who is turning 32 this year\",\n        }\n    ],\n)\n```"
  },
  {
    "path": "docs/concepts/types.md",
    "content": "---\ntitle: Working with Types in Instructor\ndescription: Learn how to use different data types with Instructor, from simple primitives to complex types.\n---\n\n# Working with Types in Instructor\n\nInstructor supports a wide range of types for your structured outputs, from simple primitives to complex nested structures.\n\n## Simple Types\n\nIn addition to `pydantic.BaseModel` (the recommended approach), Instructor also supports:\n\n- Primitive types: `str`, `int`, `float`, `bool`\n- Collection types: `List`, `Dict`\n- Type composition: `Union`, `Literal`, `Optional`\n- Specialized outputs: [Iterable](lists.md), [Partial](partial.md)\n\nYou can use these types directly in your `response_model` parameter without wrapping them in a Pydantic model.\n\nFor better documentation and control, use `typing.Annotated` to add more context to your types.\n\n## What happens behind the scenes?\n\nWe will actually wrap the response model with a `pydantic.BaseModel` of the following form:\n\n```python\nfrom typing import Annotated\nfrom pydantic import create_model, Field, BaseModel\n\ntypehint = Annotated[bool, Field(description=\"Sample Description\")]\n\nmodel = create_model(\"Response\", content=(typehint, ...), __base__=BaseModel)\n\nprint(model.model_json_schema())\n\"\"\"\n{\n    'properties': {\n        'content': {\n            'description': 'Sample Description',\n            'title': 'Content',\n            'type': 'boolean',\n        }\n    },\n    'required': ['content'],\n    'title': 'Response',\n    'type': 'object',\n}\n\"\"\"\n```\n\n## Primitive Types (str, int, float, bool)\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n# Response model with simple types like str, int, float, bool\nresp = client.create(\n    response_model=bool,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Is it true that Paris is the capital of France?\",\n        },\n    ],\n)\nassert resp is True, \"Paris is the capital of France\"\nprint(resp)\n#> True\n```\n\n## Annotated\n\nAnnotations can be used to add more information about the type. This can be useful for adding descriptions to the type, along with more complex information like field names, and more.\n\n```python\nimport instructor\nfrom typing import Annotated\nfrom pydantic import Field\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nUpperCaseStr = Annotated[str, Field(description=\"string must be upper case\")]\n\n# Response model with simple types like str, int, float, bool\nresp = client.create(\n    response_model=UpperCaseStr,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"What is the capital of france?\",\n        },\n    ],\n)\nassert resp == \"PARIS\", \"Paris is the capital of France\"\nprint(resp)\n#> PARIS\n```\n\n## Literal\n\nWhen doing simple classification Literals go quite well, they support literal of string, int, bool.\n\n```python\nimport instructor\nfrom typing import Literal\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresp = client.create(\n    response_model=Literal[\"BILLING\", \"SHIPPING\"],\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Classify the following messages: 'I am having trouble with my billing'\",\n        },\n    ],\n)\nassert resp == \"BILLING\"\nprint(resp)\n#> BILLING\n```\n\n## Enum\n\nEnums are harder to get right without some addition promping but are useful if these are values that are shared across the application.\n\n```python\nimport instructor\nfrom enum import Enum\n\n\nclass Label(str, Enum):\n    BILLING = \"BILLING\"\n    SHIPPING = \"SHIPPING\"\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresp = client.create(\n    response_model=Label,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Classify the following messages: 'I am having trouble with my billing'\",\n        },\n    ],\n)\nassert resp == Label.BILLING\nprint(resp)\n#> BILLING\n```\n\n## List\n\n```python\nimport instructor\nfrom typing import List\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresp = client.create(\n    response_model=List[int],\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Give me the first 5 prime numbers\",\n        },\n    ],\n)\n\nassert resp == [2, 3, 5, 7, 11]\nprint(resp)\n#> [2, 3, 5, 7, 11]\n```\n\n## Union\n\nUnion is a great way to handle multiple types of responses, similar to multiple function calls but not limited to the function calling api, like in JSON_SCHEMA modes.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Union\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Add(BaseModel):\n    a: int\n    b: int\n\n\nclass Weather(BaseModel):\n    location: str\n\n\nresp = client.create(\n    response_model=Union[Add, Weather],\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"What is 5 + 5?\",\n        },\n    ],\n)\n\nassert resp == Add(a=5, b=5)\nprint(resp)\n#> a=5 b=5\n```\n\n## See Also\n\n- [Response Models](./models.md) - Using Pydantic models for structured outputs\n- [Enums](./enums.md) - Working with enumerated types\n- [Union Types](./unions.md) - Handling multiple possible types\n- [Lists](./lists.md) - Working with collections\n- [Optional Fields](../learning/patterns/optional_fields.md) - Handling missing data\n\n## Complex Types\n\n### Pandas DataFrame\n\nThis is a more complex example, where we use a custom type to convert markdown to a pandas DataFrame.\n\n```python\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import BeforeValidator, PlainSerializer, InstanceOf, WithJsonSchema\nimport pandas as pd\nimport instructor\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    # Validates final type\n    InstanceOf[pd.DataFrame],\n    # Converts markdown to DataFrame\n    BeforeValidator(md_to_df),\n    # Converts DataFrame to markdown on model_dump_json\n    PlainSerializer(lambda df: df.to_markdown()),\n    # Adds a description to the type\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n            The markdown representation of the table,\n            each one should be tidy, do not try to join\n            tables that should be seperate\"\"\",\n        }\n    ),\n]\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresp = client.create(\n    response_model=MarkdownDataFrame,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Jason is 20, Sarah is 30, and John is 40\",\n        },\n    ],\n)\n\nassert isinstance(resp, pd.DataFrame)\nprint(resp)\n\"\"\"\n        Age\n Name\nJason     20\nSarah     30\nJohn      40\n\"\"\"\n```\n\n### Lists of Unions\n\nJust like Unions we can use List of Unions to represent multiple types of responses. This will feel similar to the parallel function calls but not limited to the function calling api, like in JSON_SCHEMA modes.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Union, List\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass Weather(BaseModel, frozen=True):\n    location: str\n\n\nclass Add(BaseModel, frozen=True):\n    a: int\n    b: int\n\n\nresp = client.create(\n    response_model=List[Union[Add, Weather]],\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Add 5 and 5, and also whats the weather in Toronto?\",\n        },\n    ],\n)\n\nassert resp == [Add(a=5, b=5), Weather(location=\"Toronto\")]\nprint(resp)\n#> [Add(a=5, b=5), Weather(location='Toronto')]\n```\n"
  },
  {
    "path": "docs/concepts/union.md",
    "content": "---\ntitle: Using Union Types in Pydantic Models\ndescription: Learn how to implement Union types in Pydantic models to handle multiple action types in Python.\n---\n\n!!! note \"Redirect Notice\"\n    This page has been consolidated into the comprehensive [Union Types](./unions.md) guide.\n    Please visit that page for complete information about working with union types in Instructor.\n\n<!-- Redirect to the consolidated page -->\n<meta http-equiv=\"refresh\" content=\"0; url=./unions.md\">\n"
  },
  {
    "path": "docs/concepts/unions.md",
    "content": "---\ntitle: Union Types in Instructor\ndescription: Learn how to use Union types to handle multiple possible response types in Instructor\n---\n\n# Working with Union Types in Instructor\n\nThis guide explains how to work with union types in Instructor, allowing you to handle multiple possible response types from language models. Union types are particularly useful when you need the LLM to choose between different response formats or action types.\n\n!!! note \"Union vs. union\"\n    The content from the original `union.md` page has been consolidated into this more comprehensive guide. That page showed a basic example of using Union types for multiple action types.\n\n## Basic Union Types\n\nUnion types let you specify that a field can be one of several types:\n\n```python\nfrom typing import Union\nfrom pydantic import BaseModel\n\n\nclass Response(BaseModel):\n    value: Union[str, int]  # Can be either string or integer\n```\n\n## Discriminated Unions\n\nUse discriminated unions to handle different response types:\n\n```python\nfrom typing import Literal, Union\nfrom pydantic import BaseModel\nimport instructor\n\n\nclass UserQuery(BaseModel):\n    type: Literal[\"user\"]\n    username: str\n\n\nclass SystemQuery(BaseModel):\n    type: Literal[\"system\"]\n    command: str\n\n\nQuery = Union[UserQuery, SystemQuery]\n\n# Usage with Instructor\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresponse = client.create(\n    response_model=Query,\n    messages=[{\"role\": \"user\", \"content\": \"Parse: user lookup jsmith\"}],\n)\n```\n\n## Optional Fields\n\nCombine Union with Optional for nullable fields:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    email: Optional[str] = None  # Same as Union[str, None]\n```\n\n## Best Practices\n\n1. **Type Hints**: Use proper type hints for clarity and better IDE support\n2. **Discriminators**: Add discriminator fields (like `type`) for complex unions to help the LLM choose correctly\n3. **Validation**: Add validators for union fields to ensure the data is valid\n4. **Documentation**: Document expected types clearly in your models with docstrings\n5. **Field Names**: Use descriptive field names to guide the model's output\n6. **Examples**: Include examples in your Pydantic models to help the LLM understand the expected format\n\n## Common Patterns\n\n### Multiple Response Types\n```python\nfrom typing import Union, Literal\nfrom pydantic import BaseModel\n\n\nclass SuccessResponse(BaseModel):\n    status: Literal[\"success\"]\n    data: dict\n\n\nclass ErrorResponse(BaseModel):\n    status: Literal[\"error\"]\n    message: str\n\n\nResponse = Union[SuccessResponse, ErrorResponse]\n```\n\n### Nested Unions\n```python\nfrom typing import Literal, Union, List\nfrom pydantic import BaseModel\n\n\nclass TextContent(BaseModel):\n    type: Literal[\"text\"]\n    text: str\n\n\nclass ImageContent(BaseModel):\n    type: Literal[\"image\"]\n    url: str\n\n\nclass Message(BaseModel):\n    content: List[Union[TextContent, ImageContent]]\n```\n\n## Dynamic Action Selection with Unions\n\nYou can use Union types to write \"agents\" that dynamically choose actions by selecting an output class. For example, in a search and lookup function:\n\n```python\nfrom pydantic import BaseModel\nfrom typing import Union\n\n\nclass Search(BaseModel):\n    query: str\n\n    def execute(self):\n        # Implementation for search\n        return f\"Searching for: {self.query}\"\n\n\nclass Lookup(BaseModel):\n    key: str\n\n    def execute(self):\n        # Implementation for lookup\n        return f\"Looking up key: {self.key}\"\n\n\nclass Action(BaseModel):\n    action: Union[Search, Lookup]\n\n    def execute(self):\n        return self.action.execute()\n```\n\nWith this pattern, the LLM can decide whether to perform a search or a lookup based on the user's input:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Union\n\n\nclass Search(BaseModel):\n    query: str\n\n    def execute(self):\n        # Implementation for search\n        return f\"Searching for: {self.query}\"\n\n\nclass Lookup(BaseModel):\n    key: str\n\n    def execute(self):\n        # Implementation for lookup\n        return f\"Looking up key: {self.key}\"\n\n\nclass Action(BaseModel):\n    action: Union[Search, Lookup]\n\n    def execute(self):\n        return self.action.execute()\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n# Let the LLM decide what action to take\nresult = client.create(\n    response_model=Action,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You're an assistant that helps search or lookup information.\",\n        },\n        {\"role\": \"user\", \"content\": \"Find information about climate change\"},\n    ],\n)\n\n# Execute the chosen action\nprint(result.execute())  # Likely outputs: \"Searching for: climate change\"\n#> Searching for: climate change\n```\n\n## Integration with Instructor\n\n### import instructor\nfrom typing import Union, Literal\nfrom pydantic import BaseModel\n\n\nclass SuccessResponse(BaseModel):\n    status: Literal[\"success\"]\n    data: dict\n\n\nclass ErrorResponse(BaseModel):\n    status: Literal[\"error\"]\n    message: str\n\n\nResponse = Union[SuccessResponse, ErrorResponse]\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nresult = client.create(\n    response_model=Response,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a helpful assistant that processes requests and returns either a success or error response.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": \"Process this request: Get user information for id 123\",\n        },\n    ],\n)\n\n# Check the result type\nif isinstance(result, ErrorResponse):\n    print(f\"Error: {result.message}\")\n    #> Error: Request not supported: Get user information for id 123\nelse:\n    print(f\"Success: {result.data}\")\n: User information for id 123 is not available.\nelse:\n    print(f\"Success: {result.data}\")\n```\n\n### Streaming with Unions\n```python\ndef stream_content():\n    response = client.create(\n        response_model=Message,\n        stream=True,\n        messages=[{\"role\": \"user\", \"content\": \"Generate mixed content\"}],\n    )\n    for partial in response:\n        if partial.content:\n            for item in partial.content:\n                if isinstance(item, TextContent):\n                    print(f\"Text: {item.text}\")\n                elif isinstance(item, ImageContent):\n                    print(ffrom pydantic import ValidationError, BaseModel\nfrom typing import Union, Literal\n\n\nclass SuccessResponse(BaseModel):\n    status: Literal[\"success\"]\n    data: dict\n\n\nclass ErrorResponse(BaseModel):\n    status: Literal[\"error\"]\n    message: str\n\n\nResponse = Union[SuccessResponse, ErrorResponse]\n\ntry:\n    # This will fail because \"invalid\" is not a valid status\n    response = SuccessResponse(status=\"invalid\", data={\"key\": \"value\"})\nexcept ValidationError as e:\n    print(f\"Validation error: {e}\")\n    \"\"\"\n    Validation error: 1 validation error for SuccessResponse\n    status\n      Input should be 'success' [type=literal_error, input_value='invalid', input_type=str]\n    \"\"\"\nid\", data={\"key\": \"value\"})\nexcept ValidationError as e:\n    print(f\"Validation error: {e}\")\n    \"\"\"\n    Validation error: 1 validation error for SuccessResponse\n    status\n      Input should be 'success' [type=literal_error, input_value='invalid', input_type=str]\n    \"\"\"\n```\n\n## Type Checking\n\nUse isinstance() for runtime type checking:\n\n```python\nfrom typing import Union, Literal\nfrom pydantic import BaseModel\n\n\nclass SuccessResponse(BaseModel):\n    status: Literal[\"success\"]\n    data: dict\n\n\nclass ErrorResponse(BaseModel):\n    status: Literal[\"error\"]\n    message: str\n\n\nResponse = Union[SuccessResponse, ErrorResponse]\n\n\ndef process_response(response: Response):\n    if isinstance(response, SuccessResponse):\n        # Handle success case\n        print(f\"Success: {response.data}\")\n    elif isinstance(response, ErrorResponse):\n        # Handle error case\n        print(f\"Error: {response.message}\")\n```\n\nFor more information about union types, check out the [Pydantic documentation on unions](https://docs.pydantic.dev/latest/concepts/types/#unions).\n\n```from typing import Literal, Union\nfrom pydantic import BaseModel\nimport instructor\nfrom openai import OpenAI\n\n\nclass Action(BaseModel):\n    \"\"\"Base action class.\"\"\"\n\n    type: str\n\n\nclass SendMessage(BaseModel):\n    type: Literal[\"send_message\"]\n    message: str\n    recipient: str\n\n\nclass MakePayment(BaseModel):\n    type: Literal[\"make_payment\"]\n    amount: float\n    recipient: str\n\n\nAction = Union[SendMessage, MakePayment]\n\n# Usage with Instructor\nclient = instructor.from_provider(\"openai/gpt-4o\")\nresponse = client.create(\n    response_model=Action,\n    messages=[{\"role\": \"user\", \"content\": \"Send a payment of $50 to John.\"}],\n)\n  ],\n)\n```\n\n```from typing import Literal, Union\nfrom pydantic import BaseModel\nimport instructor\nfrom openai import OpenAI\n\n\nclass SearchAction(BaseModel):\n    type: Literal[\"search\"]\n    query: str\n\n\nclass EmailAction(BaseModel):\n    type: Literal[\"email\"]\n    to: str\n    subject: str\n    body: str\n\n\nAction = Union[SearchAction, EmailAction]\n\n# The model can choose which action to take\nclient = instructor.from_provider(\"openai/gpt-4o\")\nresponse = client.create(\n    response_model=Action,\n    messages=[{\"role\": \"user\", \"content\": \"Find me information about climate change.\"}],\n)\n  ],\n)\n```\n\n```from typing import Literal, Union\nfrom pydantic import BaseModel\nimport instructor\nfrom openai import OpenAI\n\n\nclass TextResponse(BaseModel):\n    type: Literal[\"text\"]\n    content: str\n\n\nclass ImageResponse(BaseModel):\n    type: Literal[\"image\"]\n    url: str\n    caption: str\n\n\nResponse = Union[TextResponse, ImageResponse]\n\n# Patched client\n```\n\n## See Also\n\n- [Types](./types.md) - Working with different data types in Instructor\n- [Enums](./enums.md) - Using enumerated types for structured choices\n- [Optional Fields](../learning/patterns/optional_fields.md) - Handling optional data\n- [Validation](./validation.md) - Validating union type responses\n- [Union Examples](../examples/index.md) - Practical union type examples\nclient = instructor.from_provider(\"openai/gpt-4o\")\nresponse = client.create(\n    response_model=Response,\n    messages=[{\"role\": \"user\", \"content\": \"Tell me a joke about programming.\"}],\n)\n  ],\n)\n```\n\n```from typing import Union\nfrom pydantic import BaseModel\n\n\nclass Response(BaseModel):\n    \"\"\"A more complex example showing nested Union fields.\"\"\"\n\n    result: Union[str, int, float, bool]\n bool]\n```\n\n```from typing import Dict, List, Union, Any\nfrom pydantic import BaseModel\n\n\nclass Response(BaseModel):\n    \"\"\"A more complex example showing nested Union fields.\"\"\"\n\n    data: Dict[str, Union[str, int, List[Any]]]\nAny]]]\n```\n"
  },
  {
    "path": "docs/concepts/usage.md",
    "content": "---\ntitle: Handling Non-Streaming Requests in OpenAI with Usage Tracking\ndescription: Learn how to manage non-streaming requests in OpenAI, track token usage, and handle exceptions with Python.\n---\n\n## See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](./from_provider.md) - Detailed client configuration\n- [Response Models](./models.md) - Working with Pydantic models\n- [Raw Response](./raw_response.md) - Access original LLM responses\n\nThe easiest way to get usage for non streaming requests is to access the raw response.\n\n```python\nimport instructor\n\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser, completion = client.create_with_completion(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nprint(completion.usage)\n\"\"\"\nCompletionUsage(\n    completion_tokens=10,\n    prompt_tokens=79,\n    total_tokens=89,\n    completion_tokens_details=CompletionTokensDetails(\n        accepted_prediction_tokens=None,\n        audio_tokens=0,\n        reasoning_tokens=0,\n        rejected_prediction_tokens=None,\n    ),\n    prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0),\n)\n\"\"\"\n```\n\nYou can catch an IncompleteOutputException whenever the context length is exceeded and react accordingly, such as by trimming your prompt by the number of exceeding tokens.\n\n```python\nfrom instructor.core.exceptions import IncompleteOutputException\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\ntry:\n    client.create_with_completion(\n        response_model=UserExtract,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\nexcept IncompleteOutputException as e:\n    token_count = e.last_completion.usage.total_tokens  # type: ignore\n    # your logic here\n```\n"
  },
  {
    "path": "docs/concepts/validation.md",
    "content": "---\ntitle: Validation\ndescription: Learn how to validate LLM outputs with Pydantic for type safety and data consistency.\n---\n\n# Validation\n\nInstructor uses Pydantic for validation, providing type checking, data coercion, custom validators, and field constraints.\n\n## Validation Flow\n\n```mermaid\nflowchart TD\n    A[Define Pydantic Model] --> B[Send Request to LLM]\n    B --> C[LLM Generates Response]\n    C --> D{Validate Response}\n\n    D -->|Valid| E[Return Pydantic Object]\n    D -->|Invalid| F{Auto-Retry Enabled?}\n\n    F -->|Yes| G[Send Error Context to LLM]\n    F -->|No| H[Raise ValidationError]\n\n    G --> I[LLM Generates New Response]\n    I --> J{Validate Again}\n\n    J -->|Valid| E\n    J -->|Invalid| K{Max Retries Reached?}\n\n    K -->|No| G\n    K -->|Yes| H\n```\n\n## Basic Validation\n\nDefine models with type hints and field constraints:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field, field_validator\n\n\nclass User(BaseModel):\n    name: str = Field(..., min_length=2, description=\"User's full name\")\n    age: int = Field(..., ge=0, le=150, description=\"User's age\")\n    emails: List[str] = Field(description=\"List of email addresses\")\n\n    @field_validator('emails')\n    @classmethod\n    def validate_emails(cls, v):\n        if not all('@' in email for email in v):\n            raise ValueError('Invalid email format')\n        return v\n```\n\n## Field Validation\n\nUse `Field()` for basic constraints:\n\n```python\nfrom pydantic import BaseModel, Field\n\n\nclass Product(BaseModel):\n    name: str = Field(..., min_length=1, max_length=100)\n    price: float = Field(..., gt=0)\n    quantity: int = Field(..., ge=0)\n```\n\n## Custom Validators\n\nUse `@field_validator` for complex validation:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator\n\n\nclass Order(BaseModel):\n    items: list[str] = Field(description=\"List of item names\")\n    total: float = Field(description=\"Total order amount\")\n\n    @field_validator('total')\n    @classmethod\n    def validate_total(cls, v):\n        if v < 0:\n            raise ValueError('Total cannot be negative')\n        return v\n```\n\n## Pre-validation Transformation\n\nTransform data before validation:\n\n```python\nfrom pydantic import BaseModel, field_validator\n\n\nclass UserProfile(BaseModel):\n    username: str\n\n    @field_validator('username', mode='before')\n    @classmethod\n    def lowercase_username(cls, v):\n        return v.lower() if isinstance(v, str) else v\n```\n\n## Semantic Validation\n\nUse `llm_validator` for validations that are hard to express programmatically:\n\n```python\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nimport instructor\nfrom instructor import llm_validator\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass ContentReview(BaseModel):\n    title: str\n    content: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"Content must be family-friendly and not contain profanity\",\n                client=client,\n            )\n        ),\n    ]\n```\n\nSemantic validation works well for content moderation, tone validation, consistency checks, and complex relationships. For more patterns and details, see the [Semantic Validation](./semantic_validation.md) guide.\n\n## Nested Validation\n\nValidate nested structures:\n\n```python\nfrom pydantic import BaseModel, Field\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    addresses: list[Address] = Field(description=\"User's addresses\")\n```\n\n## Error Handling\n\nHandle validation failures with appropriate error types:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field, field_validator\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n    @field_validator('age')\n    @classmethod\n    def validate_age(cls, v):\n        if v < 0:\n            raise ValueError(\"Age cannot be negative\")\n        return v\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\ntry:\n    user = client.create(\n        response_model=User,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: John, age: -5\"},\n        ],\n    )\nexcept instructor.exceptions.InstructorValidationError as e:\n    print(f\"Validation error: {e}\")\n```\n\n## Best Practices\n\n1. **Start simple**: Begin with basic type validation before adding complex rules\n2. **Use type hints**: Always specify types for clarity\n3. **Document constraints**: Add descriptions to Field() definitions\n4. **Choose the right validation type**: Rule-based for objective criteria, semantic for subjective\n5. **Handle errors**: Implement proper error handling for validation failures\n6. **Consider costs**: Semantic validation with LLMs incurs API costs and latency\n\n## See Also\n\n- [Semantic Validation](./semantic_validation.md) - LLM-based validation patterns\n- [Reask Validation](./reask_validation.md) - Automatic retry with validation feedback\n- [Retrying](./retrying.md) - Configure retry behavior\n- [Error Handling](./error_handling.md) - Handle validation failures\n"
  },
  {
    "path": "docs/contributing.md",
    "content": "---\ntitle: Contribute to Instructor: Evals, Issues, and Pull Requests\ndescription: Join us in enhancing the Instructor library with evals, report issues, and submit pull requests on GitHub. Collaborate and contribute!\n---\n\n# Contributing to Instructor\n\nWe welcome contributions to Instructor! This page covers the different ways you can help improve the library.\n\n## Ways to Contribute\n\n### Evaluation Tests (Evals)\n\nEvals help us monitor the quality of both the OpenAI models and the Instructor library. To contribute:\n\n1. **Explore Existing Evals**: Check out [our evals directory](https://github.com/instructor-ai/instructor/tree/main/tests/llm/test_openai/evals)\n2. **Create a New Eval**: Add new pytest tests that evaluate specific capabilities or edge cases\n3. **Follow the Pattern**: Structure your eval similar to existing ones\n4. **Submit a PR**: We'll review and incorporate your eval\n\nEvals are run weekly, and results are tracked to monitor performance over time.\n\n### Reporting Issues\n\nIf you encounter a bug or problem, please [file an issue on GitHub](https://github.com/instructor-ai/instructor/issues) with:\n\n1. A clear, descriptive title\n2. Detailed information including:\n   - The `response_model` you're using\n   - The `messages` you sent\n   - The `model` you're using\n   - Steps to reproduce the issue\n   - Expected vs. actual behavior\n   - Your environment details (Python version, OS, package versions)\n\n### Contributing Code\n\nWe welcome pull requests! Here's the process:\n\n1. **For Small Changes**: Feel free to submit a PR directly\n2. **For Larger Changes**: [Start with an issue](https://github.com/instructor-ai/instructor/issues) to discuss approach\n3. **Looking for Ideas?** Check issues labeled [help wanted](https://github.com/instructor-ai/instructor/labels/help%20wanted) or [good first issue](https://github.com/instructor-ai/instructor/labels/good%20first%20issue)\n\n## Setting Up Your Development Environment\n\n### Using UV (Recommended)\n\nUV is a fast Python package installer and resolver that makes development easier.\n\n1. **Install UV** (official method):\n   ```bash\n   # macOS/Linux\n   curl -LsSf https://astral.sh/uv/install.sh | sh\n\n   # Windows PowerShell\n   powershell -ExecutionPolicy ByPass -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n   ```\n\n2. **Install Project in Development Mode**:\n   ```bash\n   # Clone the repository\n   git clone https://github.com/YOUR-USERNAME/instructor.git\n   cd instructor\n\n   # Install with development dependencies\n   uv pip install -e \".[dev,docs]\"\n   ```\n\n3. **Adding New Dependencies**:\n   ```bash\n   # Add a regular dependency\n   uv pip install some-package\n\n   # Install a specific version\n   uv pip install \"some-package>=1.0.0,<2.0.0\"\n   ```\n\n4. **Common UV Commands**:\n   ```bash\n   # Update UV itself\n   uv self update\n\n   # Create a requirements file\n   uv pip freeze > requirements.txt\n   ```\n\n### Using Poetry\n\nPoetry provides comprehensive dependency management and packaging.\n\n1. **Install Poetry**:\n   ```bash\n   curl -sSL https://install.python-poetry.org | python3 -\n   ```\n\n2. **Install Dependencies**:\n   ```bash\n   # Clone the repository\n   git clone https://github.com/YOUR-USERNAME/instructor.git\n   cd instructor\n\n   # Install with development dependencies\n   poetry install --with dev,docs\n   ```\n\n3. **Working with Poetry**:\n   ```bash\n   # Activate virtual environment\n   poetry shell\n\n   # Run a command in the virtual environment\n   poetry run pytest\n\n   # Add a dependency\n   poetry add package-name\n\n   # Add a development dependency\n   poetry add --group dev package-name\n   ```\n\n## Adding Support for New LLM Providers\n\nInstructor uses optional dependencies to support different LLM providers. Provider-specific utilities live in the `instructor/utils` directory. To add a new provider:\n\n1. **Add Dependencies to pyproject.toml**:\n   ```toml\n   [project.optional-dependencies]\n   # Add your provider\n   my-provider = [\"my-provider-sdk>=1.0.0,<2.0.0\"]\n\n   [dependency-groups]\n   # Mirror in dependency groups\n   my-provider = [\"my-provider-sdk>=1.0.0,<2.0.0\"]\n   ```\n\n2. **Create Provider Client**:\n   - Create a new file at `instructor/clients/client_myprovider.py`\n   - Implement `from_myprovider` function that patches the provider's client\n\n3. **Add Tests**: Create tests in `tests/llm/test_myprovider/`\n\n4. **Document Installation**:\n   ```bash\n   # Installation command for your provider\n   uv pip install \"instructor[my-provider]\"\n   # or with poetry\n   poetry install --with my-provider\n   ```\n\n5. **Create Provider Utilities and Handlers**:\n   - Add `instructor/utils/myprovider.py` with `reask` and `handle_*` helpers\n   - Define `MYPROVIDER_HANDLERS` mapping `Mode` values to these functions\n\n6. **Register the Provider**:\n   - Update `instructor/utils/providers.py` with your provider enum value\n   - Extend `get_provider` detection for your base URL\n\n7. **Update `process_response.py`**:\n   - Import your handlers and add them to `mode_handlers`\n   - This script uses the handlers to prepare kwargs and parse results\n\n8. **Write Documentation**:\n   - Add a new markdown file in `docs/integrations/` for your provider\n   - Update `mkdocs.yml` to include your new page\n   - Make sure to include a complete example\n\n## Development Workflow\n\n1. **Fork the Repository**: Create your own fork of the project\n2. **Clone and Set Up**:\n   ```bash\n   git clone https://github.com/YOUR-USERNAME/instructor.git\n   cd instructor\n   git remote add upstream https://github.com/instructor-ai/instructor.git\n   ```\n3. **Create a Branch**:\n   ```bash\n   git checkout -b feature/your-feature-name\n   ```\n4. **Make Changes, Test, and Commit**:\n   ```bash\n   # Run tests\n   pytest tests/ -k 'not llm and not openai'  # Skip LLM tests for faster local dev\n\n   # Commit changes\n   git add .\n   git commit -m \"Your descriptive commit message\"\n   ```\n5. **Keep Updated and Push**:\n   ```bash\n   git fetch upstream\n   git rebase upstream/main\n   git push origin feature/your-feature-name\n   ```\n6. **Create a Pull Request**: Submit your PR with a clear description of changes\n\n## Utility Scripts\n\nThe `scripts/` directory contains utility scripts that help maintain code quality and documentation. These scripts are integrated into pre-commit hooks and can also be run manually.\n\n### Available Scripts\n\n#### `make_clean.py` - Markdown File Cleaner\nCleans markdown files by removing special whitespace characters and replacing em dashes with regular dashes.\n\n```bash\n# Clean all markdown files\npython scripts/make_clean.py\n\n# Preview changes without modifying files\npython scripts/make_clean.py --dry-run\n```\n\n#### `check_blog_excerpts.py` - Blog Post Excerpt Validator\nEnsures all blog posts contain the `<!-- more -->` tag for proper excerpt handling.\n\n```bash\n# Check all blog posts\npython scripts/check_blog_excerpts.py\n```\n\n#### `make_sitemap.py` - Enhanced Documentation Sitemap Generator\nGenerates an enhanced sitemap (`sitemap.yaml`) with AI-powered content analysis and cross-link suggestions.\n\n```bash\n# Generate sitemap with default settings\npython scripts/make_sitemap.py\n\n# Customize settings\npython scripts/make_sitemap.py \\\n  --root-dir docs \\\n  --output-file sitemap.yaml \\\n  --max-concurrency 10\n```\n\n**Requirements for sitemap generation**:\n- OpenAI API key (set as `OPENAI_API_KEY` environment variable)\n- Additional dependencies: `openai`, `typer`, `rich`, `tenacity`, `pyyaml`\n\n### Pre-commit Integration\n\nThese scripts run automatically during the commit process:\n\n- **Markdown cleaning**: Runs on commits with markdown files in `docs/`\n- **Blog excerpt validation**: Runs on commits with blog post files\n\n### Manual Usage\n\nYou can run scripts manually for testing or one-time operations:\n\n```bash\n# Test markdown cleaning\npython scripts/make_clean.py --dry-run\n\n# Check blog excerpts\npython scripts/check_blog_excerpts.py\n\n# Generate fresh sitemap\npython scripts/make_sitemap.py\n```\n\nFor detailed documentation on each script, see the `scripts/README.md` file in the project repository.\n\n## Using Cursor to Build PRs\n\n[Cursor](https://cursor.sh) is an AI-powered code editor that can help you contribute to Instructor.\n\n1. **Getting Started with Cursor**:\n   - Download Cursor from [cursor.sh](https://cursor.sh)\n   - Open the Instructor project in Cursor\n   - Cursor will automatically detect our rules in `.cursor/rules/`\n\n2. **Using Cursor Rules**:\n   - `new-features-planning`: Helps plan and structure new features\n   - `simple-language`: Guidelines for writing clear documentation\n   - `documentation-sync`: Ensures documentation stays in sync with code changes\n\n3. **Creating PRs with Cursor**:\n   - Use Cursor's Git integration to create a new branch\n   - Make your changes with AI assistance\n   - Create a PR with:\n     ```bash\n     # Use GitHub CLI to create the PR\n     gh pr create -t \"Your feature title\" -b \"Description of your changes\" -r jxnl,ivanleomk\n     ```\n   - Add `This PR was written by [Cursor](https://cursor.sh)` to your PR description\n\n4. **Benefits of Using Cursor**:\n   - AI helps generate code that follows our style guidelines\n   - Simplifies PR creation process\n   - Helps maintain documentation standards\n\n## Code Style Guidelines\n\nWe use the following tools to maintain code quality:\n\n- **Ruff**: For linting and formatting\n- **ty**: For type checking\n- **Pre-commit**: For automatic checks before committing\n\n```bash\n# Install pre-commit hooks\npip install pre-commit\npre-commit install\n```\n\nKey style guidelines:\n- Use strict typing\n- Follow import order: standard lib → third-party → local\n- Use snake_case for functions/variables, PascalCase for classes\n- Write comprehensive docstrings for public API functions\n\n### Conventional Comments\n\nWhen reviewing code or writing commit messages, we use conventional comments to make feedback clearer:\n\n```\n<label>: <subject>\n\n<description>\n```\n\nCommon labels:\n- **praise:** highlights something positive\n- **suggestion:** proposes a change or improvement\n- **question:** asks for clarification\n- **issue:** points out a problem that needs fixing\n- **todo:** notes something to be addressed later\n- **fix:** resolves an issue\n\nExamples:\n\n```\nsuggestion: use a validator for this field\nThis would ensure the value is always properly formatted.\n\nquestion: why not use async processing here?\nI'm curious if this would improve performance.\n\nfix: correct the parameter type\nIt should be an OpenAI client instance, not a string.\n```\n\nThis format helps everyone understand the purpose and importance of each comment. Visit [conventionalcomments.org](https://conventionalcomments.org/) to learn more.\n\n### Conventional Commits\n\nWe use conventional commit messages to make our project history clear and generate automated changelogs. A conventional commit has this structure:\n\n```\n<type>[optional scope]: <description>\n\n[optional body]\n\n[optional footer]\n```\n\n#### Common Types\n\n- **feat**: New feature\n- **fix**: Bug fix\n- **docs**: Documentation changes\n- **style**: Formatting changes\n- **refactor**: Code change that neither fixes a bug nor adds a feature\n- **test**: Adding or fixing tests\n- **chore**: Maintenance tasks\n\n#### Examples\n\n```\nfeat(openai): add streaming response support\n\nfix(anthropic): resolve tool calling response format\n\ndocs: update installation instructions\n\ntest(evals): add new recursive schema test cases\n```\n\nFor breaking changes, add an exclamation mark before the colon:\n\n```\nfeat(api)!: change return type of from_openai function\n```\n\nUsing conventional commits helps automatically generate release notes and makes the project history easier to navigate.\n\nFor more details, see the [Conventional Commits specification](https://www.conventionalcommits.org/).\n\n## Documentation Contributions\n\nDocumentation improvements are highly valued:\n\n1. **Docs Structure**: All documentation is in Markdown in the `docs/` directory\n2. **Adding New Pages**: When adding a new page, include it in `mkdocs.yml` in the right section\n3. **Local Preview**: Run `mkdocs serve` to preview changes locally\n4. **Style Guidelines**:\n   - Write at a grade 10 reading level (simple, clear language)\n   - Include working code examples\n   - Add links to related documentation\n   - Use consistent formatting\n   - Make sure each code example is complete with imports\n\nExample of a good documentation code block:\n\n```python\n# Complete example with imports\nimport instructor\nfrom pydantic import BaseModel\n# Define your model\nclass Person(BaseModel):\n    name: str\n    age: int\n\n# Create the patched client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n# Use the model\nperson = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: John Doe is 25 years old\"}\n    ]\n)\n\nprint(person.name)  # \"John Doe\"\nprint(person.age)   # 25\n```\n\n## Contributors\n\n<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->\n<!-- prettier-ignore-start -->\n<!-- markdownlint-disable -->\n\n<!-- markdownlint-restore -->\n<!-- prettier-ignore-end -->\n\n<!-- ALL-CONTRIBUTORS-LIST:END -->\n\n<a href=\"https://github.com/instructor-ai/instructor/graphs/contributors\">\n  <img src=\"https://contrib.rocks/image?repo=jxnl/instructor\" />\n</a>\n\n## Documentation Resources\n\nWhen working on documentation, these resources may be helpful:\n\n- **mkdocs serve**: Preview documentation locally. Install dependencies from `requirements-doc.txt` first.\n\n- **hl_lines in Code Blocks**: Highlight specific lines in a code block to draw attention:\n  ````markdown\n  ```python hl_lines=\"2 3\"\n  def example():\n      # This line is highlighted\n      # This line is also highlighted\n      return \"normal line\"\n  ```\n  ````\n\n- **Admonitions**: Create styled callout boxes for important information:\n  ```markdown\n  !!! note \"Optional Title\"\n      This is a note admonition.\n\n  !!! warning\n      This is a warning.\n  ```\n\nFor more documentation features, check the [MkDocs Material documentation](https://squidfunk.github.io/mkdocs-material/).\n\nThank you for your contributions to Instructor!\n"
  },
  {
    "path": "docs/debugging.md",
    "content": "---\ntitle: Debugging Instructor Applications\ndescription: Learn how to debug Instructor applications with hooks, logging, and exception handling. Practical techniques for inspecting inputs, outputs, and retries.\n---\n\n# Debugging\n\nThis guide shows how to quickly inspect inputs/outputs, capture retries, and reproduce failures when working with Instructor. It focuses on practical techniques using hooks, logging, and exception data.\n\n## Enable Logs\n\n### Quick Debug Mode (Recommended)\n\nThe fastest way to enable debug logging is with the `INSTRUCTOR_DEBUG` environment variable:\n\n```bash\nexport INSTRUCTOR_DEBUG=1\npython your_script.py\n```\n\nOr inline:\n```bash\nINSTRUCTOR_DEBUG=1 python your_script.py\n```\n\nThis automatically enables debug logging with correlation IDs for request tracing.\n\n### Manual Debug Configuration\n\nYou can also use the standard Python `logging` module for more control:\n\n```python\nimport logging\nlogging.basicConfig(level=logging.DEBUG)\nlogging.getLogger(\"instructor\").setLevel(logging.DEBUG)\n```\n\nYou will see messages for:\n- Raw responses (provider-specific objects)\n- Handler/mode selection\n- Retry attempts and parse errors\n- Reask adjustments to `messages`\n- **Correlation IDs** for tracing requests (format: `[a1b2c3d4]`)\n\nTip: Set a handler/formatter to include timestamps and module names.\n\n## Observe the Flow with Hooks\n\nHooks let you tap into key moments without modifying core code:\n\n```python\nfrom instructor.core.hooks import HookName\n\n# Attach one or more handlers\nclient.on(HookName.COMPLETION_KWARGS, lambda **kw: print(\"KWARGS:\", kw))\nclient.on(HookName.COMPLETION_RESPONSE, lambda resp: print(\"RESPONSE:\", type(resp)))\nclient.on(HookName.PARSE_ERROR, lambda e: print(\"PARSE ERROR:\", e))\nclient.on(HookName.COMPLETION_LAST_ATTEMPT, lambda e: print(\"LAST ATTEMPT:\", e))\nclient.on(HookName.COMPLETION_ERROR, lambda e: print(\"COMPLETION ERROR:\", e))\n```\n\nCommon uses:\n- Capture the final `kwargs` passed to the provider (including mode/tools/response_format).\n- Record raw responses (e.g., to logs or a file) for offline analysis.\n- Inspect parse errors and how reask modifies the next attempt.\n\nNote: Handlers that accept `**kwargs` (or a parameter named `_instructor_meta`) receive a metadata dict with:\n- `attempt_number`, `correlation_id`, `mode`, `response_model_name`.\nAdd `**kwargs` to your handler signature to access it:\n\n```python\nclient.on(HookName.COMPLETION_KWARGS, lambda **kw: print(kw.get(\"_instructor_meta\")))\n```\n\n## Inspect Raw Responses\n\nMost parsed models returned by Instructor carry the original provider response for debugging:\n\n```python\nmodel = client.create(...)\nraw = getattr(model, \"_raw_response\", None)\nprint(raw)\n```\n\nThis is useful for checking provider metadata like token usage, model version, and provider-specific fields.\n\n## Handling Failures & Retries\n\nWhen all retries are exhausted, an `InstructorRetryException` is raised. It includes detailed context:\n\n```python\nfrom instructor.core.exceptions import InstructorRetryException\n\ntry:\n    client.create(...)\nexcept InstructorRetryException as e:\n    print(\"Attempts:\", e.n_attempts)\n    print(\"Last completion:\", e.last_completion)\n    print(\"Create kwargs:\", e.create_kwargs)  # reproducible input\n    print(\"Failed attempts:\", e.failed_attempts)  # list of (attempt, exception, completion)\n    # If available, a compact trace packet to help debugging\n    if hasattr(e, \"trace_packet\") and e.trace_packet:\n        print(\"Trace packet:\", e.trace_packet)\n```\n\nUse `e.create_kwargs` and `e.failed_attempts` to craft a minimal reproduction.\n\n## Minimal Reproduction Template\n\n```python\nimport openai\nimport instructor\nfrom pydantic import BaseModel\n\nclass MyModel(BaseModel):\n    # fields...\n    pass\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\ncreate_kwargs = {\n    # paste from InstructorRetryException.create_kwargs\n}\n\ntry:\n    client.create(response_model=MyModel, **create_kwargs)\nexcept Exception as err:\n    # Inspect and iterate\n    raise\n```\n\nThis pattern captures the exact inputs that triggered a failure.\n\n## Strict vs Non-Strict Parsing\n\n- `strict=True` enforces exact schema matches and can surface schema drift early.\n- If providers sometimes return extra fields or slightly different types (e.g., floats for ints), try `strict=False` to validate non‑strictly.\n\n```python\nclient.create(..., response_model=MyModel, strict=True)\n```\n\n## Customizing Retries\n\nYou can pass an integer (attempt count) or a `tenacity` retrying object to control behavior:\n\n```python\nfrom tenacity import Retrying, stop_after_attempt, stop_after_delay\n\nmax_retries = Retrying(stop=stop_after_attempt(3) | stop_after_delay(10))\nclient.create(..., max_retries=max_retries)\n```\n\nThis is helpful when balancing latency and robustness.\n\n## Multimodal & Message Conversion\n\nIf you send images/audio/PDFs or text that may include media paths/URIs, Instructor can convert messages for provider formats.\n\n- For supported modes, `processing.multimodal.convert_messages` runs automatically.\n- If debugging content issues, log `messages` before and after conversion using the hooks above, and ensure media types/URIs are valid.\n\n## Caching Considerations\n\nIf you’re using a cache (`cache=...`), remember:\n- Successful parsed responses are stored; retrieving from cache skips the provider call.\n- If debugging live provider behavior, temporarily disable cache or change the cache key (e.g., tweak a message).\n\n```python\nmodel = client.create(..., cache=None)\n```\n\n## Common Troubleshooting Tips\n\n- Validate the `response_model.model_json_schema()` matches what you expect the provider to return.\n- Confirm `mode` is valid for your provider; mismatches can cause parsing failures.\n- Check provider‑side limits (max tokens/response length); incomplete outputs raise specific exceptions.\n- If using markdown JSON (`MD_JSON`), ensure the provider is actually returning a ```json code block.\n\nIf you need deeper visibility, add a custom handler to write kwargs/responses/errors to disk with a timestamp and correlation id.\n\n## Example: Local Debug Run\n\nYou can run a minimal, no‑network example that exercises hooks, logging, and parsing flow using a fake provider function:\n\n- File: `examples/debugging/run.py`\n- Run:\n\n```bash\npython examples/debugging/run.py\n```\n\nThis script:\n- Enables DEBUG logging for `instructor.*`\n- Patches a fake provider `create` with `instructor.patch(mode=Mode.JSON)`\n- Attaches hook handlers to print kwargs, response types, and parse errors\n- Parses a simple JSON payload into a Pydantic model and prints the result\n"
  },
  {
    "path": "docs/examples/action_items.md",
    "content": "---\ntitle: Automating Action Item Extraction from Meeting Transcripts\ndescription: Learn to extract actionable items from meeting transcripts using OpenAI's API and Pydantic for efficient project management.\n---\n\n# Extracting Action Items from Meeting Transcripts\n\nIn this guide, we'll walk through how to extract action items from meeting transcripts using OpenAI's API and Pydantic. This use case is essential for automating project management tasks, such as task assignment and priority setting.\n\nFor multi-label classification, we introduce a new enum class and a different Pydantic model to handle multiple labels.\n\n!!! tips \"Motivation\"\n\n    Significant amount of time is dedicated to meetings, where action items are generated as the actionable outcomes of these discussions. Automating the extraction of action items can save time and guarantee that no critical tasks are overlooked.\n\n## Defining the Structures\n\nWe'll model a meeting transcript as a collection of **`Ticket`** objects, each representing an action item. Every **`Ticket`** can have multiple **`Subtask`** objects, representing smaller, manageable pieces of the main task.\n\n## Extracting Action Items\n\nTo extract action items from a meeting transcript, we use the **`generate`** function. It calls OpenAI's API, processes the text, and returns a set of action items modeled as **`ActionItems`**.\n\n## Evaluation and Testing\n\nTo test the **`generate`** function, we provide it with a sample transcript, and then print the JSON representation of the extracted action items.\n\n```python\nimport instructor\nfrom typing import Iterable, List, Optional\nfrom enum import Enum\nfrom pydantic import BaseModel\n\n\nclass PriorityEnum(str, Enum):\n    high = \"High\"\n    medium = \"Medium\"\n    low = \"Low\"\n\n\nclass Subtask(BaseModel):\n    \"\"\"Correctly resolved subtask from the given transcript\"\"\"\n\n    id: int\n    name: str\n\n\nclass Ticket(BaseModel):\n    \"\"\"Correctly resolved ticket from the given transcript\"\"\"\n\n    id: int\n    name: str\n    description: str\n    priority: PriorityEnum\n    assignees: List[str]\n    subtasks: Optional[List[Subtask]]\n    dependencies: Optional[List[int]]\n\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate(data: str) -> Iterable[Ticket]:\n    return client.create(\n        model=\"gpt-4\",\n        response_model=Iterable[Ticket],\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"The following is a transcript of a meeting...\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Create the action items for the following transcript: {data}\",\n            },\n        ],\n    )\n\n\nprediction = generate(\n    \"\"\"\nAlice: Hey team, we have several critical tasks we need to tackle for the upcoming release. First, we need to work on improving the authentication system. It's a top priority.\nBob: Got it, Alice. I can take the lead on the authentication improvements. Are there any specific areas you want me to focus on?\nAlice: Good question, Bob. We need both a front-end revamp and back-end optimization. So basically, two sub-tasks.\nCarol: I can help with the front-end part of the authentication system.\nBob: Great, Carol. I'll handle the back-end optimization then.\nAlice: Perfect. Now, after the authentication system is improved, we have to integrate it with our new billing system. That's a medium priority task.\nCarol: Is the new billing system already in place?\nAlice: No, it's actually another task. So it's a dependency for the integration task. Bob, can you also handle the billing system?\nBob: Sure, but I'll need to complete the back-end optimization of the authentication system first, so it's dependent on that.\nAlice: Understood. Lastly, we also need to update our user documentation to reflect all these changes. It's a low-priority task but still important.\nCarol: I can take that on once the front-end changes for the authentication system are done. So, it would be dependent on that.\nAlice: Sounds like a plan. Let's get these tasks modeled out and get started.\"\"\"\n)\n```\n\n## Visualizing the tasks\n\nIn order to quickly visualize the data we used code interpreter to create a graphviz export of the json version of the ActionItems array.\n\n![Action items visualization showing extracted tasks with priorities and dependencies](../img/action_items.png)\n\n```json\n[\n  {\n    \"id\": 1,\n    \"name\": \"Improve Authentication System\",\n    \"description\": \"Revamp the front-end and optimize the back-end of the authentication system\",\n    \"priority\": \"High\",\n    \"assignees\": [\"Bob\", \"Carol\"],\n    \"subtasks\": [\n      {\n        \"id\": 2,\n        \"name\": \"Front-end Revamp\"\n      },\n      {\n        \"id\": 3,\n        \"name\": \"Back-end Optimization\"\n      }\n    ],\n    \"dependencies\": []\n  },\n  {\n    \"id\": 4,\n    \"name\": \"Integrate Authentication System with Billing System\",\n    \"description\": \"Integrate the improved authentication system with the new billing system\",\n    \"priority\": \"Medium\",\n    \"assignees\": [\"Bob\"],\n    \"subtasks\": [],\n    \"dependencies\": [1]\n  },\n  {\n    \"id\": 5,\n    \"name\": \"Update User Documentation\",\n    \"description\": \"Update the user documentation to reflect the changes in the authentication system\",\n    \"priority\": \"Low\",\n    \"assignees\": [\"Carol\"],\n    \"subtasks\": [],\n    \"dependencies\": [2]\n  }\n]\n```\n\nIn this example, the **`generate`** function successfully identifies and segments the action items, assigning them priorities, assignees, subtasks, and dependencies as discussed in the meeting.\n\nBy automating this process, you can ensure that important tasks and details are not lost in the sea of meeting minutes, making project management more efficient and effective.\n"
  },
  {
    "path": "docs/examples/audio_extraction.md",
    "content": "---\ntitle: Audio Information Extraction with OpenAI\ndescription: Learn how to extract structured information from audio files using OpenAI's audio capabilities and Instructor for type-safe data extraction.\n---\n\n# Audio Information Extraction with OpenAI\n\nThis example demonstrates how to use Instructor with OpenAI's audio capabilities to extract structured information from audio files. The example shows how to process audio input and extract specific fields into a Pydantic model.\n\n## Prerequisites\n\n- OpenAI API key with access to GPT-4 audio models\n- An audio file in WAV format\n- Instructor library installed with OpenAI support\n\n## Code Example\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor.processing.multimodal import Audio\nimport base64\n\n# Initialize the OpenAI client with Instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\n# Define the structure for extracted information\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\n# Read and encode the audio file\nwith open(\"./output.wav\", \"rb\") as f:\n    encoded_string = base64.b64encode(f.read()).decode(\"utf-8\")\n\n# Extract information from the audio\nresp = client.create(\n    model=\"gpt-4-audio-preview\",\n    response_model=Person,\n    modalities=[\"text\"],\n    audio={\"voice\": \"alloy\", \"format\": \"wav\"},\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract the following information from the audio\",\n                Audio.from_path(\"./output.wav\"),\n            ],\n        },\n    ],\n)\n\nprint(resp)\n# Example output: Person(name='Jason', age=20)\n```\n\n## How It Works\n\n1. First, we import the necessary libraries including the `Audio` class from `instructor.processing.multimodal`.\n\n2. We define a Pydantic model `Person` that specifies the structure of the information we want to extract from the audio:\n   - `name`: The person's name\n   - `age`: The person's age\n\n3. The audio file is read and encoded in base64 format.\n\n4. We use OpenAI's audio-capable model to process the audio and extract the specified information:\n   - The `model` parameter specifies the GPT-4 audio model\n   - `response_model` tells Instructor to structure the output according to our `Person` model\n   - `modalities` specifies that we want text output\n   - The `audio` parameter configures audio-specific settings\n   - In the message, we use `Audio.from_path()` to include the audio file\n\n5. The response is automatically parsed into our Pydantic model, making the extracted information easily accessible in a structured format.\n\n## Use Cases\n\nThis pattern is particularly useful for:\n\n- Transcribing and extracting information from recorded interviews\n- Processing voice messages or audio notes\n- Automated form filling from voice input\n- Voice-based data entry systems\n\n## Tips\n\n- Ensure your audio file is in a supported format (WAV in this example)\n- The audio model works best with clear speech and minimal background noise\n- Consider the length of the audio file, as there may be model-specific limitations\n- Structure your Pydantic model to match the information you expect to extract\n\n## Related Examples\n\n- [Multi-Modal Data with Gemini](multi_modal_gemini.md)\n- [Structured Outputs with OpenAI](../integrations/openai.md)"
  },
  {
    "path": "docs/examples/batch_classification_langsmith.md",
    "content": "---\ntitle: Enhancing OpenAI Client with LangSmith and Instructor\ndescription: Discover how to integrate LangSmith with the OpenAI client for improved observability and functionality using instructor.\n---\n\n# Seamless Support with Langsmith\n\nIts a common misconception that LangChain's [LangSmith](https://www.langchain.com/langsmith) is only compatible with LangChain's models. In reality, LangSmith is a unified DevOps platform for developing, collaborating, testing, deploying, and monitoring LLM applications. In this blog we will explore how LangSmith can be used to enhance the OpenAI client alongside `instructor`.\n\nFirst, install the necessary packages:\n\n```bash\npip install -U langsmith\n```\n\n## LangSmith\n\nIn order to use langsmith, you first need to set your LangSmith API key.\n\n```bash\nexport LANGCHAIN_API_KEY=<your-api-key>\n```\n\nNext, you will need to install the LangSmith SDK:\n\n```bash\npip install -U langsmith\npip install -U instructor\n```\n\nIn this example we'll use the `wrap_openai` function to wrap the OpenAI client with LangSmith. This will allow us to use LangSmith's observability and monitoring features with the OpenAI client. Then we'll use `instructor` to patch the client with the `TOOLS` mode. This will allow us to use `instructor` to add additional functionality to the client.\n\n```python\nimport instructor\nimport asyncio\n\nfrom langsmith import traceable\nfrom langsmith.wrappers import wrap_openai\n\nfrom openai import AsyncOpenAI\nfrom pydantic import BaseModel, Field, field_validator\nfrom typing import List\nfrom enum import Enum\n\n# Wrap the OpenAI client with LangSmith\nclient = wrap_openai(AsyncOpenAI())\n\n# Patch the client with instructor\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\n# Rate limit the number of requests\nsem = asyncio.Semaphore(5)\n\n\n# Use an Enum to define the types of questions\nclass QuestionType(Enum):\n    CONTACT = \"CONTACT\"\n    TIMELINE_QUERY = \"TIMELINE_QUERY\"\n    DOCUMENT_SEARCH = \"DOCUMENT_SEARCH\"\n    COMPARE_CONTRAST = \"COMPARE_CONTRAST\"\n    EMAIL = \"EMAIL\"\n    PHOTOS = \"PHOTOS\"\n    SUMMARY = \"SUMMARY\"\n\n\n# You can add more instructions and examples in the description\n# or you can put it in the prompt in `messages=[...]`\nclass QuestionClassification(BaseModel):\n    \"\"\"\n    Predict the type of question that is being asked.\n    Here are some tips on how to predict the question type:\n    CONTACT: Searches for some contact information.\n    TIMELINE_QUERY: \"When did something happen?\n    DOCUMENT_SEARCH: \"Find me a document\"\n    COMPARE_CONTRAST: \"Compare and contrast two things\"\n    EMAIL: \"Find me an email, search for an email\"\n    PHOTOS: \"Find me a photo, search for a photo\"\n    SUMMARY: \"Summarize a large amount of data\"\n    \"\"\"\n\n    # If you want only one classification, just change it to\n    #   `classification: QuestionType` rather than `classifications: List[QuestionType]``\n    chain_of_thought: str = Field(\n        ..., description=\"The chain of thought that led to the classification\"\n    )\n    classification: List[QuestionType] = Field(\n        description=f\"An accuracy and correct prediction predicted class of question. Only allowed types: {[t.value for t in QuestionType]}, should be used\",\n    )\n\n    @field_validator(\"classification\", mode=\"before\")\n    def validate_classification(cls, v):\n        # sometimes the API returns a single value, just make sure it's a list\n        if not isinstance(v, list):\n            v = [v]\n        return v\n\n\n@traceable(name=\"classify-question\")\nasync def classify(data: str) -> QuestionClassification:\n    \"\"\"\n    Perform multi-label classification on the input text.\n    Change the prompt to fit your use case.\n    Args:\n        data (str): The input text to classify.\n    \"\"\"\n    async with sem:  # some simple rate limiting\n        return data, await client.create(\n            model=\"gpt-4-turbo-preview\",\n            response_model=QuestionClassification,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Classify the following question: {data}\",\n                },\n            ],\n        )\n\n\nasync def main(questions: List[str]):\n    tasks = [classify(question) for question in questions]\n\n    for task in asyncio.as_completed(tasks):\n        question, label = await task\n        resp = {\n            \"question\": question,\n            \"classification\": [c.value for c in label.classification],\n            \"chain_of_thought\": label.chain_of_thought,\n        }\n        resps.append(resp)\n    return resps\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    questions = [\n        \"What was that ai app that i saw on the news the other day?\",\n        \"Can you find the trainline booking email?\",\n        \"what did I do on Monday?\",\n        \"Tell me about todays meeting and how it relates to the email on Monday\",\n    ]\n\n    resp = asyncio.run(main(questions))\n\n    for r in resp:\n        print(\"q:\", r[\"question\"])\n        #> q: what did I do on Monday?\n        print(\"c:\", r[\"classification\"])\n        #> c: ['SUMMARY']\n```\n\nIf you follow what we've done is wrapped the client and proceeded to quickly use asyncio to classify a list of questions. This is a simple example of how you can use LangSmith to enhance the OpenAI client. You can use LangSmith to monitor and observe the client, and use `instructor` to add additional functionality to the client.\n\nTo take a look at trace of this run check out this shareable [link](https://smith.langchain.com/public/eaae9f95-3779-4bbb-824d-97aa8a57a4e0/r).\n"
  },
  {
    "path": "docs/examples/batch_in_memory.md",
    "content": "---\ntitle: In-Memory Batch Processing for Serverless Applications\ndescription: Learn how to use Instructor's in-memory batch processing feature for serverless deployments without disk I/O.\n---\n\n## See Also\n\n- [Batch Processing](./batch_job_oai.md) - File-based batch processing\n- [Bulk Classification](./bulk_classification.md) - Process multiple classifications\n- [from_provider Guide](../concepts/from_provider.md#async-clients) - Async client setup\n- [Cost Optimization](./batch_job_oai.md) - Reduce API costs with batch processing\n\n# In-Memory Batch Processing for Serverless\n\nThis guide demonstrates how to use Instructor's in-memory batch processing feature, which is perfect for serverless deployments and applications that need to avoid disk I/O.\n\n## Overview\n\nIn-memory batch processing allows you to create and submit batch requests without writing to disk, using BytesIO buffers instead of files. This is ideal for:\n\n- **Serverless environments** (AWS Lambda, Google Cloud Functions, Azure Functions)\n- **Containerized applications** with read-only file systems\n- **Security-sensitive applications** that avoid temporary files\n- **High-performance applications** that minimize I/O overhead\n\n## Quick Start\n\n```python\nimport time\nfrom pydantic import BaseModel\nfrom instructor.batch.processor import BatchProcessor\n\n\nclass User(BaseModel):\n    \"\"\"User model for extraction.\"\"\"\n\n    name: str\n    age: int\n    email: str\n\n\ndef main():\n    # Initialize batch processor\n    processor = BatchProcessor(\"openai/gpt-5-nano\", User)\n\n    # Sample messages for batch processing\n    messages_list = [\n        [\n            {\"role\": \"system\", \"content\": \"Extract user information from the text.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"John Doe is 25 years old and his email is john@example.com\",\n            },\n        ],\n        [\n            {\"role\": \"system\", \"content\": \"Extract user information from the text.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Jane Smith, age 30, can be reached at jane.smith@company.com\",\n            },\n        ],\n        [\n            {\"role\": \"system\", \"content\": \"Extract user information from the text.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Bob Wilson (bob.wilson@email.com) is 28 years old\",\n            },\n        ],\n    ]\n\n    # Create batch in memory (no file_path specified)\n    batch_buffer = processor.create_batch_from_messages(\n        messages_list,\n        file_path=None,  # This triggers in-memory mode\n        max_tokens=150,\n        temperature=0.1,\n    )\n\n    print(f\"Created batch buffer: {type(batch_buffer)}\")\n    print(f\"Buffer size: {len(batch_buffer.getvalue())} bytes\")\n\n    # Submit the batch using the in-memory buffer\n    batch_id = processor.submit_batch(\n        batch_buffer, metadata={\"description\": \"In-memory batch example\"}\n    )\n\n    print(f\"Batch submitted successfully! Batch ID: {batch_id}\")\n\n    # Poll for completion\n    print(\"Waiting for batch to complete...\")\n    max_wait_time = 300  # 5 minutes max\n    start_time = time.time()\n\n    while time.time() - start_time < max_wait_time:\n        status = processor.get_batch_status(batch_id)\n        current_status = status.get(\"status\", \"unknown\")\n\n        print(f\"Current status: {current_status}\")\n\n        if current_status in [\"completed\", \"failed\", \"cancelled\", \"expired\"]:\n            break\n\n        time.sleep(10)\n\n    # Retrieve and process results\n    if status.get(\"status\") == \"completed\":\n        print(\"Batch completed! Retrieving results...\")\n\n        results = processor.get_results(batch_id)\n\n        successful_results = [r for r in results if hasattr(r, \"result\")]\n        error_results = [r for r in results if hasattr(r, \"error_message\")]\n\n        print(f\"Total results: {len(results)}\")\n        print(f\"Successful: {len(successful_results)}\")\n        print(f\"Errors: {len(error_results)}\")\n\n        # Show successful extractions\n        if successful_results:\n            print(\"\\nExtracted Users:\")\n            for result in successful_results:\n                user = result.result\n                print(f\"   - {user.name}, {user.age} years old, {user.email}\")\n\n        # Show any errors\n        if error_results:\n            print(\"\\nErrors encountered:\")\n            for error in error_results:\n                print(f\"   - {error.custom_id}: {error.error_message}\")\n\n\nif __name__ == \"__main__\":\n    main()\n```\n\n## File vs In-Memory Comparison\n\n### Traditional File-Based Approach\n\n```python\n# File-based approach\nprocessor = BatchProcessor(\"openai/gpt-5-nano\", User)\n\n# Creates file on disk\nfile_path = processor.create_batch_from_messages(\n    messages_list,\n    file_path=\"temp_batch.jsonl\",  # Specify file path\n    max_tokens=150,\n    temperature=0.1,\n)\n\n# Submit using file path\nbatch_id = processor.submit_batch(file_path)\n\n# Remember to clean up\nimport os\n\nif os.path.exists(file_path):\n    os.remove(file_path)\n```\n\n### New In-Memory Approach\n\n```python\n# In-memory approach\nprocessor = BatchProcessor(\"openai/gpt-5-nano\", User)\n\n# Creates BytesIO buffer in memory\nbuffer = processor.create_batch_from_messages(\n    messages_list,\n    file_path=None,  # No file path = in-memory\n    max_tokens=150,\n    temperature=0.1,\n)\n\n# Submit using buffer\nbatch_id = processor.submit_batch(buffer)\n\n# No cleanup required - buffer is automatically garbage collected\n```\n\n## Benefits of In-Memory Processing\n\n### ✅ Perfect for Serverless\n\n```python\n# AWS Lambda example\nimport json\n\n\ndef lambda_handler(event, context):\n    \"\"\"AWS Lambda function using in-memory batch processing.\"\"\"\n\n    # Extract data from event\n    messages_list = event.get(\"messages\", [])\n\n    # Process in memory - no disk I/O\n    processor = BatchProcessor(\"openai/gpt-5-nano\", User)\n    buffer = processor.create_batch_from_messages(\n        messages_list,\n        file_path=None,  # Essential for Lambda\n    )\n\n    batch_id = processor.submit_batch(buffer)\n\n    return {\n        'statusCode': 200,\n        'body': json.dumps(\n            {'batch_id': batch_id, 'message': 'Batch submitted successfully'}\n        ),\n    }\n```\n\n### ✅ Memory Efficient\n\n```python\n# Check buffer size before submission\nbuffer = processor.create_batch_from_messages(messages_list, file_path=None)\n\nprint(f\"Buffer size: {len(buffer.getvalue())} bytes\")\nprint(f\"Buffer type: {type(buffer)}\")\n\n# Buffer content is accessible\nbuffer.seek(0)\ncontent_preview = buffer.read(200).decode(\"utf-8\")\nprint(f\"Preview: {content_preview}...\")\n\n# Reset for submission\nbuffer.seek(0)\nbatch_id = processor.submit_batch(buffer)\n```\n\n### ✅ Security Benefits\n\n```python\n# No temporary files on disk\n# No file permissions to manage\n# No cleanup required\n# Buffer is automatically garbage collected\n\nprocessor = BatchProcessor(\"openai/gpt-5-nano\", User)\n\n# This approach leaves no trace on the file system\nbuffer = processor.create_batch_from_messages(\n    sensitive_messages,\n    file_path=None,  # Keeps everything in memory\n)\n\nbatch_id = processor.submit_batch(buffer)\n# When buffer goes out of scope, it's automatically cleaned up\n```\n\n## Error Handling\n\n```python\ntry:\n    # Create batch buffer\n    buffer = processor.create_batch_from_messages(\n        messages_list,\n        file_path=None,\n    )\n\n    # Submit batch\n    batch_id = processor.submit_batch(buffer)\n\n    # Process results\n    results = processor.get_results(batch_id)\n\nexcept Exception as e:\n    print(f\"Error during batch processing: {e}\")\n    #> Error during batch processing: name 'processor' is not defined\n    # No file cleanup needed with in-memory approach\n```\n\n## Provider Support\n\nAll providers support in-memory batch processing:\n\n### OpenAI\n```python\nprocessor = BatchProcessor(\"openai/gpt-5-nano\", User)\nbuffer = processor.create_batch_from_messages(messages_list, file_path=None)\nbatch_id = processor.submit_batch(buffer)\n```\n\n### Anthropic\n```python\nprocessor = BatchProcessor(\"anthropic/claude-3-5-sonnet-20241022\", User)\nbuffer = processor.create_batch_from_messages(messages_list, file_path=None)\nbatch_id = processor.submit_batch(buffer)\n```\n\n### Google GenAI\n```python\nprocessor = BatchProcessor(\"google/gemini-2.5-flash\", User)\nbuffer = processor.create_batch_from_messages(messages_list, file_path=None)\nbatch_id = processor.submit_batch(buffer)\n```\n\n## Best Practices\n\n1. **Always set `file_path=None`** to enable in-memory mode\n2. **Monitor buffer size** for large batches to avoid memory issues\n3. **Use appropriate models** that support JSON schema (e.g., gpt-4o-mini)\n4. **Handle errors gracefully** - no file cleanup needed\n5. **Consider memory limits** in serverless environments\n\n## Limitations\n\n- **Memory usage**: Large batches may consume significant memory\n- **No debugging files**: Can't inspect batch files for troubleshooting\n- **Temporary storage**: Buffer contents are lost if not submitted immediately\n\n## Troubleshooting\n\n### Buffer Size Issues\n```python\n# Check buffer size before submission\nbuffer = processor.create_batch_from_messages(messages_list, file_path=None)\nsize_mb = len(buffer.getvalue()) / (1024 * 1024)\nprint(f\"Buffer size: {size_mb:.2f} MB\")\n\nif size_mb > 100:  # Adjust threshold as needed\n    print(\"Warning: Large buffer size, consider splitting batch\")\n```\n\n### Memory Monitoring\n```python\nimport psutil\nimport os\n\n# Check memory usage\nprocess = psutil.Process(os.getpid())\nmemory_before = process.memory_info().rss / 1024 / 1024  # MB\n\nbuffer = processor.create_batch_from_messages(messages_list, file_path=None)\n\nmemory_after = process.memory_info().rss / 1024 / 1024  # MB\nprint(f\"Memory increase: {memory_after - memory_before:.2f} MB\")\n```\n\nThis in-memory approach makes Instructor's batch processing perfect for modern serverless and containerized applications while maintaining the same powerful API and provider support.\n"
  },
  {
    "path": "docs/examples/batch_job_oai.md",
    "content": "---\ntitle: Generating Synthetic Data with OpenAI's Batch API\ndescription: Learn to use OpenAI's Batch API for large-scale synthetic data generation, focusing on question-answer pairs from the ms-marco dataset.\n---\n\n## See Also\n\n- [In-Memory Batch Processing](./batch_in_memory.md) - Serverless batch processing without disk I/O\n- [Bulk Classification](./bulk_classification.md) - Process multiple classifications efficiently\n- [Cost Optimization](../examples/index.md#api-integration) - Reduce API costs\n- [from_provider Guide](../concepts/from_provider.md#async-clients) - Async client setup\n\n# Bulk Generation of Synthetic Data\n\nThis tutorial shows how to use `instructor` to generate large quantities of synthetic data at scale using Open AI's new Batch API. In this example, we'll be generating synthetic questions using the `ms-marco` dataset to evaluate RAG retrieval.\n\n??? tips \"Why use the batch API?\"\n\n    There are a few reasons why you might want to use the Batch API\n\n    1. Batch Jobs are 50% cheaper than running an inference job on demand ( see Open AI's pricing page [here](https://openai.com/api/pricing/) )\n\n    2. Batch Jobs have higher rate limits than normal api calls\n\n    3. Batch Jobs support both normal models **and fine-tuned models**\n\n    This makes them perfect for non time-sensitive tasks that involve large quantities of data.\n\n## Getting Started\n\nLet's first see how we can generate a Question and Answer Pair using Instructor with a normal OpenAI function call.\n\n```python\nfrom pydantic import BaseModel, Field\n\nclient = from_openai(OpenAI())\n\n\nclass QuestionAnswerPair(BaseModel):\n    \"\"\"\n    This model represents a pair of a question generated from a text chunk, its corresponding answer,\n    and the chain of thought leading to the answer. The chain of thought provides insight into how the answer\n    was derived from the question.\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        description=\"The reasoning process leading to the answer.\"\n    )\n    question: str = Field(description=\"The generated question from the text chunk.\")\n    answer: str = Field(description=\"The answer to the generated question.\")\n\n\ndef generate_question(chunk: str) -> QuestionAnswerPair:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world class AI that excels at generating hypothethical search queries. You're about to be given a text snippet and asked to generate a search query which is specific to the specific text chunk that you'll be given. Make sure to use information from the text chunk.\",\n            },\n            {\"role\": \"user\", \"content\": f\"Here is the text chunk: {chunk}\"},\n        ],\n        response_model=QuestionAnswerPair,\n    )\n\n\ntext_chunk = \"\"\"\nThe Reserve Bank of Australia (RBA) came into being on 14 January 1960 as Australia 's central bank and banknote issuing authority, when the Reserve Bank Act 1959 removed the central banking functions from the Commonwealth Bank. The assets of the bank include the gold and foreign exchange reserves of Australia, which is estimated to have a net worth of A$101 billion. Nearly 94% of the RBA's employees work at its headquarters in Sydney, New South Wales and at the Business Resumption Site.\n\"\"\"\nprint(generate_question(text_chunk).model_dump_json(indent=2))\n\"\"\"\n{\n  \"chain_of_thought\": \"The text discusses the formation of the Reserve Bank of Australia (RBA) and provides key details about its establishment date, the removal of central banking functions from the Commonwealth Bank, its asset worth, and its employee distribution. By focusing on these details, a search query can be framed around the establishment date and purpose of the RBA.\",\n  \"question\": \"When was the Reserve Bank of Australia established and what are its main functions?\",\n  \"answer\": \"The Reserve Bank of Australia was established on 14 January 1960 as Australia's central bank and banknote issuing authority.\"\n}\n\"\"\"\n```\n\nAs the number of chunks we'd like to generate these synthetic questions for increases, the cost will grow proportionally.\n\nLet's see how we can use the `BatchJob` object to create a `.jsonl` file which is compatible with the Batch API.\n\n```python hl_lines=\"9-18 35-40\"\nfrom datasets import load_dataset\nfrom instructor.batch import BatchJob\nfrom pydantic import BaseModel, Field\nfrom datasets import load_dataset\n\ndataset = load_dataset(\"ms_marco\", \"v1.1\", split=\"train\", streaming=True).take(200)\n\n\ndef get_messages(dataset):  # (1)!\n    for row in dataset:\n        for passage in row['passages']['passage_text']:\n            yield [\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a world class AI that excels at generating hypothethical search queries. You're about to be given a text snippet and asked to generate a search query which is specific to the specific text chunk that you'll be given. Make sure to use information from the text chunk.\",\n                },\n                {\"role\": \"user\", \"content\": f\"Here is the text chunk: {passage}\"},\n            ]\n\n\nclass QuestionAnswerPair(BaseModel):\n    \"\"\"\n    This model represents a pair of a question generated from a text chunk, its corresponding answer,\n    and the chain of thought leading to the answer. The chain of thought provides insight into how the answer\n    was derived from the question.\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        description=\"The reasoning process leading to the answer.\"\n    )\n    question: str = Field(description=\"The generated question from the text chunk.\")\n    answer: str = Field(description=\"The answer to the generated question.\")\n\n\nBatchJob.create_from_messages(\n    messages_batch=get_messages(dataset),\n    model=\"gpt-4o\",\n    file_path=\"./test.jsonl\",\n    response_model=QuestionAnswerPair,\n)  # (2)!\n```\n\n1.  We first define a generator which generates a list of messages which we would have made in a normal `openai` api call\n\n2.  We then use the `create_from_messages` class method to specify the model and response_model that we want. `instructor` will handle the generation of the openai schema behind the scenes as well as write the output to the file path you specify\n\nOnce we've got this new `.jsonl` file, we can then use the new `instructor` cli's `batch` command to create a new batch job.\n\n```bash\n> % ls -a | grep test.jsonl\ntest.jsonl\n\n> % instructor batch create-from-file --file-path test.jsonl\n```\n\nThis will create a table like what you see below. In my case, my batch job took around 6 minutes to complete and cost me $2.72 to run.\n\n| Batch ID                       | Created At          | Status      | Failed | Completed | Total |\n| ------------------------------ | ------------------- | ----------- | ------ | --------- | ----- |\n| batch_Z8XUudoweH43R9c4sr4wRYub | 2024-07-16 12:45:22 | in_progress | 0      | 483       | 1627  |\n\nOnce our batch job is complete, the status will change to `completed`.\n\n??? \"Cancelling A Job\"\n\n    If you'd like to cancel a batch job midway, you can do so too with the instructor `batch` cli command\n\n    ```bash\n    instructor batch cancel --batch-id <batch id here>\n    ```\n\nWe can then download the file generated by the batch job using the cli command\n\n```bash\ninstructor batch download-file --download-file-path output.jsonl --batch-id batch_Z8XUudoweH43R9c4sr4wRYub\n```\n\nThis will then create a `.jsonl` file with the generated content at the path that you specify.\n\n## Parsing the generated response\n\nWe can then parse the generated response by using the `.parse_from_file` command provided by the `BatchJob` class.\n\n```python hl_lines=\"19-21\"\nfrom instructor.batch import BatchJob\nfrom pydantic import BaseModel, Field\n\n# <%hide%>\nwith open(\"./output.jsonl\", \"w\") as f:\n    f.write('')\n# <%hide%>\n\n\nclass QuestionAnswerPair(BaseModel):\n    \"\"\"\n    This model represents a pair of a question generated from a text chunk, its corresponding answer,\n    and the chain of thought leading to the answer. The chain of thought provides insight into how the answer\n    was derived from the question.\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        description=\"The reasoning process leading to the answer.\"\n    )\n    question: str = Field(description=\"The generated question from the text chunk.\")\n    answer: str = Field(description=\"The answer to the generated question.\")\n\n\nparsed, unparsed = BatchJob.parse_from_file(  # (1)!\n    file_path=\"./output.jsonl\", response_model=QuestionAnswerPair\n)\n\nprint(len(parsed))\n#> 0\nprint(len(unparsed))\n#> 0\n\n# <%hide%>\nimport os\n\nif os.path.exists(\"./output.jsonl\"):\n    os.remove(\"./output.jsonl\")\n# <%hide%>\n```\n\n1.  We can then use a generic `Pydantic` schema to parse the generated function calls back\n\nThis will then return a list of two elements\n\n- `parsed` is a list of responses that have been succesfully parsed into the `QuestionAnswerPair` Base Model class\n- `unparsed` is a second list which contains responses which were not able to be parsed into the `QuestionAnswerPair` Base Model class\n"
  },
  {
    "path": "docs/examples/building_knowledge_graphs.md",
    "content": "---\ntitle: Building Knowledge Graphs from Text\ndescription: Learn to construct knowledge graphs from textual data using OpenAI's API and Pydantic in this comprehensive tutorial.\n---\n\n## See Also\n\n- [Knowledge Graph](./knowledge_graph.md) - Visualize knowledge graphs\n- [Entity Resolution](./entity_resolution.md) - Identify and resolve entities\n- [Document Segmentation](./document_segmentation.md) - Break down documents for analysis\n- [Nested Structures](../learning/patterns/nested_structure.md) - Complex hierarchical models\n\n# Building Knowledge Graphs from Textual Data\n\nIn this tutorial, we will explore the process of constructing knowledge graphs from textual data using OpenAI's API and Pydantic. This approach is crucial for efficiently automating the extraction of structured information from unstructured text.\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nimport instructor\n\n\nclass Node(BaseModel):\n    id: int\n    label: str\n    color: str = \"blue\"  # Default color set to blue\n\n\nclass Edge(BaseModel):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"  # Default color for edges\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: List[Node] = Field(default_factory=list)\n    edges: List[Edge] = Field(default_factory=list)\n\n\n# Patch the OpenAI client to add response_model support\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_graph(input_text: str) -> KnowledgeGraph:\n    \"\"\"Generates a knowledge graph from the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Help me understand the following by describing it as a detailed knowledge graph: {input_text}\",\n            }\n        ],\n        response_model=KnowledgeGraph,\n    )\n\n\nif __name__ == \"__main__\":\n    input_text = \"Jason is Sarah's friend and he is a doctor\"\n    graph = generate_graph(input_text)\n    print(graph.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"nodes\": [\n        {\n          \"id\": 1,\n          \"label\": \"Jason\",\n          \"color\": \"blue\"\n        },\n        {\n          \"id\": 2,\n          \"label\": \"Sarah\",\n          \"color\": \"blue\"\n        },\n        {\n          \"id\": 3,\n          \"label\": \"Doctor\",\n          \"color\": \"blue\"\n        }\n      ],\n      \"edges\": [\n        {\n          \"source\": 1,\n          \"target\": 2,\n          \"label\": \"is a friend of\",\n          \"color\": \"black\"\n        },\n        {\n          \"source\": 1,\n          \"target\": 3,\n          \"label\": \"is a\",\n          \"color\": \"black\"\n        }\n      ]\n    }\n    \"\"\"\n```\n"
  },
  {
    "path": "docs/examples/bulk_classification.md",
    "content": "---\ntitle: User-Provided Tag Classification Tutorial\ndescription: Learn to classify user-provided tags effectively using async functions and FastAPI for parallel processing.\n---\n\n## See Also\n\n- [Batch Processing](./batch_job_oai.md) - Process large datasets efficiently\n- [Classification Examples](./classification.md) - More classification patterns\n- [FastAPI Integration](../integrations/index.md) - Building APIs with Instructor\n- [from_provider Guide](../concepts/from_provider.md#async-clients) - Async client setup\n\n# Bulk Classification from User-Provided Tags.\n\nThis tutorial shows how to do classification from user provided tags. This is valuable when you want to provide services that allow users to do some kind of classification.\n\n!!! tips \"Motivation\"\n\n    Imagine allowing the user to upload documents as part of a RAG application. Oftentimes, we might want to allow the user to specify an existing set of tags, give descriptions, and do the classification for them.\n\n## Defining the Structures\n\nOne of the easy things to do is to allow users to define a set of tags in some kind of schema and save that in a database. Here's an example of a schema that we might use:\n\n| tag_id | name     | instructions         |\n| ------ | -------- | -------------------- |\n| 0      | personal | Personal information |\n| 1      | phone    | Phone number         |\n| 2      | email    | Email address        |\n| 3      | address  | Address              |\n| 4      | Other    | Other information    |\n\n1. **tag_id** - The unique identifier for the tag.\n2. **name** - The name of the tag.\n3. **instructions** - A description of the tag, which can be used as a prompt to describe the tag.\n\n## Implementing the Classification\n\nIn order to do this we'll do a couple of things:\n\n0. We'll use the `instructor` library with async client support.\n1. Implement a `Tag` model that will be used to validate the tags from the context. (This will allow us to avoid hallucinating tags that are not in the context.)\n2. Helper models for the request and response.\n3. An async function to do the classification.\n4. A main function to run the classification using the `asyncio.gather` function to run the classification in parallel.\n\nIf you want to learn more about how to do bad computations, check out our post on AsyncIO [here](../blog/posts/learn-async.md).\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4o\", async_client=True)\n```\n\nFirst, we'll need to import all of our Pydantic and instructor code and use the AsyncOpenAI client. Then, we'll define the tag model along with the tag instructions to provide input and output.\n\nThis is very helpful because once we use something like FastAPI to create endpoints, the Pydantic functions will serve as multiple tools:\n\n1. A description for the developer\n2. Type hints for the IDE\n3. OpenAPI documentation for the FastAPI endpoint\n4. Schema and Response Model for the language model.\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, model_validator\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: List[Tag] = context.get(\"tags\")\n            assert self.id in {\n                tag.id for tag in tags\n            }, f\"Tag ID {self.id} not found in context\"\n            assert self.name in {\n                tag.name for tag in tags\n            }, f\"Tag name {self.name} not found in context\"\n        return self\n\n\nclass TagWithInstructions(Tag):\n    instructions: str\n\n\nclass TagRequest(BaseModel):\n    texts: List[str]\n    tags: List[TagWithInstructions]\n\n\nclass TagResponse(BaseModel):\n    texts: List[str]\n    predictions: List[Tag]\n```\n\nLet's delve deeper into what the `validate_ids` function does. Notice that its purpose is to extract tags from the context and ensure that each ID and name exists in the set of tags. This approach helps minimize hallucinations. If we mistakenly identify either the ID or the tag, an error will be thrown, and the instructor will prompt the language model to retry until the correct item is successfully extracted.\n\n```python\nfrom pydantic import model_validator, ValidationInfo\n\n\n@model_validator(mode=\"after\")\ndef validate_ids(self, info: ValidationInfo):\n    context = info.context\n    if context:\n        tags: List[Tag] = context.get(\"tags\")\n        assert self.id in {\n            tag.id for tag in tags\n        }, f\"Tag ID {self.id} not found in context\"\n        assert self.name in {\n            tag.name for tag in tags\n        }, f\"Tag name {self.name} not found in context\"\n    return self\n```\n\nNow, let's implement the function to do the classification. This function will take a single text and a list of tags and return the predicted tag.\n\n```python\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, model_validator\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: List[Tag] = context.get(\"tags\")\n            assert self.id in {\n                tag.id for tag in tags\n            }, f\"Tag ID {self.id} not found in context\"\n            assert self.name in {\n                tag.name for tag in tags\n            }, f\"Tag name {self.name} not found in context\"\n        return self\n\n\nclass TagWithInstructions(Tag):\n    instructions: str\n\n\nclass TagRequest(BaseModel):\n    texts: List[str]\n    tags: List[TagWithInstructions]\n\n\nclass TagResponse(BaseModel):\n    texts: List[str]\n    predictions: List[Tag]\n\n\n# <%hide%>\nasync def tag_single_request(text: str, tags: List[Tag]) -> Tag:\n    allowed_tags = [(tag.id, tag.name) for tag in tags]\n    allowed_tags_str = \", \".join([f\"`{tag}`\" for tag in allowed_tags])\n\n    return await client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world-class text tagging system.\",\n            },\n            {\"role\": \"user\", \"content\": f\"Describe the following text: `{text}`\"},\n            {\n                \"role\": \"user\",\n                \"content\": f\"Here are the allowed tags: {allowed_tags_str}\",\n            },\n        ],\n        response_model=Tag,  # Minimizes the hallucination of tags that are not in the allowed tags.\n        context={\"tags\": tags},\n    )\n\n\nasync def tag_request(request: TagRequest) -> TagResponse:\n    predictions = await asyncio.gather(\n        *[tag_single_request(text, request.tags) for text in request.texts]\n    )\n    return TagResponse(\n        texts=request.texts,\n        predictions=predictions,\n    )\n```\n\nNotice that we first define a single async function that makes a prediction of a tag, and we pass it into the validation context in order to minimize hallucinations.\n\nFinally, we'll implement the main function to run the classification using the `asyncio.gather` function to run the classification in parallel.\n\n```python\nimport asyncio\n\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, model_validator\nimport instructor\nimport asyncio\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", async_client=True)\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: List[Tag] = context.get(\"tags\")\n            assert self.id in {\n                tag.id for tag in tags\n            }, f\"Tag ID {self.id} not found in context\"\n            assert self.name in {\n                tag.name for tag in tags\n            }, f\"Tag name {self.name} not found in context\"\n        return self\n\n\nclass TagWithInstructions(Tag):\n    instructions: str\n\n\nclass TagRequest(BaseModel):\n    texts: List[str]\n    tags: List[TagWithInstructions]\n\n\nclass TagResponse(BaseModel):\n    texts: List[str]\n    predictions: List[Tag]\n\n\nasync def tag_single_request(text: str, tags: List[Tag]) -> Tag:\n    allowed_tags = [(tag.id, tag.name) for tag in tags]\n    allowed_tags_str = \", \".join([f\"`{tag}`\" for tag in allowed_tags])\n\n    return await client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world-class text tagging system.\",\n            },\n            {\"role\": \"user\", \"content\": f\"Describe the following text: `{text}`\"},\n            {\n                \"role\": \"user\",\n                \"content\": f\"Here are the allowed tags: {allowed_tags_str}\",\n            },\n        ],\n        response_model=Tag,  # Minimizes the hallucination of tags that are not in the allowed tags.\n        context={\"tags\": tags},\n    )\n\n\nasync def tag_request(request: TagRequest) -> TagResponse:\n    predictions = await asyncio.gather(\n        *[tag_single_request(text, request.tags) for text in request.texts]\n    )\n    return TagResponse(\n        texts=request.texts,\n        predictions=predictions,\n    )\n\n\n# <%hide%>\ntags = [\n    TagWithInstructions(id=0, name=\"personal\", instructions=\"Personal information\"),\n    TagWithInstructions(id=1, name=\"phone\", instructions=\"Phone number\"),\n    TagWithInstructions(id=2, name=\"email\", instructions=\"Email address\"),\n    TagWithInstructions(id=3, name=\"address\", instructions=\"Address\"),\n    TagWithInstructions(id=4, name=\"Other\", instructions=\"Other information\"),\n]\n\n# Texts will be a range of different questions.\n# Such as \"How much does it cost?\", \"What is your privacy policy?\", etc.\ntexts = [\n    \"What is your phone number?\",\n    \"What is your email address?\",\n    \"What is your address?\",\n    \"What is your privacy policy?\",\n]\n\n# The request will contain the texts and the tags.\nrequest = TagRequest(texts=texts, tags=tags)\n\n# The response will contain the texts, the predicted tags, and the confidence.\nresponse = asyncio.run(tag_request(request))\nprint(response.model_dump_json(indent=2))\n\"\"\"\n{\n  \"texts\": [\n    \"What is your phone number?\",\n    \"What is your email address?\",\n    \"What is your address?\",\n    \"What is your privacy policy?\"\n  ],\n  \"predictions\": [\n    {\n      \"id\": 1,\n      \"name\": \"phone\"\n    },\n    {\n      \"id\": 2,\n      \"name\": \"email\"\n    },\n    {\n      \"id\": 3,\n      \"name\": \"address\"\n    },\n    {\n      \"id\": 4,\n      \"name\": \"Other\"\n    }\n  ]\n}\n\"\"\"\n```\n\nWhich would result in:\n\n```json\n{\n  \"texts\": [\n    \"What is your phone number?\",\n    \"What is your email address?\",\n    \"What is your address?\",\n    \"What is your privacy policy?\"\n  ],\n  \"predictions\": [\n    {\n      \"id\": 1,\n      \"name\": \"phone\"\n    },\n    {\n      \"id\": 2,\n      \"name\": \"email\"\n    },\n    {\n      \"id\": 3,\n      \"name\": \"address\"\n    },\n    {\n      \"id\": 4,\n      \"name\": \"Other\"\n    }\n  ]\n}\n```\n\n## What happens in production?\n\nIf we were to use this in production, we might expect to have some kind of fast API endpoint.\n\n```python\nfrom fastapi import FastAPI\n\napp = FastAPI()\n\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, model_validator\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: List[Tag] = context.get(\"tags\")\n            assert self.id in {\n                tag.id for tag in tags\n            }, f\"Tag ID {self.id} not found in context\"\n            assert self.name in {\n                tag.name for tag in tags\n            }, f\"Tag name {self.name} not found in context\"\n        return self\n\n\nclass TagWithInstructions(Tag):\n    instructions: str\n\n\nclass TagRequest(BaseModel):\n    texts: List[str]\n    tags: List[TagWithInstructions]\n\n\nclass TagResponse(BaseModel):\n    texts: List[str]\n    predictions: List[Tag]\n\n\n# <%hide%>\n@app.post(\"/tag\", response_model=TagResponse)\nasync def tag(request: TagRequest) -> TagResponse:\n    return await tag_request(request)\n```\n\nSince everything is already annotated with Pydantic, this code is very simple to write!\n\n!!! warning \"Where do tags come from?\"\n\n    I just want to call out that here you can also imagine the tag spec IDs and names and instructions for example could come from a database or somewhere else. I'll leave this as an exercise to the reader, but I hope this gives us a clear understanding of how we can do something like user-defined classification.\n\n## Improving the Model\n\nThere's a couple things we could do to make this system a little bit more robust.\n\n1. Use confidence score:\n\n```python\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, model_validator, Field\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: List[Tag] = context.get(\"tags\")\n            assert self.id in {\n                tag.id for tag in tags\n            }, f\"Tag ID {self.id} not found in context\"\n            assert self.name in {\n                tag.name for tag in tags\n            }, f\"Tag name {self.name} not found in context\"\n        return self\n\n\n# <%hide%>\nclass TagWithConfidence(Tag):\n    confidence: float = Field(\n        ...,\n        ge=0,\n        le=1,\n        description=\"The confidence of the prediction, 0 is low, 1 is high\",\n    )\n```\n\n2. Use multiclass classification:\n\nNotice in the example we use Iterable[Tag] vs Tag. This is because we might want to use a multiclass classification model that returns multiple tag!\n\n```python\nimport instructor\nimport instructor\nimport asyncio\nfrom typing import Iterable\n\nclient = instructor.from_openai(\n    openai.AsyncOpenAI(),\n)\n\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, ValidationInfo, model_validator\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: List[Tag] = context.get(\"tags\")\n            assert self.id in {\n                tag.id for tag in tags\n            }, f\"Tag ID {self.id} not found in context\"\n            assert self.name in {\n                tag.name for tag in tags\n            }, f\"Tag name {self.name} not found in context\"\n        return self\n\n\n# <%hide%>\ntags = [\n    Tag(id=0, name=\"personal\"),\n    Tag(id=1, name=\"phone\"),\n    Tag(id=2, name=\"email\"),\n    Tag(id=3, name=\"address\"),\n    Tag(id=4, name=\"Other\"),\n]\n\n# Texts will be a range of different questions.\n# Such as \"How much does it cost?\", \"What is your privacy policy?\", etc.\ntext = \"What is your phone number?\"\n\n\nasync def get_tags(text: List[str], tags: List[Tag]) -> List[Tag]:\n    allowed_tags = [(tag.id, tag.name) for tag in tags]\n    allowed_tags_str = \", \".join([f\"`{tag}`\" for tag in allowed_tags])\n\n    return await client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world-class text tagging system.\",\n            },\n            {\"role\": \"user\", \"content\": f\"Describe the following text: `{text}`\"},\n            {\n                \"role\": \"user\",\n                \"content\": f\"Here are the allowed tags: {allowed_tags_str}\",\n            },\n        ],\n        response_model=Iterable[Tag],\n        context={\"tags\": tags},\n    )\n\n\ntag_results = asyncio.run(get_tags(text, tags))\nfor tag in tag_results:\n    print(tag)\n    #> id=1 name='phone'\n"
  },
  {
    "path": "docs/examples/classification.md",
    "content": "---\ntitle: Text Classification with OpenAI and Pydantic\ndescription: Learn to implement single-label and multi-label text classification using OpenAI API and Pydantic models in Python.\n---\n\n# Text Classification using OpenAI and Pydantic\n\nThis tutorial showcases how to implement text classification tasks-specifically, single-label and multi-label classifications-using the OpenAI API and Pydantic models. For complete examples, check out our [single classification](./bulk_classification.md) and [multi-label classification](./bulk_classification.md) examples in the cookbook.\n\n!!! tips \"Motivation\"\n\n    Text classification is a common problem in many NLP applications, such as spam detection or support ticket categorization. The goal is to provide a systematic way to handle these cases using OpenAI's GPT models in combination with Python data structures.\n\n## Single-Label Classification\n\n### Defining the Structures\n\nFor single-label classification, we define a Pydantic model with a [Literal](../concepts/prompting.md#literals) field for the possible labels.\n\n!!! note \"Literals vs Enums\"\n\n    We prefer using `Literal` types over `enum` for classification labels. Literals provide better type checking and are more straightforward to use with Pydantic models.\n\n!!! important \"Few-Shot Examples\"\n\n    Including few-shot examples in the model's docstring is crucial for improving the model's classification accuracy. These examples guide the AI in understanding the task and expected outputs.\n\n    If you want to learn more prompting tips check out our [prompting guide](../prompting/index.md)\n\n!!! note \"Chain of Thought\"\n\n    Using [Chain of Thought](../concepts/prompting.md#chain-of-thought) has been shown to improve the quality of the predictions by ~ 10%\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ClassificationResponse(BaseModel):\n    \"\"\"\n    A few-shot example of text classification:\n\n    Examples:\n    - \"Buy cheap watches now!\": SPAM\n    - \"Meeting at 3 PM in the conference room\": NOT_SPAM\n    - \"You've won a free iPhone! Click here\": SPAM\n    - \"Can you pick up some milk on your way home?\": NOT_SPAM\n    - \"Increase your followers by 10000 overnight!\": SPAM\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        ...,\n        description=\"The chain of thought that led to the prediction.\",\n    )\n    label: Literal[\"SPAM\", \"NOT_SPAM\"] = Field(\n        ...,\n        description=\"The predicted class label.\",\n    )\n```\n\n### Classifying Text\n\nThe function **`classify`** will perform the single-label classification.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\nimport instructor\n\n\nclass ClassificationResponse(BaseModel):\n    \"\"\"\n    A few-shot example of text classification:\n\n    Examples:\n    - \"Buy cheap watches now!\": SPAM\n    - \"Meeting at 3 PM in the conference room\": NOT_SPAM\n    - \"You've won a free iPhone! Click here\": SPAM\n    - \"Can you pick up some milk on your way home?\": NOT_SPAM\n    - \"Increase your followers by 10000 overnight!\": SPAM\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        ...,\n        description=\"The chain of thought that led to the prediction.\",\n    )\n    label: Literal[\"SPAM\", \"NOT_SPAM\"] = Field(\n        ...,\n        description=\"The predicted class label.\",\n    )\n\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\n# <%hide%>\ndef classify(data: str) -> ClassificationResponse:\n    \"\"\"Perform single-label classification on the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=ClassificationResponse,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: <text>{data}</text>\",\n            },\n        ],\n    )\n```\n\n### Testing and Evaluation\n\nLet's run examples to see if it correctly identifies spam and non-spam messages.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ClassificationResponse(BaseModel):\n    \"\"\"\n    A few-shot example of text classification:\n\n    Examples:\n    - \"Buy cheap watches now!\": SPAM\n    - \"Meeting at 3 PM in the conference room\": NOT_SPAM\n    - \"You've won a free iPhone! Click here\": SPAM\n    - \"Can you pick up some milk on your way home?\": NOT_SPAM\n    - \"Increase your followers by 10000 overnight!\": SPAM\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        ...,\n        description=\"The chain of thought that led to the prediction.\",\n    )\n    label: Literal[\"SPAM\", \"NOT_SPAM\"] = Field(\n        ...,\n        description=\"The predicted class label.\",\n    )\n\n\ndef classify(data: str) -> ClassificationResponse:\n    \"\"\"Perform single-label classification on the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=ClassificationResponse,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: <text>{data}</text>\",\n            },\n        ],\n    )\n\n\n# <%hide%>\nif __name__ == \"__main__\":\n    for text, label in [\n        (\"Hey Jason! You're awesome\", \"NOT_SPAM\"),\n        (\"I am a nigerian prince and I need your help.\", \"SPAM\"),\n    ]:\n        prediction = classify(text)\n        assert prediction.label == label\n        print(f\"Text: {text}, Predicted Label: {prediction.label}\")\n        #> Text: Hey Jason! You're awesome, Predicted Label: NOT_SPAM\n        #> Text: I am a nigerian prince and I need your help., Predicted Label: SPAM\n```\n\n## Multi-Label Classification\n\n### Defining the Structures\n\nFor multi-label classification, we'll update our approach to use Literals instead of enums, and include few-shot examples in the model's docstring.\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\n\n\nclass MultiClassPrediction(BaseModel):\n    \"\"\"\n    Class for a multi-class label prediction.\n\n    Examples:\n    - \"My account is locked\": [\"TECH_ISSUE\"]\n    - \"I can't access my billing info\": [\"TECH_ISSUE\", \"BILLING\"]\n    - \"When do you close for holidays?\": [\"GENERAL_QUERY\"]\n    - \"My payment didn't go through and now I can't log in\": [\"BILLING\", \"TECH_ISSUE\"]\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        ...,\n        description=\"The chain of thought that led to the prediction.\",\n    )\n\n    class_labels: List[Literal[\"TECH_ISSUE\", \"BILLING\", \"GENERAL_QUERY\"]] = Field(\n        ...,\n        description=\"The predicted class labels for the support ticket.\",\n    )\n```\n\n### Classifying Text\n\nThe function **`multi_classify`** is responsible for multi-label classification.\n\n```python\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\n\n\nclass MultiClassPrediction(BaseModel):\n    \"\"\"\n    Class for a multi-class label prediction.\n\n    Examples:\n    - \"My account is locked\": [\"TECH_ISSUE\"]\n    - \"I can't access my billing info\": [\"TECH_ISSUE\", \"BILLING\"]\n    - \"When do you close for holidays?\": [\"GENERAL_QUERY\"]\n    - \"My payment didn't go through and now I can't log in\": [\"BILLING\", \"TECH_ISSUE\"]\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        ...,\n        description=\"The chain of thought that led to the prediction.\",\n    )\n\n    class_labels: List[Literal[\"TECH_ISSUE\", \"BILLING\", \"GENERAL_QUERY\"]] = Field(\n        ...,\n        description=\"The predicted class labels for the support ticket.\",\n    )\n\n\n# <%hide%>\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef multi_classify(data: str) -> MultiClassPrediction:\n    \"\"\"Perform multi-label classification on the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=MultiClassPrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following support ticket: <ticket>{data}</ticket>\",\n            },\n        ],\n    )\n```\n\n### Testing and Evaluation\n\nFinally, we test the multi-label classification function using a sample support ticket.\n\n```python\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\nimport instructor\n\n\nclass MultiClassPrediction(BaseModel):\n    \"\"\"\n    Class for a multi-class label prediction.\n\n    Examples:\n    - \"My account is locked\": [\"TECH_ISSUE\"]\n    - \"I can't access my billing info\": [\"TECH_ISSUE\", \"BILLING\"]\n    - \"When do you close for holidays?\": [\"GENERAL_QUERY\"]\n    - \"My payment didn't go through and now I can't log in\": [\"BILLING\", \"TECH_ISSUE\"]\n    \"\"\"\n\n    chain_of_thought: str = Field(\n        ...,\n        description=\"The chain of thought that led to the prediction.\",\n    )\n\n    class_labels: List[Literal[\"TECH_ISSUE\", \"BILLING\", \"GENERAL_QUERY\"]] = Field(\n        ...,\n        description=\"The predicted class labels for the support ticket.\",\n    )\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef multi_classify(data: str) -> MultiClassPrediction:\n    \"\"\"Perform multi-label classification on the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=MultiClassPrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following support ticket: <ticket>{data}</ticket>\",\n            },\n        ],\n    )\n\n\n# <%hide%>\n# Test multi-label classification\nticket = \"My account is locked and I can't access my billing info.\"\nprediction = multi_classify(ticket)\nassert \"TECH_ISSUE\" in prediction.class_labels\nassert \"BILLING\" in prediction.class_labels\nprint(f\"Ticket: {ticket}\")\n#> Ticket: My account is locked and I can't access my billing info.\nprint(f\"Predicted Labels: {prediction.class_labels}\")\n#> Predicted Labels: ['TECH_ISSUE', 'BILLING']\n```\n\nBy using Literals and including few-shot examples, we've improved both the single-label and multi-label classification implementations. These changes enhance type safety and provide better guidance for the AI model, potentially leading to more accurate classifications.\n"
  },
  {
    "path": "docs/examples/document_segmentation.md",
    "content": "---\ntitle: \"Document Segmentation with LLMs: A Comprehensive Guide\"\ndescription: Learn effective document segmentation techniques using Cohere's LLM, enhancing comprehension of complex texts.\n---\n\n## See Also\n\n- [Knowledge Graph](./knowledge_graph.md) - Build knowledge graphs from documents\n- [Entity Resolution](./entity_resolution.md) - Identify and disambiguate entities\n- [List Extraction](../learning/patterns/list_extraction.md) - Extract multiple objects\n- [Nested Structures](../learning/patterns/nested_structure.md) - Complex hierarchical models\n\n# Document Segmentation\n\nIn this guide, we demonstrate how to do document segmentation using structured output from an LLM. We'll be using [command-a](https://docs.cohere.com/docs/command-a) - one of Cohere's latest LLMs with 256k context length and testing the approach on an article explaining the Transformer architecture. Same approach to document segmentation can be applied to any other domain where we need to break down a complex long document into smaller chunks.\n\n!!! tips \"Motivation\"\nSometimes we need a way to split the document into meaningful parts that center around a single key concept/idea. Simple length-based / rule-based text-splitters are not reliable enough. Consider the cases where documents contain code snippets or math equations - we don't want to split those on `'\\n\\n'` or have to write extensive rules for different types of documents. It turns out that LLMs with sufficiently long context length are well suited for this task.\n\n## Defining the Data Structures\n\nFirst, we need to define a **`Section`** class for each of the document's segments. **`StructuredDocument`** class will then encapsulate a list of these sections.\n\nNote that in order to avoid LLM regenerating the content of each section, we can simply enumerate each line of the input document and then ask LLM to segment it by providing start-end line numbers for each section.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Section(BaseModel):\n    title: str = Field(description=\"main topic of this section of the document\")\n    start_index: int = Field(description=\"line number where the section begins\")\n    end_index: int = Field(description=\"line number where the section ends\")\n\n\nclass StructuredDocument(BaseModel):\n    \"\"\"obtains meaningful sections, each centered around a single concept/topic\"\"\"\n\n    sections: List[Section] = Field(description=\"a list of sections of the document\")\n```\n\n## Document Preprocessing\n\nPreprocess the input `document` by prepending each line with its number.\n\n```python\ndef doc_with_lines(document):\n    document_lines = document.split(\"\\n\")\n    document_with_line_numbers = \"\"\n    line2text = {}\n    for i, line in enumerate(document_lines):\n        document_with_line_numbers += f\"[{i}] {line}\\n\"\n        line2text[i] = line\n    return document_with_line_numbers, line2text\n```\n\n## Segmentation\n\nNext use a Cohere client to extract `StructuredDocument` from the preprocessed doc.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Section(BaseModel):\n    title: str = Field(description=\"main topic of this section of the document\")\n    start_index: int = Field(description=\"line number where the section begins\")\n    end_index: int = Field(description=\"line number where the section ends\")\n\n\nclass StructuredDocument(BaseModel):\n    \"\"\"obtains meaningful sections, each centered around a single concept/topic\"\"\"\n\n    sections: List[Section] = Field(description=\"a list of sections of the document\")\n\n\n# <%hide%>\n\nimport instructor\n\n# Apply the patch to the cohere client\n# enables response_model keyword\nclient = instructor.from_provider(\"cohere/command-r-plus\")\n\n\nsystem_prompt = f\"\"\"\\\nYou are a world class educator working on organizing your lecture notes.\nRead the document below and extract a StructuredDocument object from it where each section of the document is centered around a single concept/topic that can be taught in one lesson.\nEach line of the document is marked with its line number in square brackets (e.g. [1], [2], [3], etc). Use the line numbers to indicate section start and end.\n\"\"\"\n\n\ndef get_structured_document(document_with_line_numbers) -> StructuredDocument:\n    return client.create(\n        model=\"command-a-03-2025\",\n        response_model=StructuredDocument,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": system_prompt,\n            },\n            {\n                \"role\": \"user\",\n                \"content\": document_with_line_numbers,\n            },\n        ],\n    )  # type: ignore\n```\n\nNext, we need to get back the section text based on the start/end indices and our `line2text` dict from the preprocessing step.\n\n```python\ndef get_sections_text(structured_doc, line2text):\n    segments = []\n    for s in structured_doc.sections:\n        contents = []\n        for line_id in range(s.start_index, s.end_index):\n            contents.append(line2text.get(line_id, ''))\n        segments.append(\n            {\n                \"title\": s.title,\n                \"content\": \"\\n\".join(contents),\n                \"start\": s.start_index,\n                \"end\": s.end_index,\n            }\n        )\n    return segments\n```\n\n## Example\n\nHere's an example of using these classes and functions to segment a tutorial on Transformers from [Sebastian Raschka](https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html). We can use `trafilatura` package to scrape the web page content of the article.\n\n```python\nfrom trafilatura import fetch_url, extract\n\n# <%hide%>\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\ndef doc_with_lines(document):\n    document_lines = document.split(\"\\n\")\n    document_with_line_numbers = \"\"\n    line2text = {}\n    for i, line in enumerate(document_lines):\n        document_with_line_numbers += f\"[{i}] {line}\\n\"\n        line2text[i] = line\n    return document_with_line_numbers, line2text\n\n\nclient = instructor.from_provider(\"cohere/command-r-plus\")\n\n\nsystem_prompt = f\"\"\"\\\nYou are a world class educator working on organizing your lecture notes.\nRead the document below and extract a StructuredDocument object from it where each section of the document is centered around a single concept/topic that can be taught in one lesson.\nEach line of the document is marked with its line number in square brackets (e.g. [1], [2], [3], etc). Use the line numbers to indicate section start and end.\n\"\"\"\n\n\nclass Section(BaseModel):\n    title: str = Field(description=\"main topic of this section of the document\")\n    start_index: int = Field(description=\"line number where the section begins\")\n    end_index: int = Field(description=\"line number where the section ends\")\n\n\nclass StructuredDocument(BaseModel):\n    \"\"\"obtains meaningful sections, each centered around a single concept/topic\"\"\"\n\n    sections: List[Section] = Field(description=\"a list of sections of the document\")\n\n\ndef get_structured_document(document_with_line_numbers) -> StructuredDocument:\n    return client.create(\n        model=\"command-a-03-2025\",\n        response_model=StructuredDocument,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": system_prompt,\n            },\n            {\n                \"role\": \"user\",\n                \"content\": document_with_line_numbers,\n            },\n        ],\n    )  # type: ignore\n\n\ndef get_sections_text(structured_doc, line2text):\n    segments = []\n    for s in structured_doc.sections:\n        contents = []\n        for line_id in range(s.start_index, s.end_index):\n            contents.append(line2text.get(line_id, ''))\n        segments.append(\n            {\n                \"title\": s.title,\n                \"content\": \"\\n\".join(contents),\n                \"start\": s.start_index,\n                \"end\": s.end_index,\n            }\n        )\n    return segments\n\n\n# <%hide%>\n\nurl = 'https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html'\ndownloaded = fetch_url(url)\ndocument = extract(downloaded)\n\n\ndocument_with_line_numbers, line2text = doc_with_lines(document)\nstructured_doc = get_structured_document(document_with_line_numbers)\nsegments = get_sections_text(structured_doc, line2text)\n```\n\n```\nprint(segments[5]['title'])\n\"\"\"\nIntroduction to Multi-Head Attention\n\"\"\"\nprint(segments[5]['content'])\n\"\"\"\nMulti-Head Attention\nIn the very first figure, at the top of this article, we saw that transformers use a module called multi-head attention. How does that relate to the self-attention mechanism (scaled-dot product attention) we walked through above?\nIn the scaled dot-product attention, the input sequence was transformed using three matrices representing the query, key, and value. These three matrices can be considered as a single attention head in the context of multi-head attention. The figure below summarizes this single attention head we covered previously:\nAs its name implies, multi-head attention involves multiple such heads, each consisting of query, key, and value matrices. This concept is similar to the use of multiple kernels in convolutional neural networks.\nTo illustrate this in code, suppose we have 3 attention heads, so we now extend the \\(d' \\times d\\) dimensional weight matrices so \\(3 \\times d' \\times d\\):\nIn:\nh = 3\nmultihead_W_query = torch.nn.Parameter(torch.rand(h, d_q, d))\nmultihead_W_key = torch.nn.Parameter(torch.rand(h, d_k, d))\nmultihead_W_value = torch.nn.Parameter(torch.rand(h, d_v, d))\nConsequently, each query element is now \\(3 \\times d_q\\) dimensional, where \\(d_q=24\\) (here, let’s keep the focus on the 3rd element corresponding to index position 2):\nIn:\nmultihead_query_2 = multihead_W_query.matmul(x_2)\nprint(multihead_query_2.shape)\nOut:\ntorch.Size([3, 24])\n\"\"\"\n```\n"
  },
  {
    "path": "docs/examples/entity_resolution.md",
    "content": "---\ntitle: Entity Resolution and Visualization for Legal Documents\ndescription: Learn how to extract, resolve, and visualize entities from legal contracts for better understanding and analysis.\n---\n\n## See Also\n\n- [Knowledge Graph](./knowledge_graph.md) - Build knowledge graphs from entities\n- [Building Knowledge Graphs](./building_knowledge_graphs.md) - Advanced graph construction\n- [Document Segmentation](./document_segmentation.md) - Break down documents for analysis\n- [Response Models](../concepts/models.md) - Working with complex data structures\n\n# Entity Resolution and Visualization for Legal Documents\n\nIn this guide, we demonstrate how to extract and resolve entities from a sample legal contract. Then, we visualize these entities and their dependencies as an entity graph. This approach can be invaluable for legal tech applications, aiding in the understanding of complex documents.\n\n!!! tips \"Motivation\"\n\n    Legal contracts are full of intricate details and interconnected clauses. Automatically extracting and visualizing these elements can make it easier to understand the document's overall structure and terms.\n\n## Defining the Data Structures\n\nThe **`Entity`** and **`Property`** classes model extracted entities and their attributes. **`DocumentExtraction`** encapsulates a list of these entities.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n    resolved_absolute_value: str\n\n\nclass Entity(BaseModel):\n    id: int = Field(\n        ...,\n        description=\"Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities\",\n    )\n    subquote_string: List[str] = Field(\n        ...,\n        description=\"Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution\",\n    )\n    entity_title: str\n    properties: List[Property] = Field(\n        ..., description=\"List of properties of the entity\"\n    )\n    dependencies: List[int] = Field(\n        ...,\n        description=\"List of entity ids that this entity depends  or relies on to resolve it\",\n    )\n\n\nclass DocumentExtraction(BaseModel):\n    entities: List[Entity] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be a separate object with a body and a list of sources\",\n    )\n```\n\n## Entity Extraction and Resolution\n\nThe **`ask_ai`** function utilizes OpenAI's API to extract and resolve entities from the input content.\n\n```python\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n    resolved_absolute_value: str\n\n\nclass Entity(BaseModel):\n    id: int = Field(\n        ...,\n        description=\"Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities\",\n    )\n    subquote_string: List[str] = Field(\n        ...,\n        description=\"Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution\",\n    )\n    entity_title: str\n    properties: List[Property] = Field(\n        ..., description=\"List of properties of the entity\"\n    )\n    dependencies: List[int] = Field(\n        ...,\n        description=\"List of entity ids that this entity depends  or relies on to resolve it\",\n    )\n\n\nclass DocumentExtraction(BaseModel):\n    entities: List[Entity] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be a separate object with a body and a list of sources\",\n    )\n\n\n# <%hide%>\n\n\ndef ask_ai(content) -> DocumentExtraction:\n    return client.create(\n        model=\"gpt-4\",\n        response_model=DocumentExtraction,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Extract and resolve a list of entities from the following document:\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n    )  # type: ignore\n```\n\n## Graph Visualization\n\n**`generate_graph`** takes the extracted entities and visualizes them using Graphviz. It creates nodes for each entity and edges for their dependencies.\n\n```python\nfrom graphviz import Digraph\n\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n    resolved_absolute_value: str\n\n\nclass Entity(BaseModel):\n    id: int = Field(\n        ...,\n        description=\"Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities\",\n    )\n    subquote_string: List[str] = Field(\n        ...,\n        description=\"Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution\",\n    )\n    entity_title: str\n    properties: List[Property] = Field(\n        ..., description=\"List of properties of the entity\"\n    )\n    dependencies: List[int] = Field(\n        ...,\n        description=\"List of entity ids that this entity depends  or relies on to resolve it\",\n    )\n\n\nclass DocumentExtraction(BaseModel):\n    entities: List[Entity] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be a separate object with a body and a list of sources\",\n    )\n\n\n# <%hide%>\ndef generate_html_label(entity: Entity) -> str:\n    rows = [\n        f\"<tr><td>{prop.key}</td><td>{prop.resolved_absolute_value}</td></tr>\"\n        for prop in entity.properties\n    ]\n    table_rows = \"\".join(rows)\n    return f\"<<table border='0' cellborder='1' cellspacing='0'><tr><td colspan='2'><b>{entity.entity_title}</b></td></tr>{table_rows}</table>>\"\n\n\ndef generate_graph(data: DocumentExtraction):\n    dot = Digraph(comment=\"Entity Graph\", node_attr={\"shape\": \"plaintext\"})\n\n    for entity in data.entities:\n        label = generate_html_label(entity)\n        dot.node(str(entity.id), label)\n\n    for entity in data.entities:\n        for dep_id in entity.dependencies:\n            dot.edge(str(entity.id), str(dep_id))\n\n    dot.render(\"entity.gv\", view=True)\n```\n\n## Execution\n\nFinally, execute the code to visualize the entity graph for the sample legal contract.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List\nfrom graphviz import Digraph\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n    resolved_absolute_value: str\n\n\nclass Entity(BaseModel):\n    id: int = Field(\n        ...,\n        description=\"Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities\",\n    )\n    subquote_string: List[str] = Field(\n        ...,\n        description=\"Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution\",\n    )\n    entity_title: str\n    properties: List[Property] = Field(\n        ..., description=\"List of properties of the entity\"\n    )\n    dependencies: List[int] = Field(\n        ...,\n        description=\"List of entity ids that this entity depends  or relies on to resolve it\",\n    )\n\n\nclass DocumentExtraction(BaseModel):\n    entities: List[Entity] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be a separate object with a body and a list of sources\",\n    )\n\n\ndef ask_ai(content) -> DocumentExtraction:\n    return client.create(\n        model=\"gpt-4\",\n        response_model=DocumentExtraction,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Extract and resolve a list of entities from the following document:\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n    )  # type: ignore\n\n\ndef generate_html_label(entity: Entity) -> str:\n    rows = [\n        f\"<tr><td>{prop.key}</td><td>{prop.resolved_absolute_value}</td></tr>\"\n        for prop in entity.properties\n    ]\n    table_rows = \"\".join(rows)\n    return f\"<<table border='0' cellborder='1' cellspacing='0'><tr><td colspan='2'><b>{entity.entity_title}</b></td></tr>{table_rows}</table>>\"\n\n\ndef generate_graph(data: DocumentExtraction):\n    dot = Digraph(comment=\"Entity Graph\", node_attr={\"shape\": \"plaintext\"})\n\n    for entity in data.entities:\n        label = generate_html_label(entity)\n        dot.node(str(entity.id), label)\n\n    for entity in data.entities:\n        for dep_id in entity.dependencies:\n            dot.edge(str(entity.id), str(dep_id))\n\n    dot.render(\"entity.gv\", view=True)\n\n\n# <%hide%>\ncontent = \"\"\"\nSample Legal Contract\nAgreement Contract\n\nThis Agreement is made and entered into on 2020-01-01 by and between Company A (\"the Client\") and Company B (\"the Service Provider\").\n\nArticle 1: Scope of Work\n\nThe Service Provider will deliver the software product to the Client 30 days after the agreement date.\n\nArticle 2: Payment Terms\n\nThe total payment for the service is $50,000.\nAn initial payment of $10,000 will be made within 7 days of the the signed date.\nThe final payment will be due 45 days after [SignDate].\n\nArticle 3: Confidentiality\n\nThe parties agree not to disclose any confidential information received from the other party for 3 months after the final payment date.\n\nArticle 4: Termination\n\nThe contract can be terminated with a 30-day notice, unless there are outstanding obligations that must be fulfilled after the [DeliveryDate].\n\"\"\"  # Your legal contract here\nmodel = ask_ai(content)\ngenerate_graph(model)\n```\n\nThis will produce a graphical representation of the entities and their dependencies, stored as \"entity.gv\".\n\n![Entity Graph visualization showing relationships between legal document entities](entity_resolution.png)\n"
  },
  {
    "path": "docs/examples/exact_citations.md",
    "content": "---\ntitle: Citation Validation with Instructor - Prevent Hallucinations\ndescription: Validate AI-generated answers with contextual citations using Instructor. Ensure every statement is backed by source quotes to prevent hallucinations.\n---\n\n# Example: Answering Questions with Validated Citations\n\nFor the full code example, check out [examples/citation_fuzzy_match.py](https://github.com/jxnl/instructor/blob/main/examples/citation_with_extraction/citation_fuzzy_match.py)\n\n## Overview\n\nThis example shows how to use Instructor with validators to not only add citations to answers generated but also prevent hallucinations by ensuring that every statement made by the LLM is backed up by a direct quote from the context provided, and that those quotes exist!\nTwo Python classes, `Fact` and `QuestionAnswer`, are defined to encapsulate the information of individual facts and the entire answer, respectively.\n\n## Data Structures\n\n### The `Fact` Class\n\nThe `Fact` class encapsulates a single statement or fact. It contains two fields:\n\n- `fact`: A string representing the body of the fact or statement.\n- `substring_quote`: A list of strings. Each string is a direct quote from the context that supports the `fact`.\n\n#### Validation Method: `validate_sources`\n\nThis method validates the sources (`substring_quote`) in the context. It utilizes regex to find the span of each substring quote in the given context. If the span is not found, the quote is removed from the list.\n\n```python hl_lines=\"6 8-13\"\nfrom pydantic import Field, BaseModel, model_validator, ValidationInfo\nfrom typing import List\n\n\nclass Fact(BaseModel):\n    fact: str = Field(...)\n    substring_quote: List[str] = Field(...)\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self, info: ValidationInfo) -> \"Fact\":\n        text_chunks = info.context.get(\"text_chunk\", None)\n        spans = list(self.get_spans(text_chunks))\n        self.substring_quote = [text_chunks[span[0] : span[1]] for span in spans]\n        return self\n\n    def get_spans(self, context):\n        for quote in self.substring_quote:\n            yield from self._get_span(quote, context)\n\n    def _get_span(self, quote, context):\n        for match in re.finditer(re.escape(quote), context):\n            yield match.span()\n```\n\n### The `QuestionAnswer` Class\n\nThis class encapsulates the question and its corresponding answer. It contains two fields:\n\n- `question`: The question asked.\n- `answer`: A list of `Fact` objects that make up the answer.\n\n#### Validation Method: `validate_sources`\n\nThis method checks that each `Fact` object in the `answer` list has at least one valid source. If a `Fact` object has no valid sources, it is removed from the `answer` list.\n\n```python hl_lines=\"5-8\"\nfrom pydantic import BaseModel, Field, model_validator\nfrom typing import List\n\n# <%hide%>\nfrom pydantic import ValidationInfo\n\n\nclass Fact(BaseModel):\n    fact: str = Field(...)\n    substring_quote: List[str] = Field(...)\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self, info: ValidationInfo) -> \"Fact\":\n        text_chunks = info.context.get(\"text_chunk\", None)\n        spans = list(self.get_spans(text_chunks))\n        self.substring_quote = [text_chunks[span[0] : span[1]] for span in spans]\n        return self\n\n    def get_spans(self, context):\n        for quote in self.substring_quote:\n            yield from self._get_span(quote, context)\n\n    def _get_span(self, quote, context):\n        for match in re.finditer(re.escape(quote), context):\n            yield match.span()\n\n\n# <%hide%>\nclass QuestionAnswer(BaseModel):\n    question: str = Field(...)\n    answer: List[Fact] = Field(...)\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self) -> \"QuestionAnswer\":\n        self.answer = [fact for fact in self.answer if len(fact.substring_quote) > 0]\n        return self\n```\n\n## Function to Ask AI a Question\n\n### The `ask_ai` Function\n\nThis function takes a string `question` and a string `context` and returns a `QuestionAnswer` object. It uses the OpenAI API to fetch the answer and then validates the sources using the defined classes.\n\nTo understand the validation context work from pydantic check out [pydantic's docs](https://docs.pydantic.dev/usage/validators/#model-validators)\n\n```python hl_lines=\"5 6 14\"\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model, context keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\n# <%hide%>\nfrom pydantic import ValidationInfo, BaseModel, Field, model_validator\nfrom typing import List\n\n\nclass Fact(BaseModel):\n    fact: str = Field(...)\n    substring_quote: List[str] = Field(...)\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self, info: ValidationInfo) -> \"Fact\":\n        text_chunks = info.context.get(\"text_chunk\", None)\n        spans = list(self.get_spans(text_chunks))\n        self.substring_quote = [text_chunks[span[0] : span[1]] for span in spans]\n        return self\n\n    def get_spans(self, context):\n        for quote in self.substring_quote:\n            yield from self._get_span(quote, context)\n\n    def _get_span(self, quote, context):\n        for match in re.finditer(re.escape(quote), context):\n            yield match.span()\n\n\nclass QuestionAnswer(BaseModel):\n    question: str = Field(...)\n    answer: List[Fact] = Field(...)\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self) -> \"QuestionAnswer\":\n        self.answer = [fact for fact in self.answer if len(fact.substring_quote) > 0]\n        return self\n\n\n# <%hide%>\ndef ask_ai(question: str, context: str) -> QuestionAnswer:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        temperature=0,\n        response_model=QuestionAnswer,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world class algorithm to answer questions with correct and exact citations.\",\n            },\n            {\"role\": \"user\", \"content\": f\"{context}\"},\n            {\"role\": \"user\", \"content\": f\"Question: {question}\"},\n        ],\n        context={\"text_chunk\": context},\n    )\n```\n\n## Example\n\nHere's an example of using these classes and functions to ask a question and validate the answer.\n\n```python\nquestion = \"What did the author do during college?\"\ncontext = \"\"\"\nMy name is Jason Liu, and I grew up in Toronto Canada but I was born in China.\nI went to an arts high school but in university I studied Computational Mathematics and physics.\nAs part of coop I worked at many companies including Stitchfix, Facebook.\nI also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.\n\"\"\"\n```\n\nThe output would be a `QuestionAnswer` object containing validated facts and their sources.\n\n```python\n{\n    \"question\": \"where did he go to school?\",\n    \"answer\": [\n        {\n            \"statement\": \"Jason Liu went to an arts highschool.\",\n            \"substring_phrase\": [\"arts highschool\"],\n        },\n        {\n            \"statement\": \"Jason Liu studied Computational Mathematics and physics in university.\",\n            \"substring_phrase\": [\"university\"],\n        },\n    ],\n}\n```\n\nThis ensures that every piece of information in the answer has been validated against the context.\n"
  },
  {
    "path": "docs/examples/examples.md",
    "content": "---\ntitle: Few-Shot Learning with Examples - Pydantic Models\ndescription: Enhance Pydantic models with practical examples for few-shot learning. Improve LLM understanding with example-driven JSON schemas.\n---\n\n# How should I include examples?\n\nTo enhance the clarity and usability of your model and prompt, incorporating examples directly into the JSON schema extra of your Pydantic model is highly recommended. This approach not only streamlines the integration of practical examples but also ensures that they are easily accessible and understandable within the context of your model's schema.\n\n```python\nimport instructor\nfrom typing import Iterable\nfrom pydantic import BaseModel, ConfigDict\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass SyntheticQA(BaseModel):\n    question: str\n    answer: str\n\n    model_config = ConfigDict(\n        json_schema_extra={\n            \"examples\": [\n                {\"question\": \"What is the capital of France?\", \"answer\": \"Paris\"},\n                {\n                    \"question\": \"What is the largest planet in our solar system?\",\n                    \"answer\": \"Jupiter\",\n                },\n                {\n                    \"question\": \"Who wrote 'To Kill a Mockingbird'?\",\n                    \"answer\": \"Harper Lee\",\n                },\n                {\n                    \"question\": \"What element does 'O' represent on the periodic table?\",\n                    \"answer\": \"Oxygen\",\n                },\n            ]\n        }\n    )\n\n\ndef get_synthetic_data() -> Iterable[SyntheticQA]:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\"role\": \"system\", \"content\": \"Generate synthetic examples\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Generate the exact examples you see in the examples of this prompt. \",\n            },\n        ],\n        response_model=Iterable[SyntheticQA],\n    )  # type: ignore\n\n\nif __name__ == \"__main__\":\n    for example in get_synthetic_data():\n        print(example)\n        #> question='What is the capital of France?' answer='Paris'\n        #> question='What is the largest planet in our solar system?' answer='Jupiter'\n        #> question=\"Who wrote 'To Kill a Mockingbird'?\" answer='Harper Lee'\n        \"\"\"\n        question=\"What element does 'O' represent on the periodic table?\" answer='Oxygen'\n        \"\"\"\n        \"\"\"\n        question=\"What element does 'O' represent on the periodic table?\" answer='Oxygen'\n        \"\"\"\n        \"\"\"\n        question=\"What element does 'O' represent on the periodic table?\" answer='Oxygen'\n        \"\"\"\n```"
  },
  {
    "path": "docs/examples/extract_contact_info.md",
    "content": "---\ntitle: Contact Information Extraction - Lead Generation Automation\ndescription: Automate customer lead extraction from text using Instructor. Extract names, phone numbers, and contact details with automatic validation.\n---\n\n# Customer Information Extraction\n\nIn this guide, we'll walk through how to extract customer lead information using OpenAI's API and Pydantic. This use case is essential for seamlessly automating the process of extracting specific information from a context.\n\n## Motivation\n\nYou could potentially integrate this into a chatbot to extract relevant user information from user messages. With the use of machine learning driven validation it would reduce the need for a human to verify the information.\n\n## Defining the Structure\n\nWe'll model a customer lead as a Lead object, including attributes for the name and phone number. We'll use a Pydantic PhoneNumber type to validate the phone numbers entered and provide a Field to give the model more information on correctly populating the object.\n\n## Extracting Lead Information\n\nTo extract lead information, we create the `parse_lead_from_message` function which integrates Instructor. It calls OpenAI's API, processes the text, and returns the extracted lead information as a Lead object.\n\n## Evaluating Lead Extraction\n\nTo showcase the `parse_lead_from_message` function we can provide sample user messages that may be obtained from a dialogue with a chatbot assistant. Also take note of the response model being set as `Iterable[Lead]` this allows for multiple leads being extracted from the same message.\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom pydantic_extra_types.phone_numbers import PhoneNumber\nfrom typing import Iterable\n\n\nclass Lead(BaseModel):\n    name: str\n    phone_number: PhoneNumber = Field(\n        description=\"Needs to be a phone number with a country code. If none, assume +1\"\n    )\n\n    # Can define some function here to send Lead information to a database using an API\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef parse_lead_from_message(user_message: str):\n    return client.create(\n        model=\"gpt-4-turbo-preview\",\n        response_model=Iterable[Lead],\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a data extraction system that extracts a user's name and phone number from a message.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Extract the user's lead information from this user's message: {user_message}\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    lead = parse_lead_from_message(\n        \"Yes, that would be great if someone can reach out my name is Patrick King 9175554587\"\n    )\n    assert all(isinstance(item, Lead) for item in lead)\n    for item in lead:\n        print(item.model_dump_json(indent=2))\n        \"\"\"\n        {\n          \"name\": \"Patrick King\",\n          \"phone_number\": \"tel:+1-917-555-4587\"\n        }\n        \"\"\"\n\n    # Invalid phone number example:\n    try:\n        lead2 = parse_lead_from_message(\n            \"Yes, that would be great if someone can reach out my name is Patrick King 9172234\"\n        )\n        assert all(isinstance(item, Lead) for item in lead2)\n        for item in lead2:\n            print(item.model_dump_json(indent=2))\n            \"\"\"\n            {\n              \"name\": \"Patrick King\",\n              \"phone_number\": \"tel:+1-917-223-4999\"\n            }\n            \"\"\"\n\n    except Exception as e:\n        print(\"ERROR:\", e)\n        \"\"\"\n        ERROR:\n        1 validation error for IterableLead\n        tasks.0.phone_number\n          value is not a valid phone number [type=value_error, input_value='+19172234', input_type=str]\n        \"\"\"\n```\n\nIn this example, the `parse_lead_from_message` function successfully extracts lead information from a user message, demonstrating how automation can enhance the efficiency of collecting accurate customer details. It also shows how the function successfully catches that the phone number is invalid so functionality can be implemented for the user to get prompted again to give a correct phone number.\n"
  },
  {
    "path": "docs/examples/extract_slides.md",
    "content": "---\ntitle: Extracting Competitor Data from Slides Using AI\ndescription: Learn how to extract competitor data from presentation slides, leveraging AI for comprehensive information gathering.\n---\n\n# Data extraction from slides\n\nIn this guide, we demonstrate how to extract data from slides.\n\n!!! tips \"Motivation\"\n\n   When we want to translate key information from slides into structured data, simply isolating the text and running extraction might not be enough. Sometimes the important data is in the images on the slides, so we should consider including them in our extraction pipeline.\n\n## Defining the necessary Data Structures\n\nLet's say we want to extract the competitors from various presentations and categorize them according to their respective industries.\n\nOur data model will have `Industry` which will be a list of `Competitor`'s for a specific industry, and `Competition` which will aggregate the competitors for all the industries.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Optional, List\n\n\nclass Competitor(BaseModel):\n    name: str\n    features: Optional[List[str]]\n\n\n# Define models\nclass Industry(BaseModel):\n    \"\"\"\n    Represents competitors from a specific industry extracted from an image using AI.\n    \"\"\"\n\n    name: str = Field(description=\"The name of the industry\")\n    competitor_list: List[Competitor] = Field(\n        description=\"A list of competitors for this industry\"\n    )\n\n\nclass Competition(BaseModel):\n    \"\"\"\n    This class serves as a structured representation of\n    competitors and their qualities.\n    \"\"\"\n\n    industry_list: List[Industry] = Field(\n        description=\"A list of industries and their competitors\"\n    )\n```\n\n## Competitors extraction\n\nTo extract competitors from slides we will define a function which will read images from urls and extract the relevant information from them.\n\n```python\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import Optional, List\n\n\nclass Competitor(BaseModel):\n    name: str\n    features: Optional[List[str]]\n\n\n# Define models\nclass Industry(BaseModel):\n    \"\"\"\n    Represents competitors from a specific industry extracted from an image using AI.\n    \"\"\"\n\n    name: str = Field(description=\"The name of the industry\")\n    competitor_list: List[Competitor] = Field(\n        description=\"A list of competitors for this industry\"\n    )\n\n\nclass Competition(BaseModel):\n    \"\"\"\n    This class serves as a structured representation of\n    competitors and their qualities.\n    \"\"\"\n\n    industry_list: List[Industry] = Field(\n        description=\"A list of industries and their competitors\"\n    )\n\n\n# <%hide%>\n\n\n# Define functions\ndef read_images(image_urls: List[str]) -> Competition:\n    \"\"\"\n    Given a list of image URLs, identify the competitors in the images.\n    \"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=Competition,\n        max_tokens=2048,\n        temperature=0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Identify competitors and generate key features for each competitor.\",\n                    },\n                    *[\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": url}}\n                        for url in image_urls\n                    ],\n                ],\n            }\n        ],\n    )\n```\n\n## Execution\n\nFinally, we will run the previous function with a few sample slides to see the data extractor in action.\n\nAs we can see, our model extracted the relevant information for each competitor regardless of how this information was formatted in the original presentations.\n\n```python\n# <%hide%>\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nfrom pydantic import BaseModel, Field\nfrom typing import Optional, List\n\n\nclass Competitor(BaseModel):\n    name: str\n    features: Optional[List[str]]\n\n\n# Define models\nclass Industry(BaseModel):\n    \"\"\"\n    Represents competitors from a specific industry extracted from an image using AI.\n    \"\"\"\n\n    name: str = Field(description=\"The name of the industry\")\n    competitor_list: List[Competitor] = Field(\n        description=\"A list of competitors for this industry\"\n    )\n\n\nclass Competition(BaseModel):\n    \"\"\"\n    This class serves as a structured representation of\n    competitors and their qualities.\n    \"\"\"\n\n    industry_list: List[Industry] = Field(\n        description=\"A list of industries and their competitors\"\n    )\n\n\n# Define functions\ndef read_images(image_urls: List[str]) -> Competition:\n    \"\"\"\n    Given a list of image URLs, identify the competitors in the images.\n    \"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=Competition,\n        max_tokens=2048,\n        temperature=0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Identify competitors and generate key features for each competitor.\",\n                    },\n                    *[\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": url}}\n                        for url in image_urls\n                    ],\n                ],\n            }\n        ],\n    )\n\n\n# <%hide%>\nurl = [\n    'https://miro.medium.com/v2/resize:fit:1276/0*h1Rsv-fZWzQUyOkt',\n]\nmodel = read_images(url)\nprint(model.model_dump_json(indent=2))\n\"\"\"\n{\n  \"industry_list\": [\n    {\n      \"name\": \"Accommodation Booking\",\n      \"competitor_list\": [\n        {\n          \"name\": \"CouchSurfing\",\n          \"features\": [\n            \"Free accommodation\",\n            \"Community-driven\",\n            \"Cultural exchange\"\n          ]\n        },\n        {\n          \"name\": \"Craigslist\",\n          \"features\": [\n            \"Local listings\",\n            \"Variety of options\",\n            \"User-generated content\"\n          ]\n        },\n        {\n          \"name\": \"BedandBreakfast.com\",\n          \"features\": [\n            \"Specialized in B&Bs\",\n            \"Personalized service\",\n            \"Local experiences\"\n          ]\n        },\n        {\n          \"name\": \"AirBed & Breakfast (Airbnb)\",\n          \"features\": [\n            \"Wide range of accommodations\",\n            \"User reviews\",\n            \"Instant booking\"\n          ]\n        },\n        {\n          \"name\": \"Hostels.com\",\n          \"features\": [\n            \"Budget-friendly hostels\",\n            \"Global reach\",\n            \"User ratings\"\n          ]\n        },\n        {\n          \"name\": \"RentDigs.com\",\n          \"features\": [\n            \"Rental listings\",\n            \"Long-term stays\",\n            \"User-friendly interface\"\n          ]\n        },\n        {\n          \"name\": \"VRBO\",\n          \"features\": [\n            \"Vacation rentals\",\n            \"Family-friendly options\",\n            \"Direct owner contact\"\n          ]\n        },\n        {\n          \"name\": \"Hotels.com\",\n          \"features\": [\n            \"Wide selection of hotels\",\n            \"Rewards program\",\n            \"Price match guarantee\"\n          ]\n        }\n      ]\n    }\n  ]\n}\n\"\"\"\n```\n"
  },
  {
    "path": "docs/examples/extracting_receipts.md",
    "content": "---\ntitle: Receipt Data Extraction with GPT-4 Vision - Expense Tracking\ndescription: Extract and validate receipt data from images using GPT-4 Vision and Instructor. Automate expense tracking with structured receipt parsing.\n---\n\n# Extracting Receipt Data using GPT-4 and Python\n\nThis post demonstrates how to use Python's Pydantic library and OpenAI's GPT-4 model to extract receipt data from images and validate the total amount. This method is particularly useful for automating expense tracking and financial analysis tasks.\n\n## Defining the Item and Receipt Classes\n\nFirst, we define two Pydantic models, `Item` and `Receipt`, to structure the extracted data. The `Item` class represents individual items on the receipt, with fields for name, price, and quantity. The `Receipt` class contains a list of `Item` objects and the total amount.\n\n```python\nfrom pydantic import BaseModel\n\n\nclass Item(BaseModel):\n    name: str\n    price: float\n    quantity: int\n\n\nclass Receipt(BaseModel):\n    items: list[Item]\n    total: float\n```\n\n## Validating the Total Amount\n\nTo ensure the accuracy of the extracted data, we use Pydantic's `model_validator` decorator to define a custom validation function, `check_total`. This function calculates the sum of item prices and compares it to the extracted total amount. If there's a discrepancy, it raises a `ValueError`.\n\n```python\nfrom pydantic import model_validator\n\n\n@model_validator(mode=\"after\")\ndef check_total(self):\n    items = self.items\n    total = self.total\n    calculated_total = sum(item.price * item.quantity for item in items)\n    if calculated_total != total:\n        raise ValueError(\n            f\"Total {total} does not match the sum of item prices {calculated_total}\"\n        )\n    return self\n```\n\n## Extracting Receipt Data from Images\n\nThe `extract_receipt` function uses OpenAI's GPT-4 model to process an image URL and extract receipt data. We utilize the `instructor` library to configure the OpenAI client for this purpose.\n\n```python\nimport instructor\n\n# <%hide%>\nfrom pydantic import BaseModel, model_validator\n\n\nclass Item(BaseModel):\n    name: str\n    price: float\n    quantity: int\n\n\nclass Receipt(BaseModel):\n    items: list[Item]\n    total: float\n\n    @model_validator(mode=\"after\")\n    def check_total(cls, values: \"Receipt\"):\n        items = values.items\n        total = values.total\n        calculated_total = sum(item.price * item.quantity for item in items)\n        if calculated_total != total:\n            raise ValueError(\n                f\"Total {total} does not match the sum of item prices {calculated_total}\"\n            )\n        return values\n\n\n# <%hide%>\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef extract(url: str) -> Receipt:\n    return client.create(\n        model=\"gpt-4\",\n        max_tokens=4000,\n        response_model=Receipt,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Analyze the image and return the items in the receipt and the total amount.\",\n                    },\n                ],\n            }\n        ],\n    )\n```\n\n## Practical Examples\n\nIn these examples, we apply the method to extract receipt data from two different images. The custom validation function ensures that the extracted total amount matches the sum of item prices.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, model_validator\nimport instructor\n\n\nclass Item(BaseModel):\n    name: str\n    price: float\n    quantity: int\n\n\nclass Receipt(BaseModel):\n    items: list[Item]\n    total: float\n\n    @model_validator(mode=\"after\")\n    def check_total(cls, values: \"Receipt\"):\n        items = values.items\n        total = values.total\n        calculated_total = round(sum(item.price * item.quantity for item in items), 2)\n        if calculated_total != total:\n            raise ValueError(\n                f\"Total {total} does not match the sum of item prices {calculated_total}\"\n            )\n        return values\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef extract(url: str) -> Receipt:\n    return client.create(\n        model=\"gpt-4o\",\n        max_tokens=4000,\n        response_model=Receipt,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Analyze the image and return the items in the receipt and the total amount.\",\n                    },\n                ],\n            }\n        ],\n    )\n\n\n# <%hide%>\nurl = \"https://templates.mediamodifier.com/645124ff36ed2f5227cbf871/supermarket-receipt-template.jpg\"\n\n\nreceipt = extract(url)\nprint(receipt)\n\"\"\"\nitems=[Item(name='Lorem ipsum', price=9.2, quantity=1), Item(name='Lorem ipsum dolor sit', price=19.2, quantity=1), Item(name='Lorem ipsum dolor sit amet', price=15.0, quantity=1), Item(name='Lorem ipsum', price=15.0, quantity=1), Item(name='Lorem ipsum', price=15.0, quantity=1), Item(name='Lorem ipsum dolor sit', price=15.0, quantity=1), Item(name='Lorem ipsum', price=19.2, quantity=1)] total=107.6\n\"\"\"\n```\n\nBy combining the power of GPT-4 and Python's Pydantic library, we can accurately extract and validate receipt data from images, streamlining expense tracking and financial analysis tasks."
  },
  {
    "path": "docs/examples/extracting_tables.md",
    "content": "---\ntitle: Extracting Tables from Images using GPT-Vision\ndescription: Learn how to use Python and GPT-Vision to extract and convert tables from images into markdown for data analysis.\n---\n\n## See Also\n\n- [Vision Processing](./tables_from_vision.md) - More vision-based table extraction\n- [Multi-Modal Processing](./multi_modal_gemini.md) - Using Gemini for vision tasks\n- [Image Processing Examples](./index.md#vision-processing) - More vision examples\n- [Raw Response](../concepts/raw_response.md) - Access original LLM responses\n\n# Extracting Tables using GPT-Vision\n\nThis post demonstrates how to use Python's type annotations and OpenAI's new vision model to extract tables from images and convert them into markdown format. This method is particularly useful for data analysis and automation tasks.\n\nThe full code is available on [GitHub](https://github.com/jxnl/instructor/blob/main/examples/vision/run_table.py)\n\n## Building the Custom Type for Markdown Tables\n\nFirst, we define a custom type, `MarkdownDataFrame`, to handle pandas DataFrames formatted in markdown. This type uses Python's `Annotated` and `InstanceOf` types, along with decorators `BeforeValidator` and `PlainSerializer`, to process and serialize the data.\n\n```python\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import BeforeValidator, PlainSerializer, InstanceOf, WithJsonSchema\nimport pandas as pd\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda df: df.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be seperate\",\n        }\n    ),\n]\n```\n\n## Defining the Table Class\n\nThe `Table` class is essential for organizing the extracted data. It includes a caption and a dataframe, processed as a markdown table. Since most of the complexity is handled by the `MarkdownDataFrame` type, the `Table` class is straightforward!\n\n```python\nfrom pydantic import BaseModel\n\n# <%hide%>\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import BeforeValidator, PlainSerializer, InstanceOf, WithJsonSchema\nimport pandas as pd\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda df: df.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be seperate\",\n        }\n    ),\n]\n# <%hide%>\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n```\n\n## Extracting Tables from Images\n\nThe `extract_table` function uses OpenAI's vision model to process an image URL and extract tables in markdown format. We utilize the `instructor` library to patch the OpenAI client for this purpose.\n\n```python\nimport instructor\nfrom typing import Iterable\n\n# <%hide%>\nfrom pydantic import BaseModel\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import BeforeValidator, PlainSerializer, InstanceOf, WithJsonSchema\nimport pandas as pd\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda df: df.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be separate\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\n# <%hide%>\n\n# Use MD_JSON mode since the vision model does not support any special structured output mode\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", mode=instructor.Mode.MD_JSON)\n\n\ndef extract_table(url: str) -> Iterable[Table]:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=Iterable[Table],\n        max_tokens=1800,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\"type\": \"text\", \"text\": \"Extract table from image.\"},\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": url}},\n                ],\n            }\n        ],\n    )\n```\n\n## Practical Example\n\nIn this example, we apply the method to extract data from an image showing the top grossing apps in Ireland for October 2023.\n\n```python\n# <%hide%>\nimport instructor\nfrom typing import Iterable\nfrom pydantic import BaseModel\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import BeforeValidator, PlainSerializer, InstanceOf, WithJsonSchema\nimport pandas as pd\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda df: df.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be separate\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef extract_table(url: str) -> Iterable[Table]:\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Iterable[Table],\n        max_tokens=1800,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\"type\": \"text\", \"text\": \"Extract table from image.\"},\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": url}},\n                ],\n            }\n        ],\n    )\n\n\n# <%hide%>\n\nurl = \"https://a.storyblok.com/f/47007/2400x2000/bf383abc3c/231031_uk-ireland-in-three-charts_table_v01_b.png\"\ntables = extract_table(url)\nfor table in tables:\n\n    print(table.dataframe)\n    \"\"\"\n                                      Android App   ... Category\n     Android Rank                                   ...\n    1                                   Google One  ...    Social networking\n    2                                      Disney+  ...        Entertainment\n    3                TikTok - Videos, Music & LIVE  ...        Entertainment\n    4                             Candy Crush Saga  ...        Entertainment\n    5               Tinder: Dating, Chat & Friends  ...                Games\n    6                                  Coin Master  ...        Entertainment\n    7                                       Roblox  ...               Dating\n    8               Bumble - Dating & Make Friends  ...                Games\n    9                                  Royal Match  ...             Business\n    10                 Spotify: Music and Podcasts  ...            Education\n\n    [10 rows x 5 columns]\n    \"\"\"\n```\n\n??? Note \"Expand to see the output\"\n\n    ![Top 10 Grossing Apps in October 2023 for Ireland - Table extraction example showing structured data from image](https://a.storyblok.com/f/47007/2400x2000/bf383abc3c/231031_uk-ireland-in-three-charts_table_v01_b.png)\n\n    ### Top 10 Grossing Apps in October 2023 (Ireland) for Android Platforms\n\n    | Rank | App Name                         | Category           |\n    |------|----------------------------------|--------------------|\n    | 1    | Google One                       | Productivity       |\n    | 2    | Disney+                          | Entertainment      |\n    | 3    | TikTok - Videos, Music & LIVE    | Entertainment      |\n    | 4    | Candy Crush Saga                 | Games              |\n    | 5    | Tinder: Dating, Chat & Friends   | Social networking  |\n    | 6    | Coin Master                      | Games              |\n    | 7    | Roblox                           | Games              |\n    | 8    | Bumble - Dating & Make Friends   | Dating             |\n    | 9    | Royal Match                      | Games              |\n    | 10   | Spotify: Music and Podcasts      | Music & Audio      |\n\n    ### Top 10 Grossing Apps in October 2023 (Ireland) for iOS Platforms\n\n    | Rank | App Name                         | Category           |\n    |------|----------------------------------|--------------------|\n    | 1    | Tinder: Dating, Chat & Friends   | Social networking  |\n    | 2    | Disney+                          | Entertainment      |\n    | 3    | YouTube: Watch, Listen, Stream   | Entertainment      |\n    | 4    | Audible: Audio Entertainment     | Entertainment      |\n    | 5    | Candy Crush Saga                 | Games              |\n    | 6    | TikTok - Videos, Music & LIVE    | Entertainment      |\n    | 7    | Bumble - Dating & Make Friends   | Dating             |\n    | 8    | Roblox                           | Games              |\n    | 9    | LinkedIn: Job Search & News      | Business           |\n    | 10   | Duolingo - Language Lessons      | Education          |\n"
  },
  {
    "path": "docs/examples/groq.md",
    "content": "---\ntitle: Groq AI Integration - Fast Structured Outputs\ndescription: Use Groq AI with Instructor for fast structured outputs. Leverage Groq's high-speed inference for real-time structured data extraction.\n---\n\n# Structured Outputs using Groq\nInstead of using openai or antrophic you can now also use groq for inference by using from_groq.\n\nThe examples are using mixtral-8x7b model.\n\n## GroqCloud API\nTo use groq you need to obtain a groq API key.\nGoto [groqcloud](https://console.groq.com) and login. Select API Keys from the left menu and then select Create API key to create a new key.\n\n## Use example\nSome pip packages need to be installed to use the example:\n```\npip install instructor groq pydantic openai anthropic\n```\nYou need to export the groq API key:\n```\nexport GROQ_API_KEY=<your-api-key>\n```\n\nAn example:\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\nimport instructor\n\n\nclass Character(BaseModel):\n    name: str\n    fact: List[str] = Field(..., description=\"A list of facts about the subject\")\n\n\n# Use from_provider for simplified setup\nclient = instructor.from_provider(\"groq/mixtral-8x7b-32768\", mode=instructor.Mode.TOOLS)\n\nresp = client.create(\n    model=\"mixtral-8x7b-32768\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Tell me about the company Tesla\",\n        }\n    ],\n    response_model=Character,\n)\nprint(resp.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"Tesla\",\n  \"fact\": [\n    \"electric vehicle manufacturer\",\n    \"solar panel producer\",\n    \"based in Palo Alto, California\",\n    \"founded in 2003 by Elon Musk\"\n  ]\n}\n\"\"\"\n```\nYou can find another example called groq_example2.py under examples/groq of this repository.\n"
  },
  {
    "path": "docs/examples/image_to_ad_copy.md",
    "content": "---\ntitle: Automatically Generate Advertising Copy from Product Images Using GPT-4 Vision\ndescription: Learn how to use GPT-4 Vision API to create engaging advertising copy from product images, ideal for e-commerce and marketing teams.\n---\n\n# Use Vision API to detect products and generate advertising copy\n\nThis post demonstrates how to use GPT-4 Vision API and the Chat API to automatically generate advertising copy from product images. This method can be useful for marketing and advertising teams, as well as for e-commerce platforms.\n\nThe full code is available on [GitHub](https://www.github.com/jxnl/instructor/tree/main/examples/vision/image_to_ad_copy.py).\n\n## Building the models\n\n### Product\n\nFor the `Product` model, we define a class that represents a product extracted from an image and store the name, key features, and description. The product attributes are dynamically determined based on the content of the image.\n\nNote that it is easy to add [Validators](https://jxnl.github.io/instructor/concepts/reask_validation/) and other Pydantic features to the model to ensure that the data is valid and consistent.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\n\n\nclass Product(BaseModel):\n    \"\"\"\n    Represents a product extracted from an image using AI.\n\n    The product attributes are dynamically determined based on the content\n    of the image and the AI's interpretation. This class serves as a structured\n    representation of the identified product characteristics.\n    \"\"\"\n\n    name: str = Field(\n        description=\"A generic name for the product.\", example=\"Headphones\"\n    )\n    key_features: Optional[List[str]] = Field(\n        description=\"A list of key features of the product that stand out.\",\n        default=None,\n    )\n\n    description: Optional[str] = Field(\n        description=\"A description of the product.\",\n        default=None,\n    )\n\n    # Can be customized and automatically generated\n    def generate_prompt(self):\n        prompt = f\"Product: {self.name}\\n\"\n        if self.description:\n            prompt += f\"Description: {self.description}\\n\"\n        if self.key_features:\n            prompt += f\"Key Features: {', '.join(self.key_features)}\\n\"\n        return prompt\n```\n\n### Identified Product\n\nWe also define a class that represents a list of products identified in the images. We also add an error flag and message to indicate if there was an error in the processing of the image.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Optional, List\n\n\n# <%hide%>\nclass Product(BaseModel):\n    \"\"\"\n    Represents a product extracted from an image using AI.\n\n    The product attributes are dynamically determined based on the content\n    of the image and the AI's interpretation. This class serves as a structured\n    representation of the identified product characteristics.\n    \"\"\"\n\n    name: str = Field(\n        description=\"A generic name for the product.\", example=\"Headphones\"\n    )\n    key_features: Optional[List[str]] = Field(\n        description=\"A list of key features of the product that stand out.\",\n        default=None,\n    )\n\n    description: Optional[str] = Field(\n        description=\"A description of the product.\",\n        default=None,\n    )\n\n    # Can be customized and automatically generated\n    def generate_prompt(self):\n        prompt = f\"Product: {self.name}\\n\"\n        if self.description:\n            prompt += f\"Description: {self.description}\\n\"\n        if self.key_features:\n            prompt += f\"Key Features: {', '.join(self.key_features)}\\n\"\n        return prompt\n\n\n# <%hide%>\nclass IdentifiedProduct(BaseModel):\n    \"\"\"\n    Represents a list of products identified in the images.\n    \"\"\"\n\n    products: Optional[List[Product]] = Field(\n        description=\"A list of products identified by the AI.\",\n        example=[\n            Product(\n                name=\"Headphones\",\n                description=\"Wireless headphones with noise cancellation.\",\n                key_features=[\"Wireless\", \"Noise Cancellation\"],\n            )\n        ],\n        default=None,\n    )\n\n    error: bool = Field(default=False)\n    message: Optional[str] = Field(default=None)\n```\n\n### Advertising Copy\n\nFinally, the `AdCopy` models stores the output in a structured format with a headline and the text.\n\n```python\nfrom pydantic import BaseModel, Field\n\n\nclass AdCopy(BaseModel):\n    \"\"\"\n    Represents a generated ad copy.\n    \"\"\"\n\n    headline: str = Field(\n        description=\"A short, catchy, and memorable headline for the given product. The headline should invoke curiosity and interest in the product.\",\n    )\n    ad_copy: str = Field(\n        description=\"A long-form advertisement copy for the given product. This will be used in campaigns to promote the product with a persuasive message and a call-to-action with the objective of driving sales.\",\n    )\n    name: str = Field(description=\"The name of the product being advertised.\")\n```\n\n## Calling the API\n\n### Product Detection\n\nThe `read_images` function uses OpenAI's vision model to process a list of image URLs and identify products in each of them. We utilize the `instructor` library to patch the OpenAI client for this purpose.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import Optional, List\n\n\nclass Product(BaseModel):\n    \"\"\"\n    Represents a product extracted from an image using AI.\n\n    The product attributes are dynamically determined based on the content\n    of the image and the AI's interpretation. This class serves as a structured\n    representation of the identified product characteristics.\n    \"\"\"\n\n    name: str = Field(\n        description=\"A generic name for the product.\", example=\"Headphones\"\n    )\n    key_features: Optional[List[str]] = Field(\n        description=\"A list of key features of the product that stand out.\",\n        default=None,\n    )\n\n    description: Optional[str] = Field(\n        description=\"A description of the product.\",\n        default=None,\n    )\n\n    # Can be customized and automatically generated\n    def generate_prompt(self):\n        prompt = f\"Product: {self.name}\\n\"\n        if self.description:\n            prompt += f\"Description: {self.description}\\n\"\n        if self.key_features:\n            prompt += f\"Key Features: {', '.join(self.key_features)}\\n\"\n        return prompt\n\n\nclass IdentifiedProduct(BaseModel):\n    \"\"\"\n    Represents a list of products identified in the images.\n    \"\"\"\n\n    products: Optional[List[Product]] = Field(\n        description=\"A list of products identified by the AI.\",\n        example=[\n            Product(\n                name=\"Headphones\",\n                description=\"Wireless headphones with noise cancellation.\",\n                key_features=[\"Wireless\", \"Noise Cancellation\"],\n            )\n        ],\n        default=None,\n    )\n\n    error: bool = Field(default=False)\n    message: Optional[str] = Field(default=None)\n\n\n# <%hide%>\ndef read_images(image_urls: list[str]) -> IdentifiedProduct:\n    \"\"\"\n    Given a list of image URLs, identify the products in the images.\n    \"\"\"\n\n    logger.info(f\"Identifying products in images... {len(image_urls)} images\")\n\n    return client_image.create(\n        response_model=IdentifiedProduct,\n        max_tokens=1024,  # can be changed\n        temperature=0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Identify products using the given images and generate key features for each product.\",\n                    },\n                    *[\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": url}}\n                        for url in image_urls\n                    ],\n                ],\n            }\n        ],\n    )\n```\n\nThis gives us a list of products identified in all the images.\n\n### Generate advertising copy\n\nThen, we can use the `generate_ad_copy` function to generate advertising copy for each of the products identified in the images.\n\nTwo clients are defined for the two different models. This is because the `gpt-4-vision-preview` model is not compatible with the `gpt-4-1106-preview` model in terms of their response format.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\n\n\nclass Product(BaseModel):\n    \"\"\"\n    Represents a product extracted from an image using AI.\n\n    The product attributes are dynamically determined based on the content\n    of the image and the AI's interpretation. This class serves as a structured\n    representation of the identified product characteristics.\n    \"\"\"\n\n    name: str = Field(\n        description=\"A generic name for the product.\", example=\"Headphones\"\n    )\n    key_features: Optional[List[str]] = Field(\n        description=\"A list of key features of the product that stand out.\",\n        default=None,\n    )\n\n    description: Optional[str] = Field(\n        description=\"A description of the product.\",\n        default=None,\n    )\n\n    # Can be customized and automatically generated\n    def generate_prompt(self):\n        prompt = f\"Product: {self.name}\\n\"\n        if self.description:\n            prompt += f\"Description: {self.description}\\n\"\n        if self.key_features:\n            prompt += f\"Key Features: {', '.join(self.key_features)}\\n\"\n        return prompt\n\n\nclass AdCopy(BaseModel):\n    \"\"\"\n    Represents a generated ad copy.\n    \"\"\"\n\n    headline: str = Field(\n        description=\"A short, catchy, and memorable headline for the given product. The headline should invoke curiosity and interest in the product.\",\n    )\n    ad_copy: str = Field(\n        description=\"A long-form advertisement copy for the given product. This will be used in campaigns to promote the product with a persuasive message and a call-to-action with the objective of driving sales.\",\n    )\n    name: str = Field(description=\"The name of the product being advertised.\")\n\n\n# <%hide%>\ndef generate_ad_copy(product: Product) -> AdCopy:\n    \"\"\"\n    Given a product, generate an ad copy for the product.\n    \"\"\"\n\n    logger.info(f\"Generating ad copy for product: {product.name}\")\n\n    return client_copy.create(\n        response_model=AdCopy,\n        temperature=0.3,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are an expert marketing assistant for all products. Your task is to generate an advertisement copy for a product using the name, description, and key features.\",\n            },\n            {\"role\": \"user\", \"content\": product.generate_prompt()},\n        ],\n    )\n```\n\n### Putting it all together\n\nFinally, we can put it all together in a single function that takes a list of image URLs and generates advertising copy for the products identified in the images. Please refer to the [full code](https://www.github.com/jxnl/instructor/tree/main/examples/vision/image_to_ad_copy.py) for the complete implementation.\n\n## Input file\n\nThe input file is currently a list of image URLs, but this trivial to change to any required format.\n\n```plaintext\nhttps://contents.mediadecathlon.com/p1279823/9a1c59ad97a4084a346c014740ae4d3ff860ea70b485ee65f34017ff5e9ae5f7/recreational-ice-skates-fit-50-black.jpg?format=auto\nhttps://contents.mediadecathlon.com/p1279822/a730505231dbd6747c14ee93e8f89e824d3fa2a5b885ec26de8d7feb5626638a/recreational-ice-skates-fit-50-black.jpg?format=auto\nhttps://contents.mediadecathlon.com/p2329893/1ed75517602a5e00245b89ab6a1c6be6d8968a5a227c932b10599f857f3ed4cd/mens-hiking-leather-boots-sh-100-x-warm.jpg?format=auto\nhttps://contents.mediadecathlon.com/p2047870/8712c55568dd9928c83b19c6a4067bf161811a469433dc89244f0ff96a50e3e9/men-s-winter-hiking-boots-sh-100-x-warm-grey.jpg?format=auto\n```\n\n??? Note \"Expand to see the output\"\n\n    ![Recreational ice skates product image for ad copy generation](https://contents.mediadecathlon.com/p1279823/9a1c59ad97a4084a346c014740ae4d3ff860ea70b485ee65f34017ff5e9ae5f7/recreational-ice-skates-fit-50-black.jpg?format=auto)\n    ![Men's hiking leather boots product image for ad copy generation](https://contents.mediadecathlon.com/p2329893/1ed75517602a5e00245b89ab6a1c6be6d8968a5a227c932b10599f857f3ed4cd/mens-hiking-leather-boots-sh-100-x-warm.jpg?format=auto)\n\n    ```json\n    {\n        \"products\":\n        [\n            {\n                \"name\": \"Ice Skates\",\n                \"key_features\": [\n                    \"Lace-up closure\",\n                    \"Durable blade\",\n                    \"Ankle support\"\n                ],\n                \"description\": \"A pair of ice skates with lace-up closure for secure fit, durable blade for ice skating, and reinforced ankle support.\"\n            },\n            {\n                \"name\": \"Hiking Boots\",\n                \"key_features\": [\n                    \"High-top design\",\n                    \"Rugged outsole\",\n                    \"Water-resistant\"\n                ],\n                \"description\": \"Sturdy hiking boots featuring a high-top design for ankle support, rugged outsole for grip on uneven terrain, and water-resistant construction.\"\n            },\n            {\n                \"name\": \"Winter Boots\",\n                \"key_features\": [\n                    \"Insulated lining\",\n                    \"Waterproof lower\",\n                    \"Slip-resistant sole\"\n                ],\n                \"description\": \"Warm winter boots with insulated lining for cold weather, waterproof lower section to keep feet dry, and a slip-resistant sole for stability.\"\n            }\n        ],\n        \"ad_copies\": [\n            {\n                \"headline\": \"Glide with Confidence - Discover the Perfect Ice Skates!\",\n                \"ad_copy\": \"Step onto the ice with poise and precision with our premium Ice Skates. Designed for both beginners and seasoned skaters, these skates offer a perfect blend of comfort and performance. The lace-up closure ensures a snug fit that keeps you stable as you carve through the ice. With a durable blade that withstands the test of time, you can focus on perfecting your moves rather than worrying about your equipment. The reinforced ankle support provides the necessary protection and aids in preventing injuries, allowing you to skate with peace of mind. Whether you're practicing your spins, jumps, or simply enjoying a leisurely glide across the rink, our Ice Skates are the ideal companion for your ice adventures. Lace up and get ready to experience the thrill of ice skating like never before!\",\n                \"name\": \"Ice Skates\"\n            },\n            {\n                \"headline\": \"Conquer Every Trail with Confidence!\",\n                \"ad_copy\": \"Embark on your next adventure with our top-of-the-line Hiking Boots! Designed for the trail-blazing spirits, these boots boast a high-top design that provides unparalleled ankle support to keep you steady on any path. The rugged outsole ensures a firm grip on the most uneven terrains, while the water-resistant construction keeps your feet dry as you traverse through streams and muddy trails. Whether you're a seasoned hiker or just starting out, our Hiking Boots are the perfect companion for your outdoor escapades. Lace up and step into the wild with confidence - your journey awaits!\",\n                \"name\": \"Hiking Boots\"\n            },\n            {\n                \"headline\": \"Conquer the Cold with Comfort!\",\n                \"ad_copy\": \"Step into the season with confidence in our Winter Boots, the ultimate ally against the chill. Designed for those who don't let the cold dictate their moves, these boots feature an insulated lining that wraps your feet in a warm embrace, ensuring that the biting cold is a worry of the past. But warmth isn't their only virtue. With a waterproof lower section, your feet will remain dry and cozy, come rain, snow, or slush. And let's not forget the slip-resistant sole that stands between you and the treacherous ice, offering stability and peace of mind with every step you take. Whether you're braving a blizzard or just nipping out for a coffee, our Winter Boots are your trusty companions, keeping you warm, dry, and upright. Don't let winter slow you down. Lace up and embrace the elements!\",\n                \"name\": \"Winter Boots\"\n            }\n        ]\n    }\n    ```\n"
  },
  {
    "path": "docs/examples/index.md",
    "content": "---\ntitle: Instructor Cookbook Collection\ndescription: Practical examples and recipes for solving real-world problems with structured outputs\n---\n\n# Instructor Cookbooks\n\n<div class=\"grid cards\" markdown>\n\n- :material-text-box-multiple: **Text Processing**\n\n    Extract structured information from text documents\n\n    [:octicons-arrow-right-16: View Recipes](#text-processing)\n\n- :material-image: **Multi-Modal**\n\n    Work with images and other media types\n\n    [:octicons-arrow-right-16: View Recipes](#multi-modal-examples)\n\n- :material-database: **Data Tools**\n\n    Integrate with databases and data processing tools\n\n    [:octicons-arrow-right-16: View Recipes](#data-tools)\n\n- :material-server: **Deployment**\n\n    Options for local and cloud deployment\n\n    [:octicons-arrow-right-16: View Recipes](#deployment-options)\n\n</div>\n\nOur cookbooks demonstrate how to use Instructor to solve real-world problems with structured outputs. Each example includes complete code and explanations to help you implement similar solutions in your own projects.\n\n## Text Processing\n\n### Classification Examples\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Single Classification](single_classification.md) | Basic classification with a single category | Content categorization |\n| [Multiple Classification](multiple_classification.md) | Handling multiple classification categories | Multi-label document tagging |\n| [Enum-Based Classification](classification.md) | Using Python enums for structured classification | Standardized taxonomies |\n| [Batch Classification](bulk_classification.md) | Process multiple items efficiently | High-volume text processing |\n| [Batch Classification with LangSmith](batch_classification_langsmith.md) | Using LangSmith for batch processing | Performance monitoring |\n| [Local Classification](local_classification.md) | Classification without external APIs | Offline processing |\n\n### Information Extraction\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Entity Resolution](entity_resolution.md) | Identify and disambiguate entities | Name standardization |\n| [Contact Information](extract_contact_info.md) | Extract structured contact details | CRM data entry |\n| [PII Sanitization](pii.md) | Detect and redact sensitive information | Privacy compliance |\n| [Citation Extraction](exact_citations.md) | Accurately extract formatted citations | Academic research |\n| [Action Items](action_items.md) | Extract tasks from text | Meeting follow-ups |\n| [Search Query Processing](search.md) | Structure complex search queries | Search enhancement |\n\n### Document Processing\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Document Segmentation](document_segmentation.md) | Divide documents into meaningful sections | Long-form content analysis |\n| [Planning and Tasks](planning-tasks.md) | Break down complex queries into subtasks | Project management |\n| [Knowledge Graph Generation](knowledge_graph.md) | Create relationship graphs from text | Information visualization |\n| [Knowledge Graph Building](../examples/building_knowledge_graphs.md) | Build and query knowledge graphs | Semantic data modeling |\n| [Chain of Density](../tutorials/6-chain-of-density.ipynb) | Implement iterative summarization | Content distillation |\n\n## Multi-Modal Examples\n\n### Vision Processing\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Table Extraction](tables_from_vision.md) | Convert image tables to structured data | Data entry automation |\n| [Table Extraction with GPT-4](extracting_tables.md) | Advanced table extraction | Complex table processing |\n| [Receipt Information](extracting_receipts.md) | Extract data from receipt images | Expense management |\n| [Slide Content Extraction](extract_slides.md) | Convert slides to structured text | Presentation analysis |\n| [Image to Ad Copy](image_to_ad_copy.md) | Generate ad text from images | Marketing automation |\n| [YouTube Clip Analysis](youtube_clips.md) | Extract info from video clips | Content moderation |\n\n### Multi-Modal Processing\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Gemini Multi-Modal](multi_modal_gemini.md) | Process text, images, and other data | Mixed-media analysis |\n\n## Data Tools\n\n### Database Integration\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [SQLModel Integration](sqlmodel.md) | Store AI-generated data in SQL databases | Persistent storage |\n| [Pandas DataFrame](pandas_df.md) | Work with structured data in Pandas | Data analysis |\n\n### Streaming and Processing\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Partial Response Streaming](partial_streaming.md) | Stream partial results in real-time | Interactive applications |\n| [Self-Critique and Correction](self_critique.md) | Implement self-assessment | Quality improvement |\n\n### API Integration\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Content Moderation](moderation.md) | Implement content filtering | Trust & safety |\n| [Cost Optimization with Batch API](batch_job_oai.md) | Reduce API costs | Production efficiency |\n| [Few-Shot Learning](examples.md) | Use contextual examples in prompts | Performance tuning |\n\n### Observability & Tracing\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Langfuse Tracing](tracing_with_langfuse.md) | Open-source LLM engineering | Observability & Debugging\n\n## Deployment Options\n\n### Model Providers\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Groq Cloud API](groq.md) | High-performance inference | Low-latency applications |\n| [Mistral/Mixtral Models](mistral.md) | Open-source model integration | Cost-effective deployment |\n| [IBM watsonx.ai](watsonx.md) | Enterprise AI platform | Business applications |\n\n### Local Deployment\n\n| Example | Description | Use Case |\n|---------|-------------|----------|\n| [Ollama Integration](ollama.md) | Local open-source models | Privacy-focused applications |\n\n## Stay Updated\n\nSubscribe to our newsletter for updates on new features and usage tips:\n\n<iframe src=\"https://embeds.beehiiv.com/2faf420d-8480-4b6e-8d6f-9c5a105f917a?slim=true\" data-test-id=\"beehiiv-embed\" height=\"52\" frameborder=\"0\" scrolling=\"no\" style=\"margin: 0; border-radius: 0px !important; background-color: transparent;\"></iframe>\n\nLooking for more structured learning? Check out our [Tutorial series](../tutorials/index.md) for step-by-step guides.\n"
  },
  {
    "path": "docs/examples/knowledge_graph.md",
    "content": "---\ntitle: 'Visualizing Knowledge Graphs: A Guide to Complex Topics'\ndescription: Learn how to create and update knowledge graphs using Python, OpenAI's API, Pydantic, and Graphviz for enhanced understanding of complex subjects.\n---\n\n# Visualizing Knowledge Graphs for Complex Topics\n\nIn this guide, you'll discover how to visualise a detailed knowledge graph when dealing with complex topics. We'll then move on to iteratively updating our knowledge graph with new information through a series of sequential api calls using only the Instructor library, Pydantic and Graphviz to visualise our graph.\n\n!!! tips \"Motivation\"\n\n    Knowledge graphs offer a visually appealing and coherent way to understand complicated topics like quantum mechanics. By generating these graphs automatically, you can accelerate the learning process and make it easier to digest complex information.\n\n## Defining the Structures\n\nLet's model a knowledge graph with **`Node`** and **`Edge`** objects. **`Node`** objects represent key concepts or entities, while **`Edge`** objects indicate the relationships between them.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Node(BaseModel, frozen=True):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel, frozen=True):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: List[Node] = Field(..., default_factory=list)\n    edges: List[Edge] = Field(..., default_factory=list)\n```\n\n## Generating Knowledge Graphs\n\nThe **`generate_graph`** function leverages OpenAI's API to generate a knowledge graph based on the input query.\n\n```python hl_lines=\"8\"\nimport instructor\n\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Node(BaseModel, frozen=True):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel, frozen=True):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: List[Node] = Field(..., default_factory=list)\n    edges: List[Edge] = Field(..., default_factory=list)\n\n\n# <%hide%>\n\n# Adds response_model to ChatCompletion\n# Allows the return of Pydantic model rather than raw JSON\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_graph(input) -> KnowledgeGraph:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Help me understand the following by describing it as a detailed knowledge graph: {input}\",\n            }\n        ],\n        response_model=KnowledgeGraph,\n    )  # type: ignore\n```\n\n## Visualizing the Graph\n\nThe **`visualize_knowledge_graph`** function uses the Graphviz library to render the generated knowledge graph.\n\n```python\nfrom graphviz import Digraph\n\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List\nimport instructor\n\n\nclass Node(BaseModel, frozen=True):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel, frozen=True):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: List[Node] = Field(..., default_factory=list)\n    edges: List[Edge] = Field(..., default_factory=list)\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_graph(input) -> KnowledgeGraph:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Help me understand the following by describing it as a detailed knowledge graph: {input}\",\n            }\n        ],\n        response_model=KnowledgeGraph,\n    )  # type: ignore\n\n\n# <%hide%>\n\n\ndef visualize_knowledge_graph(kg: KnowledgeGraph):\n    dot = Digraph(comment=\"Knowledge Graph\")\n\n    # Add nodes\n    for node in kg.nodes:\n        dot.node(str(node.id), node.label, color=node.color)\n\n    # Add edges\n    for edge in kg.edges:\n        dot.edge(str(edge.source), str(edge.target), label=edge.label, color=edge.color)\n\n    # Render the graph\n    dot.render(\"knowledge_graph.gv\", view=True)\n\n\ngraph = generate_graph(\"Teach me about quantum mechanics\")\nvisualize_knowledge_graph(graph)\n```\n\n![Knowledge Graph visualization showing interconnected concepts and relationships](knowledge_graph.png)\n\nThis will produce a visual representation of the knowledge graph, stored as \"knowledge_graph.gv\". You can open this file to explore the key concepts and their relationships in quantum mechanics.\n\n## Iterative Updates\n\nNow that we've seen how to generate a knowledge graph from a single input, let's see how we can iteratively update our knowledge graph with new information, or when information does not fit into a single prompt.\n\nLet's take an easy example where we want to visualise the combined knowledge graph that the following sentences represent.\n\n```python\ntext_chunks = [\n    \"Jason knows a lot about quantum mechanics. He is a physicist. He is a professor\",\n    \"Professors are smart.\",\n    \"Sarah knows Jason and is a student of his.\",\n    \"Sarah is a student at the University of Toronto. and UofT is in Canada\",\n]\n```\n\n### Updating Our Data Model\n\nTo support our new iterative approach, we need to update our data model. We can do this by adding helper methods `update` and `draw` to our Pydantic models. These methods will simplify our code and allow us to easily visualize the knowledge graph.\n\nIn the `KnowledgeGraph` class, we have migrated the code from the `visualize_knowledge_graph` method and added new lists for nodes and edges.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\n\n\nclass Node(BaseModel, frozen=True):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel, frozen=True):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: Optional[List[Node]] = Field(..., default_factory=list)\n    edges: Optional[List[Edge]] = Field(..., default_factory=list)\n\n    def update(self, other: \"KnowledgeGraph\") -> \"KnowledgeGraph\":\n        \"\"\"Updates the current graph with the other graph, deduplicating nodes and edges.\"\"\"\n        return KnowledgeGraph(\n            nodes=list(set(self.nodes + other.nodes)),\n            edges=list(set(self.edges + other.edges)),\n        )\n\n    def draw(self, prefix: str = None):\n        dot = Digraph(comment=\"Knowledge Graph\")\n\n        for node in self.nodes:  # (1)!\n            dot.node(str(node.id), node.label, color=node.color)\n\n        for edge in self.edges:  # (2)!\n            dot.edge(\n                str(edge.source), str(edge.target), label=edge.label, color=edge.color\n            )\n        dot.render(prefix, format=\"png\", view=True)\n```\n\n1. We iterate through all the nodes in our graph and add them to the graph\n2. We iterate through all the edges in our graph and add them to the graph\n\nWe can modify our `generate_graph` function to now take in a list of strings. At each step, it'll extract out the key insights from the sentences in the form of edges and nodes like we've seen before. We can then combine these new edges and nodes with our existing knowledge graph through iterative updates to our graph before arriving at our final result.\n\n```python hl_lines=\"2 21-25 31-32\"\nfrom typing import List\n\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\n\n\nclass Node(BaseModel, frozen=True):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel, frozen=True):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: Optional[List[Node]] = Field(..., default_factory=list)\n    edges: Optional[List[Edge]] = Field(..., default_factory=list)\n\n    def update(self, other: \"KnowledgeGraph\") -> \"KnowledgeGraph\":\n        \"\"\"Updates the current graph with the other graph, deduplicating nodes and edges.\"\"\"\n        return KnowledgeGraph(\n            nodes=list(set(self.nodes + other.nodes)),\n            edges=list(set(self.edges + other.edges)),\n        )\n\n    def draw(self, prefix: str = None):\n        dot = Digraph(comment=\"Knowledge Graph\")\n\n        for node in self.nodes:  # (1)!\n            dot.node(str(node.id), node.label, color=node.color)\n\n        for edge in self.edges:  # (2)!\n            dot.edge(\n                str(edge.source), str(edge.target), label=edge.label, color=edge.color\n            )\n        dot.render(prefix, format=\"png\", view=True)\n\n\n# <%hide%>\n\n\ndef generate_graph(input: List[str]) -> KnowledgeGraph:\n    cur_state = KnowledgeGraph()  # (1)!\n    num_iterations = len(input)\n    for i, inp in enumerate(input):\n        new_updates = client.create(\n            model=\"gpt-3.5-turbo-16k\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"You are an iterative knowledge graph builder.\n                    You are given the current state of the graph, and you must append the nodes and edges\n                    to it Do not procide any duplcates and try to reuse nodes as much as possible.\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Extract any new nodes and edges from the following:\n                    # Part {i}/{num_iterations} of the input:\n\n                    {inp}\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Here is the current state of the graph:\n                    {cur_state.model_dump_json(indent=2)}\"\"\",\n                },  # (2)!\n            ],\n            response_model=KnowledgeGraph,\n        )  # type: ignore\n\n        # Update the current state\n        cur_state = cur_state.update(new_updates)  # (3)!\n        cur_state.draw(prefix=f\"iteration_{i}\")\n    return cur_state\n```\n\n1.  We first initialise an empty `KnowledgeGraph`. In this state, it has zero nodes and edges\n\n2.  We then add in the current state of the graph into the prompt so that the model knows what new information needs to be added\n\n3.  We then update the nodes and edges of our graph with the information that our model has returned before visualizing the new changes\n\nOnce we've done this, we can now run this new `generate_graph` function with the following two lines.\n\n```python\n# <%hide%>\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\nimport instructor\nfrom graphviz import Digraph\n\n\nclass Node(BaseModel, frozen=True):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel, frozen=True):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: Optional[List[Node]] = Field(..., default_factory=list)\n    edges: Optional[List[Edge]] = Field(..., default_factory=list)\n\n    def update(self, other: \"KnowledgeGraph\") -> \"KnowledgeGraph\":\n        \"\"\"Updates the current graph with the other graph, deduplicating nodes and edges.\"\"\"\n        return KnowledgeGraph(\n            nodes=list(set(self.nodes + other.nodes)),\n            edges=list(set(self.edges + other.edges)),\n        )\n\n    def draw(self, prefix: str = None):\n        dot = Digraph(comment=\"Knowledge Graph\")\n\n        for node in self.nodes:  # (1)!\n            dot.node(str(node.id), node.label, color=node.color)\n\n        for edge in self.edges:  # (2)!\n            dot.edge(\n                str(edge.source), str(edge.target), label=edge.label, color=edge.color\n            )\n        dot.render(prefix, format=\"png\", view=True)\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_graph(input: List[str]) -> KnowledgeGraph:\n    cur_state = KnowledgeGraph()  # (1)!\n    num_iterations = len(input)\n    for i, inp in enumerate(input):\n        new_updates = client.create(\n            model=\"gpt-4o-mini\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"You are an iterative knowledge graph builder.\n                    You are given the current state of the graph, and you must append the nodes and edges\n                    to it Do not procide any duplcates and try to reuse nodes as much as possible.\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Extract any new nodes and edges from the following:\n                    # Part {i}/{num_iterations} of the input:\n\n                    {inp}\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Here is the current state of the graph:\n                    {cur_state.model_dump_json(indent=2)}\"\"\",\n                },  # (2)!\n            ],\n            response_model=KnowledgeGraph,\n        )  # type: ignore\n\n        # Update the current state\n        cur_state = cur_state.update(new_updates)  # (3)!\n        cur_state.draw(prefix=f\"iteration_{i}\")\n    return cur_state\n\n\n# <%hide%>\ntext_chunks = [\n    \"Jason knows a lot about quantum mechanics. He is a physicist. He is a professor\",\n    \"Professors are smart.\",\n    \"Sarah knows Jason and is a student of his.\",\n    \"Sarah is a student at the University of Toronto. and UofT is in Canada\",\n]\ngraph: KnowledgeGraph = generate_graph(text_chunks)\ngraph.draw(prefix=\"final\")\n```\n\n## Conclusion\n\nWe've seen how we can use `Instructor` to obtain structured outputs from the OpenAI LLM API but you could use that for any of the other open-source models that the library is compatible with. If you enjoy the content or want to try out `Instructor` check out the [github](https://github.com/jxnl/instructor) and don't forget to give us a star!\n"
  },
  {
    "path": "docs/examples/local_classification.md",
    "content": "---\ntitle: Classifying Confidential Data with Local AI Models\ndescription: Learn to classify private documents securely using Llama-cpp-python with instructor while maintaining data privacy and local infrastructure.\n---\n\n# Leveraging Local Models for Classifying Private Data\n\nIn this article, we'll show you how to use Llama-cpp-python with instructor for classification. This is a perfect use-case for users who want to ensure that confidential documents are handled securely without ever leaving your own infrastructure.\n\n## Setup\n\nLet's start by installing the required libraries in your local python environment. This might take a while since we'll need to build and compile `llama-cpp` for your specific environment.\n\n```bash\npip install instructor pydantic\n```\n\nNext, we'll install `llama-cpp-python` which is a python package that allows us to use llama-cpp with our python scripts.\n\nFor this tutorial, we'll be using `Mistral-7B-Instruct-v0.2-GGUF` by `TheBloke` to do our function calls. This will require around 6GB of RAM and a GPU.\n\nWe can install the package by running the following command\n\n```bash\nCMAKE_ARGS=\"-DGGML_CUDA=on\" pip install llama-cpp-python\n```\n\n!!! note \"Don't have a GPU?\"\n\n    If you don't have a GPU, we recommend using the `Qwen2-0.5B-Instruct` model instead and compiling llama-cpp-python to use `OpenBLAS`. This allows you to run the program using your CPU instead.\n\n    You can compile `llama-cpp-python` with `OpenBLAS` support by running the command\n\n    ```bash\n    CMAKE_ARGS=\"-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS\" pip install llama-cpp-python\n    ```\n\n## Using `LLama-cpp-python`\n\nHere's an example of how to implement a system for handling confidential document queries using local models:\n\n```python hl_lines=\"7-12 14-16 43-52\"\nfrom llama_cpp import Llama  # type: ignore\nimport instructor\nfrom pydantic import BaseModel\nfrom enum import Enum\nfrom typing import Optional\n\nllm = Llama.from_pretrained(  # type: ignore\n    repo_id=\"TheBloke/Mistral-7B-Instruct-v0.2-GGUF\",  # (1)!\n    filename=\"*Q4_K_M.gguf\",\n    verbose=False,  # (2)!\n    n_gpu_layers=-1,  # (3)!\n)\n\ncreate = instructor.patch(\n    create=llm.create_chat_completion_openai_v1,  # type: ignore  # (4)!\n)\n\n\n# Define query types for document-related inquiries\nclass QueryType(str, Enum):\n    DOCUMENT_CONTENT = \"document_content\"\n    LAST_MODIFIED = \"last_modified\"\n    ACCESS_PERMISSIONS = \"access_permissions\"\n    RELATED_DOCUMENTS = \"related_documents\"\n\n\n# Define the structure for query responses\nclass QueryResponse(BaseModel):\n    query_type: QueryType\n    response: str\n    additional_info: Optional[str] = None\n\n\ndef process_confidential_query(query: str) -> QueryResponse:\n    prompt = f\"\"\"Analyze the following confidential document query and provide an appropriate response:\n    Query: {query}\n\n    Determine the type of query (document content, last modified, access permissions, or related documents),\n    provide a response, and include a confidence score and any additional relevant information.\n    Remember, you're handling confidential data, so be cautious about specific details.\n    \"\"\"\n\n    return create(\n        response_model=QueryResponse,  # (5)!\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a secure AI assistant trained to handle confidential document queries.\",\n            },\n            {\"role\": \"user\", \"content\": prompt},\n        ],\n    )\n\n\n# Sample confidential document queries\nconfidential_queries = [\n    \"What are the key findings in the Q4 financial report?\",\n    \"Who last accessed the merger proposal document?\",\n    \"What are the access permissions for the new product roadmap?\",\n    \"Are there any documents related to Project X's budget forecast?\",\n    \"When was the board meeting minutes document last updated?\",\n]\n\n# Process each query and print the results\nfor query in confidential_queries:\n    response: QueryResponse = process_confidential_query(query)\n    print(f\"{query} : {response.query_type}\")\n    \"\"\"\n    #> What are the key findings in the Q4 financial report? : document_content\n    #> Who last accessed the merger proposal document? : access_permissions\n    #> What are the access permissions for the new product roadmap? : access_permissions\n    #> Are there any documents related to Project X's budget forecast? : document_content\n    #> When was the board meeting minutes document last updated? : last_modified\n    \"\"\"\n```\n\n1. We load in the model from Hugging Face and cache it locally. This makes it quick and easy for us to experiment with different model configurations and types.\n\n2. We can set `verbose` to be `True` to log out all of the output from `llama.cpp`. This helps if you're trying to debug specific issues\n\n3. If you have a GPU with limited memory, set `n_gpu` to a lower number (Eg. 10 ). We've set it here to `-1` so that all of the model layers are loaded on the GPU by default.\n\n4. Now make sure to patch the client with the `create_chat_completion_openai_v1` api which is OpenAI compatible\n\n5. Pass in the response model as a parameter just like any other inference client we support\n\n## Conclusion\n\n`instructor` provides a robust solution for organizations needing to handle confidential document queries locally. By processing these queries on your own hardware, you can leverage advanced AI capabilities while maintaining the highest standards of data privacy and security.\n\nBut this goes far beyond just simple confidential documents, using local models unlocks a whole new world of interesting use-cases, fine-tuned specialist models and more!\n"
  },
  {
    "path": "docs/examples/mistral.md",
    "content": "---\ntitle: Using MistralAI for Structured Outputs\ndescription: Learn how to use MistralAI models for inference, including setup, API key generation, and example code.\n---\n\n# Structured Outputs using Mistral\nYou can use MistralAI models for inference with Instructor using `from_provider`.\n\nThe examples use `mistral-large-latest`.\n\n## MistralAI API\nTo use mistral you need to obtain a mistral API key.\nGoto [mistralai](https://mistral.ai/) click on Build Now and login. Select API Keys from the left menu and then select\nCreate API key to create a new key.\n\n## Use example\nSome pip packages need to be installed to use the example:\n```\npip install instructor mistralai pydantic\n```\nYou need to export the mistral API key:\n```\nexport MISTRAL_API_KEY=<your-api-key>\n```\n\nAn example:\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\n# Using from_provider (recommended)\nclient = instructor.from_provider(\"mistral/mistral-large-latest\")\n\nresp = client.create(\n    response_model=UserDetails,\n    messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n    temperature=0,\n)\n\nprint(resp)\n#> name='Jason' age=10\n\n# output: UserDetails(name='Jason', age=10)\n```\n"
  },
  {
    "path": "docs/examples/moderation.md",
    "content": "---\ntitle: OpenAI Moderation Example for Content Compliance\ndescription: Learn how to use OpenAI's moderation endpoint to filter harmful content and ensure compliance with usage policies.\n---\n\n# OpenAI Moderation\n\nThis example uses OpenAI's moderation endpoint to check content compliance with OpenAI's usage policies. It can identify and filter harmful content that violates the policies.\n\nThe model flags content and classifies it into categories including hate, harassment, self-harm, sexual content, and violence. Each category has subcategories for detailed classification.\n\nThis validator is to be used for monitoring OpenAI API inputs and outputs, other use cases are currently [not allowed](https://platform.openai.com/docs/guides/moderation/overview).\n\n## Incorporating OpenAI moderation validator\n\nThe following code defines a function to validate content using OpenAI's Moderation endpoint. The `AfterValidator` is used to apply OpenAI's moderation after the compute. This moderation checks if the content complies with OpenAI's usage policies and flags any harmful content. Here's how it works:\n\n1. Generate the OpenAI client and patch it with the `instructor`. Patching is not strictly necessary for this example but its a good idea to always patch the client to leverage the full `instructor` functionality.\n\n2. Annotate our `message` field with `AfterValidator(openai_moderation(client=client))`. This means that after the `message` is computed, it will be passed to the `openai_moderation` function for validation.\n\n```python\nimport instructor\n\nfrom instructor import openai_moderation\n\nfrom typing_extensions import Annotated\nfrom pydantic import BaseModel, AfterValidator\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Response(BaseModel):\n    message: Annotated[str, AfterValidator(openai_moderation(client=client))]\n\n\ntry:\n    Response(message=\"I want to make them suffer the consequences\")\nexcept Exception as e:\n    print(e)\n    \"\"\"\n    1 validation error for Response\n    message\n      Value error, `I want to make them suffer the consequences` was flagged for violence [type=value_error, input_value='I want to make them suffer the consequences', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.9/v/value_error\n    \"\"\"\n\ntry:\n    Response(message=\"I want to hurt myself.\")\nexcept Exception as e:\n    print(e)\n    \"\"\"\n    1 validation error for Response\n    message\n      Value error, `I want to hurt myself.` was flagged for self_harm, self_harm_intent, self-harm, self-harm/intent [type=value_error, input_value='I want to hurt myself.', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.9/v/value_error\n    \"\"\"\n```\n"
  },
  {
    "path": "docs/examples/multi_modal_gemini.md",
    "content": "---\ntitle: Utilizing Gemini for Multi-Modal Data Processing with Audio Files\ndescription: Learn how to use Gemini with Google Generative AI to process audio files efficiently in multi-modal applications.\n---\n\n# Using Gemini with Multi Modal Data\n\nThis tutorial shows how to use `instructor` with `google-generativeai` to work with multi-modal data. In this example, we'll demonstrate three ways to work with audio files.\n\nWe'll be using this [recording](https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3) that's taken from the [Google Generative AI cookbook](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Audio.ipynb).\n\n## Normal Message\n\nThe first way to work with audio files is to upload the entire audio file and pass it into the LLM as a normal message. This is the easiest way to get started and doesn't require any special setup.\n\n```python\n# <%hide%>\nimport requests\nfrom pydub import AudioSegment\n\n# Download the audio file\nurl = \"https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3\"\nresponse = requests.get(url)\n\n# Save the audio file locally\nwith open(\"sample.mp3\", \"wb\") as file:\n    file.write(response.content)\n\nsound = AudioSegment.from_mp3(\"sample.mp3\")  # (2)!\nsound = sound[:60000]\nsound.export(\n    \"sample.mp3\", format=\"mp3\"\n)  # Save the processed audio segment as sample.mp3\n# <%hide>\nimport instructor\nimport google.generativeai as genai\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\"),\n    mode=instructor.Mode.JSON,  # (1)!\n)\n\nmp3_file = genai.upload_file(\"./sample.mp3\")  # (2)!\n\n\nclass Description(BaseModel):\n    description: str\n\n\nresp = client.create(\n    response_model=Description,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Summarize what's happening in this audio file and who the main speaker is\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": mp3_file,  # (3)!\n        },\n    ],\n)\n\nprint(resp)\n\"\"\"\ndescription = 'The main speaker is President John F. Kennedy, giving his State of the Union address to a joint session of Congress. He is speaking in the House of Representatives in Washington, D.C. on January 30th, 1961. He is thanking the members of Congress for their knowledge and inspiration.'\n\"\"\"\n```\n\n1. Make sure to set the mode to `Mode.JSON` (replaces deprecated `GEMINI_JSON`), this is important because Tool Calling doesn't work with multi-modal inputs.\n2. Use `genai.upload_file` to upload your file. If you've already uploaded the file, you can get it by using `genai.get_file`\n3. Pass in the file object as any normal user message\n\n## Inline Audio Segment\n\n!!! note \"Maximum File Size\"\n\n    When uploading and working with audio, there is a maximum file size that we can upload to the api as an inline segment. You'll know when this error is thrown below.\n\n    ```\n    google.api_core.exceptions.InvalidArgument: 400 Request payload size exceeds the limit: 20971520 bytes. Please upload your files with the File API instead.`f = genai.upload_file(path); m.generate_content(['tell me about this file:', f])`\n    ```\n\n    When it comes to video files, we recommend using the file.upload method as shown in the example above.\n\nSecondly, we can also pass in a audio segment as a normal message as an inline object as shown below. This requires you to install the `pydub` library in order to do so.\n\n```python\nimport instructor\nimport google.generativeai as genai\nfrom pydantic import BaseModel\nfrom pydub import AudioSegment\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\"),\n    mode=instructor.Mode.JSON,  # (1)!\n)\n\n\nsound = AudioSegment.from_mp3(\"sample.mp3\")  # (2)!\nsound = sound[:60000]\n\n\nclass Transcription(BaseModel):\n    summary: str\n    exact_transcription: str\n\n\nresp = client.create(\n    response_model=Transcription,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Please transcribe this recording\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": {\n                \"mime_type\": \"audio/mp3\",\n                \"data\": sound.export().read(),  # (3)!\n            },\n        },\n    ],\n)\n\nprint(resp)\n\"\"\"\nsummary='President addresses the joint session of Congress,  reflecting on his first time taking the oath of federal office and the knowledge and inspiration gained.' exact_transcription=\"The President's state of the union address to a joint session of the Congress from the rostrum of the House of Representatives, Washington D.C. January 30th 1961 Speaker, Mr Vice President members of the Congress It is a pleasure to return from whence I came You are among my oldest friends in Washington And this house is my oldest home It was here it was here more than 14 years ago that I first took the oath of federal office It was here for 14 years that I gained both knowledge and inspiration from members of both\"\n\"\"\"\n\n#> summary='President delivers a speech to a joint session of Congress,\n#> highlighting his history in the House of Representatives and thanking\n#> the members of Congress for their guidance.',\n# >\n#> exact_transcription=\"The President's State of the Union address to a\n#> joint session of the Congress from the rostrum of the House of\n#> Representatives, Washington DC, January 30th 1961. Mr. Speaker, Mr.\n#> Vice-President, members of the Congress, it is a pleasure to return\n#> from whence I came. You are among my oldest friends in Washington,\n#> and this house is my oldest home. It was here that I first took the\n#> oath of federal office. It was here for 14 years that I gained both\n#> knowledge and inspiration from members of both\"\n```\n\n1. Make sure to set the mode to `Mode.JSON` (replaces deprecated `GEMINI_JSON`), this is important because Tool Calling doesn't work with multi-modal inputs.\n2. Use `AudioSegment.from_mp3` to load your audio file.\n3. Pass in the audio data as bytes to the `data` field using the content as a dictionary with the right content `mime_type` and `data` as bytes\n\n## Lists of Content\n\nWe also support passing in these as a single list as per the documentation for `google-generativeai`. Here's how to do so with a audio segment snippet from the same recording.\n\nNote that the list can contain normal user messages as well as file objects. It's incredibly flexible.\n\n```python\nimport instructor\nimport google.generativeai as genai\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\"),\n    mode=instructor.Mode.JSON,  # (1)!\n)\n\nmp3_file = genai.upload_file(\"./sample.mp3\")  # (2)!\n\n\nclass Description(BaseModel):\n    description: str\n\n\ncontent = [\n    \"Summarize what's happening in this audio file and who the main speaker is\",\n    mp3_file,  # (3)!\n]\n\nresp = client.create(\n    response_model=Description,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": content,\n        }\n    ],\n)\n\nprint(resp)\n\"\"\"\ndescription = 'President John F. Kennedy delivers his State of the Union address to the Congress on January 30, 1961. The speech was delivered at the rostrum of the House of Representatives in Washington, D.C.'\n\"\"\"\n```\n\n1. Make sure to set the mode to `Mode.JSON` (replaces deprecated `GEMINI_JSON`), this is important because Tool Calling doesn't work with multi-modal inputs.\n2. Upload the file using `genai.upload_file` or get the file using `genai.get_file`\n3. Pass in the content as a list containing the normal user message and the file object.\n"
  },
  {
    "path": "docs/examples/multiple_classification.md",
    "content": "---\ntitle: Multi-Label Classification - Support Ticket Categorization\ndescription: Implement multi-label classification with Instructor for support tickets. Assign multiple categories like ACCOUNT, BILLING, and GENERAL_QUERY simultaneously.\n---\n\nFor multi-label classification, we introduce a new enum class and a different Pydantic model to handle multiple labels.\n\n```python\nimport instructor\n\nfrom typing import List, Literal\nfrom pydantic import BaseModel, Field\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nLABELS = Literal[\"ACCOUNT\", \"BILLING\", \"GENERAL_QUERY\"]\n\n\nclass MultiClassPrediction(BaseModel):\n    \"\"\"\n    A few-shot example of multi-label classification:\n    Examples:\n    - \"My account is locked and I can't access my billing info.\": ACCOUNT, BILLING\n    - \"I need help with my subscription.\": ACCOUNT\n    - \"How do I change my payment method?\": BILLING\n    - \"Can you tell me the status of my order?\": BILLING\n    - \"I have a question about the product features.\": GENERAL_QUERY\n    \"\"\"\n\n    labels: List[LABELS] = Field(\n        ...,\n        description=\"Only select the labels that apply to the support ticket.\",\n    )\n\n\ndef multi_classify(data: str) -> MultiClassPrediction:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=MultiClassPrediction,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"You are a support agent at a tech company. Only select the labels that apply to the support ticket.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following support ticket: <text>{data}</text>\",\n            },\n        ],\n    )  # type: ignore\n\n\nif __name__ == \"__main__\":\n    ticket = \"My account is locked and I can't access my billing info.\"\n    prediction = multi_classify(ticket)\n    assert {\"ACCOUNT\", \"BILLING\"} == {label for label in prediction.labels}\n    print(\"input:\", ticket)\n    #> input: My account is locked and I can't access my billing info.\n    print(\"labels:\", LABELS)\n    #> labels: typing.Literal['ACCOUNT', 'BILLING', 'GENERAL_QUERY']\n    print(\"prediction:\", prediction)\n    #> prediction: labels=['ACCOUNT', 'BILLING']\n```\n"
  },
  {
    "path": "docs/examples/ollama.md",
    "content": "---\ntitle: Harnessing Structured Outputs with Ollama and Instructor\ndescription: Discover how to utilize Ollama's Instructor library for structured outputs in LLM applications using Pydantic models.\n---\n\n## See Also\n\n- [Ollama Integration](../integrations/ollama.md) - Complete Ollama setup guide\n- [Open Source Models](./open_source.md) - More open-source model examples\n- [Local Deployment](./index.md#local-deployment) - Local model deployment options\n- [Response Models](../concepts/models.md) - Working with Pydantic models\n\n# Structured Outputs with Ollama\n\nOpen-source Large Language Models (LLMs) are rapidly gaining popularity in the AI community. With the recent release of Ollama's OpenAI compatibility layer, it has become possible to obtain structured outputs using JSON schema from these open-source models. This development opens up exciting possibilities for developers and researchers alike.\n\nIn this blog post, we'll explore how to effectively utilize the Instructor library with Ollama to harness the power of structured outputs with [Pydantic models](../concepts/models.md). We'll cover everything from setup to implementation, providing you with practical insights and code examples.\n\n## Why use Instructor?\n\nInstructor offers several key benefits:\n\n- :material-code-tags: **Simple API with Full Prompt Control**: Instructor provides a straightforward API that gives you complete ownership and control over your prompts. This allows for fine-tuned customization and optimization of your LLM interactions. [:octicons-arrow-right-16: Explore Concepts](../concepts/models.md)\n\n- :material-refresh: **Reasking and Validation**: Automatically reask the model when validation fails, ensuring high-quality outputs. Leverage Pydantic's validation for robust error handling. [:octicons-arrow-right-16: Learn about Reasking](../concepts/reask_validation.md)\n\n- :material-repeat-variant: **Streaming Support**: Stream partial results and iterables with ease, allowing for real-time processing and improved responsiveness in your applications. [:octicons-arrow-right-16: Learn about Streaming](../concepts/partial.md)\n\n- :material-code-braces: **Powered by Type Hints**: Leverage Pydantic for schema validation, prompting control, less code, and IDE integration. [:octicons-arrow-right-16: Learn more](https://docs.pydantic.dev/)\n\n- :material-lightning-bolt: **Simplified LLM Interactions**: Support for various LLM providers including OpenAI, Anthropic, Google, Vertex AI, Mistral/Mixtral, Anyscale, Ollama, llama-cpp-python, Cohere, and LiteLLM. [:octicons-arrow-right-16: See Examples](../examples/index.md)\n\nFor more details on these features, check out the [Concepts](../concepts/models.md) section of the documentation.\n\n## Patching\n\nInstructor's [patch](../concepts/patching.md) enhances an openai api with the following features:\n\n- `response_model` in `create` calls that returns a pydantic model\n- `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy\n\n!!! note \"Learn More\"\n\n    To learn more, please refer to the [docs](../index.md). To understand the benefits of using Pydantic with Instructor, visit the tips and tricks section of the [why use Pydantic](../why.md) page.\n\n## Ollama\n\nStart by downloading [Ollama](https://ollama.ai/download), and then pull a model such as Llama 3 or Mistral.\n\n!!! tip \"Make sure you update your `ollama` to the latest version!\"\n\n```\nollama pull llama3\n```\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\n\nclass Character(BaseModel):\n    name: str\n    age: int\n    fact: List[str] = Field(..., description=\"A list of facts about the character\")\n\n\n# Use from_provider with base_url for Ollama\nclient = instructor.from_provider(\n    \"ollama/llama3\",\n    base_url=\"http://localhost:11434/v1\",\n    mode=instructor.Mode.JSON,\n)\n\nresp = client.create(\n    model=\"llama3\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Tell me about the Harry Potter\",\n        }\n    ],\n    response_model=Character,\n)\nprint(resp.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"Harry James Potter\",\n  \"age\": 37,\n  \"fact\": [\n    \"He is the chosen one.\",\n    \"He has a lightning-shaped scar on his forehead.\",\n    \"He is the son of James and Lily Potter.\",\n    \"He attended Hogwarts School of Witchcraft and Wizardry.\",\n    \"He is a skilled wizard and sorcerer.\",\n    \"He fought against Lord Voldemort and his followers.\",\n    \"He has a pet owl named Snowy.\"\n  ]\n}\n\"\"\"\n```\n\nThis example demonstrates how to use Instructor with Ollama, a local LLM server, to generate structured outputs. By leveraging Instructor's capabilities, we can easily extract structured information from the LLM's responses, making it simpler to work with the generated data in our applications.\n\n## Further Reading\n\nTo explore more about Instructor and its various applications, consider checking out the following resources:\n\n1. [Why use Instructor?](../why.md) - Learn about the benefits and use cases of Instructor.\n\n2. [Concepts](../concepts/models.md) - Dive deeper into the core concepts of Instructor, including models, retrying, and validation.\n\n3. [Examples](../examples/index.md) - Explore our comprehensive collection of examples and integrations with various LLM providers.\n\n4. [Tutorials](../tutorials/1-introduction.ipynb) - Step-by-step tutorials to help you get started with Instructor.\n\n5. [Learn Prompting](../prompting/index.md) - Techniques and strategies for effective prompt engineering with Instructor.\n\nBy exploring these resources, you'll gain a comprehensive understanding of Instructor's capabilities and how to leverage them in your projects.\n"
  },
  {
    "path": "docs/examples/open_source.md",
    "content": "---\ntitle: Open Source Model Providers for Chat API\ndescription: Explore tested open source models compatible with the OpenAI chat API, including OpenRouter, Perplexity, and RunPod LLMs.\n---\n\n# Instructor with open source models\nInstructor works with Open source model providers that support the [OpenAI API chat endpoint](https://platform.openai.com/docs/api-reference/chat)\n\nSee examples README [here](https://github.com/jxnl/instructor/tree/main/examples/open_source_examples)\n\n# Currently tested open source model providers\n- [OpenRouter](https://openrouter.ai/)\n- [Perplexity](https://www.perplexity.ai/)\n- [RunPod TheBloke LLMs](https://github.com/TheBlokeAI/dockerLLM/blob/main/README_Runpod_LocalLLMsUI.md) **\n\n\n** This utilizes text-generation-webui w/ Openai plugin under the hood."
  },
  {
    "path": "docs/examples/pandas_df.md",
    "content": "---\ntitle: Extracting DataFrames from Markdown using Pandas\ndescription: Learn how to extract and convert Markdown tables directly into Pandas DataFrames in Python.\n---\n\n# Extracting directly to a DataFrame\n\nIn this example we'll show you how to extract directly to a `pandas.DataFrame`\n\n```python\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n    PlainSerializer,\n    InstanceOf,\n    WithJsonSchema,\n)\nimport pandas as pd\nimport instructor\nimport instructor\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    # Validates final type\n    InstanceOf[pd.DataFrame],\n    # Converts markdown to DataFrame\n    BeforeValidator(md_to_df),\n    # Converts DataFrame to markdown on model_dump_json\n    PlainSerializer(lambda df: df.to_markdown()),\n    # Adds a description to the type\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n            The markdown representation of the table,\n            each one should be tidy, do not try to join\n            tables that should be seperate\"\"\",\n        }\n    ),\n]\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef extract_df(data: str) -> pd.DataFrame:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=MarkdownDataFrame,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a data extraction system, table of writing perfectly formatted markdown tables.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Extract the data into a table: {data}\",\n            },\n        ],\n    )\n\n\nclass Table(BaseModel):\n    title: str\n    data: MarkdownDataFrame\n\n\ndef extract_table(data: str) -> Table:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=Table,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a data extraction system, table of writing perfectly formatted markdown tables.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Extract the data into a table: {data}\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    df = extract_df(\n        \"\"\"Create a table of the last 5 presidents of the United States,\n        including their party and the years they served.\"\"\"\n    )\n    assert isinstance(df, pd.DataFrame)\n    print(df)\n    \"\"\"\n                         Party          Years Served\n     President\n    Joe Biden                  Democrat  2021 - Present\n    Donald Trump             Republican     2017 - 2021\n    Barack Obama               Democrat     2009 - 2017\n    George W. Bush           Republican     2001 - 2009\n    Bill Clinton               Democrat     1993 - 2001\n    \"\"\"\n\n    table = extract_table(\n        \"\"\"Create a table of the last 5 presidents of the United States,\n        including their party and the years they served.\"\"\"\n    )\n    assert isinstance(table, Table)\n    assert isinstance(table.data, pd.DataFrame)\n    print(table.title)\n    #> Last 5 Presidents of the United States\n    print(table.data)\n    \"\"\"\n                         Party  Years Served\n     President\n    Joe Biden        Democratic     2021-2025\n    Donald Trump     Republican     2017-2021\n    Barack Obama     Democratic     2009-2017\n    George W. Bush   Republican     2001-2009\n    Bill Clinton     Democratic     1993-2001\n    \"\"\"\n```\n\nNotice that you can extract both the raw `MarkdownDataFrame` or a more complex structure like `Table` which includes a title and the data as a DataFrame. You can even request `Iterable[Table]` to get multiple tables in a single response!\n"
  },
  {
    "path": "docs/examples/partial_streaming.md",
    "content": "---\ntitle: Partial Response Streaming - Field-Level Updates\ndescription: Stream partial responses with Instructor for real-time UI updates. Get incremental snapshots of response models as fields are generated.\n---\n\n# Streaming Partial Responses\n\nField level streaming provides incremental snapshots of the current state of the response model that are immediately useable. This approach is particularly relevant in contexts like rendering UI components.\n\nInstructor supports this pattern by making use of `Partial[T]`. This lets us dynamically create a new class that treats all of the original model's fields as `Optional`.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import List\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\ntext_block = \"\"\"\nIn our recent online meeting, participants from various backgrounds joined to discuss the upcoming tech conference. The names and contact details of the participants were as follows:\n- Name: John Doe, Email: johndoe@email.com, Twitter: @TechGuru44\n- Name: Jane Smith, Email: janesmith@email.com, Twitter: @DigitalDiva88\n- Name: Alex Johnson, Email: alexj@email.com, Twitter: @CodeMaster2023\nDuring the meeting, we agreed on several key points. The conference will be held on March 15th, 2024, at the Grand Tech Arena located at 4521 Innovation Drive. Dr. Emily Johnson, a renowned AI researcher, will be our keynote speaker.\nThe budget for the event is set at $50,000, covering venue costs, speaker fees, and promotional activities. Each participant is expected to contribute an article to the conference blog by February 20th.\nA follow-up meetingis scheduled for January 25th at 3 PM GMT to finalize the agenda and confirm the list of speakers.\n\"\"\"\n\n\nclass User(BaseModel):\n    name: str\n    email: str\n    twitter: str\n\n\nclass MeetingInfo(BaseModel):\n    users: List[User]\n    date: str\n    location: str\n    budget: int\n    deadline: str\n\n\nPartialMeetingInfo = instructor.Partial[MeetingInfo]\n\n\nextraction_stream = client.create(\n    model=\"gpt-4\",\n    response_model=PartialMeetingInfo,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Get the information about the meeting and the users {text_block}\",\n        },\n    ],\n    stream=True,\n)  # type: ignore\n\n\nfrom rich.console import Console\n\nconsole = Console()\n\nfor extraction in extraction_stream:\n    obj = extraction.model_dump()\n    console.clear()\n    console.print(obj)\n```\n"
  },
  {
    "path": "docs/examples/pii.md",
    "content": "---\ntitle: Extracting and Scrubbing PII Data with OpenAI\ndescription: Learn to extract and sanitize Personally Identifiable Information (PII) from documents using OpenAI's ChatCompletion model and Python.\n---\n\n# PII Data Extraction and Scrubbing\n\n## Overview\n\nThis example demonstrates the usage of OpenAI's ChatCompletion model for the extraction and scrubbing of Personally Identifiable Information (PII) from a document. The code defines Pydantic models to manage the PII data and offers methods for both extraction and sanitation.\n\n## Defining the Structures\n\nFirst, Pydantic models are defined to represent the PII data and the overall structure for PII data extraction.\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel\n\n\n# Define Schemas for PII data\nclass Data(BaseModel):\n    index: int\n    data_type: str\n    pii_value: str\n\n\nclass PIIDataExtraction(BaseModel):\n    \"\"\"\n    Extracted PII data from a document, all data_types should try to have consistent property names\n    \"\"\"\n\n    private_data: List[Data]\n\n    def scrub_data(self, content: str) -> str:\n        \"\"\"\n        Iterates over the private data and replaces the value with a placeholder in the form of\n        <{data_type}_{i}>\n        \"\"\"\n        for i, data in enumerate(self.private_data):\n            content = content.replace(data.pii_value, f\"<{data.data_type}_{i}>\")\n        return content\n```\n\n## Extracting PII Data\n\nThe OpenAI API is utilized to extract PII information from a given document.\n\n```python\nimport instructor\n\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel\n\n\n# Define Schemas for PII data\nclass Data(BaseModel):\n    index: int\n    data_type: str\n    pii_value: str\n\n\nclass PIIDataExtraction(BaseModel):\n    \"\"\"\n    Extracted PII data from a document, all data_types should try to have consistent property names\n    \"\"\"\n\n    private_data: List[Data]\n\n    def scrub_data(self, content: str) -> str:\n        \"\"\"\n        Iterates over the private data and replaces the value with a placeholder in the form of\n        <{data_type}_{i}>\n        \"\"\"\n        for i, data in enumerate(self.private_data):\n            content = content.replace(data.pii_value, f\"<{data.data_type}_{i}>\")\n        return content\n\n\n# <%hide%>\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nEXAMPLE_DOCUMENT = \"\"\"\n# Fake Document with PII for Testing PII Scrubbing Model\n# (The content here)\n\"\"\"\n\npii_data = client.create(\n    model=\"gpt-4o-mini\",\n    response_model=PIIDataExtraction,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a world class PII scrubbing model, Extract the PII data from the following document\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": EXAMPLE_DOCUMENT,\n        },\n    ],\n)  # type: ignore\n\nprint(\"Extracted PII Data:\")\n#> Extracted PII Data:\nprint(pii_data.model_dump_json())\n\"\"\"\n{\"private_data\":[{\"index\":1,\"data_type\":\"Name\",\"pii_value\":\"John Doe\"},{\"index\":2,\"data_type\":\"Email\",\"pii_value\":\"john.doe@example.com\"},{\"index\":3,\"data_type\":\"Phone\",\"pii_value\":\"+1234567890\"},{\"index\":4,\"data_type\":\"Address\",\"pii_value\":\"1234 Elm Street, Springfield, IL 62704\"},{\"index\":5,\"data_type\":\"SSN\",\"pii_value\":\"123-45-6789\"}]}\n\"\"\"\n```\n\n### Output of Extracted PII Data\n\n```json\n{\n  \"private_data\": [\n    {\n      \"index\": 0,\n      \"data_type\": \"date\",\n      \"pii_value\": \"01/02/1980\"\n    },\n    {\n      \"index\": 1,\n      \"data_type\": \"ssn\",\n      \"pii_value\": \"123-45-6789\"\n    },\n    {\n      \"index\": 2,\n      \"data_type\": \"email\",\n      \"pii_value\": \"john.doe@email.com\"\n    },\n    {\n      \"index\": 3,\n      \"data_type\": \"phone\",\n      \"pii_value\": \"555-123-4567\"\n    },\n    {\n      \"index\": 4,\n      \"data_type\": \"address\",\n      \"pii_value\": \"123 Main St, Springfield, IL, 62704\"\n    }\n  ]\n}\n```\n\n## Scrubbing PII Data\n\nAfter extracting the PII data, the `scrub_data` method is used to sanitize the document.\n\n```python\n# <%hide%>\nfrom typing import List\nfrom pydantic import BaseModel\n\n\n# Define Schemas for PII data\nclass Data(BaseModel):\n    index: int\n    data_type: str\n    pii_value: str\n\n\nclass PIIDataExtraction(BaseModel):\n    \"\"\"\n    Extracted PII data from a document, all data_types should try to have consistent property names\n    \"\"\"\n\n    private_data: List[Data]\n\n    def scrub_data(self, content: str) -> str:\n        \"\"\"\n        Iterates over the private data and replaces the value with a placeholder in the form of\n        <{data_type}_{i}>\n        \"\"\"\n        for i, data in enumerate(self.private_data):\n            content = content.replace(data.pii_value, f\"<{data.data_type}_{i}>\")\n        return content\n\n\npii_data = PIIDataExtraction(\n    private_data=[\n        {\"index\": 0, \"data_type\": \"date\", \"pii_value\": \"01/02/1980\"},\n        {\"index\": 1, \"data_type\": \"ssn\", \"pii_value\": \"123-45-6789\"},\n        {\"index\": 2, \"data_type\": \"email\", \"pii_value\": \"john.doe@email.com\"},\n        {\"index\": 3, \"data_type\": \"phone\", \"pii_value\": \"555-123-4567\"},\n        {\n            \"index\": 4,\n            \"data_type\": \"address\",\n            \"pii_value\": \"123 Main St, Springfield, IL, 62704\",\n        },\n    ]\n)\n\nEXAMPLE_DOCUMENT = \"\"\"\n# Fake Document with PII for Testing PII Scrubbing Model\n# He was born on 01/02/1980. His social security number is 123-45-6789. He has been using the email address john.doe@email.com for years, and he can always be reached at 555-123-4567.\n\"\"\"\n# <%hide%>\nprint(\"Scrubbed Document:\")\n#> Scrubbed Document:\nprint(pii_data.scrub_data(EXAMPLE_DOCUMENT))\n\"\"\"\n# Fake Document with PII for Testing PII Scrubbing Model\n# He was born on <date_0>. His social security number is <ssn_1>. He has been using the email address <email_2> for years, and he can always be reached at <phone_3>.\n\"\"\"\n```\n\n### Output of Scrubbed Document\n\n```plaintext\n# Fake Document with PII for Testing PII Scrubbing Model\n\n## Personal Story\n\nJohn Doe was born on <date_0>. His social security number is <ssn_1>. He has been using the email address <email_2> for years, and he can always be reached at <phone_3>.\n\n## Residence\n\nJohn currently resides at <address_4>. He's been living there for about 5 years now.\n```\n"
  },
  {
    "path": "docs/examples/planning-tasks.md",
    "content": "---\ntitle: Query Planning with Instructor - Complex Task Decomposition\ndescription: Plan and execute complex query plans using Instructor. Break down complex questions into sub-questions with dependencies for systematic information gathering.\n---\n\n# Planning and Executing a Query Plan\n\nThis example demonstrates how to use the OpenAI Function Call ChatCompletion model to plan and execute a query plan in a question-answering system. By breaking down a complex question into smaller sub-questions with defined dependencies using [lists](../concepts/lists.md), the system can systematically gather the necessary information to answer the main question similar to [knowledge graph extraction](../examples/knowledge_graph.md).\n\n!!! tips \"Motivation\"\n\n    The goal of this example is to showcase how query planning can be used to handle complex questions, facilitate iterative information gathering, automate workflows, and optimize processes. By leveraging the OpenAI Function Call model, you can design and execute a structured plan to find answers effectively.\n\n     **Use Cases:**\n\n    * Complex question answering\n    * Iterative information gathering\n    * Workflow automation\n    * Process optimization\n\nWith the OpenAI Function Call model, you can customize the planning process and integrate it into your specific application to meet your unique requirements.\n\n## Defining the Structures\n\nLet's define the necessary Pydantic models to represent the query plan and the queries.\n\n```python\nfrom typing import List, Literal\nfrom pydantic import Field, BaseModel\n\n\nclass Query(BaseModel):\n    \"\"\"Class representing a single question in a query plan.\"\"\"\n\n    id: int = Field(..., description=\"Unique id of the query\")\n    question: str = Field(\n        ...,\n        description=\"Question asked using a question answering system\",\n    )\n    dependencies: List[int] = Field(\n        default_factory=list,\n        description=\"List of sub questions that need to be answered before asking this question\",\n    )\n    node_type: Literal[\"SINGLE\", \"MERGE_MULTIPLE_RESPONSES\"] = Field(\n        default=\"SINGLE\",\n        description=\"Type of question, either a single question or a multi-question merge\",\n    )\n\n\nclass QueryPlan(BaseModel):\n    \"\"\"Container class representing a tree of questions to ask a question answering system.\"\"\"\n\n    query_graph: List[Query] = Field(\n        ..., description=\"The query graph representing the plan\"\n    )\n\n    def _dependencies(self, ids: List[int]) -> List[Query]:\n        \"\"\"Returns the dependencies of a query given their ids.\"\"\"\n        return [q for q in self.query_graph if q.id in ids]\n```\n\n!!! warning \"Graph Generation\"\n\n    Notice that this example produces a flat list of items with dependencies that resemble a graph, while pydantic allows for recursive definitions, it's much easier and less confusing for the model to generate flat schemas rather than recursive schemas. If you want to see a recursive example, see [recursive schemas](recursive.md)\n\n## Planning a Query Plan\n\nNow, let's demonstrate how to plan and execute a query plan using the defined models and the OpenAI API.\n\n```python\nimport instructor\n\n# <%hide%>\nfrom typing import List, Literal\nfrom pydantic import Field, BaseModel\n\n\nclass Query(BaseModel):\n    \"\"\"Class representing a single question in a query plan.\"\"\"\n\n    id: int = Field(..., description=\"Unique id of the query\")\n    question: str = Field(\n        ...,\n        description=\"Question asked using a question answering system\",\n    )\n    dependencies: List[int] = Field(\n        default_factory=list,\n        description=\"List of sub questions that need to be answered before asking this question\",\n    )\n    node_type: Literal[\"SINGLE\", \"MERGE_MULTIPLE_RESPONSES\"] = Field(\n        default=\"SINGLE\",\n        description=\"Type of question, either a single question or a multi-question merge\",\n    )\n\n\nclass QueryPlan(BaseModel):\n    \"\"\"Container class representing a tree of questions to ask a question answering system.\"\"\"\n\n    query_graph: List[Query] = Field(\n        ..., description=\"The query graph representing the plan\"\n    )\n\n    def _dependencies(self, ids: List[int]) -> List[Query]:\n        \"\"\"Returns the dependencies of a query given their ids.\"\"\"\n        return [q for q in self.query_graph if q.id in ids]\n\n\n# <%hide%>\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef query_planner(question: str) -> QueryPlan:\n    PLANNING_MODEL = \"gpt-4o-mini\"\n\n    messages = [\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a world class query planning algorithm capable ofbreaking apart questions into its dependency queries such that the answers can be used to inform the parent question. Do not answer the questions, simply provide a correct compute graph with good specific questions to ask and relevant dependencies. Before you call the function, think step-by-step to get a better understanding of the problem.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"Consider: {question}\\nGenerate the correct query plan.\",\n        },\n    ]\n\n    root = client.create(\n        model=PLANNING_MODEL,\n        temperature=0,\n        response_model=QueryPlan,\n        messages=messages,\n        max_tokens=1000,\n    )\n    return root\n```\n\n```\nplan = query_planner(\n    \"What is the difference in populations of Canada and the Jason's home country?\"\n)\nplan.model_dump()\n```\n\n!!! warning \"No RAG\"\n\n    While we build the query plan in this example, we do not propose a method to actually answer the question. You can implement your own answer function that perhaps makes a retrieval and calls openai for retrieval augmented generation. That step would also make use of function calls but goes beyond the scope of this example.\n\n```python\n{\n    \"query_graph\": [\n        {\n            \"dependencies\": [],\n            \"id\": 1,\n            \"node_type\": \"SINGLE\",\n            \"question\": \"Identify Jason's home country\",\n        },\n        {\n            \"dependencies\": [],\n            \"id\": 2,\n            \"node_type\": \"SINGLE\",\n            \"question\": \"Find the population of Canada\",\n        },\n        {\n            \"dependencies\": [1],\n            \"id\": 3,\n            \"node_type\": \"SINGLE\",\n            \"question\": \"Find the population of Jason's home country\",\n        },\n        {\n            \"dependencies\": [2, 3],\n            \"id\": 4,\n            \"node_type\": \"SINGLE\",\n            \"question\": \"Calculate the difference in populations between Canada and Jasons home country\",\n        },\n    ]\n}\n```\n\nIn the above code, we define a `query_planner` function that takes a question as input and generates a query plan using the OpenAI API.\n\n## Conclusion\n\nIn this example, we demonstrated how to use the OpenAI Function Call `ChatCompletion` model to plan a query using a question-answering system. We defined the necessary structures using Pydantic and created a query planner function that generates a structured plan for answering complex questions.\n\nThe query planner breaks down the main question into smaller, manageable sub-questions, establishing dependencies between them. This approach allows for a systematic and organized way to tackle multi-step queries.\n\nFor more advanced implementations and variations of this concept, you can explore:\n\n1. [Query planning and execution example](https://github.com/jxnl/instructor/blob/main/examples/query_planner_execution/query_planner_execution.py)\n2. [Task planning with topological sort](https://github.com/jxnl/instructor/blob/main/examples/task_planner/task_planner_topological_sort.py)\n\nThese examples provide additional insights into how you can leverage structured outputs for complex query planning and task management.\n\nFeel free to adapt this code to your specific use cases and explore the possibilities of using OpenAI Function Calls to plan and structure complex workflows in your applications.\n"
  },
  {
    "path": "docs/examples/recursive.md",
    "content": "---\ntitle: Working with Recursive Schemas in Instructor\ndescription: Learn how to effectively implement and use recursive Pydantic models for handling nested and hierarchical data structures.\n---\n\n## See Also\n\n- [Nested Structures](../learning/patterns/nested_structure.md) - Complex hierarchical models\n- [Knowledge Graph](./knowledge_graph.md) - Build knowledge graphs\n- [Response Models](../concepts/models.md) - Working with complex data structures\n- [Types](../concepts/types.md) - Working with different data types\n\n# Recursive Schema Implementation Guide\n\nThis guide demonstrates how to work with recursive schemas in Instructor using Pydantic models. While flat schemas are often simpler to work with, some use cases require recursive structures to represent hierarchical data effectively.\n\n!!! tips \"Motivation\"\n    Recursive schemas are particularly useful when dealing with:\n    * Nested organizational structures\n    * File system hierarchies\n    * Comment threads with replies\n    * Task dependencies with subtasks\n    * Abstract syntax trees\n\n## Defining a Recursive Schema\n\nHere's an example of how to define a recursive Pydantic model:\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel, Field\n\n\nclass RecursiveNode(BaseModel):\n    \"\"\"A node that can contain child nodes of the same type.\"\"\"\n\n    name: str = Field(..., description=\"Name of the node\")\n    value: Optional[str] = Field(\n        None, description=\"Optional value associated with the node\"\n    )\n    children: List[\"RecursiveNode\"] = Field(\n        default_factory=list, description=\"List of child nodes\"\n    )\n\n\n# Required for recursive Pydantic models\nRecursiveNode.model_rebuild()\n```\n\n## Example Usage\n\nLet's see how to use this recursive schema with Instructor:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef parse_hierarchy(text: str) -> RecursiveNode:\n    \"\"\"Parse text into a hierarchical structure.\"\"\"\n    return client.create(\n        model=\"gpt-4\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are an expert at parsing text into hierarchical structures.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Parse this text into a hierarchical structure: {text}\",\n            },\n        ],\n        response_model=RecursiveNode,\n    )\n\n\n# Example usage\nhierarchy = parse_hierarchy(\n    \"\"\"\nCompany: Acme Corp\n- Department: Engineering\n  - Team: Frontend\n    - Project: Website Redesign\n    - Project: Mobile App\n  - Team: Backend\n    - Project: API v2\n    - Project: Database Migration\n- Department: Marketing\n  - Team: Digital\n    - Project: Social Media Campaign\n  - Team: Brand\n    - Project: Logo Refresh\n\"\"\"\n)\n```\n\n## Validation and Best Practices\n\nWhen working with recursive schemas:\n\n1. Always call `model_rebuild()` after defining the model\n2. Consider adding validation for maximum depth to prevent infinite recursion\n3. Use type hints properly to maintain code clarity\n4. Consider implementing custom validators for specific business rules\n\n```python\nfrom pydantic import model_validator\n\n\nclass RecursiveNodeWithDepth(RecursiveNode):\n    @model_validator(mode='after')\n    def validate_depth(self) -> \"RecursiveNodeWithDepth\":\n        def check_depth(node: \"RecursiveNodeWithDepth\", current_depth: int = 0) -> int:\n            if current_depth > 10:  # Maximum allowed depth\n                raise ValueError(\"Maximum depth exceeded\")\n            return max(\n                [check_depth(child, current_depth + 1) for child in node.children],\n                default=current_depth,\n            )\n\n        check_depth(self)\n        return self\n```\n\n## Performance Considerations\n\nWhile recursive schemas are powerful, they can be more challenging for language models to handle correctly. Consider these tips:\n\n1. Keep structures as shallow as possible\n2. Use clear naming conventions\n3. Provide good examples in your prompts\n4. Consider breaking very large structures into smaller chunks\n\n## Conclusion\n\nRecursive schemas provide a powerful way to handle hierarchical data structures in your applications. While they require more careful handling than flat schemas, they can be invaluable for certain use cases.\n\nFor more examples of working with complex data structures, check out:\n1. [Query Planning with Dependencies](planning-tasks.md)\n2. [Knowledge Graph Generation](knowledge_graph.md)\n"
  },
  {
    "path": "docs/examples/search.md",
    "content": "---\ntitle: Search Query Segmentation with Instructor - Multi-Task Extraction\ndescription: Segment complex search queries into actionable tasks using Instructor. Break down user queries into parallel executable tasks with structured outputs.\n---\n\n# Example: Segmenting Search Queries\n\nIn this example, we will demonstrate how to leverage the `MultiTask` and `enum.Enum` features of OpenAI Function Call to segment search queries. We will define the necessary structures using Pydantic and demonstrate how segment queries into multiple sub queries and execute them in parallel with `asyncio`.\n\n!!! tips \"Motivation\"\n\n    Extracting a list of tasks from text is a common use case for leveraging language models. This pattern can be applied to various applications, such as virtual assistants like Siri or Alexa, where understanding user intent and breaking down requests into actionable tasks is crucial. In this example, we will demonstrate how to use OpenAI Function Call to segment search queries and execute them in parallel.\n\n## Structure of the Data\n\nThe `Search` class is a Pydantic model that defines the structure of the search query. It has three fields: `title`, `query`, and `type`. The `title` field is the title of the request, the `query` field is the query to search for relevant content, and the `type` field is the type of search. The `execute` method is used to execute the search query.\n\n```python\nimport instructor\nfrom typing import Iterable, Literal\nfrom pydantic import BaseModel, Field\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Search(BaseModel):\n    query: str = Field(..., description=\"Query to search for relevant content\")\n    type: Literal[\"web\", \"image\", \"video\"] = Field(..., description=\"Type of search\")\n\n    async def execute(self):\n        print(\n            f\"Searching for `{self.title}` with query `{self.query}` using `{self.type}`\"\n        )\n\n\ndef segment(data: str) -> Search:\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=Iterable[Search],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Consider the data below: '\\n{data}' and segment it into multiple search queries\",\n            },\n        ],\n        max_tokens=1000,\n    )\n\n\nfor search in segment(\"Search for a picture of a cat and a video of a dog\"):\n    print(search.model_dump_json())\n    #> {\"query\":\"picture of a cat\",\"type\":\"image\"}\n    #> {\"query\":\"video of a dog\",\"type\":\"video\"}\n```\n"
  },
  {
    "path": "docs/examples/self_critique.md",
    "content": "---\ntitle: Implementing Self-Correction with LLM Validator\ndescription: Learn how to use llm_validator for self-healing in NLP applications and improve response accuracy with validation errors.\n---\n\n# Self-Correction with `llm_validator`\n\n## Introduction\n\nThis guide demonstrates how to use `llm_validator` for implementing self-healing. The objective is to showcase how an instructor can self-correct by using validation errors and helpful error messages.\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass QuestionAnswer(BaseModel):\n    question: str\n    answer: str\n\n\nquestion = \"What is the meaning of life?\"\ncontext = \"The according to the devil the meaning of live is to live a life of sin and debauchery.\"\n\nqa: QuestionAnswer = client.create(\n    response_model=QuestionAnswer,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n        },\n    ],\n)\n```\n\n### Output Before Validation\n\nWhile it calls out the objectionable content, it doesn't provide any details on how to correct it.\n\n```json\n{\n  \"question\": \"What is the meaning of life?\",\n  \"answer\": \"The meaning of life, according to the context, is to live a life of sin and debauchery.\"\n}\n```\n\n## Adding Custom Validation\n\nBy adding a validator to the `answer` field, we can try to catch the issue and correct it.\nLets integrate `llm_validator` into the model and see the error message. Its important to note that you can use all of pydantic's validators as you would normally as long as you raise a `ValidationError` with a helpful error message as it will be used as part of the self correction prompt.\n\n```python\nfrom pydantic import BaseModel, BeforeValidator\nfrom typing_extensions import Annotated\nfrom instructor import llm_validator\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass QuestionAnswerNoEvil(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"don't say objectionable things\", client=client, allow_override=True\n            )\n        ),\n    ]\n\n\ntry:\n    qa: QuestionAnswerNoEvil = client.create(\n        response_model=QuestionAnswerNoEvil,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n            },\n        ],\n    )\nexcept Exception as e:\n    print(e)\n    #> name 'context' is not defined\n```\n\n### Output After Validation\n\nNow, we throw validation error that its objectionable and provide a helpful error message.\n\n```text\n1 validation error for QuestionAnswerNoEvil\nanswer\n  Assertion failed, The statement promotes sin and debauchery, which is objectionable.\n```\n\n## Retrying with Corrections\n\nBy adding the `max_retries` parameter, we can retry the request with corrections. and use the error message to correct the output.\n\n```python\n# <%hide%>\nimport instructor\nfrom pydantic import BaseModel, BeforeValidator\nfrom typing_extensions import Annotated\nfrom instructor import llm_validator\n\nquestion = \"What is the meaning of life?\"\ncontext = \"The according to the devil the meaning of live is to live a life of sin and debauchery.\"\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n\nclass QuestionAnswerNoEvil(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"don't say objectionable things\", client=client, allow_override=True\n            )\n        ),\n    ]\n\n\n# <%hide%>\n\nqa: QuestionAnswerNoEvil = client.create(\n    response_model=QuestionAnswerNoEvil,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n        },\n    ],\n)\n```\n\n### Final Output\n\nNow, we get a valid response that is not objectionable!\n\n```json\n{\n  \"question\": \"What is the meaning of life?\",\n  \"answer\": \"The meaning of life is subjective and can vary depending on individual beliefs and philosophies.\"\n}\n```\n"
  },
  {
    "path": "docs/examples/single_classification.md",
    "content": "---\ntitle: Single-Label Text Classification - SPAM Detection Example\ndescription: Implement single-label text classification with Instructor. Classify text as SPAM or NOT_SPAM with chain-of-thought reasoning.\n---\n\n# Single-Label Classification\n\nThis example demonstrates how to perform single-label classification using the OpenAI API. The example uses the `gpt-3.5-turbo` model to classify text as either `SPAM` or `NOT_SPAM`.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\nimport instructor\n\n# Apply the patch to the OpenAI client\n# enables response_model keyword\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ClassificationResponse(BaseModel):\n    \"\"\"\n    A few-shot example of text classification:\n\n    Examples:\n    - \"Buy cheap watches now!\": SPAM\n    - \"Meeting at 3 PM in the conference room\": NOT_SPAM\n    - \"You've won a free iPhone! Click here\": SPAM\n    - \"Can you pick up some milk on your way home?\": NOT_SPAM\n    - \"Increase your followers by 10000 overnight!\": SPAM\n    \"\"\"\n\n    label: Literal[\"SPAM\", \"NOT_SPAM\"] = Field(\n        ...,\n        description=\"The predicted class label.\",\n    )\n\n\ndef classify(data: str) -> ClassificationResponse:\n    \"\"\"Perform single-label classification on the input text.\"\"\"\n    return client.create(\n        model=\"gpt-4o-mini\",\n        response_model=ClassificationResponse,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: <text>{data}</text>\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    for text, label in [\n        (\"Hey Jason! You're awesome\", \"NOT_SPAM\"),\n        (\"I am a nigerian prince and I need your help.\", \"SPAM\"),\n    ]:\n        prediction = classify(text)\n        assert prediction.label == label\n        print(f\"Text: {text}, Predicted Label: {prediction.label}\")\n        #> Text: Hey Jason! You're awesome, Predicted Label: NOT_SPAM\n        #> Text: I am a nigerian prince and I need your help., Predicted Label: SPAM\n```\n"
  },
  {
    "path": "docs/examples/sqlmodel.md",
    "content": "---\ntitle: SQLModel with Instructor - Complete Guide to AI-Powered Database Operations\ndescription: Master SQLModel integration with Instructor for AI-powered database operations, FastAPI APIs, and production-ready applications. Learn advanced patterns, performance optimization, and best practices.\nkeywords: SQLModel, Instructor AI, Python ORM, FastAPI integration, database automation, AI data generation, Pydantic models, SQLAlchemy, OpenAI GPT, structured data extraction\n---\n\n# SQLModel with Instructor: Complete Integration Guide\n\n[SQLModel](https://sqlmodel.tiangolo.com/) is a modern Python library that combines the power of SQLAlchemy's database operations with Pydantic's data validation. Created by Sebastian Ramirez (the creator of FastAPI), SQLModel provides a unified approach to database modeling and API development.\n\nWhen integrated with Instructor, SQLModel becomes a powerful tool for AI-driven database operations, allowing you to generate structured data directly from language models and seamlessly store it in your database.\n\n## Why SQLModel + Instructor?\n\nThe combination of SQLModel and Instructor offers several key advantages:\n\n- **Single Model Definition**: Write one model that works for database tables, API schemas, and AI data generation\n- **Type Safety**: Full type checking and editor support throughout your application\n- **AI-Powered Data Generation**: Generate realistic database records using large language models\n- **FastAPI Integration**: Seamless API development with automatic documentation\n- **Production Ready**: Built on proven technologies (SQLAlchemy + Pydantic)\n\n## Quick Start Example\n\nHere's a simple example to get you started:\n\n```python\nimport instructor\nfrom typing import Optional\nfrom uuid import UUID, uuid4\nfrom pydantic.json_schema import SkipJsonSchema\nfrom sqlmodel import Field, SQLModel, create_engine, Session\n\n# Initialize the Instructor client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Hero(SQLModel, instructor.OpenAISchema, table=True):\n    id: SkipJsonSchema[UUID] = Field(default_factory=lambda: uuid4(), primary_key=True)\n    name: str\n    secret_name: str\n    age: Optional[int] = None\n    power_level: Optional[int] = Field(default=None, ge=1, le=100)\n\n\n# Generate AI-powered data\ndef create_hero() -> Hero:\n    return client.create(\n        model=\"gpt-4\",\n        response_model=Hero,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a superhero with a power level between 1-100\",\n            },\n        ],\n    )\n\n\n# Database setup and insertion\nengine = create_engine(\"sqlite:///heroes.db\")\nSQLModel.metadata.create_all(engine)\n\nhero = create_hero()\nwith Session(engine) as session:\n    session.add(hero)\n    session.commit()\n    print(f\"Created hero: {hero.name} with power level {hero.power_level}\")\n```\n\n# Core Concepts and Best Practices\n\n## Model Definition Strategies\n\n### Using SkipJsonSchema for Auto-Generated Fields\n\nThe `SkipJsonSchema` annotation is crucial for fields that should be generated by your application rather than the AI:\n\n```python\nfrom pydantic.json_schema import SkipJsonSchema\nfrom sqlmodel import Field, SQLModel\nimport instructor\nfrom uuid import UUID, uuid4\nfrom datetime import datetime\n\n\nclass Product(SQLModel, instructor.OpenAISchema, table=True):\n    # Auto-generated fields excluded from AI generation\n    id: SkipJsonSchema[UUID] = Field(default_factory=uuid4, primary_key=True)\n    created_at: SkipJsonSchema[datetime] = Field(default_factory=datetime.utcnow)\n    updated_at: SkipJsonSchema[datetime] = Field(default_factory=datetime.utcnow)\n\n    # AI-generated fields\n    name: str = Field(description=\"Product name\")\n    description: str = Field(description=\"Detailed product description\")\n    price: float = Field(gt=0, description=\"Product price in USD\")\n    category: str = Field(description=\"Product category\")\n```\n\n### Field Validation and Constraints\n\nSQLModel supports Pydantic's validation features, ensuring data quality:\n\n```python\nfrom typing import Optional\nfrom sqlmodel import Field, SQLModel\nimport instructor\nfrom pydantic import validator\n\n\nclass Customer(SQLModel, instructor.OpenAISchema, table=True):\n    id: Optional[int] = Field(default=None, primary_key=True)\n    name: str = Field(min_length=2, max_length=100)\n    email: str = Field(regex=r'^[\\w\\.-]+@[\\w\\.-]+\\.\\w+$')\n    age: Optional[int] = Field(default=None, ge=18, le=120)\n    credit_score: Optional[int] = Field(default=None, ge=300, le=850)\n\n    @validator('email')\n    def validate_email_domain(cls, v):\n        allowed_domains = ['gmail.com', 'yahoo.com', 'outlook.com']\n        domain = v.split('@')[1]\n        if domain not in allowed_domains:\n            raise ValueError(f'Email domain must be one of {allowed_domains}')\n        return v\n```\n\n## Advanced Integration Patterns\n\n### Relationship Modeling with AI Generation\n\nSQLModel supports relationships between tables, which can be populated using AI:\n\n```python\nfrom typing import List, Optional\nfrom sqlmodel import Field, SQLModel, Relationship\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Team(SQLModel, table=True):\n    id: Optional[int] = Field(default=None, primary_key=True)\n    name: str\n    city: str\n\n    # Relationship to heroes\n    heroes: List[\"Hero\"] = Relationship(back_populates=\"team\")\n\n\nclass Hero(SQLModel, instructor.OpenAISchema, table=True):\n    id: Optional[int] = Field(default=None, primary_key=True)\n    name: str\n    secret_name: str\n    age: Optional[int] = None\n\n    # Foreign key to team\n    team_id: Optional[int] = Field(default=None, foreign_key=\"team.id\")\n    team: Optional[Team] = Relationship(back_populates=\"heroes\")\n\n\ndef create_hero_for_team(team_name: str) -> Hero:\n    return client.create(\n        model=\"gpt-4\",\n        response_model=Hero,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Create a superhero for the {team_name} team\"},\n        ],\n    )\n```\n\n### Bulk Data Generation\n\nGenerate multiple records efficiently:\n\n```python\nfrom typing import List\nimport instructor\nfrom sqlmodel import Session\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef create_hero_team(team_size: int = 5) -> List[Hero]:\n    return client.create(\n        model=\"gpt-4\",\n        response_model=List[Hero],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Create a team of {team_size} diverse superheroes\",\n            },\n        ],\n    )\n\n\n# Bulk insert\nheroes = create_hero_team(10)\nwith Session(engine) as session:\n    for hero in heroes:\n        session.add(hero)\n    session.commit()\n    print(f\"Created {len(heroes)} heroes\")\n```\n\n# FastAPI Integration\n\n## Building Production APIs\n\nSQLModel's tight integration with FastAPI makes it perfect for building production APIs:\n\n```python\nfrom fastapi import FastAPI, HTTPException, Depends\nfrom sqlmodel import Session, select\nfrom typing import List\nimport instructor\n\napp = FastAPI(title=\"Hero Management API\")\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\ndef get_session():\n    with Session(engine) as session:\n        yield session\n\n\nsession_dep = Depends(get_session)\n\n\n# Create hero endpoint\n@app.post(\"/heroes/\", response_model=Hero)\nasync def create_hero_endpoint(prompt: str, session: Session = session_dep):\n    hero = await client.create(\n        model=\"gpt-4\",\n        response_model=Hero,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Create a superhero: {prompt}\"},\n        ],\n    )\n    session.add(hero)\n    session.commit()\n    session.refresh(hero)\n    return hero\n\n\n# List heroes endpoint\n@app.get(\"/heroes/\", response_model=List[Hero])\ndef list_heroes(limit: int = 10, offset: int = 0, session: Session = session_dep):\n    statement = select(Hero).offset(offset).limit(limit)\n    heroes = session.exec(statement).all()\n    return heroes\n\n\n# Get specific hero\n@app.get(\"/heroes/{hero_id}\", response_model=Hero)\ndef get_hero(hero_id: int, session: Session = session_dep):\n    hero = session.get(Hero, hero_id)\n    if not hero:\n        raise HTTPException(status_code=404, detail=\"Hero not found\")\n    return hero\n```\n\n## API Response Models\n\nCreate specialized models for different API operations:\n\n```python\nfrom sqlmodel import SQLModel\nfrom typing import Optional\n\n\n# Base model for database\nclass HeroBase(SQLModel):\n    name: str\n    secret_name: str\n    age: Optional[int] = None\n\n\n# Database model\nclass Hero(HeroBase, table=True):\n    id: Optional[int] = Field(default=None, primary_key=True)\n\n\n# API models\nclass HeroCreate(HeroBase):\n    pass\n\n\nclass HeroRead(HeroBase):\n    id: int\n\n\nclass HeroUpdate(SQLModel):\n    name: Optional[str] = None\n    secret_name: Optional[str] = None\n    age: Optional[int] = None\n```\n\n# Performance Optimization\n\n## Database Connection Management\n\nOptimize database connections for production:\n\n```python\nfrom sqlmodel import create_engine\nfrom sqlalchemy.pool import QueuePool\n\n# Production database configuration\nengine = create_engine(\n    \"postgresql://user:password@localhost/dbname\",\n    poolclass=QueuePool,\n    pool_size=20,\n    max_overflow=0,\n    pool_pre_ping=True,\n    echo=False,  # Set to True for debugging\n)\n```\n\n## Efficient AI Data Generation\n\nOptimize AI calls for better performance:\n\n```python\nimport asyncio\nfrom typing import List\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nasync def create_heroes_batch(prompts: List[str]) -> List[Hero]:\n    \"\"\"Generate multiple heroes concurrently\"\"\"\n    tasks = []\n    for prompt in prompts:\n        task = client.create(\n            model=\"gpt-4\",\n            response_model=Hero,\n            messages=[{\"role\": \"user\", \"content\": prompt}],\n        )\n        tasks.append(task)\n\n    return await asyncio.gather(*tasks)\n\n\n# Usage\nprompts = [\n    \"Create a fire-based superhero\",\n    \"Create a water-based superhero\",\n    \"Create an earth-based superhero\",\n]\nheroes = await create_heroes_batch(prompts)\n```\n\n# Testing Strategies\n\n## Unit Testing with SQLModel\n\nTest your models and AI integration:\n\n```python\nimport pytest\nfrom sqlmodel import Session, SQLModel, create_engine\nfrom sqlalchemy.pool import StaticPool\n\n\n@pytest.fixture\ndef session():\n    engine = create_engine(\n        \"sqlite://\",\n        connect_args={\"check_same_thread\": False},\n        poolclass=StaticPool,\n    )\n    SQLModel.metadata.create_all(engine)\n    with Session(engine) as session:\n        yield session\n\n\ndef test_hero_creation(session):\n    hero = Hero(name=\"Test Hero\", secret_name=\"Test Identity\", age=25)\n    session.add(hero)\n    session.commit()\n\n    assert hero.id is not None\n    assert hero.name == \"Test Hero\"\n\n\n@pytest.mark.asyncio\nasync def test_ai_hero_generation():\n    # Mock the AI response for testing\n    mock_hero = Hero(name=\"AI Hero\", secret_name=\"AI Identity\", age=30)\n\n    # Test the generated hero meets requirements\n    assert len(mock_hero.name) > 0\n    assert len(mock_hero.secret_name) > 0\n    assert mock_hero.age is None or mock_hero.age > 0\n```\n\n## Integration Testing\n\nTest the full stack including AI generation:\n\n```python\nfrom fastapi.testclient import TestClient\n\nclient = TestClient(app)\n\n\ndef test_create_hero_endpoint():\n    response = client.post(\"/heroes/\", params={\"prompt\": \"Create a test superhero\"})\n    assert response.status_code == 200\n    hero_data = response.json()\n    assert \"name\" in hero_data\n    assert \"secret_name\" in hero_data\n\n\ndef test_list_heroes():\n    response = client.get(\"/heroes/\")\n    assert response.status_code == 200\n    heroes = response.json()\n    assert isinstance(heroes, list)\n```\n\n# Production Deployment\n\n## Environment Configuration\n\nSet up proper configuration for different environments:\n\n```python\nfrom pydantic import BaseSettings\nfrom sqlmodel import create_engine\n\n\nclass Settings(BaseSettings):\n    database_url: str = \"sqlite:///./app.db\"\n    openai_api_key: str\n    debug: bool = False\n\n    class Config:\n        env_file = \".env\"\n\n\nsettings = Settings()\nengine = create_engine(settings.database_url)\n```\n\n## Error Handling and Logging\n\nImplement robust error handling:\n\n```python\nimport logging\nfrom fastapi import HTTPException\nimport instructor\n\nlogger = logging.getLogger(__name__)\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nasync def safe_create_hero(prompt: str) -> Hero:\n    try:\n        hero = await client.create(\n            model=\"gpt-4\",\n            response_model=Hero,\n            messages=[{\"role\": \"user\", \"content\": prompt}],\n            max_retries=3,\n        )\n        logger.info(f\"Successfully created hero: {hero.name}\")\n        return hero\n    except Exception as e:\n        logger.error(f\"Failed to create hero: {str(e)}\")\n        raise HTTPException(\n            status_code=500, detail=\"Failed to generate hero data\"\n        ) from e\n```\n\n# Advanced Use Cases\n\n## Data Migration and Seeding\n\nUse AI to generate realistic seed data:\n\n```python\nfrom sqlmodel import Session\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef seed_database():\n    \"\"\"Generate realistic seed data for development\"\"\"\n    engine = create_engine(\"sqlite:///seed.db\")\n    SQLModel.metadata.create_all(engine)\n\n    # Generate diverse heroes\n    hero_types = [\n        \"tech-based superhero\",\n        \"magic-based superhero\",\n        \"strength-based superhero\",\n        \"speed-based superhero\",\n        \"psychic superhero\",\n    ]\n\n    with Session(engine) as session:\n        for hero_type in hero_types:\n            for _ in range(5):  # 5 heroes of each type\n                hero = client.create(\n                    model=\"gpt-4\",\n                    response_model=Hero,\n                    messages=[\n                        {\"role\": \"user\", \"content\": f\"Create a unique {hero_type}\"}\n                    ],\n                )\n                session.add(hero)\n\n        session.commit()\n        print(\"Database seeded successfully!\")\n\n\nif __name__ == \"__main__\":\n    seed_database()\n```\n\n## Real-time Data Processing\n\nCombine SQLModel with streaming for real-time applications:\n\n```python\nfrom fastapi import FastAPI\nfrom fastapi.responses import StreamingResponse\nimport instructor\nimport json\n\napp = FastAPI()\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\n@app.post(\"/heroes/stream\")\nasync def stream_hero_creation(prompts: List[str]):\n    async def generate_heroes():\n        for prompt in prompts:\n            try:\n                hero = await client.create(\n                    model=\"gpt-4\",\n                    response_model=Hero,\n                    messages=[{\"role\": \"user\", \"content\": prompt}],\n                )\n\n                # Save to database\n                with Session(engine) as session:\n                    session.add(hero)\n                    session.commit()\n                    session.refresh(hero)\n\n                yield f\"data: {hero.model_dump_json()}\\n\\n\"\n            except Exception as e:\n                yield f\"data: {json.dumps({'error': str(e)})}\\n\\n\"\n\n    return StreamingResponse(generate_heroes(), media_type=\"text/plain\")\n```\n\n# Troubleshooting Common Issues\n\n## Model Inheritance Issues\n\nWhen using both SQLModel and instructor.OpenAISchema:\n\n```python\n# Correct way to inherit from both\nclass Hero(SQLModel, instructor.OpenAISchema, table=True):\n    __table_args__ = {'extend_existing': True}  # Prevents table conflicts\n    # ... model fields\n```\n\n## JSON Schema Conflicts\n\nHandle conflicts between database and AI schema requirements:\n\n```python\nfrom pydantic import Field\nfrom pydantic.json_schema import SkipJsonSchema\n\n\nclass Hero(SQLModel, instructor.OpenAISchema, table=True):\n    # Database-only fields\n    id: SkipJsonSchema[int] = Field(default=None, primary_key=True)\n    created_at: SkipJsonSchema[datetime] = Field(default_factory=datetime.utcnow)\n\n    # AI-generated fields with database constraints\n    name: str = Field(description=\"Hero name for AI\", max_length=100)  # DB constraint\n    power_level: int = Field(description=\"Power level 1-100\", ge=1, le=100)\n```\n\n## Performance Monitoring\n\nMonitor AI generation performance:\n\n```python\nimport time\nfrom functools import wraps\n\n\ndef monitor_ai_calls(func):\n    @wraps(func)\n    async def wrapper(*args, **kwargs):\n        start_time = time.time()\n        result = await func(*args, **kwargs)\n        duration = time.time() - start_time\n        logger.info(f\"AI call took {duration:.2f} seconds\")\n        return result\n\n    return wrapper\n\n\n@monitor_ai_calls\nasync def create_hero(prompt: str) -> Hero:\n    return await client.create(\n        model=\"gpt-4\",\n        response_model=Hero,\n        messages=[{\"role\": \"user\", \"content\": prompt}],\n    )\n```\n\n# Conclusion\n\nSQLModel with Instructor provides a powerful foundation for building AI-powered applications with robust database integration. The combination offers:\n\n- **Developer Productivity**: Single model definition for multiple use cases\n- **Type Safety**: Full type checking and validation\n- **AI Integration**: Seamless integration with language models\n- **Production Ready**: Built on proven, scalable technologies\n- **FastAPI Compatible**: Perfect for modern API development\n\nBy following the patterns and best practices outlined in this guide, you can build sophisticated applications that leverage AI for data generation while maintaining data integrity and performance.\n\n## Next Steps\n\n- Explore the [FastAPI integration guide](../concepts/fastapi.md) for advanced API patterns\n- Check out [validation techniques](../concepts/validation.md) for robust data handling\n- Learn about [streaming responses](partial_streaming.md) for real-time applications\n\n![Database screenshot showing AI-generated hero records stored in SQLite database](db.png)\n\n*Example of AI-generated hero data stored in SQLite database*\n"
  },
  {
    "path": "docs/examples/tables_from_vision.md",
    "content": "---\ntitle: Extracting Tables from Images Using OpenAI GPT-4\ndescription: Learn how to convert images into markdown tables using OpenAI's GPT-4 Vision model for data extraction and analysis.\n---\n\n# Extracting Tables from Images with OpenAI's GPT-4 Vision Model\n\nFirst, we define a custom type, `MarkdownDataFrame`, to handle pandas DataFrames formatted in markdown. This type uses Python's `Annotated` and `InstanceOf` types, along with decorators `BeforeValidator` and `PlainSerializer`, to process and serialize the data.\n\n## Defining the Table Class\n\nThe `Table` class is essential for organizing the extracted data. It includes a caption and a dataframe, processed as a markdown table. Since most of the complexity is handled by the `MarkdownDataFrame` type, the `Table` class is straightforward!\n\nThis requires additional dependencies `pip install pandas tabulate`.\n\n```python\nfrom io import StringIO\nfrom typing import Annotated, Any, List\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n    PlainSerializer,\n    InstanceOf,\n    WithJsonSchema,\n)\nimport instructor\nimport pandas as pd\nfrom rich.console import Console\n\nconsole = Console()\nclient = instructor.from_provider(\"openai/gpt-4o\", mode=instructor.Mode.TOOLS)\n\n\ndef md_to_df(data: Any) -> Any:\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Get rid of whitespaces\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .map(lambda x: x.strip())\n        )  # type: ignore\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda x: x.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n                The markdown representation of the table,\n                each one should be tidy, do not try to join tables\n                that should be seperate\"\"\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\nclass MultipleTables(BaseModel):\n    tables: List[Table]\n\n\nexample = MultipleTables(\n    tables=[\n        Table(\n            caption=\"This is a caption\",\n            dataframe=pd.DataFrame(\n                {\n                    \"Chart A\": [10, 40],\n                    \"Chart B\": [20, 50],\n                    \"Chart C\": [30, 60],\n                }\n            ),\n        )\n    ]\n)\n\n\ndef extract(url: str) -> MultipleTables:\n    return client.create(\n        model=\"gpt-4-turbo\",\n        max_tokens=4000,\n        response_model=MultipleTables,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                            First, analyze the image to determine the most appropriate headers for the tables.\n                            Generate a descriptive h1 for the overall image, followed by a brief summary of the data it contains.\n                            For each identified table, create an informative h2 title and a concise description of its contents.\n                            Finally, output the markdown representation of each table.\n                            Make sure to escape the markdown table properly, and make sure to include the caption and the dataframe.\n                            including escaping all the newlines and quotes. Only return a markdown table in dataframe, nothing else.\n                        \"\"\",\n                    },\n                ],\n            }\n        ],\n    )\n\n\nurls = [\n    \"https://a.storyblok.com/f/47007/2400x1260/f816b031cb/uk-ireland-in-three-charts_chart_a.png/m/2880x0\",\n    \"https://a.storyblok.com/f/47007/2400x2000/bf383abc3c/231031_uk-ireland-in-three-charts_table_v01_b.png/m/2880x0\",\n]\n\nfor url in urls:\n    for table in extract(url).tables:\n        console.print(table.caption, \"\\n\", table.dataframe)\n```\n"
  },
  {
    "path": "docs/examples/tracing_with_langfuse.md",
    "content": "---\ntitle: Observability & Tracing with Langfuse\ndescription: Learn how to trace and monitor Instructor API calls using Langfuse for comprehensive observability in your LLM applications.\n---\n\n# Observability & Tracing with Langfuse\n\n**What is Langfuse?**\n\n> **What is Langfuse?** [Langfuse](https://langfuse.com) ([GitHub](https://github.com/langfuse/langfuse)) is an open source LLM engineering platform that helps teams trace API calls, monitor performance, and debug issues in their AI applications.\n\n![Instructor Trace in Langfuse showing structured output monitoring and observability](https://langfuse.com/images/docs/instructor-trace.png)\n\nThis cookbook shows how to use Langfuse to trace and monitor model calls made with the Instructor library.\n\n## Setup\n\n> **Note** : Before continuing with this section, make sure that you've signed up for an account with [Langfuse](https://langfuse.com). You'll need your private and public key to start tracing with Langfuse.\n\nFirst, let's start by installing the necessary dependencies.\n\n```python\npip install langfuse instructor\n```\n\nIt is easy to use instructor with Langfuse. We use the [Langfuse OpenAI Integration](https://langfuse.com/docs/integrations/openai) and simply patch the client with instructor. This works with both synchronous and asynchronous clients.\n\n### Langfuse-Instructor integration with synchronous OpenAI client\n\n```python\nimport instructor\nfrom langfuse.openai import openai\nfrom pydantic import BaseModel\nimport os\n\n# Set your API keys Here\nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-...\"\nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-...\"\nos.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\"\nos.environ[\"OPENAI_API_KEY] = \"sk-...\"\n\n# Patch Langfuse wrapper of synchronous OpenAI client with instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass WeatherDetail(BaseModel):\n    city: str\n    temperature: int\n\n\n# Run synchronous OpenAI client\nweather_info = client.create(\n    model=\"gpt-4o\",\n    response_model=WeatherDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"The weather in Paris is 18 degrees Celsius.\"},\n    ],\n)\n\nprint(weather_info.model_dump_json(indent=2))\n\"\"\"\n{\n  \"city\": \"Paris\",\n  \"temperature\": 18\n}\n\"\"\"\n```\n\nOnce we've run this request succesfully, we'll see that we have a trace avaliable in the Langfuse dashboard for you to look at.\n\n### Langfuse-Instructor integration with asychnronous OpenAI client\n\n```python\nimport instructor\nfrom langfuse.openai import openai\nfrom pydantic import BaseModel\nimport os\nimport asyncio\n\n# Set your API keys Here\nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-\"\nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-\"\nos.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\"\nos.environ[\"OPENAI_API_KEY] = \"sk-...\"\n\n\n# Patch Langfuse wrapper of synchronous OpenAI client with instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass WeatherDetail(BaseModel):\n    city: str\n    temperature: int\n\n\nasync def main():\n    # Run synchronous OpenAI client\n    weather_info = await client.create(\n        model=\"gpt-4o\",\n        response_model=WeatherDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": \"The weather in Paris is 18 degrees Celsius.\"},\n        ],\n    )\n\n    print(weather_info.model_dump_json(indent=2))\n    \"\"\"\n    {\n    \"city\": \"Paris\",\n    \"temperature\": 18\n    }\n    \"\"\"\n\n\nasyncio.run(main())\n\n```\n\nHere's a [public link](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/0da3f599-b807-4e14-9888-cf68fa53d976?timestamp=2025-03-31T16:12:40.076Z&display=details) to the trace that we generated which you can view in Langfuse.\n\n## Example\n\nIn this example, we first classify customer feedback into categories like `PRAISE`, `SUGGESTION`, `BUG` and `QUESTION`, and further scores the relevance of each feedback to the business on a scale of 0.0 to 1.0. In this case, we use the asynchronous OpenAI client `AsyncOpenAI` to classify and evaluate the feedback.\n\n```python\nfrom enum import Enum\n\nimport asyncio\nimport instructor\n\nfrom langfuse import Langfuse\nfrom langfuse.openai import AsyncOpenAI\nfrom langfuse.decorators import langfuse_context, observe\n\nfrom pydantic import BaseModel, Field, field_validator\nimport os\n\n# Set your API keys Here\nos.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"pk-...\"\nos.environ[\"LANGFUSE_SECRET_KEY\"] = \"sk-...\"\nos.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\"\nos.environ[\"OPENAI_API_KEY] = \"sk-...\"\n\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n# Initialize Langfuse (needed for scoring)\nlangfuse = Langfuse()\n\n# Rate limit the number of requests\nsem = asyncio.Semaphore(5)\n\n\n# Define feedback categories\nclass FeedbackType(Enum):\n    PRAISE = \"PRAISE\"\n    SUGGESTION = \"SUGGESTION\"\n    BUG = \"BUG\"\n    QUESTION = \"QUESTION\"\n\n\n# Model for feedback classification\nclass FeedbackClassification(BaseModel):\n    feedback_text: str = Field(...)\n    classification: list[FeedbackType] = Field(\n        description=\"Predicted categories for the feedback\"\n    )\n    relevance_score: float = Field(\n        default=0.0,\n        description=\"Score of the query evaluating its relevance to the business between 0.0 and 1.0\",\n    )\n\n    # Make sure feedback type is list\n    @field_validator(\"classification\", mode=\"before\")\n    def validate_classification(cls, v):\n        if not isinstance(v, list):\n            v = [v]\n        return v\n\n\n@observe()  # Langfuse decorator to automatically log spans to Langfuse\nasync def classify_feedback(feedback: str):\n    \"\"\"\n    Classify customer feedback into categories and evaluate relevance.\n    \"\"\"\n    async with sem:  # simple rate limiting\n        response = await client.create(\n            model=\"gpt-4o\",\n            response_model=FeedbackClassification,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Classify and score this feedback: {feedback}\",\n                },\n            ],\n        )\n\n        # Retrieve observation_id of current span\n        observation_id = langfuse_context.get_current_observation_id()\n\n        return feedback, response, observation_id\n\n\ndef score_relevance(trace_id: str, observation_id: str, relevance_score: float):\n    \"\"\"\n    Score the relevance of a feedback query in Langfuse given the observation_id.\n    \"\"\"\n    langfuse.score(\n        trace_id=trace_id,\n        observation_id=observation_id,\n        name=\"feedback-relevance\",\n        value=relevance_score,\n    )\n\n\n@observe()  # Langfuse decorator to automatically log trace to Langfuse\nasync def main(feedbacks: list[str]):\n    tasks = [classify_feedback(feedback) for feedback in feedbacks]\n    results = []\n\n    for task in asyncio.as_completed(tasks):\n        feedback, classification, observation_id = await task\n        result = {\n            \"feedback\": feedback,\n            \"classification\": [c.value for c in classification.classification],\n            \"relevance_score\": classification.relevance_score,\n        }\n        results.append(result)\n\n        # Retrieve trace_id of current trace\n        trace_id = langfuse_context.get_current_trace_id()\n\n        # Score the relevance of the feedback in Langfuse\n        score_relevance(trace_id, observation_id, classification.relevance_score)\n\n    # Flush observations to Langfuse\n    langfuse_context.flush()\n    return results\n\n\nfeedback_messages = [\n    \"The chat bot on your website does not work.\",\n    \"Your customer service is exceptional!\",\n    \"Could you add more features to your app?\",\n    \"I have a question about my recent order.\",\n]\n\nfeedback_classifications = asyncio.run(main(feedback_messages))\n\nfor classification in feedback_classifications:\n    print(f\"Feedback: {classification['feedback']}\")\n    print(f\"Classification: {classification['classification']}\")\n    print(f\"Relevance Score: {classification['relevance_score']}\")\n\n\n\"\"\"\nFeedback: I have a question about my recent order.\nClassification: ['QUESTION']\nRelevance Score: 0.0\nFeedback: Could you add more features to your app?\nClassification: ['SUGGESTION']\nRelevance Score: 0.0\nFeedback: The chat bot on your website does not work.\nClassification: ['BUG']\nRelevance Score: 0.9\nFeedback: Your customer service is exceptional!\nClassification: ['PRAISE']\nRelevance Score: 0.9\n\"\"\"\n```\n\nWe can see that with Langfuse, we were able to generate these different completions and view them with our own UI. Click here to see the [public trace](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/ba27e7b1-e23e-4f50-87de-420cf038190f?timestamp=2025-03-31T16:12:57.041Z&display=details) for the 5 completions that we generated.\n"
  },
  {
    "path": "docs/examples/using_decimals.md",
    "content": "---\ntitle: Working with Decimal Types in Instructor\ndescription: Learn how to use Python Decimal types for precise financial calculations and numeric data extraction with Instructor.\n---\n\n## See Also\n\n- [Types](../concepts/types.md) - Working with different data types\n- [Fields](../concepts/fields.md) - Customizing field validation\n- [Field Validation](../learning/patterns/field_validation.md) - Field-level validation patterns\n- [Validation](../concepts/validation.md) - Core validation concepts\n\n# Using Decimals\n\nExtract precise decimal values for financial calculations using Python's `Decimal` type.\n\n```python\nfrom decimal import Decimal\nfrom pydantic import BaseModel, field_validator\nimport instructor\n\n\nclass Receipt(BaseModel):\n    item: str\n    price: Decimal\n\n    @field_validator('price', mode='before')\n    @classmethod\n    def parse_price(cls, v):\n        if isinstance(v, str):\n            return Decimal(v)\n        return v\n\n\nclient = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\nreceipt = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Coffee costs $4.99\"}],\n    response_model=Receipt,\n)\n\nprint(f\"Item: {receipt.item}\")\nprint(f\"Price: {receipt.price}\")  # Decimal('4.99')\nprint(f\"Type: {type(receipt.price)}\")  # <class 'decimal.Decimal'>\n```\n\nThe `field_validator` ensures string values from LLM responses are properly converted to Decimal objects for precise financial calculations.\n"
  },
  {
    "path": "docs/examples/watsonx.md",
    "content": "---\ntitle: IBM watsonx.ai Integration - Enterprise LLM Inference\ndescription: Use IBM watsonx.ai with Instructor through LiteLLM for enterprise-grade structured outputs. Setup, authentication, and production examples.\n---\n\n# Structured Outputs with IBM watsonx.ai\n\nYou can use IBM watsonx.ai for inference using [LiteLLM](https://docs.litellm.ai/docs/providers/watsonx).\n\n## Prerequisites\n\n- IBM Cloud Account\n- API Key from IBM Cloud IAM: https://cloud.ibm.com/iam/apikeys\n- Project ID (from watsonx.ai instance URL: https://dataplatform.cloud.ibm.com/projects/<WATSONX_PROJECT_ID>/)\n\n## Install\n\n```bash\npoetry install instructor --with litellm\n```\n\n## Example\n\n```python\nimport os\n\nimport litellm\nfrom litellm import completion\nfrom pydantic import BaseModel, Field\n\nimport instructor\nfrom instructor import Mode\n\nlitellm.drop_params = True  # watsonx.ai doesn't support `json_mode`\n\nos.environ[\"WATSONX_URL\"] = \"https://us-south.ml.cloud.ibm.com\"\nos.environ[\"WATSONX_API_KEY\"] = \"\"\nos.environ[\"WATSONX_PROJECT_ID\"] = \"\"\n# Additional options: https://docs.litellm.ai/docs/providers/watsonx\n\n\nclass Company(BaseModel):\n    name: str = Field(description=\"name of the company\")\n    year_founded: int = Field(description=\"year the company was founded\")\n\n\nclient = instructor.from_litellm(completion, mode=Mode.JSON)\n\nresp = client.create(\n    model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n    max_tokens=1024,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\\\nGiven the following text, create a Company object:\n\nIBM was founded in 1911 as the Computing-Tabulating-Recording Company (CTR), a holding company of manufacturers of record-keeping and measuring systems.\n\"\"\",\n        }\n    ],\n    project_id=os.environ[\"WATSONX_PROJECT_ID\"],\n    response_model=Company,\n)\n\nprint(resp.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"IBM\",\n  \"year_founded\": 1911\n}\n\"\"\"\n```\n"
  },
  {
    "path": "docs/examples/youtube_clips.md",
    "content": "---\ntitle: Generating YouTube Clips from Transcripts Using Instructor\ndescription: Learn to create concise YouTube clips from video transcripts with `instructor` and OpenAI, enhancing your content engagement.\n---\n\n# Generating YouTube Clips from Transcripts\n\nThis guide demonstrates how to generate concise, informative clips from YouTube video transcripts using the `instructor` library. By leveraging the power of OpenAI's models, we can extract meaningful segments from a video's transcript, which can then be recut into smaller, standalone videos. This process involves identifying key moments within a transcript and summarizing them into clips with specific titles and descriptions.\n\nFirst, install the necessary packages:\n\n```bash\npip install youtube_transcript_api instructor rich\n```\n\n![YouTube clip streaming demonstration showing real-time video segment extraction](../img/youtube.gif)\n\n```python\nfrom youtube_transcript_api import YouTubeTranscriptApi\nfrom pydantic import BaseModel, Field\nfrom typing import List, Generator, Iterable\nimport instructor\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef extract_video_id(url: str) -> str | None:\n    import re\n\n    match = re.search(r\"v=([a-zA-Z0-9_-]+)\", url)\n    if match:\n        return match.group(1)\n\n\nclass TranscriptSegment(BaseModel):\n    source_id: int\n    start: float\n    text: str\n\n\ndef get_transcript_with_timing(\n    video_id: str,\n) -> Generator[TranscriptSegment, None, None]:\n    \"\"\"\n    Fetches the transcript of a YouTube video along with the start and end times\n    for each text segment, and returns them as a list of Pydantic models.\n    \"\"\"\n    transcript = YouTubeTranscriptApi.get_transcript(video_id)\n    for ii, segment in enumerate(transcript):\n        yield TranscriptSegment(\n            source_id=ii, start=segment[\"start\"], text=segment[\"text\"]\n        )\n\n\nclass YoutubeClip(BaseModel):\n    title: str = Field(description=\"Specific and informative title for the clip.\")\n    description: str = Field(\n        description=\"A detailed description of the clip, including notable quotes or phrases.\"\n    )\n    start: float\n    end: float\n\n\nclass YoutubeClips(BaseModel):\n    clips: List[YoutubeClip]\n\n\ndef yield_clips(segments: Iterable[TranscriptSegment]) -> Iterable[YoutubeClips]:\n    return client.create(\n        model=\"gpt-4-turbo-preview\",\n        stream=True,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are given a sequence of YouTube transcripts and your job\n                is to return notable clips that can be recut as smaller videos. Give very\n                specific titles and descriptions. Make sure the length of clips is proportional\n                to the length of the video. Note that this is a transcript and so there might\n                be spelling errors. Note that and correct any spellings. Use the context to\n                make sure you're spelling things correctly.\"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Let's use the following transcript segments.\\n{segments}\",\n            },\n        ],\n        response_model=instructor.Partial[YoutubeClips],\n        context={\"segments\": segments},\n    )  # type: ignore\n\n\n# Example usage\nif __name__ == \"__main__\":\n    from rich.table import Table\n    from rich.console import Console\n    from rich.prompt import Prompt\n\n    console = Console()\n    url = Prompt.ask(\"Enter a YouTube URL\")\n\n    with console.status(\"[bold green]Processing YouTube URL...\") as status:\n        video_id = extract_video_id(url)\n\n        if video_id is None:\n            raise ValueError(\"Invalid YouTube video URL\")\n\n        transcript = list(get_transcript_with_timing(video_id))\n        status.update(\"[bold green]Generating clips...\")\n\n        for clip in yield_clips(transcript):\n            console.clear()\n\n            table = Table(title=\"Extracted YouTube Clips\", padding=(0, 1))\n\n            table.add_column(\"Title\", style=\"cyan\")\n            table.add_column(\"Description\", style=\"magenta\")\n            table.add_column(\"Start\", justify=\"right\", style=\"green\")\n            table.add_column(\"End\", justify=\"right\", style=\"green\")\n            for youtube_clip in clip.clips or []:\n                table.add_row(\n                    youtube_clip.title,\n                    youtube_clip.description,\n                    str(youtube_clip.start),\n                    str(youtube_clip.end),\n                )\n            console.print(table)\n```\n"
  },
  {
    "path": "docs/faq.md",
    "content": "---\ntitle: Frequently Asked Questions\ndescription: Common questions and answers about using Instructor\n---\n\n# Frequently Asked Questions\n\nThis page answers common questions about using Instructor with various LLM providers.\n\n## General Questions\n\n### What is Instructor?\n\nInstructor is a library that makes it easy to get structured data from Large Language Models (LLMs). It uses Pydantic to define output schemas and provides a consistent interface across different LLM providers.\n\n### How does Instructor work?\n\nInstructor \"patches\" LLM clients to add a `response_model` parameter that accepts a Pydantic model. When you make a request, Instructor:\n\n1. Converts your Pydantic model to a schema the LLM can understand\n2. Formats the prompt appropriately for the provider\n3. Validates the LLM's response against your model\n4. Retries automatically if validation fails\n5. Returns a properly typed Pydantic object\n\n### Which LLM providers does Instructor support?\n\nInstructor supports many providers, including:\n\n- OpenAI (GPT models)\n- Anthropic (Claude models)\n- Google (Gemini models)\n- Cohere\n- Mistral AI\n- Groq\n- LiteLLM (meta-provider)\n- TrueFoundry AI Gateway\n- Various open-source models via Ollama, llama.cpp, etc.\n\nSee the [Integrations](./integrations/index.md) section for the complete list.\n\n### What's the difference between various modes?\n\nInstructor supports generic modes across providers:\n\n- `Mode.TOOLS` - Tool/function calling when supported\n- `Mode.JSON` - JSON generation for providers that support it (GenAI)\n- `Mode.JSON_SCHEMA` - JSON schema enforcement (OpenAI, Mistral, Cohere)\n- `Mode.MD_JSON` - JSON embedded in markdown\n- `Mode.PARALLEL_TOOLS` - Parallel tool calls where supported\n\nThe optimal mode depends on your provider and use case. See [Patching](./concepts/patching.md) for details.\n\n## Installation and Setup\n\n### How do I install Instructor?\n\nBasic installation:\n```bash\npip install instructor\n```\n\nFor specific providers:\n```bash\npip install \"instructor[anthropic]\"  # For Anthropic\npip install \"instructor[google-generativeai]\"  # For Google/Gemini\n```\n\n### What environment variables do I need?\n\nThis depends on your provider:\n\n- OpenAI: `OPENAI_API_KEY`\n- Anthropic: `ANTHROPIC_API_KEY`\n- Google: `GOOGLE_API_KEY`\n\nEach provider has specific requirements documented in their integration guide.\n\n## Common Issues\n\n### Why is my model not returning structured data?\n\nCommon reasons include:\n\n1. Using the wrong mode for your provider\n2. Complex schema that confuses the model\n3. Insufficient context in your prompt\n4. Using a model that doesn't support function/tool calling\n\nTry simplifying your schema or providing clearer instructions in your prompt.\n\n### How do I handle validation errors?\n\nInstructor automatically retries when validation fails. You can customize this behavior:\n\n```python\nfrom tenacity import stop_after_attempt\n\nresult = client.create(\n    response_model=MyModel,\n    max_retries=stop_after_attempt(5),  # Retry up to 5 times\n    messages=[...]\n)\n```\n\n### Can I see the raw response from the LLM?\n\nYes, use `create_with_completion`:\n\n```python\nresult, completion = client.create_with_completion(\n    response_model=MyModel,\n    messages=[...]\n)\n```\n\n`result` is your Pydantic model, and `completion` is the raw response.\n\n### How do I stream large responses?\n\nUse `create_partial` for partial updates as the response is generated:\n\n```python\nstream = client.create_partial(\n    response_model=MyModel,\n    messages=[...]\n)\n\nfor partial in stream:\n    print(partial)  # Partial model with fields filled in as they're generated\n```\n\n## Performance and Costs\n\n### How can I optimize token usage?\n\n1. Use concise prompts\n2. Use smaller models for simpler tasks\n3. Use the `MD_JSON` or `JSON` mode for simple schemas\n4. Cache responses for repeated queries\n\n### How do I handle rate limits?\n\nInstructor uses the `tenacity` library for retries, which you can configure:\n\n```python\nfrom tenacity import retry_if_exception_type, wait_exponential\nfrom openai.error import RateLimitError\n\nresult = client.create(\n    response_model=MyModel,\n    max_retries=retry_if_exception_type(RateLimitError),\n    messages=[...],\n)\n```\n\n## Advanced Usage\n\n### How do I use Instructor with FastAPI?\n\nInstructor works seamlessly with FastAPI:\n\n```python\nfrom fastapi import FastAPI\nfrom pydantic import BaseModel\nimport instructor\napp = FastAPI()\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n@app.post(\"/extract\")\nasync def extract_user_info(text: str) -> UserInfo:\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": text}]\n    )\n```\n\n### How do I use Instructor with async code?\n\nUse the async client:\n\n```python\nimport instructor\nimport asyncio\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\nasync def extract_data():\n    result = await client.create(\n        response_model=MyModel,\n        messages=[...]\n    )\n    return result\n\nasyncio.run(extract_data())\n```\n\n### Where can I get more help?\n\n- [Discord community](https://discord.gg/bD9YE9JArw)\n- [GitHub issues](https://github.com/jxnl/instructor/issues)\n- [Twitter @jxnl](https://twitter.com/jxnlco)\n"
  },
  {
    "path": "docs/getting-started.md",
    "content": "---\ntitle: Getting Started\ndescription: A step-by-step guide to getting started with Instructor for structured outputs from LLMs\n---\n\n# Getting Started with Instructor\n\nThis guide will walk you through the basics of using Instructor to extract structured data from language models. By the end, you'll understand how to:\n\n1. Install and set up Instructor\n2. Extract basic structured data\n3. Handle validation and errors\n4. Work with streaming responses\n5. Use different LLM providers\n\n## Installation\n\nFirst, install Instructor:\n\n```bash\npip install instructor\n```\n\nTo use a specific provider, install the appropriate extras:\n\n> Instructor's core install contains only required dependencies. Provider SDKs are optional and must be added explicitly.\n\n```bash\n# For OpenAI (included by default)\npip install instructor\n\n# For Anthropic\npip install \"instructor[anthropic]\"\n\n# For other providers\npip install \"instructor[google-genai]\"         # For Google/Gemini\npip install \"instructor[vertexai]\"             # For Vertex AI\npip install \"instructor[cohere]\"               # For Cohere\npip install \"instructor[litellm]\"              # For LiteLLM (multiple providers)\npip install \"instructor[mistralai]\"            # For Mistral\npip install \"instructor[xai]\"                  # For xAI\n```\n\n## Setting Up Environment\n\nSet your API keys as environment variables:\n\n```bash\n# For OpenAI\nexport OPENAI_API_KEY=your_openai_api_key\n\n# For Anthropic\nexport ANTHROPIC_API_KEY=your_anthropic_api_key\n\n# For other providers, set relevant API keys\n```\n\n## Your First Structured Output\n\nLet's start with a simple example using OpenAI:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Define your output structure\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n# Create an instructor client with from_provider\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n# Extract structured data\nuser_info = client.create(\n    response_model=UserInfo,\n    messages=[\n        {\"role\": \"user\", \"content\": \"John Doe is 30 years old.\"}\n    ],\n)\n\nprint(f\"Name: {user_info.name}, Age: {user_info.age}\")\n# Output: Name: John Doe, Age: 30\n```\n\nThis example demonstrates the core workflow:\n1. Define a Pydantic model for your output structure\n2. Create an Instructor client with `from_provider`\n3. Request structured output using the `response_model` parameter\n\n## Validation and Error Handling\n\nInstructor leverages Pydantic's validation to ensure your data meets requirements:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator\n\nclass User(BaseModel):\n    name: str\n    age: int = Field(gt=0, lt=120)  # Age must be between 0 and 120\n\n    @field_validator('name')\n    def name_must_have_space(cls, v):\n        if ' ' not in v:\n            raise ValueError('Name must include first and last name')\n        return v\n\n# This will make the LLM retry if validation fails\nuser = client.create(\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Tom is 25 years old.\"}\n    ],\n)\n```\n\n## Working with Complex Models\n\nInstructor works seamlessly with nested Pydantic models:\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n\nperson = client.create(\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Extract: John Smith is 35 years old.\n        He has homes at 123 Main St, Springfield, IL 62704 and\n        456 Oak Ave, Chicago, IL 60601.\n        \"\"\"}\n    ],\n)\n```\n\n## Streaming Responses\n\nFor larger responses or better user experience, use streaming:\n\n```python\nfrom instructor import Partial\n\n# Stream the response as it's being generated\nstream = client.create_partial(\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract a detailed person profile for John Smith, 35, who lives in Chicago and Springfield.\"}\n    ],\n)\n\nfor partial in stream:\n    # This will incrementally show the response being built\n    print(partial)\n```\n\n## Using Different Providers\n\nInstructor supports multiple LLM providers. Here's how to use Anthropic:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n# Create an instructor client with from_provider\nclient = instructor.from_provider(\"anthropic/claude-3-opus-20240229\")\n\nuser_info = client.create(\n    response_model=UserInfo,\n    messages=[\n        {\"role\": \"user\", \"content\": \"John Doe is 30 years old.\"}\n    ],\n)\n\nprint(f\"Name: {user_info.name}, Age: {user_info.age}\")\n```\n\n## Frequently Asked Questions\n\n### What's the difference between `start-here.md` and `getting-started.md`?\n\n- **[Start Here](./start-here.md)**: Explains what Instructor is and why you'd use it (conceptual overview)\n- **Getting Started**: This guide - shows you how to install and use Instructor (practical steps)\n\n### Which provider should I start with?\n\nOpenAI is the most popular choice for beginners due to reliability and wide support. Once comfortable, you can explore Anthropic Claude, Google Gemini, or open-source models.\n\n### Do I need to understand Pydantic?\n\nBasic knowledge helps, but you can start with simple models. Instructor works with any Pydantic BaseModel. Learn more advanced features as you need them.\n\n### Can I use Instructor with async code?\n\nYes! Use `async_client=True` when creating your client: `client = instructor.from_provider(\"openai/gpt-4o\", async_client=True)`, then use `await client.create()`.\n\n### What if validation fails?\n\nInstructor automatically retries with validation feedback. You can configure retry behavior with `max_retries` parameter. See [retry mechanisms](./learning/validation/retry_mechanisms.md) for details.\n\n[View all FAQs →](./faq.md)\n\n## Next Steps\n\nNow that you've mastered the basics, here are some next steps:\n\n- Learn about [client setup with from_provider](./concepts/from_provider.md) for different LLM providers\n- Explore [advanced validation](./concepts/reask_validation.md) to ensure data quality\n- Check out the [Cookbook examples](./examples/index.md) for real-world applications\n- See how to [use hooks](./concepts/hooks.md) for monitoring and debugging\n\n**Using older patterns?** If you're using `instructor.patch()` or provider-specific functions like `from_openai()`, check out the [Migration Guide](./concepts/migration.md) to modernize your code.\n\n**New to Instructor?** Start with [Start Here](./start-here.md) for a conceptual overview.\n\nFor more detailed information on any topic, visit the [Concepts](./concepts/index.md) section.\n\nIf you have questions or need help, join our [Discord community](https://discord.gg/bD9YE9JArw) or check the [GitHub repository](https://github.com/jxnl/instructor).\n"
  },
  {
    "path": "docs/help.md",
    "content": "---\ntitle: Getting Started with Instructor: Help and Resources\ndescription: Explore key resources for getting help with Instructor, including Discord, blog, concepts, cookbooks, and GitHub discussions.\n---\n\n# Getting help with Instructor\n\nIf you need help getting started with Instructor or with advanced usage, the following sources may be useful.\n\n## :material-discord: Discord\n\nThe [Discord](https://discord.gg/bD9YE9JArw) is a great place to ask questions and get help from the community.\n\n## :material-creation: Concepts\n\nThe [concepts](concepts/prompting.md) section explains the core concepts of Instructor and how to prompt with models.\n\n## :material-chef-hat: Cookbooks\n\nThe [cookbooks](examples/index.md) are a great place to start. They contain a variety of examples that demonstrate how to use Instructor in different scenarios.\n\n## :material-book: Blog\n\nThe [blog](blog/index.md) contains articles that explain how to use Instructor in different scenarios.\n\n## :material-github: GitHub Discussions\n\n[GitHub discussions](https://github.com/jxnl/instructor/discussions) are useful for asking questions, your question and the answer will help everyone.\n\n## :material-github: GitHub Issues\n\n[GitHub issues](https://github.com/jxnl/instructor/issues) are useful for reporting bugs or requesting new features.\n\n## :material-twitter: Twitter\n\nYou can also reach out to me on [Twitter](https://twitter.com/jxnlco) if you have any questions or ideas.\n"
  },
  {
    "path": "docs/hooks/hide_lines.py",
    "content": "from typing import Any\nimport mkdocs.plugins\nfrom pymdownx import highlight  # type: ignore\n\n\n@mkdocs.plugins.event_priority(0)\n# pylint: disable=unused-argument\ndef on_startup(command: str, dirty: bool) -> None:  # noqa: ARG001\n    \"\"\"Monkey patch Highlight extension to hide lines in code blocks.\"\"\"\n    original = highlight.Highlight.highlight  # type: ignore\n\n    def patched(self: Any, src: str, *args: Any, **kwargs: Any) -> Any:\n        lines = src.splitlines(keepends=True)\n\n        final_lines = []\n\n        remove_lines = False\n        for line in lines:\n            if line.strip() == \"# <%hide%>\":\n                remove_lines = not remove_lines\n            elif not remove_lines:\n                final_lines.append(line)\n\n        return original(self, \"\".join(final_lines), *args, **kwargs)\n\n    highlight.Highlight.highlight = patched\n"
  },
  {
    "path": "docs/index.md",
    "content": "---\ntitle: \"Instructor - Multi-Language Library for Structured LLM Outputs | Python, TypeScript, Go, Ruby\"\ndescription: \"Get structured, validated data from any LLM with Instructor - the #1 library for LLM data extraction. Supports 15+ providers (OpenAI, Anthropic, Google, Ollama, DeepSeek) in 6 languages. Built on type-safe schemas with automatic retries, streaming, and nested object support.\"\nkeywords: \"LLM structured outputs, structured data extraction, OpenAI structured data, Pydantic LLM validation, Python LLM library, TypeScript LLM, Go LLM, Ruby LLM, Anthropic structured outputs, GPT structured data extraction, LLM response validation, AI data extraction, Ollama structured outputs, open source LLM, DeepSeek validation, Instructor vs Guardrails, LLM validation library, JSON schema validation, nested LLM schemas\"\n---\n\n# Instructor: Top Multi-Language Library for Structured LLM Outputs\n\n_Extract structured data from any LLM with type safety, validation, and automatic retries. Available in Python, TypeScript, Go, Ruby, Elixir, and Rust._\n\n[![PyPI - Version](https://img.shields.io/pypi/v/instructor?style=flat-square&logo=pypi&logoColor=white&label=PyPI)](https://pypi.org/project/instructor/)\n[![License](https://img.shields.io/github/license/instructor-ai/instructor?style=flat-square&color=blue)](https://github.com/instructor-ai/instructor/blob/main/LICENSE)\n[![GitHub Repo stars](https://img.shields.io/github/stars/instructor-ai/instructor?style=flat-square&logo=github&logoColor=white)](https://github.com/instructor-ai/instructor)\n[![Downloads](https://img.shields.io/pypi/dm/instructor?style=flat-square&logo=pypi&logoColor=white&label=Downloads)](https://pypi.org/project/instructor/)\n[![Discord](https://img.shields.io/discord/1192334452110659664?style=flat-square&logo=discord&logoColor=white&label=Discord)](https://discord.gg/bD9YE9JArw)\n[![Twitter Follow](https://img.shields.io/twitter/follow/jxnlco?style=flat-square&logo=twitter&logoColor=white)](https://twitter.com/jxnlco)\n\n> **Instructor for extraction, PydanticAI for agents.** Instructor shines when you need fast, schema-first extraction without extra agents. When your project needs quality gates, shareable runs, or built-in observability, try [PydanticAI](https://ai.pydantic.dev/). PydanticAI is the official agent runtime from the Pydantic team: it adds typed tools, dataset replays, and production dashboards while keeping your existing Instructor models. Read the [PydanticAI docs](https://ai.pydantic.dev/) to see how to bring those capabilities into your stack.\n\n## What is Instructor?\n\nInstructor is the **most popular Python library** for extracting structured data from Large Language Models (LLMs). With over **3 million monthly downloads, 11k stars, and 100+ contributors**, it's the go-to solution for developers who need reliable, validated outputs from AI models.\n\nBuilt on top of **Pydantic**, Instructor provides type-safe data extraction with automatic validation, retries, and streaming support. Whether you're using OpenAI's GPT models, Anthropic's Claude, Google's Gemini, **open source models with Ollama**, **DeepSeek**, or any of 15+ supported providers, Instructor ensures your LLM outputs are always structured and validated.\n\n## Key Features for LLM Data Extraction\n\n- **Structured Outputs**: Define Pydantic models to specify exactly what data you want from your LLM\n- **Automatic Retries**: Built-in retry logic when validation fails - no more manual error handling\n- **Data Validation**: Leverage Pydantic's powerful validation to ensure response quality\n- **Streaming Support**: Real-time processing of partial responses and lists\n- **Multi-Provider**: Works with OpenAI, Anthropic, Google, Mistral, Cohere, Ollama, DeepSeek, and 15+ LLM providers\n- **Type Safety**: Full IDE support with proper type inference and autocompletion\n- **Open Source Support**: Run any open source model locally with Ollama, llama-cpp-python, or vLLM\n\n## Quick Start\n\nInstall Instructor and start extracting structured data in minutes:\n\n=== \"pip\"\n    ```bash\n    pip install instructor\n    ```\n\n=== \"uv\"\n    ```bash\n    uv add instructor\n    ```\n\n=== \"poetry\"\n    ```bash\n    poetry add instructor\n    ```\n\n### Extract Structured Data\n\nInstructor's **`from_provider`** function provides a unified interface to work with any LLM provider. Switch between OpenAI, Anthropic, Google, Ollama, DeepSeek, and 15+ providers with the same code:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n\n\n# Works with any provider - same interface everywhere\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n# Or: instructor.from_provider(\"anthropic/claude-3\")\n# Or: instructor.from_provider(\"google/gemini-pro\")\n# Or: instructor.from_provider(\"ollama/llama3\")  # local\n\n# Extract structured data from natural language\nperson = client.create(\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: John is a 30-year-old software engineer\"}\n    ],\n)\nprint(person)  # Person(name='John', age=30, occupation='software engineer')\n```\n\nThe **`from_provider`** API supports both sync and async usage (`async_client=True`) and automatically handles provider-specific configurations. [See all supported providers →](./integrations/index.md)\n\n## Complex Schemas & Validation\n\nInstructor excels at extracting complex, nested data structures with custom validation rules. Here's a concise example:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field, field_validator\nfrom typing import List, Optional\nfrom enum import Enum\n\n\nclass Priority(str, Enum):\n    LOW = \"low\"\n    MEDIUM = \"medium\"\n    HIGH = \"high\"\n    CRITICAL = \"critical\"\n\n\nclass Ticket(BaseModel):\n    title: str = Field(..., min_length=5, max_length=100)\n    priority: Priority\n    estimated_hours: Optional[float] = Field(None, gt=0, le=100)\n\n    @field_validator('estimated_hours')\n    @classmethod\n    def validate_hours(cls, v):\n        if v is not None and v % 0.5 != 0:\n            raise ValueError('Hours must be in 0.5 increments')\n        return v\n\n\nclass CustomerSupport(BaseModel):\n    customer_name: str\n    tickets: List[Ticket] = Field(..., min_items=1)\n\n\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\nsupport_case = client.create(\n    response_model=CustomerSupport,\n    messages=[{\"role\": \"user\", \"content\": \"Extract support case details...\"}],\n    max_retries=3,\n)\n```\n\n**Key Features:**\n- Deep nesting with nested models and lists\n- Custom validation with Pydantic validators\n- Automatic retries on validation failures\n- Type-safe extraction with full IDE support\n\n[Learn more about validation and complex schemas →](./concepts/reask_validation.md)\n\n## Supported LLM Providers\n\nInstructor works seamlessly with **15+ popular LLM providers**, giving you the flexibility to use any model while maintaining consistent structured output handling. From OpenAI's GPT models to **open source alternatives with Ollama**, **DeepSeek models**, and local inference, get validated data extraction everywhere.\n\nIt stands out for its simplicity, transparency, and user-centric design, built on top of Pydantic. Instructor helps you manage [validation context](./concepts/reask_validation.md), retries with [Tenacity](./concepts/retrying.md), and streaming [Lists](./concepts/lists.md) and [Partial](./concepts/partial.md) responses.\n\n[:material-star: Star the Repo](https://github.com/jxnl/instructor){: .md-button .md-button--primary } [:material-book-open-variant: Cookbooks](./examples/index.md){: .md-button } [:material-lightbulb: Prompting Guide](./prompting/index.md){: .md-button }\n\nIf you ever get stuck, you can always run `instructor docs` to open the documentation in your browser. It even supports searching for specific topics.\n\n```bash\ninstructor docs [QUERY]\n```\n\n### Provider Examples\n\nAll providers use the same simple interface. Here are quick examples for the most popular providers:\n\n=== \"OpenAI\"\n    ```python\n    import instructor\n    from pydantic import BaseModel\n\n\n    class ExtractUser(BaseModel):\n        name: str\n        age: int\n\n\n    client = instructor.from_provider(\"openai/gpt-5-nano\")\n    res = client.create(\n        response_model=ExtractUser,\n        messages=[{\"role\": \"user\", \"content\": \"John Doe is 30 years old.\"}],\n    )\n    ```\n\n    [Full OpenAI docs →](./integrations/openai.md)\n\n=== \"Anthropic\"\n    ```python\n    import instructor\n    from pydantic import BaseModel\n\n\n    class ExtractUser(BaseModel):\n        name: str\n        age: int\n\n\n    client = instructor.from_provider(\"anthropic/claude-3-5-sonnet-20240620\")\n    resp = client.create(\n        response_model=ExtractUser,\n        messages=[{\"role\": \"user\", \"content\": \"Extract Jason is 25 years old.\"}],\n    )\n    ```\n\n    [Full Anthropic docs →](./integrations/anthropic.md)\n\n=== \"Google Gemini\"\n    ```python\n    import instructor\n    from pydantic import BaseModel\n\n\n    class ExtractUser(BaseModel):\n        name: str\n        age: int\n\n\n    client = instructor.from_provider(\"google/gemini-2.5-flash\")\n    resp = client.create(\n        response_model=ExtractUser,\n        messages=[{\"role\": \"user\", \"content\": \"Extract Jason is 25 years old.\"}],\n    )\n    ```\n\n    [Full Google docs →](./integrations/google.md)\n\n=== \"Ollama (Local)\"\n    ```python\n    import instructor\n    from pydantic import BaseModel\n\n\n    class ExtractUser(BaseModel):\n        name: str\n        age: int\n\n\n    client = instructor.from_provider(\"ollama/llama3\")\n    resp = client.create(\n        response_model=ExtractUser,\n        messages=[{\"role\": \"user\", \"content\": \"Extract Jason is 25 years old.\"}],\n    )\n    ```\n\n    [Full Ollama docs →](./integrations/ollama.md)\n\n[View all 15+ providers →](./integrations/index.md)\n\n## Citation\n\nIf you use Instructor in your research or project, please cite it using:\n\n```bibtex\n@software{liu2024instructor,\n  author = {Jason Liu and Contributors},\n  title = {Instructor: A library for structured outputs from large language models},\n  url = {https://github.com/instructor-ai/instructor},\n  year = {2024},\n  month = {3}\n}\n```\n\n## Why use Instructor?\n\n<div class=\"grid cards\" markdown>\n\n- :material-code-tags: **Simple API with Full Prompt Control**\n\n    Instructor provides a straightforward API that gives you complete ownership and control over your prompts. This allows for fine-tuned customization and optimization of your LLM interactions.\n\n    [:octicons-arrow-right-16: Explore Concepts](./concepts/models.md)\n\n- :material-translate: **Multi-Language Support**\n\n    Simplify structured data extraction from LLMs with type hints and validation.\n\n    [:simple-python: Python](https://python.useinstructor.com) · [:simple-typescript: TypeScript](https://js.useinstructor.com) · [:simple-ruby: Ruby](https://ruby.useinstructor.com) · [:simple-go: Go](https://go.useinstructor.com) · [:simple-elixir: Elixir](https://hex.pm/packages/instructor) · [:simple-rust: Rust](https://rust.useinstructor.com)\n\n- :material-refresh: **Reasking and Validation**\n\n    Automatically reask the model when validation fails, ensuring high-quality outputs. Leverage Pydantic's validation for robust error handling.\n\n    [:octicons-arrow-right-16: Learn about Reasking](./concepts/reask_validation.md)\n\n- :material-repeat-variant: **Streaming Support**\n\n    Stream partial results and iterables with ease, allowing for real-time processing and improved responsiveness in your applications.\n\n    [:octicons-arrow-right-16: Learn about Streaming](./concepts/partial.md)\n\n- :material-code-braces: **Powered by Type Hints**\n\n    Leverage Pydantic for schema validation, prompting control, less code, and IDE integration.\n\n    [:octicons-arrow-right-16: Learn more](https://docs.pydantic.dev/)\n\n- :material-lightning-bolt: **Simplified LLM Interactions**\n\n    Support for [OpenAI](./integrations/openai.md), [Anthropic](./integrations/anthropic.md), [Google](./integrations/google.md), [Vertex AI](./integrations/vertex.md), [Mistral/Mixtral](./integrations/together.md), [Ollama](./integrations/ollama.md), [llama-cpp-python](./integrations/llama-cpp-python.md), [Cohere](./integrations/cohere.md), [LiteLLM](./integrations/litellm.md).\n\n    [:octicons-arrow-right-16: See Hub](./integrations/index.md)\n\n</div>\n\n\n### Using Hooks\n\nInstructor's hooks system lets you intercept and handle events during LLM interactions. Use hooks for logging, monitoring, or custom error handling:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\n# Attach hooks for logging and error handling\nclient.on(\"completion:kwargs\", lambda **kw: print(\"Called with:\", kw))\nclient.on(\"completion:error\", lambda e: print(f\"Error: {e}\"))\n\nuser_info = client.create(\n    response_model=UserInfo,\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 20 years old\"}],\n)\n```\n\n[Learn more about hooks →](./concepts/hooks.md)\n\n## Type Inference & Advanced Methods\n\nInstructor provides full type inference for better IDE support and type safety. The client includes specialized methods for different use cases:\n\n**Basic extraction:**\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\nuser = client.create(response_model=User, messages=[...])  # Type: User\n```\n\n**Async support:**\n```python\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", async_client=True)\nuser = await client.create(...)  # Type: User\n```\n\n**Access original completion:**\n```python\nuser, completion = client.create_with_completion(...)  # Returns tuple\n```\n\n**Stream partial objects:**\n```python\nfor partial in client.create_partial(...):  # Type: Generator[User, None]\n    print(partial)\n```\n\n**Stream multiple objects:**\n```python\nfor user in client.create_iterable(...):  # Type: Generator[User, None]\n    print(user)\n```\n\nAll methods provide full type inference for better IDE autocomplete and type checking.\n\n## Frequently Asked Questions\n\n### What is Instructor?\n\nInstructor is a Python library that extracts structured, validated data from Large Language Models (LLMs). It uses Pydantic models to define output schemas and automatically handles validation, retries, and error handling.\n\n### Which LLM providers does Instructor support?\n\nInstructor supports 15+ providers including OpenAI, Anthropic, Google Gemini, Mistral, Cohere, Ollama, DeepSeek, and many more. See our [integrations page](./integrations/index.md) for the complete list.\n\n### Do I need to know Pydantic to use Instructor?\n\nBasic Pydantic knowledge helps, but you can get started with simple models. Instructor works with any Pydantic BaseModel, and you can learn advanced features as you need them.\n\n### How does Instructor compare to other libraries?\n\nInstructor focuses specifically on structured outputs with automatic validation and retries. Unlike larger frameworks, Instructor does one thing very well: getting reliable, validated data from LLMs.\n\n### Can I use Instructor with open source models?\n\nYes! Instructor works with Ollama, llama-cpp-python, and other local models. See our [Ollama integration guide](./integrations/ollama.md) to get started.\n\n### Does Instructor work with async code?\n\nYes, Instructor fully supports async/await. Use `async_client=True` when creating your client, then use `await client.create()`.\n\n[View all FAQs →](./faq.md)\n\n## Templating\n\nInstructor supports templating with Jinja, which lets you create dynamic prompts. This is useful when you want to fill in parts of a prompt with data. Here's a simple example:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create a completion using a Jinja template in the message content\nresponse = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"Extract the information from the\n            following text: {{ data }}`\"\"\",\n        },\n    ],\n    response_model=User,\n    context={\"data\": \"John Doe is thirty years old\"},\n)\n\nprint(response)\n#> User(name='John Doe', age=30)\n```\n\n[Learn more about templating :octicons-arrow-right:](./concepts/templating.md){: .md-button .md-button-primary }\n## Validation\n\nYou can also use Pydantic to validate your outputs and get the llm to retry on failure. Check out our docs on [retrying](./concepts/retrying.md) and [validation context](./concepts/reask_validation.md).\n\n```python\nimport instructor\nfrom pydantic import BaseModel, ValidationError, BeforeValidator\nfrom typing_extensions import Annotated\nfrom instructor import llm_validator\n\n# Create instructor client\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\n\nclass QuestionAnswer(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(llm_validator(\"don't say objectionable things\", client=client)),\n    ]\n\n\ntry:\n    qa = QuestionAnswer(\n        question=\"What is the meaning of life?\",\n        answer=\"The meaning of life is to be evil and steal\",\n    )\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for QuestionAnswer\n    answer\n      Assertion failed, The statement promotes objectionable behavior by encouraging evil and stealing. [type=assertion_error, input_value='The meaning of life is to be evil and steal', input_type=str]\n    \"\"\"\n```\n\n## Contributing\n\nIf you want to help out, checkout some of the issues marked as `good-first-issue` or `help-wanted`. Found [here](https://github.com/jxnl/instructor/labels/good%20first%20issue). They could be anything from code improvements, a guest blog post, or a new cook book.\n\n## License\n\nThis project is licensed under the terms of the MIT License.\n"
  },
  {
    "path": "docs/installation.md",
    "content": "---\ntitle: Installing Instructor with Pip\ndescription: Learn how to install Instructor and its dependencies using pip for Python 3.9+. Simple setup guide included.\n---\n\nInstallation is as simple as:\n\n```bash\npip install instructor\n```\n\nInstructor has a few dependencies:\n\n- [`openai`](https://pypi.org/project/openai/): OpenAI's Python client.\n- [`typer`](https://pypi.org/project/typer/): Build great CLIs. Easy to code. Based on Python type hints.\n- [`docstring-parser`](https://pypi.org/project/docstring-parser/): A parser for Python docstrings, to improve the experience of working with docstrings in jsonschema.\n- [`pydantic`](https://pypi.org/project/pydantic/): Data validation and settings management using python type annotations.\n\nIf you've got Python 3.9+ and `pip` installed, you're good to go.\n"
  },
  {
    "path": "docs/integrations/anthropic.md",
    "content": "---\ntitle: \"Anthropic Claude Tutorial: Structured Outputs with Instructor\"\ndescription: \"Complete guide to using Anthropic's Claude models with Instructor for structured data extraction. Learn how to use Claude Haiku for type-safe outputs in Python.\"\n---\n\n## See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n- [Mode Comparison](../modes-comparison.md) - Using Anthropic's tool calling\n\n# Anthropic Claude Tutorial: Structured Outputs with Instructor\n\nLearn how to use Anthropic's Claude Haiku models with Instructor to extract structured, validated data. This tutorial covers everything from basic setup to advanced patterns for production use.\n\n## Quick Start: Install Instructor for Claude\n\nGet started with Claude and Instructor for structured outputs:\n\n```\npip install \"instructor[anthropic]\"\n```\n\nOnce we've done so, getting started is as simple as using our `from_provider` method to patch the client up.\n\n### Basic Usage\n\n```python\n# Standard library imports\nimport os\nfrom typing import List\n\n# Third-party imports\nimport anthropic\nimport instructor\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"ANTHROPIC_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Define your models with proper type annotations\nclass Properties(BaseModel):\n    \"\"\"Model representing a key-value property.\"\"\"\n    name: str = Field(description=\"The name of the property\")\n    value: str = Field(description=\"The value of the property\")\n\n\nclass User(BaseModel):\n    \"\"\"Model representing a user with properties.\"\"\"\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n    properties: List[Properties] = Field(description=\"List of user properties\")\n\nclient = instructor.from_provider(\n    \"anthropic/claude-4-5-haiku-latest\",\n    mode=instructor.Mode.TOOLS\n)\n\ntry:\n    # Extract structured data\n    user_response = client.create(\n        max_tokens=1024,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Extract structured information based on the user's request.\"\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a user for a model with a name, age, and properties.\",\n            }\n        ],\n        response_model=User,\n    )\n\n    # Print the result as formatted JSON\n    print(user_response.model_dump_json(indent=2))\n\n    # Expected output:\n    # {\n    #   \"name\": \"John Doe\",\n    #   \"age\": 35,\n    #   \"properties\": [\n    #     {\n    #       \"name\": \"City\",\n    #       \"value\": \"New York\"\n    #     },\n    #     {\n    #       \"name\": \"Occupation\",\n    #       \"value\": \"Software Engineer\"\n    #     }\n    #   ]\n    # }\nexcept instructor.exceptions.InstructorError as e:\n    print(f\"Validation error: {e}\")\nexcept Exception as e:\n    print(f\"Unexpected error: {e}\")\n```\n\n### Async Example\n\n```python\nimport asyncio\n\nasync_client = instructor.from_provider(\n    \"anthropic/claude-4-5-haiku-latest\",\n    async_client=True,\n    mode=instructor.Mode.TOOLS,\n)\n\nasync def extract_user():\n    return await async_client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n        response_model=User,\n    )\n\nuser = asyncio.run(extract_user())\nprint(user)\n```\n\n### Parallel Tool Calling\n\nParallel tool mode is automatically detected when your response model is `Iterable[Union[Model1, Model2, ...]]`. Just use `Mode.TOOLS` (or let it default) and the handler will automatically:\n- Set tool_choice to \"auto\" (required for parallel)\n- Generate schemas for all union members\n- Return a generator yielding each tool result\n\n```python\nfrom typing import Iterable, Literal\nfrom pydantic import BaseModel\nimport instructor\n\n\nclass Weather(BaseModel):\n    location: str\n    units: Literal[\"imperial\", \"metric\"]\n\n\nclass GoogleSearch(BaseModel):\n    query: str\n\n\n# No need to specify Mode.PARALLEL_TOOLS - it's auto-detected!\nclient = instructor.from_provider(\n    \"anthropic/claude-3-5-haiku-latest\",\n    mode=instructor.Mode.TOOLS,  # or just omit and use default\n)\n\nresults = client.create(\n    messages=[\n        {\"role\": \"system\", \"content\": \"You must always use tools\"},\n        {\n            \"role\": \"user\",\n            \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n        },\n    ],\n    response_model=Iterable[Weather | GoogleSearch],  # Auto-detects parallel mode\n)\n\nfor item in results:\n    print(item)\n```\n\n**How it works**: When Instructor detects `Iterable[Union[...]]`, it automatically:\n1. Sets `tool_choice` to `\"auto\"` (allows model to call any tool)\n2. Generates tool schemas from all union members\n3. Returns a generator that yields each extracted tool call\n4. Each yielded item is validated against its corresponding Pydantic model\n\n## Multimodal\n\n> We've provided a few different sample files for you to use to test out these new features. All examples below use these files.\n>\n> - (Image) : An image of some blueberry plants [image.jpg](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg)\n> - (PDF) : A sample PDF file which contains a fake invoice [invoice.pdf](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf)\n\nInstructor provides a unified, provider-agnostic interface for working with multimodal inputs like images, PDFs, and audio files. With Instructor's multimodal objects, you can easily load media from URLs, local files, or base64 strings using a consistent API that works across different AI providers (OpenAI, Anthropic, Mistral, etc.).\n\nInstructor handles all the provider-specific formatting requirements behind the scenes, ensuring your code remains clean and future-proof as provider APIs evolve.\n\nLet's see how to use the Image and PDF classes.\n\n### Image\n\n> For a more in-depth walkthrough of the Image component, check out the [docs here](../concepts/multimodal.md)\n\nInstructor makes it easy to analyse and extract semantic information from images using Anthropic's claude models. [Click here](https://docs.anthropic.com/en/docs/about-claude/models/all-models) to check if the model you'd like to use has vison capabilities.\n\nLet's see an example below with the sample image above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path` and the `from_base64` class methods.\n\n```python\nfrom instructor.processing.multimodal import Image\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom anthropic import Anthropic\n\n\nclass ImageDescription(BaseModel):\n    objects: list[str] = Field(..., description=\"The objects in the image\")\n    scene: str = Field(..., description=\"The scene of the image\")\n    colors: list[str] = Field(..., description=\"The colors in the image\")\n\n\nclient = instructor.from_provider(\"anthropic/claude-4-5-haiku-latest\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n# Multiple ways to load an image:\nresponse = client.create(\n    response_model=ImageDescription,\n    max_tokens=1000,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What is in this image?\",\n                # Option 1: Direct URL with autodetection\n                Image.from_url(url),\n                # Option 2: Local file\n                # Image.from_path(\"path/to/local/image.jpg\")\n                # Option 3: Base64 string\n                # Image.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # Image.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nprint(response)\n# Example output:\n# ImageDescription(\n#     objects=['blueberries', 'leaves'],\n#     scene='A blueberry bush with clusters of ripe blueberries and some unripe ones against a cloudy sky',\n#     colors=['green', 'blue', 'purple', 'white']\n# )\n\n```\n\n### PDF\n\nInstructor makes it easy to analyse and extract semantic information from PDFs using Anthropic's Claude line of models.\n\nLet's see an example below with the sample PDF above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path` and the `from_base64` class methods.\n\n```python\nfrom instructor.processing.multimodal import PDF\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom anthropic import Anthropic\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\nclient = instructor.from_provider(\"anthropic/claude-4-5-haiku-latest\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n# Multiple ways to load an PDF:\nresponse = client.create(\n    response_model=Receipt,\n    max_tokens=1000,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract out the total and line items from the invoice\",\n                # Option 1: Direct URL\n                PDF.from_url(url),\n                # Option 2: Local file\n                # PDF.from_path(\"path/to/local/invoice.pdf\"),\n                # Option 3: Base64 string\n                # PDF.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # PDF.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nprint(response)\n# > Receipt(total=220, items=['English Tea', 'Tofu'])\n```\n\nIf you'd like to cache the PDF and use it across multiple different requests, we support that with the `PdfWithCacheControl` class which we can see below.\n\n```python\nfrom instructor.processing.multimodal import PdfWithCacheControl\nfrom pydantic import BaseModel\nimport instructor\nfrom anthropic import Anthropic\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\nclient = instructor.from_provider(\"anthropic/claude-4-5-haiku-latest\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n# Multiple ways to load an PDF:\nresponse, completion = client.create_with_completion(\n    response_model=Receipt,\n    max_tokens=1000,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract out the total and line items from the invoice\",\n                # Option 1: Direct URL\n                PdfWithCacheControl.from_url(url),\n                # Option 2: Local file\n                # PDF.from_path(\"path/to/local/invoice.pdf\"),\n                # Option 3: Base64 string\n                # PDF.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # PDF.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nassert (\n    completion.usage.cache_creation_input_tokens > 0\n    or completion.usage.cache_read_input_tokens > 0\n)\nprint(response)\n# > Receipt(total=220, items=['English Tea', 'Tofu'])\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\n### Partials\n\nYou can use our `create_partial` method to stream a single object. Note that validators should not be declared in the response model when streaming objects because it will break the streaming process.\n\n```python\n# Standard library imports\nimport os\n\n# Third-party imports\nimport anthropic\nimport instructor\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"ANTHROPIC_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Initialize client with explicit mode\nclient = instructor.from_provider(\n    \"anthropic/claude-4-5-haiku-latest\",\n    mode=instructor.Mode.TOOLS,\n)\n\n# Define your model with proper annotations\nclass User(BaseModel):\n    \"\"\"Model representing a user profile.\"\"\"\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n    bio: str = Field(description=\"A biographical description of the user\")\n\ntry:\n    # Stream partial objects as they're generated\n    for partial_user in client.create_partial(\n        messages=[\n            {\"role\": \"system\", \"content\": \"Create a detailed user profile based on the information provided.\"},\n            {\"role\": \"user\", \"content\": \"Create a user profile for Jason, age 25\"},\n        ],\n        response_model=User,\n        max_tokens=4096,\n    ):\n        print(f\"Current state: {partial_user}\")\n\n    # Expected output:\n    # > Current state: name='Jason' age=None bio=None\n    # > Current state: name='Jason' age=25 bio='Jason is a 25-year-old with an adventurous spirit and a love for technology. He is'\n    # > Current state: name='Jason' age=25 bio='Jason is a 25-year-old with an adventurous spirit and a love for technology. He is always on the lookout for new challenges and opportunities to grow both personally and professionally.'\nexcept Exception as e:\n    print(f\"Error during streaming: {e}\")\n```\n\n### Iterable Example\n\nYou can also use our `create_iterable` method to stream a list of objects. This is helpful when you'd like to extract multiple instances of the same response model from a single prompt.\n\n```python\n# Standard library imports\nimport os\n\n# Third-party imports\nimport anthropic\nfrom instructor import from_provider\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"ANTHROPIC_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Initialize client with explicit mode\nclient = from_provider(\n    mode=instructor.Mode.TOOLS\n)\n\n# Define your model with proper annotations\nclass User(BaseModel):\n    \"\"\"Model representing a basic user.\"\"\"\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n\ntry:\n    # Create an iterable of user objects\n    users = client.create_iterable(\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Extract all users from the provided text into structured format.\"\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n                Extract users:\n                1. Jason is 25 years old\n                2. Sarah is 30 years old\n                3. Mike is 28 years old\n                \"\"\",\n            },\n        ],\n        max_tokens=4096,\n        response_model=User,\n    )\n\n    # Process each user as it's extracted\n    for user in users:\n        print(user)\n\n    # Expected output:\n    # > name='Jason' age=25\n    # > name='Sarah' age=30\n    # > name='Mike' age=28\nexcept Exception as e:\n    print(f\"Error during iteration: {e}\")\n```\n\n## Instructor Modes\n\nWe provide several modes to make it easy to work with the different response models that Anthropic supports\n\n1. `instructor.Mode.JSON` : This uses the text completion API from the Anthropic API and then extracts out the desired response model from the text completion model\n2. `instructor.Mode.TOOLS` : This uses Anthropic's [tools calling API](https://docs.anthropic.com/en/docs/build-with-claude/tool-use) to return structured outputs. Automatically detects parallel tools from `Iterable[Union[...]]` response models.\n3. `instructor.Mode.PARALLEL_TOOLS` : **Deprecated** - Use `Mode.TOOLS` with `Iterable[Union[Model1, Model2, ...]]` instead. Auto-detected automatically.\n\n### Mode Auto-Detection\n\n`Mode.TOOLS` now intelligently adapts based on your response model and parameters:\n\n| Response Model | Parameters | Behavior |\n|---|---|---|\n| `Model` | Regular | Single tool (forced) |\n| `Model` | `thinking={...}` | Single tool with extended thinking (auto) |\n| `Iterable[Union[Model1, Model2]]` | Regular | Parallel tools (auto) |\n| `Iterable[Union[Model1, Model2]]` | `thinking={...}` | Parallel with thinking |\n\nIn general, we recommend using `Mode.TOOLS` because it automatically handles all these cases and is the best way to ensure you have the desired response schema.\n\n## Caching\n\nIf you'd like to use caching with the Anthropic Client, we also support it for images and text input.\n\n### Caching Text Input\n\nHere's how you can implement caching for text input ( assuming you have a giant `book.txt` file that you read in).\n\nWe've written a comprehensive walkthrough of how to use caching to implement Anthropic's new Contextual Retrieval method that gives a significant bump to retrieval accuracy.\n\n```python\n# Standard library imports\nimport os\n\n# Third-party imports\nimport instructor\nfrom anthropic import Anthropic\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"ANTHROPIC_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Define your Pydantic model with proper annotations\nclass Character(BaseModel):\n    \"\"\"Model representing a character extracted from text.\"\"\"\n    name: str = Field(description=\"The character's full name\")\n    description: str = Field(description=\"A description of the character\")\n\n# Initialize client with explicit mode and prompt caching\nclient = instructor.from_provider(\n    \"anthropic/claude-4-5-haiku-latest\",\n    mode=instructor.Mode.TOOLS,\n)\n\ntry:\n    # Load your large context\n    with open(\"./book.txt\", \"r\") as f:\n        book = f.read()\n\n    # Make multiple calls using the cached context\n    for _ in range(2):\n        # The first time processes the large text, subsequent calls use the cache\n        resp, completion = client.create_with_completion(\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"Extract character information from the provided text.\"\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": [\n                        {\n                            \"type\": \"text\",\n                            \"text\": \"<book>\" + book + \"</book>\",\n                            \"cache_control\": {\"type\": \"ephemeral\"},  # Mark for caching\n                        },\n                        {\n                            \"type\": \"text\",\n                            \"text\": \"Extract a character from the text given above\",\n                        },\n                    ],\n                },\n            ],\n            response_model=Character,\n            max_tokens=1000,\n        )\n\n        # Process the result\n        print(f\"Character: {resp.name}\")\n        print(f\"Description: {resp.description}\")\n\n        # The completion contains the raw response\n        print(f\"Raw completion length: {len(completion)}\")\n\n    # Note: Second iteration should be faster due to cache hit\n\nexcept Exception as e:\n    print(f\"Error: {e}\")\n```\n\n### Caching Images\n\nWe also support caching for images. This helps significantly, especially if you're using images repeatedly to save on costs. Read more about it [here](../concepts/caching.md)\n\n```python\n# Standard library imports\nimport os\n\n# Third-party imports\nimport instructor\nfrom anthropic import Anthropic\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"ANTHROPIC_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Define your model for image analysis\nclass ImageAnalyzer(BaseModel):\n    \"\"\"Model for analyzing image content.\"\"\"\n    content_description: str = Field(description=\"Description of what appears in the images\")\n    objects: list[str] = Field(description=\"List of objects visible in the images\")\n    scene_type: str = Field(description=\"Type of scene shown in the images (indoor, outdoor, etc.)\")\n\n# Initialize client with explicit mode and image caching enabled\nclient = instructor.from_provider(\n    \"anthropic/claude-4-5-haiku-latest\",\n    mode=instructor.Mode.TOOLS,\n)\n\ntry:\n    # Configure cache control for images\n    cache_control = {\"type\": \"ephemeral\"}\n\n    # Make a request with cached images\n    response = client.create(\n        response_model=ImageAnalyzer,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Analyze the content of the provided images in detail.\"\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"What is in these two images?\",\n                    # Remote image with caching\n                    {\n                        \"type\": \"image\",\n                        \"source\": \"https://example.com/image.jpg\",\n                        \"cache_control\": cache_control\n                    },\n                    # Local image with caching\n                    {\n                        \"type\": \"image\",\n                        \"source\": \"path/to/image.jpg\",\n                        \"cache_control\": cache_control\n                    },\n                ]\n            }\n        ],\n        autodetect_images=True  # Automatically handle image content\n    )\n\n    # Process the results\n    print(f\"Description: {response.content_description}\")\n    print(f\"Objects: {', '.join(response.objects)}\")\n    print(f\"Scene type: {response.scene_type}\")\n\n    # Subsequent identical requests will use cached images\n\nexcept Exception as e:\n    print(f\"Error during image analysis: {e}\")\n```\n\n## Thinking (Extended Thinking)\n\nAnthropic supports extended thinking with their Claude models, enabling the model to think through complex problems before providing structured outputs. In Instructor, use `Mode.TOOLS` with the `thinking` parameter to enable this feature.\n\n### Using Extended Thinking with TOOLS\n\n```python\nfrom anthropic import Anthropic\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Answer(BaseModel):\n    answer: float\n\n\nclient = instructor.from_provider(\"anthropic/claude-3-5-haiku-latest\")\nresponse = client.create(\n    response_model=Answer,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Which is larger, 9.11 or 9.8?\",\n        },\n    ],\n    temperature=1,\n    max_tokens=2000,\n    thinking={\"type\": \"enabled\", \"budget_tokens\": 1024},\n)\n\n# Response is a validated Answer object\nassert isinstance(response, Answer)\nassert response.answer == 9.8\n```\n\n### How It Works\n\nWhen you provide the `thinking` parameter with `type: \"enabled\"`:\n\n1. **Automatic Mode Detection**: `Mode.TOOLS` automatically detects the thinking parameter and adjusts the tool choice strategy to `auto` (required by Anthropic's API when thinking is enabled)\n2. **Model Reasoning**: Claude uses the allocated `budget_tokens` to reason about the problem\n3. **Structured Output**: After reasoning, the model returns a valid tool call with your response model\n4. **Validation**: The response is automatically validated against your Pydantic model\n\n### Deprecation Notice\n\n`Mode.ANTHROPIC_REASONING_TOOLS` is deprecated. Use `Mode.TOOLS` with the `thinking` parameter instead. Both modes now support thinking, but using the standard `TOOLS` mode is preferred and more flexible.\n"
  },
  {
    "path": "docs/integrations/anyscale.md",
    "content": "---\ntitle: Anyscale\ndescription: Guide to using instructor with Anyscale\n---\n\n# Structured outputs with Anyscale, a complete guide w/ instructor\n\n[Anyscale](https://www.anyscale.com/) is a platform that provides access to various open-source LLMs like Mistral and Llama models. This guide shows how to use instructor with Anyscale to get structured outputs from these models.\n\n## Quick Start\n\nFirst, install the required packages:\n\n```bash\npip install instructor\n```\n\nYou'll need an Anyscale API key which you can set as an environment variable:\n\n```bash\nexport ANYSCALE_API_KEY=your_api_key_here\n```\n\n## Basic Example\n\nHere's how to extract structured data from Anyscale models:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize the client with Anyscale base URL\nclient = instructor.from_provider(\n    \"anyscale/Mixtral-8x7B-Instruct-v0.1\",\n    mode=instructor.Mode.JSON_SCHEMA,\n)\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n# Extract structured data\nuser = client.create(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nprint(user)\n# Output: UserExtract(name='Jason', age=25)\n```\n\n### Async Example\n\n```python\nimport asyncio\nimport instructor\nfrom pydantic import BaseModel\n\nasync_client = instructor.from_provider(\n    \"anyscale/Mixtral-8x7B-Instruct-v0.1\",\n    async_client=True,\n    mode=instructor.Mode.JSON_SCHEMA,\n)\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\nasync def fetch_user():\n    return await async_client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"}],\n        response_model=UserExtract,\n    )\n\nuser = asyncio.run(fetch_user())\nprint(user)\n```\n\n## Supported Modes\n\nAnyscale supports the following instructor modes:\n\n- `Mode.TOOLS`\n- `Mode.JSON`\n- `Mode.JSON_SCHEMA`\n- `Mode.MD_JSON`\n\n## Models\n\nAnyscale provides access to various models, including:\n\n- Mistral models (e.g., `mistralai/Mixtral-8x7B-Instruct-v0.1`)\n- Llama models\n- Other open-source LLMs available through their platform\n\n"
  },
  {
    "path": "docs/integrations/azure.md",
    "content": "---\ntitle: Structured outputs with Azure OpenAI, a complete guide w/ instructor\ndescription: Learn how to use Azure OpenAI with instructor for structured outputs, including async/sync implementations, streaming, and validation.\n---\n\n# Structured Outputs with Azure OpenAI\n\nThis guide demonstrates how to use Azure OpenAI with instructor for structured outputs. Azure OpenAI provides the same powerful models as OpenAI but with enterprise-grade security and compliance features through Microsoft Azure.\n\n## Installation\n\nWe can use the same installation as we do for OpenAI since the default `openai` client ships with an AzureOpenAI client.\n\nFirst, install the required dependencies:\n\n```bash\npip install instructor\n```\n\nNext, make sure that you've enabled Azure OpenAI in your Azure account and have a deployment for the model you'd like to use. [Here is a guide to get started](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal)\n\nOnce you've done so, you'll have an endpoint and a API key to be used to configure the client.\n\n```bash\ninstructor.exceptions.InstructorRetryException: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.'}\n```\n\nIf you see an error like the one above, make sure you've set the correct endpoint and API key in the client.\n\n## Authentication\n\nTo use Azure OpenAI, you'll need:\n\n1. Azure OpenAI endpoint\n2. API key\n3. Deployment name\n\n```python\nimport os\nfrom openai import AzureOpenAI\nimport instructor\n\n# Configure Azure OpenAI client\nclient = AzureOpenAI(\n    api_key=os.environ[\"AZURE_OPENAI_API_KEY\"],\n    api_version=\"2024-02-01\",\n    azure_endpoint=os.environ[\"AZURE_OPENAI_ENDPOINT\"]\n)\n\n# Patch the client with instructor\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n```\n\n## Using Auto Client (Recommended)\n\nThe easiest way to get started with Azure OpenAI is using the `from_provider` method:\n\n```python\nimport instructor\nimport os\n\n# Set your Azure OpenAI credentials\nos.environ[\"AZURE_OPENAI_API_KEY\"] = \"your-api-key\"\nos.environ[\"AZURE_OPENAI_ENDPOINT\"] = \"https://your-resource.openai.azure.com/\"\n\n# Create client using the provider string\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n\n# Or async client\nasync_client = instructor.from_provider(\"azure_openai/gpt-4o-mini\", async_client=True)\n```\n\nYou can also pass credentials as parameters:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\n    \"azure_openai/gpt-4o-mini\",\n    api_key=\"your-api-key\",\n    azure_endpoint=\"https://your-resource.openai.azure.com/\",\n    api_version=\"2024-02-01\"  # Optional, defaults to 2024-02-01\n)\n```\n\n## Basic Usage\n\nHere's a simple example using a Pydantic model:\n\n```python\nimport os\nimport instructor\nfrom openai import AzureOpenAI\nfrom pydantic import BaseModel\n\nclient = AzureOpenAI(\n    api_key=os.environ[\"AZURE_OPENAI_API_KEY\"],\n    api_version=\"2024-02-01\",\n    azure_endpoint=os.environ[\"AZURE_OPENAI_ENDPOINT\"],\n)\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Synchronous usage\nuser = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"John is 30 years old\"}],\n    response_model=User,\n)\n\nprint(user)\n# > name='John' age=30\n```\n\n## Async Implementation\n\nAzure OpenAI supports async operations:\n\n```python\nimport os\nimport instructor\nimport asyncio\nfrom openai import AsyncAzureOpenAI\nfrom pydantic import BaseModel\n\nclient = AsyncAzureOpenAI(\n    api_key=os.environ[\"AZURE_OPENAI_API_KEY\"],\n    api_version=\"2024-02-15-preview\",\n    azure_endpoint=os.environ[\"AZURE_OPENAI_ENDPOINT\"],\n)\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def get_user_async():\n    return await client.create(\n        messages=[{\"role\": \"user\", \"content\": \"John is 30 years old\"}],\n        response_model=User,\n    )\n\n\n# Run async function\nuser = asyncio.run(get_user_async())\nprint(user)\n# > name='John' age=30\n```\n\n## Nested Models\n\nAzure OpenAI handles complex nested structures:\n\n```python\nimport os\nimport instructor\nfrom openai import AzureOpenAI\nfrom pydantic import BaseModel\n\nclient = AzureOpenAI(\n    api_key=os.environ[\"AZURE_OPENAI_API_KEY\"],\n    api_version=\"2024-02-01\",\n    azure_endpoint=os.environ[\"AZURE_OPENAI_ENDPOINT\"],\n)\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass UserWithAddress(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n        John is 30 years old and has two addresses:\n        1. 123 Main St, New York, USA\n        2. 456 High St, London, UK\n        \"\"\",\n        }\n    ],\n    response_model=UserWithAddress,\n)\n\nprint(resp)\n# {\n#     'name': 'John',\n#     'age': 30,\n#     'addresses': [\n#         {\n#             'street': '123 Main St',\n#             'city': 'New York',\n#             'country': 'USA'\n#         },\n#         {\n#             'street': '456 High St',\n#             'city': 'London',\n#             'country': 'UK'\n#         }\n#     ]\n# }\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\n### Partials\n\nYou can use our `create_partial` method to stream a single object. Note that validators should not be declared in the response model when streaming objects because it will break the streaming process.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    bio: str\n\n\n# Stream partial objects as they're generated\nuser = client.create_partial(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create a user profile for Jason, age 25\"},\n    ],\n    response_model=User,\n)\n\nfor user_partial in user:\n    print(user_partial)\n\n# > name='Jason' age=None bio='None'\n# > name='Jason' age=25 bio='A tech'\n# > name='Jason' age=25 bio='A tech enthusiast'\n# > name='Jason' age=25 bio='A tech enthusiast who loves coding, gaming, and exploring new'\n# > name='Jason' age=25 bio='A tech enthusiast who loves coding, gaming, and exploring new technologies'\n\n```\n\n## Iterable Responses\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"azure_openai/gpt-4o-mini\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract multiple users from text\nusers = client.create_iterable(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract users:\n            1. Jason is 25 years old\n            2. Sarah is 30 years old\n            3. Mike is 28 years old\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user in users:\n    print(user)\n#> name='Jason' age=25\n# > name='Sarah' age=30\n# > name='Mike' age=28\n\n```\n\n## Instructor Modes\n\nWe provide several modes to make it easy to work with the different response models that OpenAI supports\n\n1. `instructor.Mode.TOOLS` : This uses the [tool calling API](https://platform.openai.com/docs/guides/function-calling) to return structured outputs to the client\n2. `instructor.Mode.JSON` : This forces the model to return JSON by using [OpenAI's JSON mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode).\n3. `instructor.Mode.FUNCTIONS` : This uses OpenAI's function calling API to return structured outputs and will be deprecated in the future.\n4. `instructor.Mode.PARALLEL_TOOLS` : This uses the [parallel tool calling API](https://platform.openai.com/docs/guides/function-calling#configuring-parallel-function-calling) to return structured outputs to the client. This allows the model to generate multiple calls in a single response.\n5. `instructor.Mode.MD_JSON` : This makes a simple call to the OpenAI chat completion API and parses the raw response as JSON.\n6. `instructor.Mode.TOOLS_STRICT` : This uses the new Open AI structured outputs API to return structured outputs to the client using constrained grammar sampling. This restricts users to a subset of the JSON schema.\n7. `instructor.Mode.JSON_O1` : This is a mode for the `O1` model. We created a new mode because `O1` doesn't support any system messages, tool calling or streaming so you need to use this mode to use Instructor with `O1`.\n\nIn general, we recommend using `Mode.Tools` because it's the most flexible and future-proof mode. It has the largest set of features that you can specify your schema in and makes things significantly easier to work with.\n\n## Best Practices\n\n## Additional Resources\n\n- [Azure OpenAI Documentation](https://learn.microsoft.com/en-us/azure/ai-services/openai/)\n- [Instructor Documentation](https://instructor-ai.github.io/instructor/)\n- [Azure OpenAI Pricing](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/)\n"
  },
  {
    "path": "docs/integrations/bedrock.md",
    "content": "---\ntitle: Structured Outputs with AWS Bedrock and Pydantic\ndescription: Learn how to use AWS Bedrock with Instructor for structured JSON outputs using Pydantic models. Create type-safe, validated responses from AWS Bedrock LLMs with Python.\n---\n\n# Structured Outputs with AWS Bedrock\n\nThis guide demonstrates how to use AWS Bedrock with Instructor to generate structured outputs. You'll learn how to use AWS Bedrock's LLM models with Pydantic to create type-safe, validated responses.\n\n## Prerequisites\n\nYou'll need to have an AWS account with access to Bedrock and the appropriate permissions. You'll also need to set up your AWS credentials.\n\n```bash\npip install \"instructor[bedrock]\"\n```\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Mode Migration Guide](../concepts/mode-migration.md) - Move to core modes\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n- [AWS Integration Guide](../examples/index.md#aws-integration) - More AWS examples\n\n# AWS Bedrock\n\nAWS Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API.\n\n## Auto Client Setup\n\nFor simplified setup, you can use the auto client pattern:\n\n```python\nimport instructor\n\n# Auto client with model specification\nclient = instructor.from_provider(\"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\")\n\n# The auto client automatically handles:\n# - AWS credential detection from environment\n# - Region configuration (defaults to us-east-1)\n# - Mode selection based on model (Claude models use TOOLS)\n```\n\n## Deprecation Notice\n\n> **Deprecation Notice:**\n>\n> The `_async` argument to `instructor.from_bedrock` is deprecated. Please use `async_client=True` for async clients instead. Support for `_async` may be removed in a future release. All new code and examples should use `async_client`.\n\n### Environment Configuration\n\nSet your AWS credentials and region:\n\n```bash\nexport AWS_ACCESS_KEY_ID=your_access_key\nexport AWS_SECRET_ACCESS_KEY=your_secret_key\nexport AWS_DEFAULT_REGION=us-east-1\n```\n\nOr configure using AWS CLI:\n\n```bash\naws configure\n```\n\n## Sync Example\n\n```python\nimport boto3\nimport instructor\nfrom pydantic import BaseModel\n\nbedrock_client = boto3.client('bedrock-runtime')\nclient = instructor.from_provider(\"bedrock/claude-3-5-sonnet-20241022\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nuser = client.create(\n    modelId=\"anthropic.claude-3-sonnet-20240229-v1:0\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > User(name='Jason', age=25)\n```\n\n## Async Example\n\n> **Warning:**\n> AWS Bedrock's official SDK (`boto3`) does not support async natively. If you need to call Bedrock from async code, you can use `asyncio.to_thread` to run synchronous Bedrock calls in a non-blocking way.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nclient = instructor.from_provider(\"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\ndef get_user():\n    return client.create(\n        modelId=\"anthropic.claude-3-sonnet-20240229-v1:0\",\n        messages=[{\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"}],\n        response_model=User,\n    )\n\nasync def get_user_async():\n    return await asyncio.to_thread(get_user)\n\nuser = asyncio.run(get_user_async())\nprint(user)\n```\n\n## Supported Modes\n\nAWS Bedrock supports the following **core** modes:\n\n- `TOOLS`: Uses function calling for models that support it (like Claude models)\n- `MD_JSON`: Direct JSON response generation (text extraction fallback)\n\n> Legacy modes (`BEDROCK_TOOLS`, `BEDROCK_JSON`) are deprecated and map to `Mode.TOOLS` and `Mode.MD_JSON`.\n> modes above. Use `TOOLS` or `MD_JSON` in new code.\n\n```python\nimport boto3\nimport instructor\nfrom instructor import Mode\nfrom pydantic import BaseModel\n\n# Use from_provider for simplified setup\nclient = instructor.from_provider(\"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\", mode=Mode.TOOLS)\n\n# Or if you need to use a custom boto3 client:\n# bedrock_client = boto3.client('bedrock-runtime')\n# client = instructor.from_provider(\"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\", client=bedrock_client, mode=Mode.TOOLS)\n\nclass User(BaseModel):\n    name: str\n    age: int\n```\n\n## OpenAI Compatibility: Flexible Input Format and Model Parameter\n\nInstructor’s Bedrock integration supports both OpenAI-style and Bedrock-native message formats, as well as any mix of the two. You can use either:\n\n- **OpenAI-style**:  \n  `{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}`\n\n- **Bedrock-native**:  \n  `{\"role\": \"user\", \"content\": [{\"text\": \"Extract: Jason is 25 years old\"}]}`\n\n- **Mixed**:  \n  You can freely mix OpenAI-style and Bedrock-native messages in the same request. The integration will automatically convert OpenAI-style messages to the correct Bedrock format, while preserving any Bedrock-native fields you provide.\n\nThis flexibility also applies to other keyword arguments, such as the model name:\n\n- You can use either `model` (OpenAI-style) or `modelId` (Bedrock-native) as a keyword argument.  \n- If you provide `model`, Instructor will automatically convert it to `modelId` for Bedrock.\n- If you provide both, `modelId` takes precedence.\n\n**Example:**\n\n```python\nimport instructor\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"Extract the name and age.\"},  # OpenAI-style\n    {\"role\": \"user\", \"content\": [{\"text\": \"Extract: Jason is 25 years old\"}]},  # Bedrock-native\n    {\"role\": \"assistant\", \"content\": \"Sure! Jason is 25.\"},  # OpenAI-style\n]\n\n# Both of these are valid:\nuser = client.create(\n    model=\"anthropic.claude-3-sonnet-20240229-v1:0\",  # OpenAI-style\n    messages=messages,\n    response_model=User,\n)\n\nuser = client.create(\n    modelId=\"anthropic.claude-3-sonnet-20240229-v1:0\",  # Bedrock-native\n    messages=messages,\n    response_model=User,\n)\n```\n\nAll of the above will work seamlessly with Instructor’s Bedrock integration.\n\n## Multimodal: Images and Documents\n\nInstructor will convert OpenAI-style image parts into Bedrock image blocks automatically. For documents (PDFs), Bedrock expects a native `document` block, so you should either pass a Bedrock-native document dict directly or build one with the `PDF` helper.\n\n```python\nimport instructor\nfrom instructor.processing.multimodal import PDF\n\nclient = instructor.from_provider(\"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\")\n\npdf = PDF.from_url(\"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\")\n\nresponse = client.create(\n    modelId=\"anthropic.claude-3-sonnet-20240229-v1:0\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Analyze this document\",\n                pdf.to_bedrock(),\n            ],\n        }\n    ],\n)\n```\n\nBedrock document blocks also support S3 URIs (for example, `s3://bucket/key.pdf`) and local files; `PDF.to_bedrock()` will load the bytes and sanitize the document name for you.\n\n## Nested Objects\n\n```python\nimport boto3\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize the Bedrock client\nbedrock_client = boto3.client('bedrock-runtime')\n\n# Enable instructor patches for Bedrock client\nclient = instructor.from_provider(\"bedrock/claude-3-5-sonnet-20241022\")\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Create structured output with nested objects\nuser = client.create(\n    modelId=\"anthropic.claude-3-sonnet-20240229-v1:0\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> User(\n#>     name='Jason',\n#>     age=25,\n#>     addresses=[\n#>         Address(street='123 Main St', city='New York', country='USA'),\n#>         Address(street='456 Beach Rd', city='Miami', country='USA')\n#>     ]\n#> )\n```\n\n## Modern Models and Features\n\n### Latest Model Support\n\nAWS Bedrock supports many modern foundation models:\n\n```python\nimport instructor\n\n# Claude 3.5 models (latest)\nclient = instructor.from_provider(\"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\")\n# or\nclient = instructor.from_provider(\"bedrock/anthropic.claude-3-5-haiku-20241022-v1:0\")\n\n# Amazon Nova models (multimodal)\nclient = instructor.from_provider(\"bedrock/amazon.nova-micro-v1:0\")\n\n# Meta Llama 3 models\nclient = instructor.from_provider(\"bedrock/meta.llama3-70b-instruct-v1:0\")\n\n# Mistral models\nclient = instructor.from_provider(\"bedrock/mistral.mistral-large-2402-v1:0\")\n```\n\n### Advanced Configuration\n\n```python\nimport boto3\nimport instructor\n\n# Custom AWS configuration\nbedrock_client = boto3.client(\n    'bedrock-runtime',\n    region_name='us-west-2',\n    aws_access_key_id='your_key',\n    aws_secret_access_key='your_secret'\n)\n\n# Use from_provider with custom client\nclient = instructor.from_provider(\n    \"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\",\n    client=bedrock_client,\n    mode=instructor.Mode.TOOLS\n)\n\n# Advanced inference configuration\nuser = client.create(\n    modelId=\"anthropic.claude-3-5-sonnet-20241022-v2:0\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract user info\"}],\n    response_model=User,\n    inferenceConfig={\n        \"maxTokens\": 2048,\n        \"temperature\": 0.1,\n        \"topP\": 0.9,\n        \"stopSequences\": [\"STOP\"]\n    }\n)\n```\n"
  },
  {
    "path": "docs/integrations/cerebras.md",
    "content": "---\ntitle: \"Structured outputs with Cerebras, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with Cerebras's hardware-accelerated AI models. Learn how to generate structured, type-safe outputs with high-performance computing.\"\n---\n\n# Structured outputs with Cerebras, a complete guide w/ instructor\n\nCerebras provides hardware-accelerated AI models optimized for high-performance computing environments. This guide shows you how to use Instructor with Cerebras's models for type-safe, validated responses.\n\n## Quick Start\n\nInstall Instructor with Cerebras support:\n\n```bash\npip install \"instructor[cerebras_cloud_sdk]\"\n```\n\n## Simple User Example (Sync)\n\n```python\nimport instructor\nfrom cerebras.cloud.sdk import Cerebras\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"cerebras/llama3.1-70b\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Create structured output\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the name and age of the person in this sentence: John Smith is 29 years old.\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(resp)\n#> User(name='John Smith', age=29)\n```\n\n## Simple User Example (Async)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nclient = instructor.from_provider(\n    \"cerebras/llama3.1-70b\",\n    async_client=True,\n)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def extract_user():\n    resp = await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract the name and age of the person in this sentence: John Smith is 29 years old.\",\n            }\n        ],\n        response_model=User,\n    )\n    return resp\n\n# Run async function\nresp = asyncio.run(extract_user())\nprint(resp)\n#> User(name='John Smith', age=29)\n```\n\n## Nested Example\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom cerebras.cloud.sdk import Cerebras\n\nclient = instructor.from_provider(\"cerebras/llama3.1-70b\")\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n        Extract: Jason is 25 years old.\n        He lives at 123 Main St, New York, USA\n        and has a summer house at 456 Beach Rd, Miami, USA\n    \"\"\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\nWe currently support partial streaming for Cerebras by parsing the raw text completion. We have not implemented streaming for function calling at this point in time yet. Please make sure you have `mode=instructor.Mode.MD_JSON` set when using partial streaming.\n\n```python\nimport instructor\nfrom cerebras.cloud.sdk import Cerebras\nfrom pydantic import BaseModel\nfrom typing import Iterable\n\nclient = instructor.from_provider(\n    \"cerebras/llama3.1-70b\",\n    mode=instructor.Mode.MD_JSON,\n)\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create_partial(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Ivan is 27 and lives in Singapore\",\n        }\n    ],\n    response_model=Person,\n    stream=True,\n)\n\nfor person in resp:\n    print(person)\n    # > name=None age=None\n    # > name='Ivan' age=None\n    # > name='Ivan' age=27\n\n```\n\n## Iterable Example\n\n```python\nimport instructor\nfrom cerebras.cloud.sdk import Cerebras\nfrom pydantic import BaseModel\nfrom typing import Iterable\n\nclient = instructor.from_provider(\n    \"cerebras/llama3.1-70b\",\n    mode=instructor.Mode.MD_JSON,\n)\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create_iterable(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract all users from this sentence : Chris is 27 and lives in San Francisco, John is 30 and lives in New York while their college roomate Jessica is 26 and lives in London\",\n        }\n    ],\n    response_model=Person,\n    stream=True,\n)\n\nfor person in resp:\n    print(person)\n    # > Person(name='Chris', age=27)\n    # > Person(name='John', age=30)\n    # > Person(name='Jessica', age=26)\n\n```\n\n## Instructor Hooks\n\nInstructor provides several hooks to customize behavior:\n\n### Validation Hook\n\n```python\nfrom instructor import Instructor\n\ndef validation_hook(value, retry_count, exception):\n    print(f\"Validation failed {retry_count} times: {exception}\")\n    return retry_count < 3  # Retry up to 3 times\n\ninstructor.patch(client, validation_hook=validation_hook)\n```\n\n## Instructor Modes\n\nWe provide serveral modes to make it easy to work with the different response models that Cerebras Supports\n\n1. `instructor.Mode.MD_JSON` : This parses the raw completions as a valid JSON object.\n2. `instructor.Mode.TOOLS` : This uses Cerebras's tool calling mode to return structured outputs to the client.\n\nIn general, we recommend using `Mode.TOOLS` because it's the most flexible and future-proof mode. It has the largest set of features that you can specify your schema in and makes things significantly easier to work with.\n"
  },
  {
    "path": "docs/integrations/cohere.md",
    "content": "---\ntitle: Structured outputs with Cohere, a complete guide w/ instructor\ndescription: Learn how to leverage Cohere's command models with Python's instructor library for structured data outputs.\n---\n\n# Structured outputs with Cohere, a complete guide w/ instructor\n\nThis guide demonstrates how to use Cohere with Instructor to generate structured outputs. You'll learn how to use Cohere's command models to create type-safe responses.\n\nYou can now use any of the Cohere's [command models](https://docs.cohere.com/docs/models) with the `instructor` library to get structured outputs.\n\nYou'll need a cohere API key which can be obtained by signing up [here](https://dashboard.cohere.com/) and gives you [free](https://cohere.com/pricing), rate-limited usage for learning and prototyping.\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Document Segmentation](../examples/document_segmentation.md) - Cohere example for document processing\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n\n# Cohere V2 API Support\n\nAs of version 1.12.0, Instructor supports both Cohere V1 and V2 SDK clients. The V2 API provides an OpenAI-compatible interface with support for the latest Cohere models.\n\n**Key differences:**\n- **V2 API** (recommended): Uses `cohere.ClientV2` / `cohere.AsyncClientV2` with OpenAI-compatible message format\n- **V1 API** (legacy): Uses `cohere.Client` / `cohere.AsyncClient` with Cohere-specific message format\n\nThe V2 API is recommended for new projects as it provides better compatibility with the OpenAI SDK interface and supports the latest models like `command-a-03-2025`.\n\n## Setup\n\n```\npip install \"instructor[cohere]\"\n\n```\n\nThis installs `cohere>=5.1.8`, which includes both V1 and V2 client support.\n\nExport your key:\n\n```\nexport CO_API_KEY=<YOUR_COHERE_API_KEY>\n```\n\n## Example (V2 API - Recommended)\n\nThe easiest way to use Cohere with Instructor is through the `from_provider` factory, which automatically uses the V2 API:\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\nimport instructor\n\n\n# Using from_provider automatically uses Cohere V2 API\nclient = instructor.from_provider(\n    \"cohere/command-a-03-2025\",\n    max_tokens=1000,\n)\n\n\nclass Person(BaseModel):\n    name: str = Field(description=\"name of the person\")\n    country_of_origin: str = Field(description=\"country of origin of the person\")\n\n\nclass Group(BaseModel):\n    group_name: str = Field(description=\"name of the group\")\n    members: List[Person] = Field(description=\"list of members in the group\")\n\n\ntask = \"\"\"\\\nGiven the following text, create a Group object for 'The Beatles' band\n\nText:\nThe Beatles were an English rock band formed in Liverpool in 1960. With a line-up comprising John Lennon, Paul McCartney, George Harrison and Ringo Starr, they are regarded as the most influential band of all time. The group were integral to the development of 1960s counterculture and popular music's recognition as an art form.\n\"\"\"\ngroup = client.create(\n    response_model=Group,\n    messages=[{\"role\": \"user\", \"content\": task}],\n    temperature=0,\n)\n\nprint(group.model_dump_json(indent=2))\n\"\"\"\n{\n  \"group_name\": \"The Beatles\",\n  \"members\": [\n    {\n      \"name\": \"John Lennon\",\n      \"country_of_origin\": \"England\"\n    },\n    {\n      \"name\": \"Paul McCartney\",\n      \"country_of_origin\": \"England\"\n    },\n    {\n      \"name\": \"George Harrison\",\n      \"country_of_origin\": \"England\"\n    },\n    {\n      \"name\": \"Ringo Starr\",\n      \"country_of_origin\": \"England\"\n    }\n  ]\n}\n\"\"\"\n```\n\n### Async Example\n\n```python\nimport instructor\n\nasync_client = instructor.from_provider(\n    \"cohere/command-a-03-2025\",\n    async_client=True,\n    max_tokens=1000,\n)\n```\n\n## Using Cohere SDK Directly\n\nYou can also explicitly create a Cohere client and patch it with Instructor:\n\n### V2 API (Recommended)\n\n```python\nimport cohere\nimport instructor\n\n# Use from_provider for simplified setup\nclient = instructor.from_provider(\"cohere/command-a-03-2025\", mode=instructor.Mode.TOOLS)\n\n# Now use it with structured outputs\nresponse = client.create(\n    response_model=YourModel,\n    model=\"command-a-03-2025\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract...\"}],\n)\n```\n\n### V1 API (Legacy Support)\n\nThe V1 API is still supported for backward compatibility:\n\n```python\nimport cohere\nimport instructor\n\n# Use from_provider for simplified setup (works with both V1 and V2)\nclient = instructor.from_provider(\"cohere/command-a-03-2025\", mode=instructor.Mode.TOOLS)\n\n# V1 uses different message format internally but instructor handles the conversion\nresponse = client.create(\n    response_model=YourModel,\n    model=\"command-r-plus\",\n    messages=[{\"role\": \"user\", \"content\": \"Extract...\"}],\n)\n```\n\n**Note**: Instructor automatically detects whether you're using V1 or V2 client and handles message format conversion accordingly. The V2 API uses OpenAI-compatible message format (`messages`), while V1 uses Cohere's legacy format (`message` + `chat_history`).\n"
  },
  {
    "path": "docs/integrations/cortex.md",
    "content": "---\ntitle: \"Structured outputs with Cortex, a complete guide w/ instructor\"\ndescription: \"Learn how to use Cortex with Instructor for structured outputs. Complete guide with examples and best practices.\"\n---\n\n# Structured outputs with Cortex\n\nCortex.cpp is a runtime that helps you run open source LLMs out of the box. It supports a wide variety of models and powers their [Jan](https://jan.ai) platform. This guide provides a quickstart on how to use Cortex with instructor for structured outputs.\n\n## Quick Start\n\nInstructor comes with support for the OpenAI client out of the box, so you don't need to install anything extra.\n\n```bash\npip install \"instructor\"\n```\n\nOnce you've done so, make sure to pull the model that you'd like to use. In this example, we'll be using a quantized llama3.2 model.\n\n```bash\ncortex run llama3.2:3b-gguf-q4-km\n```\n\nLet's start by initializing the client below - note that we need to provide a base URL and an API key here. The API key isn't important, it's just so the OpenAI client doesn't throw an error.\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\n    \"cortex/llama3.2:3b-gguf-q4-km\",\n    base_url=\"http://localhost:39281/v1\",\n    api_key=\"this is a fake api key that doesn't matter\",\n)\n```\n\n## Simple User Example (Sync)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\n    \"cortex/llama3.2:3b-gguf-q4-km\",\n    base_url=\"http://localhost:39281/v1\",\n    api_key=\"this is a fake api key that doesn't matter\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Ivan is 27 and lives in Singapore\"}],\n    response_model=User,\n)\n\nprint(resp)\n# > name='Ivan', age=27\n```\n\n## Simple User Example (Async)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n# Initialize with API key\nclient = instructor.from_provider(\n    \"cortex/llama3.2:3b-gguf-q4-km\",\n    async_client=True,\n    base_url=\"http://localhost:39281/v1\",\n    api_key=\"this is a fake api key that doesn't matter\",\n)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n#> User(name='Jason', age=25)\n```\n\n## Nested Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\n    \"cortex/llama3.2:3b-gguf-q4-km\",\n    base_url=\"http://localhost:39281/v1\",\n    api_key=\"this is a fake api key that doesn't matter\",\n)\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\nIn this tutorial we've seen how we can run local models with Cortex while simplifying a lot of the logic around managing retries and function calling with our simple interface.\n\nWe'll be publishing a lot more content on Cortex and how to work with local models moving forward so do keep an eye out for that.\n\n## Related Resources\n\n- [Cortex Documentation](https://cortex.so/docs/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with the latest OpenAI API versions and models. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n"
  },
  {
    "path": "docs/integrations/databricks.md",
    "content": "---\ntitle: Databricks\ndescription: Guide to using instructor with Databricks models\n---\n\n# Structured outputs with Databricks, a complete guide w/ instructor\n\n[Databricks](https://www.databricks.com/) provides an AI platform with access to various models. This guide shows how to use instructor with Databricks to get structured outputs.\n\n## Quick Start\n\nFirst, install the required packages:\n\n```bash\nuv pip install instructor openai\n```\n\nSet your Databricks workspace URL and token as environment variables:\n\n```bash\nexport DATABRICKS_TOKEN=\"your_personal_access_token\"\nexport DATABRICKS_HOST=\"https://your-workspace.cloud.databricks.com\"\n```\n\n`DATABRICKS_API_KEY` and `DATABRICKS_WORKSPACE_URL` are also supported if you prefer those names. The provider appends `/serving-endpoints` automatically, so the host only needs the base workspace URL.\n\n## Basic Example\n\nHere's how to extract structured data from Databricks models:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize the client; host and token are read from the environment\nclient = instructor.from_provider(\n    \"databricks/dbrx-instruct\",\n    mode=instructor.Mode.TOOLS,\n)\n\n# Define your data structure\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n# Extract structured data\nuser = client.create(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nprint(user)\n# Output: UserExtract(name='Jason', age=25)\n```\n\nIf you need to point at a different workspace or testing endpoint, pass `base_url=\"https://alt-workspace.cloud.databricks.com/serving-endpoints\"`. The helper will use that value as-is without adding another suffix.\n\n### Async Example\n\n```python\nasync_client = instructor.from_provider(\n    \"databricks/dbrx-instruct\",\n    async_client=True,\n    mode=instructor.Mode.TOOLS,\n)\n```\n\n## Supported Modes\n\nDatabricks supports the same modes as OpenAI:\n\n- `Mode.TOOLS`\n- `Mode.JSON`\n- `Mode.FUNCTIONS`\n- `Mode.PARALLEL_TOOLS`\n- `Mode.MD_JSON`\n- `Mode.TOOLS_STRICT`\n- `Mode.JSON_O1`\n\n## Models\n\nDatabricks provides access to various models depending on your setup, including:\n\n- Foundation models hosted on Databricks\n- Custom fine-tuned models\n- Open source models deployed on Databricks\n\n"
  },
  {
    "path": "docs/integrations/deepseek.md",
    "content": "---\ntitle: \"Structured outputs with DeepSeek, a complete guide with instructor\"\ndescription: \"Learn how to use Instructor with DeepSeek's models for type-safe, structured outputs.\"\n---\n\n# Structured outputs with DeepSeek, a complete guide with instructor\n\nDeepSeek is a Chinese company that provides AI models and services. They're most notable for the deepseek coder and chat model and most recently, the R1 reasoning model.\n\nThis guide covers everything you need to know about using DeepSeek with Instructor for type-safe, validated responses.\n\n## Quick Start\n\nInstructor comes with support for the OpenAI Client out of the box, so you don't need to install anything extra.\n\n```bash\npip install \"instructor\"\n```\n\n⚠️ **Important**: You must set your DeepSeek API key before using the client. You can do this in two ways:\n\n1. Set the environment variable:\n\n```bash\nexport DEEPSEEK_API_KEY='your-api-key-here'\n```\n\n2. Or provide it directly to the client:\n\n```python\nimport os\nfrom openai import OpenAI\n\nclient = OpenAI(api_key=os.getenv('DEEPSEEK_API_KEY'), base_url=\"https://api.deepseek.com\")\n```\n\n## Simple User Example (Sync)\n\n```python\nimport os\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport instructor\n\nclient = instructor.from_provider(\n    \"deepseek/deepseek-chat\",\n    base_url=\"https://api.deepseek.com\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > name='Jason' age=25\n```\n\n## Simple User Example (Async)\n\n```python\nimport os\nimport asyncio\nfrom pydantic import BaseModel\nimport instructor\n\nclient = instructor.from_provider(\n    \"deepseek/deepseek-chat\",\n    async_client=True,\n    base_url=\"https://api.deepseek.com\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n# > name='Jason' age=25\n\n```\n\n## Nested Example\n\n```python\nfrom pydantic import BaseModel\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Initialize with API key\nclient = instructor.from_provider(\n    \"deepseek/deepseek-chat\",\n    base_url=\"https://api.deepseek.com\",\n)\n\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\n### Partials\n\n```python\nfrom pydantic import BaseModel\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Initialize with API key\nclient = instructor.from_provider(\n    \"deepseek/deepseek-chat\",\n    base_url=\"https://api.deepseek.com\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    bio: str\n\n\nuser = client.create_partial(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Create a user profile for Jason and a one sentence bio, age 25\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user_partial in user:\n    print(user_partial)\n\n\n# > name='Jason' age=None bio='None'\n# > name='Jason' age=25 bio='A tech'\n# > name='Jason' age=25 bio='A tech enthusiast'\n# > name='Jason' age=25 bio='A tech enthusiast who loves coding, gaming, and exploring new'\n# > name='Jason' age=25 bio='A tech enthusiast who loves coding, gaming, and exploring new technologies'\n\n```\n\n### Iterable Example\n\n```python\nfrom pydantic import BaseModel\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Initialize with API key\nclient = instructor.from_provider(\n    \"deepseek/deepseek-chat\",\n    base_url=\"https://api.deepseek.com\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract multiple users from text\nusers = client.create_iterable(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract users:\n            1. Jason is 25 years old\n            2. Sarah is 30 years old\n            3. Mike is 28 years old\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user in users:\n    print(user)\n\n    #> name='Jason' age=25\n    #> name='Sarah' age=30\n    #> name='Mike' age=28\n```\n\n## Reasoning Models\n\nBecause Instructor is built on top of the OpenAI API, we can get our reasoning traces from the `deepseek-reasoner` model. Make sure to configure the `MD_JSON` mode here to get the best experience.\n\n```python\nimport os\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport instructor\nfrom rich import print\n\nclient = instructor.from_provider(\n    \"deepseek/deepseek-chat\",\n    base_url=\"https://api.deepseek.com\",\n    mode=instructor.Mode.MD_JSON,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\ncompletion, raw_completion = client.create_with_completion(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(completion)\n# > User(name='Jason', age=25)\nprint(raw_completion.choices[0].message.reasoning_content)\n# > Okay, let's see. The user wants me to extract information from the sentence \"Jason is 25 years old\" and format it into a JSON object that matches the given schema. The schema requires a \"name\" and an \"age\", both of which are required.\n# >\n# > First, I need to identify the name. The sentence starts with \"Jason\", so that's the name. Then the age is given as \"25 years old\". The age should be an integer, so I need to convert \"25\" from a string to a number.\n# >\n# > So putting that together, the JSON should have \"name\": \"Jason\" and \"age\": 25. Let me double-check the schema to make sure there are no other requirements. The properties are \"name\" (string) and \"age\" (integer), both required. Yep, that's all.\n# >\n# > I need to make sure the JSON is correctly formatted, with commas and braces. Also, the user specified to return it in a json codeblock, not the schema itself. So the final answer should be a JSON object with those key-value pairs.\n```\n\n## Instructor Modes\n\nWe suggest using the `Mode.Tools` mode for Deepseek which is the default when initializing via `from_provider`.\n\n## Related Resources\n\n- [DeepSeek Documentation](https://api-docs.deepseek.com/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with the latest OpenAI API versions and models. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n"
  },
  {
    "path": "docs/integrations/fireworks.md",
    "content": "---\ntitle: \"Structured outputs with Fireworks, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with Fireworks AI models. Learn how to generate structured, type-safe outputs with high-performance, cost-effective AI capabilities.\"\n---\n\n# Structured outputs with Fireworks, a complete guide w/ instructor\n\nFireworks provides efficient and cost-effective AI models with enterprise-grade reliability. This guide shows you how to use Instructor with Fireworks's models for type-safe, validated responses.\n\n## Quick Start\n\nInstall Instructor with Fireworks support:\n\n```bash\npip install \"instructor[fireworks-ai]\"\n```\n\n## Simple User Example (Sync)\n\n```python\nfrom fireworks.client import Fireworks\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize the client\nclient = Fireworks()\n\n# Enable instructor patches\nclient = instructor.from_provider(\"fireworks/llama-v3-70b-instruct\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract: Jason is 25 years old\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > User(name='Jason', age=25)\n\n```\n\n## Simple User Example (Async)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nclient = instructor.from_provider(\n    \"fireworks/llama-v3-70b-instruct\",\n    async_client=True,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract: Jason is 25 years old\",\n            }\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)  # User(name='Jason', age=25)\n\n```\n\n## Nested Example\n\n```python\nfrom fireworks.client import Fireworks\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Enable instructor patches\nclient = instructor.from_provider(\"fireworks/llama-v3-70b-instruct\")\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n                Extract: Jason is 25 years old.\n                He lives at 123 Main St, New York, USA\n                and has a summer house at 456 Beach Rd, Miami, USA\n            \"\"\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\n### Partial Streaming Example\n\n```python\nfrom fireworks.client import Fireworks\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Enable instructor patches\nclient = instructor.from_provider(\"fireworks/llama-v3-70b-instruct\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    bio: str\n\n\nuser = client.create_partial(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Create a user profile for Jason + 1 sentence bio, age 25\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user_partial in user:\n    print(user_partial)\n    # name=None age=None bio=None\n    # name='Jason' age=None bio=None\n    # name='Jason' age=25 bio=\"When he's\"\n    # name='Jason' age=25 bio=\"When he's not working as a graphic designer, Jason can usually be found trying out new craft beers or attempting to cook something other than ramen noodles.\"\n\n```\n\n## Iterable Example\n\n```python\nfrom fireworks.client import Fireworks\nimport instructor\nfrom pydantic import BaseModel\n\n\n# Enable instructor patches\nclient = instructor.from_provider(\"fireworks/llama-v3-70b-instruct\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract multiple users from text\nusers = client.create_iterable(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract users:\n            1. Jason is 25 years old\n            2. Sarah is 30 years old\n            3. Mike is 28 years old\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user in users:\n    print(user)\n\n    # name='Jason' age=25\n    # name='Sarah' age=30\n    # name='Mike' age=28\n```\n\n## Instructor Modes\n\nWe provide several modes to make it easy to work with the different response models that Fireworks supports\n\n1. `instructor.Mode.MD_JSON` : This parses the raw text completion into a pydantic object\n2. `instructor.Mode.TOOLS` : This uses Fireworks's tool calling API to return structured outputs to the client\n\n## Related Resources\n\n- [Fireworks Documentation](https://docs.fireworks.ai/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with Fireworks's latest API versions. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n\nNote: Always verify model-specific features and limitations before implementing streaming functionality in production environments.\n"
  },
  {
    "path": "docs/integrations/genai.md",
    "content": "---\ndraft: False\ndate: 2025-03-15\ntitle: \"Structured outputs with Google's genai SDK\"\ndescription: \"Learn how to use Instructor with Google's Generative AI SDK to extract structured data from Gemini models.\"\nslug: genai\ntags:\n  - patching\nauthors:\n  - instructor\n---\n\n# Structured Outputs with Google's genai SDK\n\n!!! info \"Recommended SDK\"\n\n    The `genai` SDK is Google's recommended Python client for working with Gemini models. It provides a unified interface for both the Gemini API and Vertex AI. For detailed setup instructions, including how to use it with Vertex AI, please refer to the [official Google AI documentation for the GenAI SDK](https://googleapis.github.io/python-genai/).\n\nThis guide demonstrates how to use Instructor with Google's `genai` SDK to extract structured data from Gemini models.\n\nWe currently have two modes for Gemini\n\n- `Mode.TOOLS` : This leverages function calling under the hood and returns a structured response\n- `Mode.JSON` : This provides Gemini with a JSON Schema that it will use to respond in a structured format with\n\n!!! info \"Gemini Thought Parts Filtering\"\n\n    When using `Mode.TOOLS`, Instructor automatically filters out thought parts from Gemini responses. Gemini 2.5 models include internal reasoning parts with `thought: true` by default, which cannot be disabled. Instructor removes these thought parts before processing the structured output to prevent runtime errors.\n\n    This filtering happens automatically and requires no additional configuration. For more information about Gemini's thinking feature, see the [official documentation](https://ai.google.dev/gemini-api/docs/thinking).\n\n!!! note \"Backwards Compatibility\"\n\n    The provider-specific modes (`Mode.TOOLS`, `Mode.JSON`, `Mode.JSON`) are still supported but emit deprecation warnings and map to the generic modes (`Mode.TOOLS`, `Mode.JSON`).\n\n## Installation\n\n```bash\npip install \"instructor[google-genai]\"\n```\n\n## Basic Usage\n\n!!! warning \"Unions and Optionals\"\n\n    Gemini doesn't have support for Union and Optional types in the structured outputs and tool calling integrations. We currently throw an error when we detect these in your response model.\n\nGetting started with Instructor and the genai SDK is straightforward. Just create a Pydantic model defining your output structure, patch the genai client, and make your request with a response_model parameter:\n\n```python\nfrom google import genai\nimport instructor\nfrom pydantic import BaseModel\n\n# Define your Pydantic model\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Initialize and patch the client\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n# Extract structured data\nresponse = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n    response_model=User,\n)\n\nprint(response)  # User(name='Jason', age=25)\n```\n\n## Alternative: Using the v2 GenAI client\n\n!!! note \"Recommended: Use `from_provider`\"\n\n    The `from_provider` approach shown above is recommended for most use cases. The `from_genai` helper below is available if you need to work directly with the native `google.genai.Client` and keep the Google request format intact.\n\n```python\nfrom google.genai import Client\nfrom instructor import Mode\nfrom instructor.v2 import from_genai\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nraw_client = Client(api_key=\"YOUR_KEY\")\nclient = from_genai(raw_client, mode=Mode.TOOLS)\n\nresult = client.chat.completions.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n    response_model=User,\n)\n\nprint(result)\n```\n\nBehind the scenes the v2 client registers the correct mode handler, converts OpenAI-style messages to the GenAI `contents` format, and parses the response while filtering Gemini thought parts.\n\n## Message Formatting\n\nGenai supports multiple message formats, and Instructor seamlessly works with all of them. This flexibility allows you to use whichever format is most convenient for your application:\n\n```python\nfrom google import genai\nimport instructor\nfrom pydantic import BaseModel\nfrom google.genai import types\n\n# Define your Pydantic model\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Initialize and patch the client\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n# Single string (converted to user message)\nresponse = client.create(\n    messages=\"Jason is 25 years old\",\n    response_model=User,\n)\n\nprint(response)\n# > name='Jason' age=25\n\n# Standard format\nresponse = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Jason is 25 years old\"}\n    ],\n    response_model=User,\n)\n\nprint(response)\n# > name='Jason' age=25\n\n# Using genai's Content type\nresponse = client.create(\n    messages=[\n        genai.types.Content(\n            role=\"user\",\n            parts=[genai.types.Part.from_text(text=\"Jason is 25 years old\")]\n        )\n    ],\n    response_model=User,\n)\n\nprint(response)\n# > name='Jason' age=25\n```\n\n### System Messages\n\nSystem messages help set context and instructions for the model. With Gemini models, you can provide system messages in two different ways:\n\n```python\nfrom google import genai\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n# As a parameter\nresponse = client.create(\n    system=\"Jason is 25 years old\",\n    messages=[{\"role\": \"user\", \"content\": \"You are a data extraction assistant\"}],\n    response_model=User,\n)\n\nprint(response)\n# > name='Jason' age=25\n\n# Or as a message with role \"system\"\nresponse = client.create(\n    messages=[\n        {\"role\": \"system\", \"content\": \"Jason is 25 years old\"},\n        {\"role\": \"user\", \"content\": \"You are a data extraction assistant\"},\n    ],\n    response_model=User,\n)\n\nprint(response)\n# > name='Jason' age=25\n\n```\n\n## Template Variables\n\nTemplate variables make it easy to reuse prompts with different values. This is particularly useful for dynamic content or when testing different inputs:\n\n```python\nfrom google import genai\nimport instructor\nfrom pydantic import BaseModel\nfrom google.genai import types\n\n\n# Define your Pydantic model\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Initialize and patch the client\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n# Single string (converted to user message)\nresponse = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"{{ name }} is {{ age }} years old\"}],\n    response_model=User,\n    context={\n        \"name\": \"Jason\",\n        \"age\": 25,\n    },\n)\n\nprint(response)\n# > name='Jason' age=25\n\n# Standard format\nresponse = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"{{ name }} is {{ age }} years old\"}],\n    response_model=User,\n    context={\n        \"name\": \"Jason\",\n        \"age\": 25,\n    },\n)\n\nprint(response)\n# > name='Jason' age=25\n\n# Using genai's Content type\nresponse = client.create(\n    messages=[\n        genai.types.Content(\n            role=\"user\",\n            parts=[genai.types.Part.from_text(text=\"{{name}} is {{age}} years old\")],\n        )\n    ],\n    response_model=User,\n    context={\n        \"name\": \"Jason\",\n        \"age\": 25,\n    },\n)\n\nprint(response)\n# > name='Jason' age=25\n```\n\n## Validation and Retries\n\nInstructor can automatically retry requests when validation fails, ensuring you get properly formatted data. This is especially helpful when enforcing specific data requirements:\n\n```python\nfrom typing import Annotated\nfrom pydantic import AfterValidator, BaseModel\nimport instructor\nfrom google import genai\n\n\ndef uppercase_validator(v: str) -> str:\n    if v.islower():\n        raise ValueError(\"Name must be ALL CAPS\")\n    return v\n\n\nclass UserDetail(BaseModel):\n    name: Annotated[str, AfterValidator(uppercase_validator)]\n    age: int\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\nresponse = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: jason is 25 years old\"}],\n    response_model=UserDetail,\n    max_retries=3,\n)\n\nprint(response)  # UserDetail(name='JASON', age=25)\n```\n\n## Multimodal Capabilities\n\n> We've provided a few different sample files for you to use to test out these new features. All examples below use these files.\n>\n> - (Audio) : A Recording of the Original Gettysburg Address : [gettysburg.wav](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav)\n> - (Image) : An image of some blueberry plants [image.jpg](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg)\n> - (PDF) : A sample PDF file which contains a fake invoice [invoice.pdf](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf)\n\nInstructor provides a unified, provider-agnostic interface for working with multimodal inputs like images, PDFs, and audio files. With Instructor's multimodal objects, you can easily load media from URLs, local files, or base64 strings using a consistent API that works across different AI providers (OpenAI, Anthropic, Mistral, etc.).\n\nInstructor handles all the provider-specific formatting requirements behind the scenes, ensuring your code remains clean and future-proof as provider APIs evolve.\n\nLet's see how to use the Image, Audio and PDF classes.\n\n### Image Processing\n\n!!! info \"Autodetect Images\"\n\n    For convenient handling of images, you can enable automatic image conversion using the `autodetect_images` parameter. When enabled, Instructor will automatically detect and convert file paths and HTTP URLs provided as strings into the appropriate format required by the Google GenAI SDK. This makes working with images seamless and straightforward. ( see examples below )\n\nInstructor makes it easy to analyse and extract semantic information from images using the Gemini series of models. [Click here](https://ai.google.dev/gemini-api/docs/models) to check if the model you'd like to use has vison capabilities.\n\nLet's see an example below with the sample image above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path` and the `from_base64` class methods.\n\n```python\nfrom instructor.processing.multimodal import Image\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom google.genai import Client\n\n\nclass ImageDescription(BaseModel):\n    objects: list[str] = Field(..., description=\"The objects in the image\")\n    scene: str = Field(..., description=\"The scene of the image\")\n    colors: list[str] = Field(..., description=\"The colors in the image\")\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n# Multiple ways to load an image:\nresponse = client.create(\n    response_model=ImageDescription,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What is in this image?\",\n                # Option 1: Direct URL with autodetection\n                Image.from_url(url),\n                # Option 2: Local file\n                # Image.from_path(\"path/to/local/image.jpg\")\n                # Option 3: Base64 string\n                # Image.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # Image.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nprint(response)\n# Example output:\n# ImageDescription(\n#     objects=['blueberries', 'leaves'],\n#     scene='A blueberry bush with clusters of ripe blueberries and some unripe ones against a cloudy sky',\n#     colors=['green', 'blue', 'purple', 'white']\n# )\n\n```\n\n### Audio Processing\n\nInstructor makes it easy to analyse and extract semantic information from Audio files using the Gemini series of models. Let's see an example below with the sample Audio file above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path`\n\n```python\nfrom instructor.processing.multimodal import Audio\nfrom pydantic import BaseModel\nimport instructor\nfrom google.genai import Client\n\n\nclass AudioDescription(BaseModel):\n    transcript: str\n    summary: str\n    speakers: list[str]\n    key_points: list[str]\n\n\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav\"\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\nresponse = client.create(\n    response_model=AudioDescription,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Please transcribe and analyze this audio:\",\n                # Multiple loading options:\n                Audio.from_url(url),\n                # Option 2: Local file\n                # Audio.from_path(\"path/to/local/audio.mp3\")\n            ],\n        },\n    ],\n)\n\nprint(response)\n# > transcript='Four score and seven years ago our fathers...\"]\n```\n\n### PDF\n\nInstructor makes it easy to analyse and extract semantic information from PDFs using Gemini's new models.\n\nLet's see an example below with the sample PDF above where we'll load it in using our `from_url` method. With this integration that we're passing in the raw bytes to gemini itself, we also support using the Files api with the `PDFWithGenaiFile` class.\n\nNote that we support local files and base64 strings using this method too with the `from_path` and the `from_base64` class methods.\n\n```python\nfrom instructor.processing.multimodal import PDF\nfrom pydantic import BaseModel\nimport instructor\nfrom google.genai import Client\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n# Multiple ways to load an PDF:\nresponse = client.create(\n    response_model=Receipt,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract out the total and line items from the invoice\",\n                # Option 1: Direct URL\n                PDF.from_url(url),\n                # Option 2: Local file\n                # PDF.from_path(\"path/to/local/invoice.pdf\"),\n                # Option 3: Base64 string\n                # PDF.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # PDF.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nprint(response)\n# > Receipt(total=220, items=['English Tea', 'Tofu'])\n```\n\nWe also support the use of PDFs with the Gemini `Files` api with the `PDFWithGenaiFile` that allows you to use existing uploaded files or local files.\n\nNote that the `PdfWithGenaiFile.from_new_genai_file` operation is blocking and you can set the timeout and retry delay that we'll call while we await the upload to be registered as completed.\n\n```python\nPDFWithGenaiFile.from_new_genai_file(\n    \"./invoice.pdf\",\n    retry_delay=1,  # Time to wait before checking if file is ready to use\n    max_retries=20 # Number of times to check before throwing an error\n),\n```\n\nThis makes it easier for you to work with the Gemini files API. You can use this in a normal chat completion as seen below\n\n```python\nfrom instructor.processing.multimodal import PDFWithGenaiFile\nfrom pydantic import BaseModel\nimport instructor\nfrom google.genai import Client\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n# Multiple ways to load an PDF:\nresponse = client.create(\n    response_model=Receipt,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract out the total and line items from the invoice\",\n                # Option 1: Direct URL\n                PDFWithGenaiFile.from_new_genai_file(\"./invoice.pdf\"),\n\n                # Option 2 : Existing Genai File\n                # PDFWithGenaiFile.from_existing_genai_file(\"invoice.pdf\"),\n            ],\n        },\n    ],\n)\n\nprint(response)\n```\n\nIf you'd like more fine-grained control over the files used, you can also use the `Files` api directly as seen below.\n\n## Using Files\n\nOur API integration also supports the use of files\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Summary(BaseModel):\n    summary: str\n\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\nfile1 = client.files.upload(\n    file=\"./gettysburg.wav\",\n)\n\n# As a parameter\nresponse = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Summarise the audio file.\",\n                file1,\n            ]\n        }\n    ],\n    response_model=Summary,\n)\n\nprint(response)\n# > summary=\"Abraham Lincoln's Gettysburg Address commences by stating that 87 years prior, the founding fathers created a new nation based on liberty and equality. It goes on to say that the Civil War is testing whether a nation so conceived can survive.\"\n```\n\n## Streaming Responses\n\n!!! warning \"Streaming Limitations\"\n\n    **As of July 11, 2025, Google GenAI does not support streaming with tool/function calling or structured outputs for regular models.** \n    \n    - `Mode.TOOLS` and `Mode.JSON` do not support streaming with regular models\n    - To use streaming, you must use `Partial[YourModel]` explicitly or switch to other modes like `Mode.JSON`\n    - Alternatively, set `stream=False` to disable streaming\n\nStreaming allows you to process responses incrementally rather than waiting for the complete result. This is extremely useful for making UI changes feel instant and responsive.\n\n### Partial Streaming\n\nReceive a stream of complete, validated objects as they're generated:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\n\nclient = instructor.from_provider(\n    \"google/gemini-2.5-flash\",\n    mode=instructor.Mode.JSON,\n)\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nclass PersonList(BaseModel):\n    people: list[Person]\n\n\nstream = client.create_partial(\n    model=\"gemini-2.5-flash\",\n    response_model=PersonList,\n    stream=True,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Ivan is 20 years old, Jason is 25 years old, and John is 30 years old\",\n        }\n    ],\n)\n\nfor extraction in stream:\n    print(extraction)\n    # > people=[PartialPerson(name='Ivan', age=None)]\n    # > people=[PartialPerson(name='Ivan', age=20), PartialPerson(name='Jason', age=25), PartialPerson(name='John', age=None)]\n    # > people=[PartialPerson(name='Ivan', age=20), PartialPerson(name='Jason', age=25), PartialPerson(name='John', age=30)]\n```\n\n### Iterable Streaming\n\nFor extracting multiple objects from a single response, use `create_iterable`:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Extract multiple users from a single response\nstream = client.create_iterable(\n    model=\"gemini-2.5-flash\",\n    response_model=User,\n    stream=True,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Jason is 25 years old, Sarah is 30 years old, and Mike is 28 years old\",\n        }\n    ],\n)\n\nfor user in stream:\n    print(user)\n    # > User(name='Jason', age=25)\n    # > User(name='Sarah', age=30)\n    # > User(name='Mike', age=28)\n```\n\n### Async Streaming\n\nBoth partial and iterable streaming work with async clients:\n\n```python\nimport asyncio\nfrom pydantic import BaseModel\nimport instructor\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def async_partial_example():\n    client = instructor.from_provider(\"google/gemini-2.5-flash\", async_client=True)\n    \n    stream = client.create_partial(\n        model=\"gemini-2.5-flash\",\n        response_model=User,\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Jason is 25 years old\"}\n        ],\n    )\n    \n    async for chunk in stream:\n        print(chunk)\n\nasync def async_iterable_example():\n    client = instructor.from_provider(\"google/gemini-2.5-flash\", async_client=True)\n    \n    stream = client.create_iterable(\n        model=\"gemini-2.5-flash\",\n        response_model=User,\n        stream=True,\n        messages=[\n            {\n                \"role\": \"user\", \n                \"content\": \"Jason is 25, Sarah is 30, Mike is 28\"\n            }\n        ],\n    )\n    \n    async for user in stream:\n        print(user)\n\n# Run async examples\nasyncio.run(async_partial_example())\nasyncio.run(async_iterable_example())\n```\n\n## Async Support\n\nInstructor provides full async support for the genai SDK, allowing you to make non-blocking requests in async applications:\n\n```python\nimport asyncio\n\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    client = instructor.from_provider(\n        \"google/gemini-2.5-flash\",\n        async_client=True,\n    )\n\n    response = await client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n        response_model=User,\n    )\n    return response\n\n\nprint(asyncio.run(extract_user()))\n#> name = Jason age= 25\n```\n"
  },
  {
    "path": "docs/integrations/google.md",
    "content": "---\ntitle: \"Google Gemini Tutorial: Structured Outputs with Instructor\"\ndescription: \"Learn how to use Google's Gemini models (Pro, Flash, Ultra) with Instructor for structured data extraction. Complete tutorial with examples for multimodal AI and type-safe outputs.\"\n---\n\n## See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Multi-Modal Examples](../examples/multi_modal_gemini.md) - Vision and multi-modal processing\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n\n# Google Gemini Tutorial: Structured Outputs with Instructor\n\nMaster structured data extraction using Google's Gemini models with Instructor. This comprehensive tutorial covers Gemini Pro, Flash, and Ultra models, including multimodal capabilities for processing text, images, and more.\n\n## Google GenAI SDK\n\nGoogle's GenAI SDK is the recommended way to access Gemini models. It provides a unified interface for both the Gemini API and Vertex AI. This guide shows you how to use Instructor with Google's GenAI SDK for type-safe, validated responses.\n\n```bash\npip install \"instructor[google-genai]\"\n```\n\n## Simple User Example (Sync)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Using from_provider (recommended)\nclient = instructor.from_provider(\n    \"google/gemini-3-flash\",\n)\n\nresp = client.create(\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract Jason is 25 years old.\",\n        }\n    ],\n)\n\nprint(resp)  # User(name='Jason', age=25)\n```\n\n## Simple User Example (Async)\n\n!!! info \"Async Support\"\n\n    Instructor supports async mode for the Google GenAI SDK. If you're using the async client, make sure that your client is declared within the same event loop as the function that calls it. If not you'll get a bunch of errors.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    client = instructor.from_provider(\n        \"google/gemini-3-flash\",\n        async_client=True,\n    )\n\n    user = await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract Jason is 25 years old.\",\n            }\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)  # User(name='Jason', age=25)\n```\n\n## Configuration Options\n\nYou can customize the model's behavior using generation configuration parameters. These parameters control aspects like temperature, token limits, and sampling methods. Pass these parameters as a dictionary to the `generation_config` parameter when creating the response.\n\nThe most common parameters include:\n- `temperature`: Controls randomness in the output (0.0 to 1.0)\n- `max_tokens`: Maximum number of tokens to generate\n- `top_p`: Nucleus sampling parameter\n- `top_k`: Number of highest probability tokens to consider\n\nFor more details on configuration options, see [Google's documentation on Gemini configuration parameters](https://cloud.google.com/vertex-ai/generative-ai/docs/samples/generativeaionvertexai-gemini-pro-config-example){target=\"_blank\"}.\n\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\n    \"google/gemini-3-flash\",\n    mode=instructor.Mode.JSON,\n)\n\nresp = client.create(\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract Jason is 25 years old.\",\n        },\n    ],\n    generation_config={\n        \"temperature\": 0.5,\n        \"max_tokens\": 1000,\n        \"top_p\": 1,\n        \"top_k\": 32,\n    },\n)\n\nprint(resp)\n```\n\n## Safety settings with images\n\nGoogle GenAI uses a different set of harm categories for image inputs (for example, `HARM_CATEGORY_IMAGE_HATE`).\n\nWhen your request includes image content, Instructor will:\n\n- Use the image-specific categories in the request config\n- Map thresholds you pass for text categories (like `HARM_CATEGORY_HATE_SPEECH`) to the matching image category (like `HARM_CATEGORY_IMAGE_HATE`)\n\nThis avoids `400 INVALID_ARGUMENT` errors when you combine `safety_settings` with images.\n\n```python\nimport instructor\nfrom google.genai.types import HarmBlockThreshold, HarmCategory\nfrom instructor.processing.multimodal import Image\nfrom pydantic import BaseModel\n\n\nclass Result(BaseModel):\n    summary: str\n\n\nclient = instructor.from_provider(\"google/gemini-3-flash\")\n\nresult = client.create(\n    response_model=Result,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Describe the image in one sentence.\",\n                Image.autodetect(\"path/to/image.png\"),\n            ],\n        }\n    ],\n    # You can still pass text categories. Instructor will map them for image inputs.\n    safety_settings={\n        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n    },\n)\n\nprint(result)\n```\n\n## Nested Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\nclient = instructor.from_provider(\n    \"google/gemini-3-flash\",\n)\n\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\n### Partials\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\n    \"google/gemini-3-flash\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    bio: str\n\n\nuser = client.create_partial(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Create a user profile for Jason and 1 sentence bio, age 25\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user_partial in user:\n    print(user_partial)\n    # > name=None age=None bio=None\n    # > name=None age=25 bio='Jason is a great guy'\n    # > name='Jason' age=25 bio='Jason is a great guy'\n```\n\n### Iterable Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n\nclient = instructor.from_provider(\n    \"google/gemini-3-flash\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract multiple users from text\nusers = client.create_iterable(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract users:\n            1. Jason is 25 years old\n            2. Sarah is 30 years old\n            3. Mike is 28 years old\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nfor user in users:\n    print(user)\n    #> name='Jason' age=25\n    #> name='Sarah' age=30\n    #> name='Mike' age=28\n```\n\n## Known Limitations (as of Nov 12, 2024)\n\nGoogle Gemini has the following known limitations when used with Instructor:\n\n1. **Union Types**: Gemini does not support Union types (except for Optional). Use separate response models or Literal types instead.\n2. **Enum Types**: Gemini returns string values instead of properly typed Enum instances. You may need to manually convert strings to enums after extraction.\n3. **Union Streaming**: Streaming is not supported for Union types with Iterable.\n\nThese limitations are specific to Google Gemini and do not affect other providers like OpenAI or Anthropic. Tests automatically skip these features for Google to prevent failures.\n\n## Instructor Modes\n\nWe provide several modes to make it easy to work with the different response models that Gemini supports:\n\n1. `instructor.Mode.TOOLS` : This uses Gemini's tool calling API to return structured outputs (default)\n2. `instructor.Mode.JSON` : This uses Gemini's JSON schema mode for structured outputs\n\n!!! note \"Backwards Compatibility\"\n\n    Legacy provider-specific modes (for example `Mode.TOOLS`, `Mode.JSON`, `Mode.JSON`, `Mode.TOOLS`) are deprecated. They emit warnings and map to the generic modes.\n\n!!! info \"Mode Selection\"\n    When using `from_provider`, the appropriate mode is automatically selected based on the provider and model capabilities.\n\n## Available Models\n\nGoogle offers several Gemini models:\n\n- Gemini Flash (General purpose)\n- Gemini Pro (Multimodal)\n- Gemini Flash-8b (Coming soon)\n\n## Using Gemini's Multimodal Capabilities\n\nWe've written an extensive list of guides on how to use gemini's multimodal capabilities with instructor.\n\n- [Using Geminin To Extract Travel Video Recomendations](../blog/posts/multimodal-gemini.md)\n- [Parsing PDFs with Gemini](../blog/posts/chat-with-your-pdf-with-gemini.md)\n- [Generating Citations with Gemini](../blog/posts/generating-pdf-citations.md)\n\nStay tuned to the blog for more guides on using Gemini with instructor.\n\n## Related Resources\n\n- [Google AI Documentation](https://ai.google.dev/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Migration from google-generativeai\n\nIf you're currently using the legacy `google-generativeai` package with Instructor, here's how to migrate:\n\n### Old Way (Deprecated)\n```python\nimport instructor\nimport google.generativeai as genai\n\nclient = instructor.from_provider(\n    \"google/gemini-2.5-flash\",\n    mode=instructor.Mode.JSON,\n)\n```\n\n### New Way (Recommended)\n```python\nimport instructor\n\n# Option 1: Using from_provider (recommended)\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n\n# Option 2: Using from_genai directly (legacy/advanced)\nfrom google import genai\nfrom instructor import from_genai\n\nclient = from_genai(genai.Client())\n```\n\n### Vertex AI Migration\n\nFor Vertex AI users, the migration is similar:\n\n#### Old Way (Deprecated)\n```python\nimport instructor\nimport vertexai\nfrom vertexai.generative_models import GenerativeModel\n\nvertexai.init(project=\"your-project\", location=\"us-central1\")\nclient = instructor.from_provider(\"google/gemini-2.5-flash\", vertexai=True),\n    mode=instructor.Mode.TOOLS,\n)\n```\n\n#### New Way (Recommended)\n```python\nimport instructor\n\n# Option 1: Using from_provider (recommended)\nclient = instructor.from_provider(\n    \"vertexai/gemini-3-flash\",\n    project=\"your-project\",\n    location=\"us-central1\"\n)\n\n# Option 2: Using from_genai with vertexai=True (legacy/advanced)\nfrom google import genai\nfrom instructor import from_genai\n\nclient = from_genai(\n    genai.Client(\n        vertexai=True,\n        project=\"your-project\",\n        location=\"us-central1\"\n    )\n)\n```\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with Google's latest API versions. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n"
  },
  {
    "path": "docs/integrations/groq.md",
    "content": "---\ntitle: Structured Outputs with Groq AI and Pydantic\ndescription: Learn how to use Groq AI for structured outputs with Pydantic in Python and enhance API interactions.\n---\n\n# Structured Outputs with Groq AI\n\nThis guide demonstrates how to use Groq AI with Instructor to generate structured outputs. You'll learn how to use Groq's LLM models to create type-safe responses.\n\nyou'll need to sign up for an account and get an API key. You can do that [here](https://console.groq.com/docs/quickstart).\n\n```bash\nexport GROQ_API_KEY=<your-api-key-here>\npip install \"instructor[groq]\"\n```\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [Groq Examples](../examples/groq.md) - Practical Groq examples\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n\n# Groq AI\n\nGroq supports structured outputs with their new `llama-3-groq-70b-8192-tool-use-preview` model.\n\n### Sync Example\n\n```python\nimport os\nfrom groq import Groq\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize with API key\nclient = Groq(api_key=os.getenv(\"GROQ_API_KEY\"))\n\n# Enable instructor patches for Groq client\nclient = instructor.from_provider(\"groq/llama3-8b-8192\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > User(name='Jason', age=25)\n```\n\n### Async Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n# Initialize async client using provider string\nclient = instructor.from_provider(\n    \"groq/llama3-8b-8192\",\n    async_client=True,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n# > User(name='Jason', age=25)\n\n```\n\n### Nested Object\n\n```python\nimport os\nfrom groq import Groq\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize with API key\nclient = Groq(api_key=os.getenv(\"GROQ_API_KEY\"))\n\n# Enable instructor patches for Groq client\nclient = instructor.from_provider(\"groq/llama3-8b-8192\")\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n"
  },
  {
    "path": "docs/integrations/index.md",
    "content": "---\ntitle: \"LLM Provider Integration Tutorials - Instructor\"\ndescription: \"Complete tutorials for integrating Instructor with 15+ LLM providers. Learn structured data extraction with OpenAI, Anthropic Claude, Google Gemini, local models with Ollama, and more.\"\n---\n\n# LLM Provider Integration Tutorials\n\nLearn how to integrate Instructor with various AI model providers. These comprehensive tutorials cover everything from cloud-based services like OpenAI and Anthropic to local open-source models, helping you extract structured outputs from any LLM.\n\n<div class=\"grid cards\" markdown>\n\n- :material-cloud: **Major Cloud Providers**\n\n    Leading AI providers with comprehensive features\n\n    [:octicons-arrow-right-16: OpenAI](./openai.md)          ·\n    [:octicons-arrow-right-16: OpenAI Responses](./openai-responses.md)          ·\n    [:octicons-arrow-right-16: Azure](./azure.md)            ·\n    [:octicons-arrow-right-16: Anthropic](./anthropic.md)    ·\n    [:octicons-arrow-right-16: Google.GenerativeAI](./google.md)          ·\n    [:octicons-arrow-right-16: Vertex AI](./vertex.md)       ·\n    [:octicons-arrow-right-16: AWS Bedrock](./bedrock.md)    ·\n    [:octicons-arrow-right-16: Google.GenAI](./genai.md)     ·\n    [:octicons-arrow-right-16: xAI](./xai.md)\n\n- :material-cloud-outline: **Additional Cloud Providers**\n\n    Other commercial AI providers with specialized offerings\n\n    [:octicons-arrow-right-16: Cohere](./cohere.md)          ·\n    [:octicons-arrow-right-16: Mistral](./mistral.md)        ·\n    [:octicons-arrow-right-16: DeepSeek](./deepseek.md)      ·\n    [:octicons-arrow-right-16: Together AI](./together.md)    ·\n    [:octicons-arrow-right-16: Groq](./groq.md)              ·\n    [:octicons-arrow-right-16: Fireworks](./fireworks.md)    ·\n    [:octicons-arrow-right-16: Cerebras](./cerebras.md)      ·\n    [:octicons-arrow-right-16: Writer](./writer.md)          ·\n    [:octicons-arrow-right-16: Perplexity](./perplexity.md)\n    [:octicons-arrow-right-16: SambaNova](./sambanova.md)\n\n- :material-open-source-initiative: **Open Source**\n\n    Run open-source models locally or in the cloud\n\n    [:octicons-arrow-right-16: Ollama](./ollama.md)                  ·\n    [:octicons-arrow-right-16: llama-cpp-python](./llama-cpp-python.md)\n\n- :material-router-wireless: **Routing**\n\n    Unified interfaces for multiple providers\n\n    [:octicons-arrow-right-16: LiteLLM](./litellm.md)\n    [:octicons-arrow-right-16: OpenRouter](./openrouter.md)\n\n</div>\n\n## Common Features\n\nAll integrations support these core features:\n\n| Feature | Description | Documentation |\n|---------|-------------|---------------|\n| **Model Patching** | Enhance provider clients with structured output capabilities | [Patching](../concepts/patching.md) |\n| **Response Models** | Define expected response schema with Pydantic | [Models](../concepts/models.md) |\n| **Validation** | Ensure responses match your schema definition | [Validation](../concepts/validation.md) |\n| **Streaming** | Stream partial or iterative responses | [Partial](../concepts/partial.md), [Iterable](../concepts/iterable.md) |\n| **Hooks** | Add callbacks for monitoring and debugging | [Hooks](../concepts/hooks.md) |\n\nHowever, each provider has different capabilities and limitations. Refer to the specific provider documentation for details.\n\n## Provider Modes\n\nProviders support different methods for generating structured outputs:\n\n| Mode | Description | Providers |\n|------|-------------|-----------|\n| `TOOLS` | Uses OpenAI-style tools/function calling | OpenAI, Anthropic, Mistral |\n| `PARALLEL_TOOLS` | Multiple simultaneous tool calls | OpenAI |\n| `JSON` | Direct JSON response generation | OpenAI, Gemini, Cohere, GenAI |\n| `MD_JSON` | JSON embedded in markdown | Most providers |\n\nSee the [Modes Comparison](../modes-comparison.md) guide for details.\n\n## Getting Started\n\nThere are two ways to use providers with Instructor:\n\n### 1. Using Provider Initialization (Recommended)\n\nThe simplest way to get started is using the provider initialization:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n\n# Initialize any provider with a simple string\nclient = instructor.from_provider(\"openai/gpt-4\")\n# Or use async client\nasync_client = instructor.from_provider(\"anthropic/claude-3-sonnet\", async_client=True)\n\n# Use the same interface for all providers\nresponse = client.create(\n    response_model=UserInfo,\n    messages=[{\"role\": \"user\", \"content\": \"Your prompt\"}]\n)\n```\n\nSupported provider strings:\n- `openai/model-name`: OpenAI models\n- `anthropic/model-name`: Anthropic models\n- `google/model-name`: Google models\n- `mistral/model-name`: Mistral models\n- `cohere/model-name`: Cohere models\n- `perplexity/model-name`: Perplexity models\n- `groq/model-name`: Groq models\n- `writer/model-name`: Writer models\n- `bedrock/model-name`: AWS Bedrock models\n- `cerebras/model-name`: Cerebras models\n- `fireworks/model-name`: Fireworks models\n- `vertexai/model-name`: Vertex AI models\n- `genai/model-name`: Google GenAI models\n- `ollama/model-name`: Ollama models\n\n### Provider Checklist\n\nUse these example strings with `from_provider` to quickly get started:\n\n- [x] `instructor.from_provider(\"openai/gpt-5-nano\")`\n- [x] `instructor.from_provider(\"anthropic/claude-3-sonnet\")`\n- [x] `instructor.from_provider(\"google/gemini-2.5-flash\")`\n- [x] `instructor.from_provider(\"mistral/mistral-large-latest\")`\n- [x] `instructor.from_provider(\"cohere/command-r\")`\n- [x] `instructor.from_provider(\"perplexity/sonar-small\")`\n- [x] `instructor.from_provider(\"groq/llama3-8b-8192\")`\n- [x] `instructor.from_provider(\"writer/palmyra-x-004\")`\n- [x] `instructor.from_provider(\"bedrock/anthropic.claude-3-sonnet-20240229-v1:0\")`\n- [x] `instructor.from_provider(\"cerebras/llama3.1-70b\")`\n- [x] `instructor.from_provider(\"fireworks/llama-v3-70b-instruct\")`\n- [x] `instructor.from_provider(\"vertexai/gemini-3-flash\")`\n- [x] `instructor.from_provider(\"genai/gemini-3-flash\")`\n- [x] `instructor.from_provider(\"ollama/llama3\")`\n\n### 2. Manual Client Setup\n\nAlternatively, you can manually set up the client:\n\n1. Install the required dependencies:\n   ```bash\n   pip install \"instructor[provider]\"  # e.g., instructor[anthropic]\n   ```\n\n2. Import the provider client and patch it with Instructor:\n   ```python\n   import instructor\n   from provider_package import Client\n\n   client = instructor.from_provider(Client())\n   ```\n\n3. Use the patched client with your Pydantic model:\n   ```python\n   response = client.create(\n       response_model=YourModel,\n       messages=[{\"role\": \"user\", \"content\": \"Your prompt\"}]\n   )\n   ```\n\nFor provider-specific setup and examples, visit each provider's documentation page.\n\n## Need Help?\n\nIf you need assistance with a specific integration:\n\n1. Check the provider-specific documentation\n2. Browse the [examples](../examples/index.md) and [cookbooks](../examples/index.md)\n3. Search existing [GitHub issues](https://github.com/jxnl/instructor/issues)\n4. Join our [Discord community](https://discord.gg/bD9YE9JArw)\n"
  },
  {
    "path": "docs/integrations/litellm.md",
    "content": "---\ntitle: \"Structured outputs with LiteLLM, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with LiteLLM's unified interface. Learn how to generate structured, type-safe outputs across multiple LLM providers.\"\n---\n\n# Structured outputs with LiteLLM, a complete guide w/ instructor\n\nLiteLLM provides a unified interface for multiple LLM providers, making it easy to switch between different models and providers. This guide shows you how to use Instructor with LiteLLM for type-safe, validated responses across various LLM providers.\n\n## Quick Start\n\nInstall Instructor with LiteLLM support:\n\n```bash\npip install \"instructor[litellm]\"\n```\n\n## Simple User Example (Sync)\n\n```python\nfrom litellm import completion\nimport instructor\nfrom pydantic import BaseModel\n\n# Enable instructor patches\nclient = instructor.from_provider(\"litellm/gpt-3.5-turbo\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)  # User(name='Jason', age=25)\n```\n\n## Simple User Example (Async)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nclient = instructor.from_provider(\n    \"litellm/gpt-3.5-turbo\",\n    async_client=True,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)  # User(name='Jason', age=25)\n\n```\n\n## Cost Calculation\n\nIn order to calculate the cost of the response, LiteLLM provides a simple `response_cost` attribute on the response object's `_hidden_params` attribute. This is recorded in their documentation [here](https://docs.litellm.ai/docs/completion/token_usage#6-completion_cost).\n\nHere is a code snippet using instructor to calculate the cost of the response:\n\n```python\nimport instructor\nfrom litellm import completion\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\"litellm/gpt-3.5-turbo\")\ninstructor_resp, raw_completion = client.create_with_completion(\n    max_tokens=1024,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract Jason is 25 years old.\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(raw_completion._hidden_params[\"response_cost\"])\n#> 0.00189\n```\n\n## Related Resources\n\n- [LiteLLM Documentation](https://docs.litellm.ai/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with LiteLLM's latest releases. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n\nNote: Always verify provider-specific features and limitations in their respective documentation before implementation.\n"
  },
  {
    "path": "docs/integrations/llama-cpp-python.md",
    "content": "---\ndraft: False\ndate: 2024-02-12\ntitle: \"Structured outputs with llama-cpp-python, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with llama-cpp-python. Learn how to generate structured, type-safe outputs with llama-cpp-python.\"\nslug: llama-cpp-python\ntags:\n  - patching\nauthors:\n  - jxnl\n---\n\n# Structured outputs with llama-cpp-python, a complete guide w/ instructor\n\nThis guide demonstrates how to use llama-cpp-python with Instructor to generate structured outputs. You'll learn how to use JSON schema mode and speculative decoding to create type-safe responses from local LLMs.\n\nOpen-source LLMS are gaining popularity, and llama-cpp-python has made the `llama-cpp` model available to obtain structured outputs using JSON schema via a mixture of [constrained sampling](https://llama-cpp-python.readthedocs.io/en/latest/#json-schema-mode) and [speculative decoding](https://llama-cpp-python.readthedocs.io/en/latest/#speculative-decoding).\n\nThey also support a [OpenAI compatible client](https://llama-cpp-python.readthedocs.io/en/latest/#openai-compatible-web-server), which can be used to obtain structured output as a in process mechanism to avoid any network dependency.\n\n<!-- more -->\n\n## Patching\n\nInstructor's patch enhances an create call it with the following features:\n\n- `response_model` in `create` calls that returns a pydantic model\n- `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy\n\n!!! note \"Learn More\"\n\n    To learn more, please refer to the [docs](../index.md). To understand the benefits of using Pydantic with Instructor, visit the tips and tricks section of the [why use Pydantic](../why.md) page. If you want to check out examples of using Pydantic with Instructor, visit the [examples](../examples/index.md) page.\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [Ollama Integration](./ollama.md) - Alternative local model setup\n- [Local Classification](../examples/local_classification.md) - Classification with local models\n- [Open Source Models](../examples/open_source.md) - More open-source model examples\n\n# llama-cpp-python\n\nRecently llama-cpp-python added support for structured outputs via JSON schema mode. This is a time-saving alternative to extensive prompt engineering and can be used to obtain structured outputs.\n\nIn this example we'll cover a more advanced use case of JSON_SCHEMA mode to stream out partial models. To learn more [partial streaming](https://github.com/jxnl/instructor/concepts/partial.md) check out partial streaming.\n\n## Quick Start with `from_provider`\n\nIf you run the `llama-cpp-python` server in OpenAI compatible mode, you can use the unified `from_provider` API to patch the client. Simply point the base URL at your local server:\n\n```python\nimport instructor\n\n# Sync client\nclient = instructor.from_provider(\n    \"ollama/openhermes\", base_url=\"http://localhost:8080/v1\"\n)\n\n# Async client\nasync_client = instructor.from_provider(\n    \"ollama/openhermes\", async_client=True, base_url=\"http://localhost:8080/v1\"\n)\n```\n\nYou can then call `chat.completions.create` just like with any other provider.\n\n```python\nimport llama_cpp\nimport instructor\nfrom llama_cpp.llama_speculative import LlamaPromptLookupDecoding\nfrom pydantic import BaseModel\n\n\nllama = llama_cpp.Llama(\n    model_path=\"../../models/OpenHermes-2.5-Mistral-7B-GGUF/openhermes-2.5-mistral-7b.Q4_K_M.gguf\",\n    n_gpu_layers=-1,\n    chat_format=\"chatml\",\n    n_ctx=2048,\n    draft_model=LlamaPromptLookupDecoding(num_pred_tokens=2),\n    logits_all=True,\n    verbose=False,\n)\n\n\ncreate = instructor.patch(\n    create=llama.create_chat_completion_openai_v1,\n    mode=instructor.Mode.JSON_SCHEMA,\n)\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\nuser = create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract `Jason is 30 years old`\",\n        }\n    ],\n    response_model=UserDetail,\n)\n\nprint(user)\n#> name='Jason' age=30\n```\n"
  },
  {
    "path": "docs/integrations/mistral.md",
    "content": "---\ndraft: False\ndate: 2025-03-11\ntitle: \"Structured outputs with Mistral, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with Mistral. Learn how to generate structured, type-safe outputs with Mistral.\"\nslug: mistral\ntags:\n  - patching\nauthors:\n  - shanktt\n  - ivanleomk\n---\n\n# Structured outputs with Mistral, a complete guide w/ instructor\n\nThis guide demonstrates how to use Mistral with Instructor to generate structured outputs. You'll learn how to use function calling with Mistral Large to create type-safe responses.\n\nMistral Large is the flagship model from Mistral AI, supporting 32k context windows and functional calling abilities. Mistral Large's addition of [function calling](https://docs.mistral.ai/guides/function-calling/) makes it possible to obtain structured outputs using JSON schema.\n\n## Quick Start\n\nTo get started with Instructor and Mistral, you'll need to install the required packages:\n\n```bash\npip install \"instructor[mistral]\"\n```\n\n⚠️ **Important**: You must set your Mistral API key by setting it explicitly on the client\n\n```python\nimport os\nfrom mistralai import Mistral\nclient = Mistral(api_key='your-api-key-here')\n```\n\n## Available Modes\n\nInstructor provides two modes for working with Mistral:\n\n1. `instructor.Mode.TOOLS`: Uses Mistral's function calling API to return structured outputs (default)\n2. `instructor.Mode.JSON_SCHEMA`: Uses Mistral's structured output capabilities\n\nTo set the mode for your mistral client, simply use the code snippet below\n\n```python\nimport os\nfrom pydantic import BaseModel\nimport instructor\n\n\n# Initialize with API key\ninstructor_client = instructor.from_provider(\n    \"mistral/mistral-large-latest\",\n    mode=Mode.TOOLS,\n)\n```\n\n## Simple User Example (Sync)\n\n```python\nimport os\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor import Mode\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\n# Initialize the client\ninstructor_client = instructor.from_provider(\n    \"mistral/mistral-large-latest\",\n    mode=Mode.TOOLS,\n)\n\n# Extract a single user\nuser = instructor_client.create(\n    response_model=UserDetails,\n    messages=[{\"role\": \"user\", \"content\": \"Jason is 25 years old\"}],\n    temperature=0,\n)\n\nprint(user)\n# Output: UserDetails(name='Jason', age=25)\n```\n\n## Async Example\n\nFor asynchronous operations, you can use the `use_async=True` parameter when creating the client:\n\n```python\nimport os\nimport asyncio\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor import Mode\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Initialize the async client\ninstructor_client = instructor.from_provider(\n    \"mistral/mistral-large-latest\",\n    async_client=True,\n    mode=Mode.TOOLS,\n)\n\nasync def extract_user():\n    user = await instructor_client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jack is 28 years old.\"}],\n        temperature=0,\n    )\n    return user\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n# Output: User(name='Jack', age=28)\n```\n\n## Nested Example\n\nYou can also work with nested models:\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\nimport os\nimport instructor\nfrom instructor import Mode\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n\n# Initialize the client\ninstructor_client = instructor.from_provider(\n    \"mistral/mistral-large-latest\",\n    mode=Mode.TOOLS,\n)\n\n# Create structured output with nested objects\nuser = instructor_client.create(\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\"}\n    ],\n    temperature=0,\n)\n\nprint(user)\n# Output:\n# User(\n#     name='Jason',\n#     age=25,\n#     addresses=[\n#         Address(street='123 Main St', city='New York', country='USA'),\n#         Address(street='456 Beach Rd', city='Miami', country='USA')\n#     ]\n# )\n```\n\n## Streaming Support\n\nInstructor now supports streaming capabilities with Mistral! You can use both `create_partial` for incremental model building and `create_iterable` for streaming collections.\n\n### Streaming Partial Responses\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom mistralai import Mistral\nfrom instructor.dsl.partial import Partial\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n# Initialize with API key\nclient = Mistral(api_key=os.environ.get(\"MISTRAL_API_KEY\"))\n\n# Enable instructor patches for Mistral client\ninstructor_client = instructor.from_provider(\"mistral/mistral-small\")\n\n# Stream partial responses\nmodel = instructor_client.create(\n    response_model=Partial[UserExtract],\n    stream=True,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Jason Liu is 25 years old\"},\n    ],\n)\n\nfor partial_user in model:\n    print(f\"Received update: {partial_user}\")\n# Output might show:\n# Received update: UserExtract(name='Jason', age=None)\n# Received update: UserExtract(name='Jason Liu', age=None)\n# Received update: UserExtract(name='Jason Liu', age=25)\n```\n\n### Streaming Iterable Collections\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom mistralai import Mistral\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n# Initialize with API key\nclient = Mistral(api_key=os.environ.get(\"MISTRAL_API_KEY\"))\n\n# Enable instructor patches for Mistral client\ninstructor_client = instructor.from_provider(\"mistral/mistral-small\")\n\n# Stream iterable responses\nusers = instructor_client.create_iterable(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Make up two people\"},\n    ],\n)\n\nfor user in users:\n    print(f\"Generated user: {user}\")\n# Output:\n# Generated user: UserExtract(name='Emily Johnson', age=32)\n# Generated user: UserExtract(name='Michael Chen', age=28)\n```\n\n### Async Streaming\n\nYou can also use async versions of both streaming approaches:\n\n```python\nimport asyncio\nfrom pydantic import BaseModel\nimport instructor\nfrom mistralai import Mistral\nfrom instructor.dsl.partial import Partial\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n# Initialize client with async support\nclient = Mistral(api_key=os.environ.get(\"MISTRAL_API_KEY\"))\ninstructor_client = instructor.from_provider(\"mistral/mistral-small\")\n\nasync def stream_partial():\n    model = await instructor_client.create(\n        response_model=Partial[UserExtract],\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Jason Liu is 25 years old\"},\n        ],\n    )\n\n    async for partial_user in model:\n        print(f\"Received update: {partial_user}\")\n\nasync def stream_iterable():\n    users = instructor_client.create_iterable(\n        response_model=UserExtract,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Make up two people\"},\n        ],\n    )\n\n    async for user in users:\n        print(f\"Generated user: {user}\")\n\n# Run async functions\nasyncio.run(stream_partial())\nasyncio.run(stream_iterable())\n```\n\n## Related Resources\n\n- [Mistral AI Documentation](https://docs.mistral.ai/)\n- [Mistral Function Calling Guide](https://docs.mistral.ai/guides/function-calling/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with the latest Mistral API versions and models. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates on Mistral integration features.\n\n## Multimodal\n\nInstructor makes it easy to analyse and extract semantic information from PDFs using Mistral's models. Let's see an example below with the sample PDF above where we'll load it in using our `from_url` method. Note that for now Mistral only supports document URLs.\n\n```\nfrom instructor.processing.multimodal import PDF\nfrom pydantic import BaseModel\nimport instructor\nfrom mistralai import Mistral\nimport os\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\nclient = instructor.from_provider(\"mistral/mistral-small\")\n\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n\nresponse = client.create(\n    response_model=Receipt,\n    max_tokens=1000,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract out the total and line items from the invoice\",\n                PDF.from_url(\n                    url\n                ),  # Also supports PDF.from_path() and PDF.from_base64()\n            ],\n        },\n    ],\n)\n\nprint(response)\n# > Receipt(total=220, items=['English Tea', 'Tofu'])\n```\n"
  },
  {
    "path": "docs/integrations/ollama.md",
    "content": "---\ndraft: False\ndate: 2024-02-08\ntitle: \"Structured outputs with Ollama, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with Ollama. Learn how to generate structured, type-safe outputs with Ollama.\"\nslug: ollama\ntags:\n  - patching\n  - open source\nauthors:\n  - jxnl\n---\n\n# Structured outputs with Ollama, a complete guide w/ instructor\n\nThis guide demonstrates how to use Ollama with Instructor to generate structured outputs. You'll learn how to use JSON schema mode with local LLMs to create type-safe responses.\n\nOpen-source LLMS are gaining popularity, and the release of Ollama's OpenAI compatibility later it has made it possible to obtain structured outputs using JSON schema.\n\nBy the end of this blog post, you will learn how to effectively utilize instructor with ollama. But before we proceed, let's first explore the concept of patching.\n\n<!-- more -->\n\n## Patching\n\nInstructor's patch enhances a openai api it with the following features:\n\n- `response_model` in `create` calls that returns a pydantic model\n- `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy\n- `timeout` parameter for controlling total retry duration (especially important for Ollama)\n\n!!! note \"Learn More\"\n\n    To learn more, please refer to the [docs](../index.md). To understand the benefits of using Pydantic with Instructor, visit the tips and tricks section of the [why use Pydantic](../why.md) page.\n\n## Timeout Handling with Ollama\n\nOllama integration now properly supports timeout parameters to ensure reliable request handling:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n\nclass Character(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\n    \"ollama/llama2\",\n    mode=instructor.Mode.JSON,\n)\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Tell me about Harry Potter\",\n        }\n    ],\n    response_model=Character,\n    max_retries=2,\n    timeout=10.0,  # Total timeout across all retry attempts\n)\n```\n\nThe timeout parameter ensures that:\n\n- **Total timeout control**: Limits the total time spent across all retry attempts, not per individual attempt\n- **Ollama compatibility**: Prevents timeout issues where retries would multiply the total wait time\n- **Predictable behavior**: A 3-second timeout stays 3 seconds total, not 9+ seconds when retrying\n\n!!! tip \"Timeout Best Practices\"\n\n    When using Ollama, especially with larger models, set appropriate timeout values based on your model's response time. The timeout applies to the total retry duration, making response times more predictable.\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [Ollama Examples](../examples/ollama.md) - Practical Ollama examples\n- [Open Source Models](../examples/open_source.md) - More open-source model examples\n- [Local Deployment](../examples/index.md#local-deployment) - Local model deployment guide\n\n# Ollama\n\nStart by downloading [Ollama](https://ollama.ai/download), and then pull a model such as Llama 2 or Mistral.\n\n!!! tip \"Make sure you update your `ollama` to the latest version!\"\n\n```\nollama pull llama2\n```\n\n## Quick Start with Auto Client\n\nYou can use Ollama with Instructor's auto client for a simple setup:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclass Character(BaseModel):\n    name: str\n    age: int\n\n# Simple setup - automatically configured for Ollama\nclient = instructor.from_provider(\"ollama/llama2\")\n\nresp = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Tell me about Harry Potter\"}],\n    response_model=Character,\n)\n```\n\n### Async Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nasync_client = instructor.from_provider(\n    \"ollama/llama2\",\n    async_client=True,\n)\n\nclass Character(BaseModel):\n    name: str\n    age: int\n\nasync def get_character():\n    return await async_client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Tell me about Harry Potter\"}],\n        response_model=Character,\n    )\n\nprint(asyncio.run(get_character()))\n```\n\n### Intelligent Mode Selection\n\nThe auto client automatically selects the best mode based on your model:\n\n- **Function Calling Models** (llama3.1, llama3.2, llama4, mistral-nemo, qwen2.5, etc.): Uses `TOOLS` mode for enhanced function calling support\n- **Other Models**: Uses `JSON` mode for structured output\n\n```python\n# These models automatically use TOOLS mode\nclient = instructor.from_provider(\"ollama/llama3.1\")\nclient = instructor.from_provider(\"ollama/qwen2.5\")\n\n# Other models use JSON mode\nclient = instructor.from_provider(\"ollama/llama2\")\n```\n\nYou can also override the mode manually:\n\n```python\nimport instructor\n\n# Force JSON mode\nclient = instructor.from_provider(\"ollama/llama3.1\", mode=instructor.Mode.JSON)\n\n# Force TOOLS mode\nclient = instructor.from_provider(\"ollama/llama2\", mode=instructor.Mode.TOOLS)\n```\n\n## Manual Setup\n\n```python\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\nimport instructor\n\n\nclass Character(BaseModel):\n    name: str\n    age: int\n    fact: List[str] = Field(..., description=\"A list of facts about the character\")\n\n\n# enables `response_model` in create call\nclient = instructor.from_provider(\n    \"ollama/llama2\",\n    mode=instructor.Mode.JSON,\n)\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Tell me about the Harry Potter\",\n        }\n    ],\n    response_model=Character,\n)\nprint(resp.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"Harry James Potter\",\n  \"age\": 37,\n  \"fact\": [\n    \"He is the chosen one.\",\n    \"He has a lightning-shaped scar on his forehead.\",\n    \"He is the son of James and Lily Potter.\",\n    \"He attended Hogwarts School of Witchcraft and Wizardry.\",\n    \"He is a skilled wizard and sorcerer.\",\n    \"He fought against Lord Voldemort and his followers.\",\n    \"He has a pet owl named Snowy.\"\n  ]\n}\n\"\"\"\n```\n"
  },
  {
    "path": "docs/integrations/openai-responses.md",
    "content": "---\ntitle: \"OpenAI Responses API Guide\"\ndescription: \"Learn how to use Instructor's new Responses API with OpenAI models for structured outputs. Complete guide with examples and best practices.\"\n---\n\n# OpenAI Responses API Guide\n\nThe Responses API provides a more streamlined way to work with OpenAI models through Instructor. This guide covers everything you need to know about using the new Responses API for type-safe, validated outputs.\n\n## Quick Start\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize the client\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\", mode=instructor.Mode.RESPONSES_TOOLS\n)\n\n\n# Define your response model\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nprofile = client.responses.create(\n    input=\"Extract out Ivan is 28 years old\",\n    response_model=User,\n)\n\nprint(profile)\n#> name='Ivan' age=28\n```\n\n## Response Modes\n\nThe Responses API supports two main modes:\n\n1. `instructor.Mode.RESPONSES_TOOLS`: Standard mode for structured outputs\n2. `instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS`: Enhanced mode that includes built-in tools like web search and file search\n\n```python\n# Initialize the client\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\", mode=instructor.Mode.RESPONSES_TOOLS\n)\n```\n\n## Core Methods\n\nThe Responses API provides several methods for creating structured outputs. Here's how to use each one:\n\n### Basic Creation\n\nThe `create` method is the simplest way to get a structured output:\n\n=== \"Sync\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS\n    )\n\n    profile = client.responses.create(\n        input=\"Extract: Jason is 25 years old\",\n        response_model=User,\n    )\n    print(profile)  # User(name='Jason', age=25)\n    ```\n\n=== \"Async\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n    import asyncio\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS,\n        async_client=True\n    )\n\n    async def main():\n        profile = await client.responses.create(\n            input=\"Extract: Jason is 25 years old\",\n            response_model=User,\n        )\n        print(profile)  # User(name='Jason', age=25)\n\n    asyncio.run(main())\n    ```\n\n### Create with Completion\n\nIf you need the original completion object from OpenAI, you can do so with the `create_with_completion` method. This is useful when you have specific methods and data that you need to work from.\n\n=== \"Sync\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS\n    )\n\n    response, completion = client.responses.create_with_completion(\n        input=\"Extract: Jason is 25 years old\",\n        response_model=User,\n    )\n    print(response)  # User(name='Jason', age=25)\n    print(completion)  # Raw completion object\n    ```\n\n=== \"Async\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n    import asyncio\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS,\n        async_client=True\n    )\n\n    async def main():\n        response, completion = await client.responses.create_with_completion(\n            input=\"Extract: Jason is 25 years old\",\n            response_model=User,\n        )\n        print(response)  # User(name='Jason', age=25)\n        print(completion)  # Raw completion object\n\n    asyncio.run(main())\n    ```\n\n### Iterable Creation\n\nIf you're interested in extracting multiple instances of the same object, we provide a convinient wrapper to be able to do so.\n\n=== \"Sync\"\n\n    ```python\n    from pydantic import BaseModel\n    from typing import Iterable\n    import instructor\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS\n    )\n\n    profiles = client.responses.create(\n        input=\"Generate three fake profiles\",\n        response_model=Iterable[User],\n    )\n\n    for profile in profiles:\n        print(profile)\n\n    ```\n\n=== \"Async\"\n\n    ```python\n    from pydantic import BaseModel\n    from typing import Iterable\n    import instructor\n    import asyncio\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS,\n        async_client=True\n    )\n\n    async def main():\n        profiles = await client.responses.create_iterable(\n            input=\"Generate three fake profiles\",\n            response_model=User,\n        )\n\n        async for profile in profiles:\n            print(profile)\n\n    asyncio.run(main())\n    ```\n\n### Partial Creation\n\nWe also provide validated outputs that you can stream in real time. This is incredibly useful for working with dynamic generative UI.\n\n=== \"Sync\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS\n    )\n\n    resp = client.responses.create_partial(\n        input=\"Generate a fake profile\",\n        response_model=User,\n    )\n\n    for user in resp:\n        print(user)  # Will show partial updates as they come in\n    ```\n\n=== \"Async\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n    import asyncio\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS,\n        async_client=True\n    )\n\n    async def main():\n        resp = client.responses.create_partial(\n            input=\"Generate a fake profile\",\n            response_model=User,\n        )\n\n        async for user in resp:\n            print(user)  # Will show partial updates as they come in\n\n    asyncio.run(main())\n    ```\n\n## Built-In Tools\n\nThe Responses API comes with powerful built-in tools that enhance the model's capabilities. These tools are managed by OpenAI, so you don't need to implement any additional code to use them.\n\nFor the most up-to-date documentation on how to use these tools, please refer to the [OpenAI Documentation](https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses)\n\n### Web Search\n\nThe web search tool allows models to search the internet for real-time information. This is particularly useful for getting up-to-date information or verifying facts.\n\nModel responses that use the web search tool will include two parts:\n\n- A web_search_call output item with the ID of the search call.\n- A message output item containing:\n    1. The text result in message.content[0].text\n    2. Annotations message.content[0].annotations for the cited URLs\n\nBy default, the model's response will include inline citations for URLs found in the web search results.\n\nIn addition to this, the url_citation annotation object will contain the URL, title and location of the cited source. You can extract this information using the `create_with_completion` method.\n\n=== \"Sync\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n\n\n    class Citation(BaseModel):\n        id: int\n        url: str\n\n\n    class Summary(BaseModel):\n        citations: list[Citation]\n        summary: str\n\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        async_client=False,\n    )\n\n    response, completion = client.responses.create_with_completion(\n        input=\"What are some of the best places to visit in New York for Latin American food?\",\n        tools=[{\"type\": \"web_search_preview\"}],\n        response_model=Summary,\n    )\n\n    print(response)\n    # > citations=[Citation(id=1,url=....)]\n    # > summary = New York City offers a rich variety of ...\n    ```\n\n=== \"Async\"\n\n    ```python\n    from pydantic import BaseModel\n    import instructor\n    import asyncio\n\n\n    class Citation(BaseModel):\n        id: int\n        url: str\n\n\n    class Summary(BaseModel):\n        citations: list[Citation]\n        summary: str\n\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        async_client=True,\n    )\n\n\n    async def main():\n        response = await client.responses.create(\n            input=\"What are some of the best places to visit in New York for Latin American food?\",\n            tools=[{\"type\": \"web_search_preview\"}],\n            response_model=Summary,\n        )\n        print(response)\n\n\n    asyncio.run(main())\n    # > citations=[Citation(id=1,url=....)]\n    # > summary = New York City offers a rich variety of ...\n    ```\n\nYou can customize the web search behavior with additional parameters:\n\n```python\nresponse = client.responses.create(\n    input=\"What are the best restaurants around Granary Square?\",\n    tools=[{\n        \"type\": \"web_search_preview\",\n        \"user_location\": {\n            \"type\": \"approximate\",\n            \"country\": \"GB\",\n            \"city\": \"London\",\n            \"region\": \"London\",\n        }\n    }],\n    response_model=Summary,\n)\n```\n\n### File Search\n\nThe file search tool enables models to retrieve information from your knowledge base through semantic and keyword search. This is useful for augmenting the model's knowledge with your own documents.\n\nThis makes it easy to build RAG applications out of the box\n\n=== \"Sync\"\n    ```python\n    from pydantic import BaseModel\n    import instructor\n\n    class Citation(BaseModel):\n        file_id: int\n        file_name: str\n        excerpt: str\n\n    class Response(BaseModel):\n        citations: list[Citation]\n        response: str\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS\n    )\n\n    response = client.responses.create(\n        input=\"How much does the Kyoto itinerary cost?\",\n        tools=[{\n            \"type\": \"file_search\",\n            \"vector_store_ids\": [\"your_vector_store_id\"],\n            \"max_num_results\": 2,\n        }],\n        response_model=Response,\n    )\n    ```\n\n=== \"Async\"\n    ```python\n    from pydantic import BaseModel\n    import instructor\n    import asyncio\n\n    class Citation(BaseModel):\n        file_id: int\n        file_name: str\n        excerpt: str\n\n    class Response(BaseModel):\n        citations: list[Citation]\n        response: str\n\n    client = instructor.from_provider(\n        \"openai/gpt-4.1-mini\",\n        mode=instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        async_client=True\n    )\n\n    async def main():\n        response = await client.responses.create(\n            input=\"How much does the Kyoto itinerary cost?\",\n            tools=[{\n                \"type\": \"file_search\",\n                \"vector_store_ids\": [\"your_vector_store_id\"],\n                \"max_num_results\": 2,\n            }],\n            response_model=Response,\n        )\n\n    asyncio.run(main())\n    ```\n\n## Related Resources\n\n- [OpenAI Documentation](https://platform.openai.com/docs)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n"
  },
  {
    "path": "docs/integrations/openai.md",
    "content": "---\ntitle: \"Structured outputs with OpenAI, a complete guide with instructor\"\ndescription: \"Learn how to use Instructor with OpenAI's models for type-safe, structured outputs. Complete guide with examples and best practices for GPT-4 and other OpenAI models.\"\n---\n\n# Structured outputs with OpenAI, a complete guide with instructor\n\nOpenAI is the primary integration for Instructor, offering robust support for structured outputs with GPT-3.5, GPT-4, and future models. This guide covers everything you need to know about using OpenAI with Instructor for type-safe, validated responses.\n\n## Quick Start\n\nInstructor comes with support for OpenAI out of the box, so you don't need to install anything extra.\n\n```bash\npip install \"instructor\"\n```\n\n⚠️ **Important**: You must set your OpenAI API key before using the client. You can do this in two ways:\n\n1. Set the environment variable:\n\n```bash\nexport OPENAI_API_KEY='your-api-key-here'\n```\n\n2. Or provide it directly to the client:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\n    \"openai/gpt-5-nano\",\n    api_key='your-api-key-here',\n)\n```\n\n## Simple User Example (Sync)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize client using provider string\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> User(name='Jason', age=25)\n```\n\n## Simple User Example (Async)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n# Initialize async client using provider string\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n#> User(name='Jason', age=25)\n```\n\n## Responses API Mode\n\nOpenAI now recommends the Responses API for new builds. Instructor exposes this API through two modes so you can keep the same interface while gaining better caching, stateful context, and optional built-in tools. Pass `mode=instructor.Mode.RESPONSES_TOOLS` when you want Instructor to call the Responses API instead of Chat Completions. Use `instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS` if you plan to use OpenAI-managed tools like web search or file search.\n\n```python\nimport asyncio\nfrom pydantic import BaseModel\nimport instructor\n\n\nclass SupportTicket(BaseModel):\n    issue: str\n    priority: str\n\n\nclient = instructor.from_provider(\n    \"openai/gpt-4.1-mini\",\n    mode=instructor.Mode.RESPONSES_TOOLS,\n    async_client=True,\n)\n\n\nasync def create_ticket() -> SupportTicket:\n    return await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Log a high priority bug about failed password resets.\",\n            }\n        ],\n        response_model=SupportTicket,\n    )\n\n\nticket = asyncio.run(create_ticket())\nprint(ticket)\n```\n\nSee the [OpenAI Responses API guide](./openai-responses.md) for a deeper walkthrough that includes built-in tool usage, streaming, and best practices.\n\n## Nested Example\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n\n# Initialize client\nclient = instructor.from_provider(\n    \"openai/gpt-5-nano\",\n    api_key=os.getenv('OPENAI_API_KEY'),\n)\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Multimodal\n\n> We've provided a few different sample files for you to use to test out these new features. All examples below use these files.\n>\n> - (Audio) : A Recording of the Original Gettysburg Address : [gettysburg.wav](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav)\n> - (Image) : An image of some blueberry plants [image.jpg](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg)\n> - (PDF) : A sample PDF file which contains a fake invoice [invoice.pdf](https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf)\n\nInstructor provides a unified, provider-agnostic interface for working with multimodal inputs like images, PDFs, and audio files. With Instructor's multimodal objects, you can easily load media from URLs, local files, or base64 strings using a consistent API that works across different AI providers (OpenAI, Anthropic, Mistral, etc.).\n\nInstructor handles all the provider-specific formatting requirements behind the scenes, ensuring your code remains clean and future-proof as provider APIs evolve.\n\nLet's see how to use the Image, Audio and PDF classes.\n\n### Image\n\n> For a more in-depth walkthrough of the Image component, check out the [docs here](../concepts/multimodal.md)\n\nInstructor makes it easy to analyse and extract semantic information from images using OpenAI's GPT-4o models. [Click here](https://platform.openai.com/docs/models) to check if the model you'd like to use has vision capabilities.\n\nLet's see an example below with the sample image above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path` and the `from_base64` class methods.\n\n```python\nfrom instructor.processing.multimodal import Image\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\n\nclass ImageDescription(BaseModel):\n    objects: list[str] = Field(..., description=\"The objects in the image\")\n    scene: str = Field(..., description=\"The scene of the image\")\n    colors: list[str] = Field(..., description=\"The colors in the image\")\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n# Multiple ways to load an image:\nresponse = client.create(\n    response_model=ImageDescription,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"What is in this image?\",\n                # Option 1: Direct URL with autodetection\n                Image.from_url(url),\n                # Option 2: Local file\n                # Image.from_path(\"path/to/local/image.jpg\")\n                # Option 3: Base64 string\n                # Image.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # Image.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nprint(response)\n# Example output:\n# ImageDescription(\n#     objects=['blueberries', 'leaves'],\n#     scene='A blueberry bush with clusters of ripe blueberries and some unripe ones against a cloudy sky',\n#     colors=['green', 'blue', 'purple', 'white']\n# )\n```\n\n### PDF\n\nInstructor makes it easy to analyse and extract semantic information from PDFs using OpenAI's GPT-4o models.\n\nLet's see an example below with the sample PDF above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path` and the `from_base64` class methods.\n\n```python\nfrom instructor.processing.multimodal import PDF\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n# Multiple ways to load an PDF:\nresponse = client.create(\n    response_model=Receipt,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract out the total and line items from the invoice\",\n                # Option 1: Direct URL\n                PDF.from_url(url),\n                # Option 2: Local file\n                # PDF.from_path(\"path/to/local/invoice.pdf\"),\n                # Option 3: Base64 string\n                # PDF.from_base64(\"base64_encoded_string_here\")\n                # Option 4: Autodetect\n                # PDF.autodetect(<url|path|base64>)\n            ],\n        },\n    ],\n)\n\nprint(response)\n# > Receipt(total=220, items=['English Tea', 'Tofu'])\n```\n\n### Audio\n\nInstructor makes it easy to analyse and extract semantic information from Audio files using OpenAI's GPT-4o models. Let's see an example below with the sample Audio file above where we'll load it in using our `from_url` method.\n\nNote that we support local files and base64 strings too with the `from_path`\n\n```python\nfrom instructor.processing.multimodal import Audio\nfrom pydantic import BaseModel\nimport instructor\nfrom openai import OpenAI\n\n\nclass AudioDescription(BaseModel):\n    transcript: str\n    summary: str\n    speakers: list[str]\n    key_points: list[str]\n\n\nurl = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav\"\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nresponse = client.create(\n    response_model=AudioDescription,\n    modalities=[\"text\"],\n    audio={\"voice\": \"alloy\", \"format\": \"wav\"},\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Please transcribe and analyze this audio:\",\n                # Multiple loading options:\n                Audio.from_url(url),\n                # Option 2: Local file\n                # Audio.from_path(\"path/to/local/audio.mp3\")\n            ],\n        },\n    ],\n)\n\nprint(response)\n# > transcript='Four score and seven years ago our fathers...\"]\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\n### Partials\n\n```python\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    bio: str\n\n\nuser = client.create_partial(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create a user profile for Jason, age 25\"},\n    ],\n    response_model=User,\n)\n\nfor user_partial in user:\n    print(user_partial)\n\n# > name='Jason' age=None bio='None'\n# > name='Jason' age=25 bio='A tech'\n# > name='Jason' age=25 bio='A tech enthusiast'\n# > name='Jason' age=25 bio='A tech enthusiast who loves coding, gaming, and exploring new'\n# > name='Jason' age=25 bio='A tech enthusiast who loves coding, gaming, and exploring new technologies'\n\n```\n\n### Iterable Example\n\n```python\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Extract multiple users from text\nusers = client.create_iterable(\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n            Extract users:\n            1. Jason is 25 years old\n            2. Sarah is 30 years old\n            3. Mike is 28 years old\n        \"\"\"},\n    ],\n    response_model=User,\n)\n\nfor user in users:\n    print(user)\n    #> name='Jason' age=25\n    #> name='Sarah' age=30\n    #> name='Mike' age=28\n```\n\n## Instructor Modes\n\nWe provide several modes to make it easy to work with the different response models that OpenAI supports\n\n1. `instructor.Mode.RESPONSES_TOOLS` : Calls the OpenAI Responses API while keeping Instructor's familiar API. Best for new builds that want lower latency, better caching, and the new stateful context features.\n2. `instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS` : Same as above, but automatically enables OpenAI's built-in tools (web search, file search, etc.) inside the Responses API.\n3. `instructor.Mode.TOOLS` : This uses the [tool calling API](https://platform.openai.com/docs/guides/function-calling) to return structured outputs to the client.\n4. `instructor.Mode.JSON` : This forces the model to return JSON by using [OpenAI's JSON mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode).\n5. `instructor.Mode.FUNCTIONS` : This uses OpenAI's function calling API to return structured outputs and will be deprecated in the future.\n6. `instructor.Mode.PARALLEL_TOOLS` : This uses the [parallel tool calling API](https://platform.openai.com/docs/guides/function-calling#configuring-parallel-function-calling) to return structured outputs to the client. This allows the model to generate multiple calls in a single response.\n7. `instructor.Mode.MD_JSON` : This makes a simple call to the OpenAI chat completion API and parses the raw response as JSON.\n8. `instructor.Mode.TOOLS_STRICT` : This uses the new Open AI structured outputs API to return structured outputs to the client using constrained grammar sampling. This restricts users to a subset of the JSON schema.\n9. `instructor.Mode.JSON_O1` : This is a mode for the `O1` model. We created a new mode because `O1` doesn't support any system messages, tool calling or streaming so you need to use this mode to use Instructor with `O1`.\n\nIn general, choose `Mode.RESPONSES_TOOLS` (or the built-in tools variant) when you're targeting the Responses API, and stick with `Mode.TOOLS` for classic Chat Completions integrations. Both modes keep schema handling identical, so switching between them is a single-line change.\n\n## Batch API\n\nWe also support batching requests using the `create_batch` method. This is helpful if your request is not time sensitive because you'll get a 50% discount on the token cost.\n\nRead more about how to use it [here](../examples/batch_job_oai.md)\n\n## Best Practices\n\n1. **Model Selection** : We recommend using gpt-4o-mini for simpler use cases because it's cheap and works well with a clearly defined objective for structured outputs. When the task is more ambigious, consider upgrading to `4o` or even `O1` depending on your needs\n\n2. **Performance Optimization** : Streaming a response model is faster and should be done from the get-go. This is especially true if you're using a simple response model.\n\n## Common Use Cases\n\n- Data Extraction\n- Form Parsing\n- API Response Structuring\n- Document Analysis\n- Configuration Generation\n\n## Related Resources\n\n- [OpenAI Documentation](https://platform.openai.com/docs)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n- [OpenAI Responses API Guide](./openai-responses.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with the latest OpenAI API versions and models. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n"
  },
  {
    "path": "docs/integrations/openrouter.md",
    "content": "---\ntitle: \"Structured outputs with OpenRouter, a complete guide with instructor\"\ndescription: \"Learn how to use Instructor with OpenRouter to access multiple LLM providers through a unified API. Get type-safe, structured outputs from various models including Qwen, Gemini, Mistral, and Cohere.\"\n---\n\n# Structured outputs with OpenRouter, a complete guide with instructor\n\nOpenRouter provides a unified API to access multiple LLM providers, allowing you to easily switch between different models. This guide shows you how to use Instructor with OpenRouter for type-safe, validated responses across various LLM providers.\n\nTo set Provider specific configuration on the `openai` client, make sure to use the `extra_body` kwarg.\n\n## Quick Start\n\n⚠️ **Important**: Make sure that the model you're using has support for `Tool Calling` and/or `Structured Outputs` in the [OpenRouter models listing](https://openrouter.ai/models)\n\nInstructor works with OpenRouter through the OpenAI client, so you don't need to install anything extra beyond the base package.\n\n## Simple User Example (Sync)\n\nWe support simple tool calling with this\n\n```python\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\n    \"openrouter/google/gemini-2.0-flash-lite-001\",\n    base_url=\"https://openrouter.ai/api/v1\",\n    async_client=False\n)\n\nresp = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Ivan is 28 years old\",\n        },\n    ],\n    response_model=User,\n    extra_body={\"provider\": {\"require_parameters\": True}},\n)\n\nprint(resp)\n#> name='Ivan' age=20\n```\n\n## Simple User Example ( Async )\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\n    \"openrouter/google/gemini-2.0-flash-lite-001\",\n    async_client=True,\n)\n\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n        extra_body={\"provider\": {\"require_parameters\": True}},\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n```\n\n## Nested Object Example ( Sync )\n\n```python\nfrom pydantic import BaseModel\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Initialize with API key\n# Initialize client with base URL\nclient = instructor.from_provider(\n    \"openrouter/google/gemini-2.0-flash-lite-001\",\n    base_url=\"https://openrouter.ai/api/v1\",\n    async_client=False\n)\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    extra_body={\"provider\": {\"require_parameters\": True}},\n    response_model=User,\n)\n\nprint(user)\n#> name='Jason' age=25 addresses=[Address(street='123 Main St', city='New York', country='USA'), Address(street='456 Beach Rd', city='Miami', country='USA')]\n```\n\n## Structured Outputs (Sync)\n\n⚠️ **Important**: Check that your chosen model supports `Structured Outputs` in the [OpenRouter models listing](https://openrouter.ai/models). Structured Outputs is a subset of Tool Calling that constrains the model's output to match your schema in order to produce valid JSON Schema.\n\nInstructor also supports Structured Outputs with OpenRouter as documented in their API [here](https://openrouter.ai/docs/features/structured-outputs). Note that the following User model will throw an error if we use the OpenAI GPT-4o model like `openai/gpt-4o-2024-11-20` because OpenAI does not support using a regex pattern as part of their structured output schema.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom openai import OpenAI\nimport instructor\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    phone_number: str = Field(\n        pattern=r\"^\\+?1?\\s*\\(?(\\d{3})\\)?[-.\\s]*(\\d{3})[-.\\s]*(\\d{4})$\"\n    )\n\n\n# Initialize with API key\n# Initialize client with base URL\nclient = instructor.from_provider(\n    \"openrouter/google/gemini-2.0-flash-lite-001\",\n    base_url=\"https://openrouter.ai/api/v1\",\n    async_client=False\n)\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old and his number is 1-212-456-7890\n        \"\"\",\n        },\n    ],\n    response_model=User,\n    extra_body={\"provider\": {\"require_parameters\": True}},\n)\n\nprint(user)\n# > name='Jason' age=25 phone_number='+1 (212) 456-7890'\n```\n\n## JSON Mode\n\nIn the event that your model doesn't support tool calling, you will see the following error when you try to use `mode.TOOLS`\n\n> instructor.exceptions.InstructorRetryException: Error code: 404 - {'error': {'message': 'No endpoints found that support tool use. To learn more about provider routing, visit: https://openrouter.ai/docs/provider-routing', 'code': 404}}\n\nIn this case, we recommend using the `JSON` mode instead as seen below.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom openai import OpenAI\nimport instructor\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    phone_number: str = Field(\n        pattern=r\"^\\+?1?\\s*\\(?(\\d{3})\\)?[-.\\s]*(\\d{3})[-.\\s]*(\\d{4})$\"\n    )\n\n\n# Initialize with API key\n# Initialize client with base URL\nclient = instructor.from_provider(\n    \"openrouter/google/gemini-2.0-flash-lite-001\",\n    base_url=\"https://openrouter.ai/api/v1\",\n    async_client=False\n)\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old and his number is 1-212-456-7890\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n```\n\n## Streaming\n\nYou can also use streaming with as seen below using the `create_partial` method. While we're using JSON mode here, this should work with tool calling and structured outputs too.\n\n```python\nfrom pydantic import BaseModel, Field\nfrom openai import OpenAI\nimport instructor\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Initialize with API key\n# Initialize client with base URL\nclient = instructor.from_provider(\n    \"openrouter/google/gemini-2.0-flash-lite-001\",\n    base_url=\"https://openrouter.ai/api/v1\",\n)\n\n# Create structured output with nested objects\nuser = client.create_partial(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old and his number is 1-212-456-7890\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nfor chunk in user:\n    print(chunk)\n    # > name=None age=None\n    # > name='Jason' age=None\n    # > name='Jason' age=25\n```\n"
  },
  {
    "path": "docs/integrations/perplexity.md",
    "content": "---\ntitle: Structured Outputs with Perplexity AI and Pydantic\ndescription: Learn how to use Perplexity AI with Instructor for structured JSON outputs using Pydantic models. Create type-safe, validated responses from Perplexity's Sonar models with Python.\n---\n\n# Structured Outputs with Perplexity AI\n\nThis guide demonstrates how to use Perplexity AI with Instructor to generate structured outputs. You'll learn how to use Perplexity's Sonar models with Pydantic to create type-safe, validated responses.\n\n## Prerequisites\n\nYou'll need to sign up for a Perplexity account and get an API key. You can do that [here](https://www.perplexity.ai/).\n\n```bash\nexport PERPLEXITY_API_KEY=<your-api-key-here>\npip install \"instructor[perplexity]\"\n```\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n- [Search Examples](../examples/search.md) - Search query processing examples\n\n# Perplexity AI\n\nPerplexity AI provides access to powerful language models through their API. Instructor supports structured outputs with Perplexity's models using the OpenAI-compatible API.\n\n### Sync Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\n    \"perplexity/sonar-small-online\",\n    api_key=os.getenv(\"PERPLEXITY_API_KEY\"),\n    base_url=\"https://api.perplexity.ai\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > User(name='Jason', age=25)\n```\n\n### Async Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nasync_client = instructor.from_provider(\n    \"perplexity/sonar-small-online\",\n    async_client=True,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n# > User(name='Jason', age=25)\n```\n\n### Nested Objects\n\n```python\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom pydantic import BaseModel\n\n# Initialize with API key\nclient = instructor.from_provider(\n    \"perplexity/sonar-small-online\",\n    api_key=os.getenv(\"PERPLEXITY_API_KEY\"),\n    base_url=\"https://api.perplexity.ai\",\n)\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\n\nprint(user)\n#> User(\n#>     name='Jason',\n#>     age=25,\n#>     addresses=[\n#>         Address(street='123 Main St', city='New York', country='USA'),\n#>         Address(street='456 Beach Rd', city='Miami', country='USA')\n#>     ]\n#> )\n```\n\n## Supported Modes\n\nPerplexity AI currently supports the following mode with Instructor:\n\n- `PERPLEXITY_JSON`: Direct JSON response generation\n\n```python\nimport os\nfrom openai import OpenAI\nimport instructor\nfrom instructor import Mode\nfrom pydantic import BaseModel\n\n# Initialize client with base URL\nclient = instructor.from_provider(\n    \"perplexity/sonar-small-online\",\n    api_key=os.getenv(\"PERPLEXITY_API_KEY\"),\n    base_url=\"https://api.perplexity.ai\",\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Create structured output\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > User(name='Jason', age=25)\n```\n\n## Additional Resources\n\n- [Perplexity API Documentation](https://docs.perplexity.ai/)\n- [Perplexity API Reference](https://docs.perplexity.ai/reference/post_chat_completions)"
  },
  {
    "path": "docs/integrations/sambanova.md",
    "content": "---\ntitle: SambaNova\ndescription: Use Instructor with SambaNova's LLM API for structured outputs.\n---\n\n## See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n- [Enterprise Integration](../examples/index.md#enterprise-integration) - More enterprise examples\n\n# SambaNova Integration\n\nInstructor supports SambaNova's LLM API, allowing you to use structured outputs with their models.\n\n## Installation\n\n```bash\npip install \"instructor[openai]\"\n```\n\n## Basic Usage\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"sambanova/Meta-Llama-3.1-405B-Instruct\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nuser = client.create(\n    messages=[\n        {\"role\": \"user\", \"content\": \"Ivan is 28\"},\n    ],\n    response_model=User,\n)\n\nprint(user)\n# > User(name='Ivan', age=28)\n```\n\n## Async Usage\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\n    \"sambanova/Meta-Llama-3.1-405B-Instruct\",\n    async_client=True,\n)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def get_user():\n    user = await client.create(\n        messages=[\n            {\"role\": \"user\", \"content\": \"Ivan is 28\"},\n        ],\n        response_model=User,\n    )\n    return user\n\n# Run with asyncio\nimport asyncio\nuser = asyncio.run(get_user())\nprint(user)\n# > User(name='Ivan', age=28)\n```\n\n## Available Models\n\nCheck the [SambaNova documentation](https://docs.sambanova.ai/cloud/docs/get-started/supported-models) for the latest model offerings and capabilities.\n"
  },
  {
    "path": "docs/integrations/together.md",
    "content": "---\ndraft: False\ndate: 2024-01-27\nslug: together\ntitle: \"Structured outputs with Together AI, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with Together AI. Learn how to generate structured, type-safe outputs with Together AI.\"\ntags:\n  - patching\n  - open source\nauthors:\n  - jxnl\n---\n\n# Structured outputs with Together AI, a complete guide with instructor\n\nThis guide demonstrates how to use Together AI with Instructor to generate structured outputs. You'll learn how to use function calling with Together's models to create type-safe responses.\n\nOpen-source LLMS are gaining popularity, and with the release of Together's Function calling models, its been easier than ever to get structured outputs.\n\nBy the end of this blog post, you will learn how to effectively utilize instructor with Together AI. But before we proceed, let's first explore the concept of patching.\n\n!!! note \"Other Languages\"\n\n    This blog post is written in Python, but the concepts are applicable to other languages as well, as we currently have support for [Javascript](https://instructor-ai.github.io/instructor-js), [Elixir](https://hexdocs.pm/instructor/Instructor.html) and [PHP](https://github.com/cognesy/instructor-php/).\n\n<!-- more -->\n\n## Patching\n\nInstructor's patch enhances the openai api it with the following features:\n\n- `response_model` in `create` calls that returns a pydantic model\n- `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy\n\n!!! note \"Learn More\"\n\n    To learn more, please refer to the [docs](../index.md). To understand the benefits of using Pydantic with Instructor, visit the tips and tricks section of the [why use Pydantic](../why.md) page.\n\n### See Also\n\n- [Getting Started](../getting-started.md) - Quick start guide\n- [from_provider Guide](../concepts/from_provider.md) - Detailed client configuration\n- [Provider Examples](../index.md#provider-examples) - Quick examples for all providers\n- [Open Source Models](../examples/open_source.md) - More open-source model examples\n\n# Together AI\n\nThe good news is that Together employs the same OpenAI client, and its models support some of these output modes too!\n\n!!! note \"Getting access\"\n\n    If you want to try this out for yourself check out the [Together AI](https://www.together.ai/) website. You can get started [here](http://api.together.ai/).\n\n```python\nimport os\nfrom pydantic import BaseModel\nimport instructor\n\nclient = instructor.from_provider(\n    \"together/Mixtral-8x7B-Instruct-v0.1\",\n    api_key=os.environ[\"TOGETHER_API_KEY\"],\n    base_url=\"https://api.together.xyz/v1\",\n)\n\n# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.create methods to support the response_model parameter\n\n\n# Now, we can use the response_model parameter using only a base model\n# rather than having to use the OpenAISchema class\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.create(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nassert isinstance(user, UserExtract), \"Should be instance of UserExtract\"\nassert user.name.lower() == \"jason\"\nassert user.age == 25\n\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"jason\",\n  \"age\": 25\n}\n\"\"\"\n{\n    \"name\": \"Jason\",\n    \"age\": 25,\n}\n```\n\n### Async Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport os\nimport asyncio\n\nasync_client = instructor.from_provider(\n    \"together/Mixtral-8x7B-Instruct-v0.1\",\n    async_client=True,\n    api_key=os.environ[\"TOGETHER_API_KEY\"],\n    base_url=\"https://api.together.xyz/v1\",\n)\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\nasync def extract_user():\n    return await async_client.create(\n        response_model=UserExtract,\n        messages=[{\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"}],\n    )\n\nprint(asyncio.run(extract_user()))\n```\n\nYou can find more information about Together's function calling support [here](https://docs.together.ai/docs/function-calling).\n"
  },
  {
    "path": "docs/integrations/truefoundry.md",
    "content": "---\ntitle: \"TrueFoundry\"\n---\n\nThis guide provides instructions for integrating Instructor with the [TrueFoundry AI Gateway](https://www.truefoundry.com/ai-gateway) for structured data extraction from LLMs.\n\n## What is TrueFoundry?\n\nTrueFoundry provides an enterprise-ready [AI Gateway](https://www.truefoundry.com/ai-gateway) and integrates seamlessly with libraries like instructor, providing enterprise-grade AI features including cost tracking, security guardrails, and access controls.\n\n## Prerequisites\n\nBefore integrating Instructor with TrueFoundry, ensure you have:\n\n1. **TrueFoundry Account**: Create a [TrueFoundry account](https://www.truefoundry.com/register) with at least one model provider and generate a Personal Access Token by following the instructions in [Generating Tokens](https://docs.truefoundry.com/gateway/authentication). For a quick setup guide, see our [Gateway Quick Start](https://docs.truefoundry.com/gateway/quick-start)\n2. **Instructor Installation**: Install Instructor using pip: `pip install instructor`\n3. **OpenAI Library**: Install the OpenAI Python library: `pip install openai`\n4. **Pydantic**: Install Pydantic for data validation: `pip install pydantic`\n\n## Setup Process\n\n### Step 1: Install Dependencies\n\n```bash\npip install instructor openai pydantic\n```\n\n### Step 2: Configure Instructor with TrueFoundry Gateway\n\nGet your TrueFoundry Gateway API key, base URL, and model name from the unified code snippet in your TrueFoundry playground:\n\n<Frame>\n  <img src=\"../img/new-code-snippet.png\" />\n</Frame>\n\nHere's how to configure Instructor to use TrueFoundry's AI Gateway:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom openai import OpenAI\n\n# Configure OpenAI client to use TrueFoundry Gateway\nclient = OpenAI(\n    api_key=\"your-truefoundry-api-key\",  # Your TrueFoundry Personal Access Token\n    base_url=\"your-truefoundry-base-url\",  # Your TrueFoundry Gateway URL\n)\n\n# Patch the client with Instructor\ninstructor_client = instructor.from_provider(\"openai/gpt-4o\")\n\n# Define your Pydantic model for structured output\nclass User(BaseModel):\n    name: str\n    age: int\n    email: str\n\n# Extract structured data\nuser_info = instructor_client.create(\n    model=\"openai-main/gpt-4o\",  # Your TrueFoundry model ID\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract user information: John Doe is 30 years old and his email is john@example.com\"}\n    ],\n)\n\nprint(f\"Name: {user_info.name}\")\nprint(f\"Age: {user_info.age}\")\nprint(f\"Email: {user_info.email}\")\n```\n\n## Usage Examples\n\n### Basic Structured Data Extraction\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom openai import OpenAI\n\n# Configure TrueFoundry Gateway\nclient = OpenAI(\n    api_key=\"your-truefoundry-api-key\",\n    base_url=\"your-truefoundry-base-url\",\n)\ninstructor_client = instructor.from_provider(\"openai/gpt-4o\")\n\n# Define response structure\nclass ProductInfo(BaseModel):\n    name: str\n    price: float\n    category: str\n    in_stock: bool\n\n# Extract product information\nproduct = instructor_client.create(\n    model=\"openai-main/gpt-4o\",\n    response_model=ProductInfo,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract product details: The iPhone 15 Pro costs $999, it's in the Electronics category and is currently available in stock.\"}\n    ],\n)\n\nprint(f\"Product: {product.name}\")\nprint(f\"Price: ${product.price}\")\nprint(f\"Category: {product.category}\")\nprint(f\"In Stock: {product.in_stock}\")\n```\n\n### Complex Data Structures with Lists\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import List\nfrom openai import OpenAI\n\n# Configure TrueFoundry Gateway\nclient = OpenAI(\n    api_key=\"your-truefoundry-api-key\",\n    base_url=\"your-truefoundry-base-url\",\n)\ninstructor_client = instructor.from_provider(\"openai/gpt-4o\")\n\nclass Task(BaseModel):\n    title: str\n    description: str\n    priority: str\n    estimated_hours: int\n\nclass ProjectPlan(BaseModel):\n    project_name: str\n    total_duration_weeks: int\n    tasks: List[Task]\n\n# Extract complex project structure\nproject = instructor_client.create(\n    model=\"openai-main/gpt-4o\",\n    response_model=ProjectPlan,\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Create a project plan for building a mobile app:\n        \n        Project: Food Delivery App (8 weeks total)\n        Tasks:\n        1. UI/UX Design - Create user interface mockups and wireframes - High priority - 2 weeks\n        2. Backend Development - Build API and database - High priority - 3 weeks  \n        3. Frontend Development - Build mobile app frontend - Medium priority - 2 weeks\n        4. Testing & QA - Test all features and fix bugs - Medium priority - 1 week\n        \"\"\"}\n    ],\n)\n\nprint(f\"Project: {project.project_name}\")\nprint(f\"Duration: {project.total_duration_weeks} weeks\")\nprint(\"\\nTasks:\")\nfor task in project.tasks:\n    print(f\"- {task.title}: {task.description} ({task.priority} priority, {task.estimated_hours} weeks)\")\n```\n\n\nThat's it! You're now ready to use Instructor with TrueFoundry Gateway for robust, production-ready structured data extraction from LLMs.\n"
  },
  {
    "path": "docs/integrations/vertex.md",
    "content": "---\ntitle: \"Structured outputs with Vertex AI, a complete guide w/ instructor\"\ndescription: \"Complete guide to using Instructor with Google Cloud's Vertex AI. Learn how to generate structured, type-safe outputs with enterprise-grade AI capabilities.\"\n---\n\n# Structured outputs with Vertex AI, a complete guide w/ instructor\n\nGoogle Cloud's Vertex AI provides enterprise-grade AI capabilities with robust scaling and security features. This guide shows you how to use Instructor with Vertex AI for type-safe, validated responses.\n\n!!! warning \"Migration Notice\"\n    The direct `from_vertexai` integration is being deprecated in favor of the unified `google-genai` SDK. \n    Please use `from_provider` or `from_genai` with `vertexai=True` for new projects. \n    See the [migration guide](#migration-to-google-genai) below.\n\n## Quick Start\n\nInstall Instructor with Google GenAI support (which includes Vertex AI):\n\n```bash\npip install \"instructor[google-genai]\"\n```\n\n## Simple User Example (Sync)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport os\n\n# Set your project ID and location\nos.environ[\"GOOGLE_CLOUD_PROJECT\"] = \"your-project-id\"\nos.environ[\"GOOGLE_CLOUD_LOCATION\"] = \"us-central1\"\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Using from_provider (recommended)\nclient = instructor.from_provider(\n    \"vertexai/gemini-3-flash\",\n)\n\nresp = client.create(\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract Jason is 25 years old.\",\n        }\n    ],\n)\n\nprint(resp)\n#> User(name='Jason', age=25)\n```\n\n## Simple User Example (Async)\n\n```python\nimport asyncio\nimport instructor\nimport vertexai  # type: ignore\nfrom vertexai.generative_models import GenerativeModel  # type: ignore\nfrom pydantic import BaseModel\n\nvertexai.init()\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclient = instructor.from_provider(\n    \"vertex_ai/gemini-1.5-pro-preview-0409\",\n    async_client=True,\n    mode=instructor.Mode.TOOLS,\n)\n\nasync def extract_user():\n    user = await client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract Jason is 25 years old.\",\n            }\n        ],\n        response_model=User,\n    )\n    return user\n\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)  # User(name='Jason', age=25)\n```\n\n## Streaming Support\n\nInstructor now supports streaming capabilities with Vertex AI! You can use both `create_partial` for incremental model building and `create_iterable` for streaming collections.\n\n### Streaming Partial Responses\n\n```python\nimport vertexai  # type: ignore\nfrom vertexai.generative_models import GenerativeModel  # type: ignore\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.dsl.partial import Partial\n\nvertexai.init()\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\n    \"vertex_ai/gemini-1.5-pro-preview-0409\",\n    mode=instructor.Mode.TOOLS,\n)\n\n# Stream partial responses\nresponse_stream = client.create(\n    response_model=Partial[UserExtract],\n    stream=True,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Anibal is 23 years old\"},\n    ],\n)\n\nfor partial_user in response_stream:\n    print(f\"Received update: {partial_user}\")\n# Output might show:\n# Received update: UserExtract(name='Anibal', age=None)\n# Received update: UserExtract(name='Anibal', age=23)\n```\n\n### Streaming Iterable Collections\n\n```python\nimport vertexai  # type: ignore\nfrom vertexai.generative_models import GenerativeModel  # type: ignore\nimport instructor\nfrom pydantic import BaseModel\n\nvertexai.init()\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\n    \"vertex_ai/gemini-1.5-pro-preview-0409\",\n    mode=instructor.Mode.TOOLS,\n)\n\n# Stream iterable responses\nresponse_stream = client.create_iterable(\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Make up two people\"},\n    ],\n)\n\nfor user in response_stream:\n    print(f\"Generated user: {user}\")\n# Output:\n# Generated user: UserExtract(name='Sarah Johnson', age=32)\n# Generated user: UserExtract(name='David Chen', age=27)\n```\n\n### Async Streaming\n\nYou can also use async versions of both streaming approaches:\n\n```python\nimport asyncio\nimport vertexai  # type: ignore\nfrom vertexai.generative_models import GenerativeModel  # type: ignore\nimport instructor\nfrom pydantic import BaseModel\nfrom instructor.dsl.partial import Partial\n\nvertexai.init()\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\n    \"vertex_ai/gemini-1.5-pro-preview-0409\",\n    async_client=True,\n    mode=instructor.Mode.TOOLS,\n)\n\nasync def stream_partial():\n    response_stream = await client.create(\n        response_model=Partial[UserExtract],\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Anibal is 23 years old\"},\n        ],\n    )\n\n    async for partial_user in response_stream:\n        print(f\"Received update: {partial_user}\")\n\nasync def stream_iterable():\n    response_stream = client.create_iterable(\n        response_model=UserExtract,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Make up two people\"},\n        ],\n    )\n\n    async for user in response_stream:\n        print(f\"Generated user: {user}\")\n\n# Run async functions\nasyncio.run(stream_partial())\nasyncio.run(stream_iterable())\n```\n\n## Related Resources\n\n- [Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Migration to Google GenAI\n\nThe legacy `from_vertexai` method is being deprecated in favor of the unified Google GenAI SDK. Here's how to migrate:\n\n### Old Way (Deprecated)\n```python\nimport instructor\nimport vertexai\nfrom vertexai.generative_models import GenerativeModel\n\nvertexai.init(project=\"your-project\", location=\"us-central1\")\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\", vertexai=True),\n    mode=instructor.Mode.TOOLS,\n)\n```\n\n### New Way (Recommended)\n```python\nimport instructor\n\n# Option 1: Using from_provider (simplest)\nclient = instructor.from_provider(\n    \"vertexai/gemini-3-flash\",\n    project=\"your-project\",  # Optional if set in environment\n    location=\"us-central1\"   # Optional, defaults to us-central1\n)\n\n# Option 2: Using from_genai with Google GenAI SDK\nfrom google import genai\nfrom instructor import from_genai\n\nclient = from_genai(\n    genai.Client(\n        vertexai=True,\n        project=\"your-project\",\n        location=\"us-central1\",\n        model=\"gemini-3-flash\"\n    )\n)\n```\n\n### Environment Variables\n\nYou can also set these environment variables to avoid passing project/location each time:\n```bash\nexport GOOGLE_CLOUD_PROJECT=\"your-project-id\"\nexport GOOGLE_CLOUD_LOCATION=\"us-central1\"\n```\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with Vertex AI's latest API versions. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n\nStreaming support has been added for both partial responses and iterable collections, with both synchronous and asynchronous interfaces.\n"
  },
  {
    "path": "docs/integrations/writer.md",
    "content": "---\ntitle: Structured Outputs with Writer, a complete guide with instructor\ndescription: Learn how to use Writer for structured outputs using their latest Palmyra-X-004 model for more reliable system outputs\n---\n\n# Structured Outputs with Writer, a complete guide with instructor\n\nThis guide demonstrates how to use Writer for structured outputs using their latest Palmyra-X-004 model for more reliable system outputs.\n\nYou'll need to sign up for an account and get an API key. You can do that [here](https://writer.com).\n\n```bash\nexport WRITER_API_KEY=<your-api-key-here>\npip install \"instructor[writer]\"\n```\n\n## Palmyra-X-004\n\nWriter supports structured outputs with their latest Palmyra-X-004 model that introduces tool calling functionality\n\n### Sync Example\n\n```python\nimport instructor\nfrom writerai import Writer\nfrom pydantic import BaseModel\n\n# Initialize Writer client\nclient = instructor.from_provider(\"writer/palmyra-x-004\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n# Extract structured data\nuser = client.create(\n    messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old\"}],\n    response_model=User,\n)\n\nprint(user)\n#> name='John' age=30\n```\n\n### Async Example\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\nclient = instructor.from_provider(\n    \"writer/palmyra-x-004\",\n    async_client=True,\n)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_user():\n    # Extract structured data\n    user = await client.create(\n        messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old\"}],\n        response_model=User,\n    )\n\n    print(user)\n    # > name='John' age=30\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    asyncio.run(extract_user())\n```\n\n## Nested Objects\n\nWriter also supports nested objects, which is useful for extracting data from more complex responses.\n\n```python\nimport instructor\nfrom writerai import Writer\nfrom pydantic import BaseModel\n\n# Initialize Writer client\nclient = instructor.from_provider(\"writer/palmyra-x-004\")\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: list[Address]\n\n\n# Create structured output with nested objects\nuser = client.create(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\",\n        },\n    ],\n    response_model=User,\n)\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Streaming Support\n\nInstructor has two main ways that you can use to stream responses out\n\n1. **Iterables**: These are useful when you'd like to stream a list of objects of the same type (Eg. use structured outputs to extract multiple users)\n2. **Partial Streaming**: This is useful when you'd like to stream a single object and you'd like to immediately start processing the response as it comes in.\n\nWe currently support streaming for Writer with native tool for both methods listed above.\n\n### Partial Streaming\n\n```python\nimport instructor\nfrom writerai import Writer\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"writer/palmyra-x-004\")\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nresp = client.create_partial(\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Ivan is 27 and lives in Singapore\",\n        }\n    ],\n    response_model=Person,\n)\n\nfor person in resp:\n    print(person)\n    # > name=None age=None\n    # > name='Ivan' age=None\n    # > name='Ivan' age=27\n```\n"
  },
  {
    "path": "docs/integrations/xai.md",
    "content": "---\ntitle: \"Structured outputs with xAI, a complete guide with instructor\"\ndescription: \"Learn how to use Instructor with xAI's Grok models for type-safe, structured outputs. Complete guide with examples and best practices.\"\n---\n\n# Structured outputs with xAI, a complete guide with instructor\n\nxAI provides access to Grok models through the `xai-sdk` package, enabling structured outputs with Instructor. This guide covers everything you need to know about using xAI's Grok models with Instructor for type-safe, validated responses.\n\n## Quick Start\n\nInstructor is distributed without xAI dependencies by default. Install xAI support with the optional `xai` extra:\n\n```bash\npip install \"instructor[xai]\"\n```\n\nOr using uv:\n\n```bash\nuv pip install \"instructor[xai]\"\n```\n\n⚠️ **Important**: You must set your xAI API key before using the client. You can do this in two ways:\n\n1. Set the environment variable:\n\n```bash\nexport XAI_API_KEY='your-api-key-here'\n```\n\n2. The xAI SDK will use this environment variable automatically.\n\n## Simple User Example (Sync)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Auto-configure xAI client\nclient = instructor.from_provider(\"xai/grok-3-mini\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Create structured output\nuser = client.create(\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n    ],\n)\n\nprint(user)\n#> User(name='Jason', age=25)\n```\n\n## Simple User Example (Async)\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\n\n# Auto-configure async xAI client\nclient = instructor.from_provider(\"xai/grok-3-mini\", async_client=True)\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nasync def extract_user():\n    user = await client.create(\n        response_model=User,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"},\n        ],\n    )\n    return user\n\n# Run async function\nuser = asyncio.run(extract_user())\nprint(user)\n#> User(name='Jason', age=25)\n```\n\n## Nested Example\n\n```python\nfrom pydantic import BaseModel\nfrom typing import List\nimport instructor\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n\n# Auto-configure xAI client\nclient = instructor.from_provider(\"xai/grok-3-mini\")\n\n# Create structured output with nested objects\nuser = client.create(\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n            Extract: Jason is 25 years old.\n            He lives at 123 Main St, New York, USA\n            and has a summer house at 456 Beach Rd, Miami, USA\n        \"\"\"},\n    ],\n)\n\nprint(user)\n#> {\n#>     'name': 'Jason',\n#>     'age': 25,\n#>     'addresses': [\n#>         {\n#>             'street': '123 Main St',\n#>             'city': 'New York',\n#>             'country': 'USA'\n#>         },\n#>         {\n#>             'street': '456 Beach Rd',\n#>             'city': 'Miami',\n#>             'country': 'USA'\n#>         }\n#>     ]\n#> }\n```\n\n## Instructor Modes\n\nxAI supports the following modes:\n\n1. `instructor.Mode.JSON` : Forces the model to return JSON output (default)\n2. `instructor.Mode.TOOLS` : Uses function calling for structured outputs\n\n```python\nimport instructor\nfrom instructor import Mode\n\n# Using JSON mode (default)\nclient = instructor.from_provider(\"xai/grok-3-mini\", mode=Mode.JSON)\n\n# Using TOOLS mode\nclient = instructor.from_provider(\"xai/grok-3-mini\", mode=Mode.TOOLS)\n```\n\n## Available Models\n\nxAI provides access to the following models:\n\n- **grok-3** - The most capable Grok model for complex reasoning tasks\n- **grok-3-mini** - A smaller, faster version optimized for speed and cost\n\n## Limitations\n\n### Streaming Support\n\n⚠️ **Note**: Streaming responses (`create_iterable` and `create_partial`) are not yet supported due to differences in xAI's streaming API. See [issue #1663](https://github.com/567-labs/instructor/issues/1663) for updates.\n\n### Python Version\n\n⚠️ **Requires Python 3.10+**: The xAI SDK requires Python 3.10 or higher.\n\n## Best Practices\n\n### 1. API Key Management\n\nStore your xAI API key securely using environment variables:\n\n```bash\nexport XAI_API_KEY=\"your-api-key-here\"\n```\n\n### 2. Model Selection\n\n- Use `grok-3-mini` for:\n  - Simple extraction tasks\n  - High-volume processing\n  - Cost-sensitive applications\n\n- Use `grok-3` for:\n  - Complex reasoning tasks\n  - Multi-step analysis\n  - Higher accuracy requirements\n\n### 3. Error Handling\n\nAlways handle potential API errors gracefully:\n\n```python\ntry:\n    user = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract user data\"}],\n    )\nexcept Exception as e:\n    print(f\"Error: {e}\")\n```\n\n## Common Use Cases\n\n- Data Extraction from unstructured text\n- Form parsing and validation\n- Content classification\n- Entity recognition\n- Structured data generation\n\n## Related Resources\n\n- [xAI Documentation](https://docs.x.ai/)\n- [Instructor Core Concepts](../concepts/index.md)\n- [Type Validation Guide](../concepts/validation.md)\n- [Advanced Usage Examples](../examples/index.md)\n\n## Updates and Compatibility\n\nInstructor maintains compatibility with the latest xAI SDK versions. Check the [changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md) for updates.\n"
  },
  {
    "path": "docs/javascripts/katex.js",
    "content": "document$.subscribe(({ body }) => { \n    renderMathInElement(body, {\n      delimiters: [\n        { left: \"$$\",  right: \"$$\",  display: true },\n        { left: \"$\",   right: \"$\",   display: false },\n        { left: \"\\\\(\", right: \"\\\\)\", display: false },\n        { left: \"\\\\[\", right: \"\\\\]\", display: true }\n      ],\n    })\n  })"
  },
  {
    "path": "docs/jobs.md",
    "content": ""
  },
  {
    "path": "docs/learning/getting_started/first_extraction.md",
    "content": "---\ntitle: Your First LLM Extraction with Instructor\ndescription: Step-by-step tutorial for your first structured data extraction from language models using Instructor and Pydantic.\n---\n\n# Your First LLM Extraction: Structured Outputs Tutorial\n\nLearn how to extract structured data from LLMs using Instructor in this hands-on tutorial. We'll build a simple yet powerful example that demonstrates how to transform unstructured text into validated Python objects using GPT-4, Claude, or any supported LLM.\n\n## Quick Start: Extract Structured Data from LLMs\n\nThis LLM tutorial shows you how to extract structured information from natural language. We'll parse a person's name and age - a perfect starting point for understanding Instructor's power:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n# 1. Define your data model for LLM extraction\nclass Person(BaseModel):\n    name: str\n    age: int\n\n# 2. Initialize Instructor with your LLM provider\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n# 3. Extract structured data from LLM\nperson = client.create(\n    response_model=Person,   # Type-safe extraction\n    messages=[\n        {\"role\": \"user\", \"content\": \"John Doe is 30 years old\"}\n    ]\n)\n\n# 4. Use validated, structured data from LLM\nprint(f\"Name: {person.name}, Age: {person.age}\")\n# Output: Name: John Doe, Age: 30\n```\n\n## How Instructor LLM Extraction Works\n\n```\n┌─────────────┐    ┌──────────────┐    ┌─────────────┐\n│ Define      │ -> │ Instruct LLM │ -> │ Get Typed   │\n│ Structure   │    │ to Extract   │    │ Response    │\n└─────────────┘    └──────────────┘    └─────────────┘\n```\n\nUnderstanding the LLM structured output pipeline:\n\n### Step 1: Define Your LLM Output Schema\n\n```python\nclass Person(BaseModel):\n    name: str\n    age: int\n```\n\nPydantic models define the structure for LLM outputs:\n- `name`: String field for extracting names from LLM\n- `age`: Integer field with automatic type validation\n\n### Step 2: Configure Your LLM Client\n\n```python\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n```\n\nInstructor enhances your LLM client with structured output capabilities. Works with OpenAI, Anthropic, Google, and 15+ providers.\n\n### Step 3: Execute LLM Extraction\n\n```python\nperson = client.create(\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"John Doe is 30 years old\"}\n    ]\n)\n```\n\nKey parameters for structured LLM outputs:\n- `response_model`: Pydantic model for type-safe extraction\n- `messages`: Input text for the LLM to process\n\nNote: The model is already specified when creating the client with `from_provider()`, so you don't need to pass it again.\n\n### Step 4: Work with Validated LLM Data\n\n```python\nprint(f\"Name: {person.name}, Age: {person.age}\")\n```\n\nGet back a fully validated Python object from your LLM - no JSON parsing, no validation errors, just clean data ready to use.\n\n## Enhance LLM Extraction with Field Descriptions\n\nImprove LLM accuracy by providing clear field descriptions:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass Person(BaseModel):\n    name: str = Field(description=\"Person's full name\")\n    age: int = Field(description=\"Person's age in years\")\n```\n\nField descriptions act as prompts, guiding the LLM to extract exactly what you need.\n\n## Handle Optional Data in LLM Responses\n\nReal-world LLM extractions often have missing data. Handle it gracefully:\n\n```python\nfrom typing import Optional\n\nclass Person(BaseModel):\n    name: str\n    age: Optional[int] = None  # Now age is optional\n```\n\n## Continue Your LLM Tutorial Journey\n\nYou've successfully extracted structured data from an LLM! Next steps:\n\n1. **[Advanced Response Models](response_models.md)** - Complex schemas for LLM outputs\n2. **[Multi-Provider Setup](../../concepts/from_provider.md)** - Use GPT-4, Claude, Gemini interchangeably\n3. **[Production Patterns](../patterns/simple_object.md)** - Real-world LLM extraction examples\n\n## Common LLM Extraction Patterns\n\n- **Entity Extraction**: Names, dates, locations from unstructured text\n- **Sentiment Analysis**: Structured sentiment scores with reasoning\n- **Data Classification**: Categorize text into predefined schemas\n- **Information Parsing**: Convert documents into structured databases\n\nReady to build more complex LLM extractions? Continue to [Response Models](response_models.md) →\n"
  },
  {
    "path": "docs/learning/getting_started/installation.md",
    "content": "---\ntitle: Installing Instructor for LLM Structured Outputs\ndescription: Complete installation guide for Instructor with support for OpenAI, Anthropic, Google, and 15+ LLM providers. Get started in minutes.\n---\n\n# Instructor Installation Guide: Setup for LLM Structured Outputs\n\nLearn how to install Instructor, the leading Python library for extracting structured data from LLMs like GPT-4, Claude, and Gemini. This comprehensive installation tutorial covers all major LLM providers and gets you ready for production use.\n\n## Quick Start: Install Instructor for LLM Development\n\nGet started with structured LLM outputs in seconds. Install Instructor using pip:\n\n```shell\npip install instructor\n```\n\nInstructor leverages Pydantic for type-safe LLM data extraction:\n\n```shell\npip install pydantic\n```\n\n> **Pro Tip**: Use `uv` for faster installation: `uv pip install instructor`\n\n## LLM Provider Installation Guide\n\nInstructor supports 15+ LLM providers. Here's how to install and configure each:\n\n### OpenAI (GPT-4, GPT-3.5)\n\nOpenAI is the default LLM provider for Instructor. Perfect for GPT-4 and GPT-3.5-turbo structured outputs:\n\n```shell\npip install instructor\n```\n\nConfigure your OpenAI API key for LLM access:\n\n```shell\nexport OPENAI_API_KEY=your_openai_key\n```\n\n### Anthropic Claude LLM Setup\n\nExtract structured data from Claude 3 models (Opus, Sonnet, Haiku) with native tool support:\n\n```shell\npip install \"instructor[anthropic]\"\n```\n\nConfigure Claude API access:\n\n```shell\nexport ANTHROPIC_API_KEY=your_anthropic_key\n```\n\n### Google Gemini LLM Integration\n\nUse Gemini Pro and Flash models for structured outputs with function calling:\n\n```shell\npip install \"instructor[google-genai]\"\n```\n\nSet up Gemini API access:\n\n```shell\nexport GOOGLE_API_KEY=your_google_key\n```\n\n### Cohere\n\nTo use with Cohere's models:\n\n```shell\npip install \"instructor[cohere]\"\n```\n\nSet up your Cohere API key:\n\n```shell\nexport COHERE_API_KEY=your_cohere_key\n```\n\n### Mistral\n\nTo use with Mistral AI's models:\n\n```shell\npip install \"instructor[mistralai]\"\n```\n\nSet up your Mistral API key:\n\n```shell\nexport MISTRAL_API_KEY=your_mistral_key\n```\n\n### LiteLLM (Multiple Providers)\n\nTo use LiteLLM for accessing multiple providers:\n\n```shell\npip install \"instructor[litellm]\"\n```\n\nSet up API keys for the providers you want to use.\n\n## Verify Your Instructor LLM Setup\n\nTest your Instructor installation with this simple LLM structured output example:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nclass Person(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nperson = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"John Doe is 30 years old\"}\n    ]\n)\n\nprint(f\"Name: {person.name}, Age: {person.age}\")\n```\n\n## Next Steps in Your LLM Tutorial Journey\n\nWith Instructor installed, you're ready to build powerful LLM applications:\n\n1. **[Create Your First LLM Extraction](first_extraction.md)** - Build structured outputs with any LLM\n2. **[Master Response Models](response_models.md)** - Learn Pydantic models for LLM data validation\n3. **[Configure LLM Clients](../../concepts/from_provider.md)** - Set up OpenAI, Anthropic, Google, and more\n\n## Common Installation Issues\n\n- **Import Errors**: Ensure you've installed the provider-specific extras (e.g., `instructor[anthropic]`)\n- **API Key Issues**: Verify your environment variables are set correctly\n- **Version Conflicts**: Use `pip install --upgrade instructor` to get the latest version\n\nReady to extract structured data from LLMs? Continue to [Your First Extraction](first_extraction.md) →"
  },
  {
    "path": "docs/learning/getting_started/response_models.md",
    "content": "---\ntitle: Understanding Response Models in Instructor\ndescription: Learn how to create response models with Pydantic to define structure, validation rules, and extract complex data from LLMs.\n---\n\n# Understanding Response Models\n\nResponse models are at the core of Instructor's functionality. They define the structure of the data you want to extract and provide validation rules. This guide explains how to create different types of response models for various use cases.\n\n## Basic Models\n\nLet's start with a simple model similar to what we've seen before:\n\n```python\nfrom pydantic import BaseModel\n\nclass User(BaseModel):\n    name: str\n    age: int\n```\n\nThis defines a model with two required fields: `name` (a string) and `age` (an integer).\n\n## Adding Field Metadata\n\nYou can add metadata to fields using the `Field` class:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass WeatherForecast(BaseModel):\n    \"\"\"Weather forecast for a specific location\"\"\"\n\n    temperature: float = Field(\n        description=\"Current temperature in Celsius\"\n    )\n    condition: str = Field(\n        description=\"Weather condition (sunny, cloudy, rainy, etc.)\"\n    )\n    humidity: int = Field(\n        description=\"Humidity percentage from 0-100\"\n    )\n```\n\nField descriptions help the LLM understand what information to extract for each field.\n\n## Field Validation\n\nYou can add validation rules to ensure the extracted data meets your requirements:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass Product(BaseModel):\n    name: str = Field(min_length=3)\n    price: float = Field(gt=0)  # greater than 0\n    quantity: int = Field(ge=0)  # greater than or equal to 0\n    description: str = Field(max_length=500)\n```\n\nCommon validation parameters include:\n- `min_length`/`max_length`: For strings\n- `ge`/`gt`/`le`/`lt`: For numbers (greater/less than or equal/than)\n- `pattern`: For regex pattern matching\n\nFor more on validation, see the [Field Validation](../patterns/field_validation.md) and [Validation Basics](../validation/basics.md) guides.\n\n## Nested Models\n\nYou can create complex data structures with nested models:\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: Optional[str] = None\n    country: str\n\nclass User(BaseModel):\n    name: str\n    age: int\n    addresses: List[Address]\n```\n\nThis allows you to extract hierarchical data structures. For more examples, check out the [Simple Nested Structure](../patterns/nested_structure.md) guide.\n\n## Using Enums\n\nEnums help when you want to restrict a field to a set of specific values:\n\n```python\nfrom enum import Enum\nfrom pydantic import BaseModel\n\nclass UserType(str, Enum):\n    ADMIN = \"admin\"\n    REGULAR = \"regular\"\n    GUEST = \"guest\"\n\nclass User(BaseModel):\n    name: str\n    user_type: UserType\n```\n\n## Optional Fields\n\nFor fields that might not be present in the source text:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel\n\nclass Contact(BaseModel):\n    name: str\n    email: str\n    phone: Optional[str] = None\n    address: Optional[str] = None\n```\n\nFor more about working with optional fields, see the [Optional Fields](../patterns/optional_fields.md) guide.\n\n## Lists and Arrays\n\nTo extract multiple items of the same type:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel\n\nclass BlogPost(BaseModel):\n    title: str\n    content: str\n    tags: List[str]\n```\n\nFor more about working with lists, see the [List Extraction](../patterns/list_extraction.md) guide.\n\n## Using Your Models with Instructor\n\nOnce you've defined your model, you can use it for extraction:\n\n```python\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nforecast = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=WeatherForecast,\n    messages=[\n        {\"role\": \"user\", \"content\": \"What's the weather in New York today?\"}\n    ]\n)\n\nprint(forecast.model_dump_json(indent=2))\n```\n\n## Model Documentation\n\nYou can add documentation to your models using docstrings and field descriptions:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass Investment(BaseModel):\n    \"\"\"Represents an investment opportunity with risk and return details.\"\"\"\n\n    name: str = Field(description=\"Name of the investment\")\n    amount: float = Field(description=\"Investment amount in USD\")\n    expected_return: float = Field(description=\"Expected annual return percentage\")\n    risk_level: str = Field(description=\"Risk level (low, medium, high)\")\n```\n\nThis documentation helps both the LLM understand what to extract and makes your code more maintainable.\n\n## Advanced Validation with Validators\n\nFor more complex validation rules, you can use validator methods:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator\nfrom datetime import date\n\nclass Reservation(BaseModel):\n    check_in: date\n    check_out: date\n    guests: int = Field(ge=1)\n\n    @field_validator(\"check_out\")\n    def check_dates(cls, v, values):\n        if \"check_in\" in values.data and v <= values.data[\"check_in\"]:\n            raise ValueError(\"check_out must be after check_in\")\n        return v\n```\n\nFor more advanced validation techniques, check out the [Custom Validators](../validation/custom_validators.md) guide.\n\n## Next Steps\n\nIn the next section, learn about [from_provider](../../concepts/from_provider.md) to configure different LLM providers and understand the various modes of operation."
  },
  {
    "path": "docs/learning/getting_started/structured_outputs.md",
    "content": "---\ntitle: Getting Started with Structured LLM Outputs\ndescription: Learn the basics of extracting structured data from language models using Instructor. Understand the difference between unstructured and structured outputs.\n---\n\n# Getting Started with Structured Outputs\n\nLarge language models (LLMs) are powerful tools for generating text, but extracting structured data from their outputs can be challenging. Structured outputs solve this problem by having LLMs return data in consistent, machine-readable formats.\n\n## The Problem with Unstructured Outputs\n\nLet's look at what happens when we ask an LLM to extract information without any structure:\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI()\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract customer: John Doe, age 35, email: john@example.com\",\n        }\n    ],\n)\n\nprint(response.choices[0].message.content)\n```\n\nThe output might look like:\n```\nCustomer Name: John Doe\nAge: 35\nEmail: john@example.com\n```\n\nOr it could be:\n```\nI found the following customer information:\n- Name: John Doe\n- Age: 35\n- Email address: john@example.com\n```\n\nThis inconsistency makes it difficult to reliably parse the information in downstream applications.\n\n## The Solution: Structured Outputs with Instructor\n\nInstructor solves this problem by using Pydantic models to define the expected structure of the output:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field, EmailStr\nclass Customer(BaseModel):\n    name: str = Field(description=\"Customer's full name\")\n    age: int = Field(description=\"Customer's age in years\", ge=0, le=120)\n    email: EmailStr = Field(description=\"Customer's email address\")\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\ncustomer = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract customer: John Doe, age 35, email: john@example.com\",\n        }\n    ],\n    response_model=Customer,  # This is the key part\n)\n\nprint(customer)  # Customer(name='John Doe', age=35, email='john@example.com')\nprint(f\"Name: {customer.name}, Age: {customer.age}, Email: {customer.email}\")\n```\n\nThe benefits of this approach include:\n\n1. **Consistency**: Always get data in the same format\n2. **Validation**: Age must be between 0 and 120, email must be valid\n3. **Type Safety**: `age` is always an integer, not a string\n4. **Documentation**: Model fields are self-documenting with descriptions\n\n## Complex Example: Nested Structures\n\nInstructor shines with complex data structures:\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass Contact(BaseModel):\n    email: Optional[str] = None\n    phone: Optional[str] = None\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n    address: Address\n    contact: Contact\n    skills: List[str] = Field(description=\"List of professional skills\")\n\nperson = client.create(\n    model=\"gpt-4\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\n        Extract detailed information for this person:\n        John Smith is a 42-year-old software engineer living at 123 Main St, San Francisco, CA 94105.\n        His email is john.smith@example.com and phone is 555-123-4567.\n        John is skilled in Python, JavaScript, and cloud architecture.\n        \"\"\",\n        }\n    ],\n    response_model=Person,\n)\n\nprint(f\"Name: {person.name}\")\nprint(f\"Location: {person.address.city}, {person.address.state}\")\nprint(f\"Skills: {', '.join(person.skills)}\")\n```\n\n## Installation\n\nTo get started with Instructor, install it via pip:\n\n```shell\npip install instructor pydantic\n```\n\nYou'll also need to set up your API keys for the LLM provider you're using.\n\n## Next Steps\n\nIn the next sections, you'll learn how to:\n\n1. Create your [first extraction](first_extraction.md)\n2. Understand the different [response models](response_models.md) you can create\n3. Set up [clients for various LLM providers](../../concepts/from_provider.md)"
  },
  {
    "path": "docs/learning/index.md",
    "content": "# Instructor LLM Tutorial: Complete Guide to Structured Outputs\n\nLearn how to use Instructor for LLM structured outputs with this comprehensive tutorial. Instructor is the leading Python library for extracting structured, validated data from large language models (LLMs) like GPT-4, Claude, and Gemini.\n\n## What You'll Learn in This LLM Tutorial\n\nThis Instructor tutorial covers everything from basic LLM integration to advanced structured output patterns. Whether you're building AI applications, automating data extraction, or creating LLM-powered APIs, this guide provides practical, production-ready examples.\n\n## Getting Started with Instructor LLM Tutorial\n\nStart your journey with these beginner-friendly tutorials for LLM integration:\n\n* [**Installation Guide**](getting_started/installation.md) - Install Instructor for Python LLM development\n* [**Your First LLM Extraction**](getting_started/first_extraction.md) - Build your first structured output with OpenAI, Anthropic, or Google LLMs\n* [**Response Models Tutorial**](getting_started/response_models.md) - Master Pydantic models for LLM outputs\n* [**LLM Client Setup**](../concepts/from_provider.md) - Configure Instructor for OpenAI, Anthropic, Gemini, and 15+ LLM providers\n\n## LLM Data Extraction Patterns\n\nLearn essential patterns for extracting structured data from language models:\n\n* [**Simple Object Extraction**](patterns/simple_object.md) - Extract structured objects from LLM responses\n* [**List Extraction Tutorial**](patterns/list_extraction.md) - Generate lists and arrays with LLMs\n* [**Nested Data Structures**](patterns/nested_structure.md) - Handle complex, hierarchical LLM outputs\n* [**Optional Fields**](patterns/optional_fields.md) - Manage missing data in LLM responses\n* [**Field Validation**](patterns/field_validation.md) - Validate LLM outputs with Pydantic\n* [**Prompt Engineering Templates**](patterns/prompt_templates.md) - Optimize prompts for better LLM extraction\n\n## LLM Output Validation Tutorial\n\nEnsure reliability with these validation tutorials:\n\n* [**Validation Fundamentals**](validation/basics.md) - Core concepts for validating LLM outputs\n* [**Field-Level Validation**](validation/field_level_validation.md) - Granular validation for LLM data\n* [**Custom Validators**](validation/custom_validators.md) - Build domain-specific LLM validators\n* [**Retry Strategies**](validation/retry_mechanisms.md) - Handle LLM failures gracefully\n\n## Streaming LLM Responses\n\nReal-time LLM output processing tutorials:\n\n* [**Streaming Basics**](streaming/basics.md) - Stream LLM responses for better UX\n* [**Streaming Lists**](streaming/lists.md) - Process LLM arrays in real-time\n\n## Why This Instructor LLM Tutorial?\n\n- **Production-Ready Examples**: Real-world LLM integration patterns used by thousands of developers\n- **Multi-Provider Support**: Works with OpenAI, Anthropic, Google, Cohere, and more\n- **Type-Safe Outputs**: Leverage Python's type system for reliable LLM applications\n- **Progressive Learning Path**: From basic LLM calls to advanced extraction techniques\n\nReady to master structured outputs with LLMs? Start with our [installation guide](getting_started/installation.md) and build your first LLM-powered application today!"
  },
  {
    "path": "docs/learning/patterns/field_validation.md",
    "content": "# Field Validation\n\nThis guide covers how to add validation to fields when extracting structured data with Instructor. Field validation ensures that your extracted data meets specific criteria and constraints.\n\n## Why Field Validation Matters\n\nField validation helps you:\n\n1. Ensure data quality and consistency\n2. Enforce business rules\n3. Prevent errors in downstream processing\n4. Provide clear feedback for invalid data\n\nInstructor uses Pydantic's validation system, which is applied automatically during extraction.\n\n## Basic Field Constraints\n\nYou can add basic constraints to fields using Pydantic's `Field` function:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass User(BaseModel):\n    name: str = Field(..., min_length=2, max_length=50)\n    age: int = Field(..., ge=0, le=120)  # greater than or equal to 0, less than or equal to 120\n    email: str = Field(..., pattern=r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$')\n\n# Extract with validation\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"I'm John Smith, 35 years old, with email john@example.com\"}\n    ],\n    response_model=User\n)\n```\n\nCommon Field constraints include:\n\n| Constraint | Description | Example |\n|------------|-------------|---------|\n| `min_length` | Minimum string length | `min_length=2` |\n| `max_length` | Maximum string length | `max_length=50` |\n| `pattern` | Regex pattern to match | `pattern=r'^[0-9]+$'` |\n| `gt` | Greater than | `gt=0` (for numbers) |\n| `ge` | Greater than or equal | `ge=18` |\n| `lt` | Less than | `lt=100` |\n| `le` | Less than or equal | `le=120` |\n| `min_items` | Minimum list items | `min_items=1` |\n| `max_items` | Maximum list items | `max_items=10` |\n\nFor more information on field definitions, see the [Fields](../../concepts/fields.md) concepts page.\n\n## Validation with Field Validators\n\nFor more complex validation logic, use Pydantic's `field_validator` decorator:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator\nimport instructor\nfrom openai import OpenAI\nimport re\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Product(BaseModel):\n    name: str\n    sku: str\n    price: float\n\n    @field_validator('name')\n    @classmethod\n    def validate_name(cls, v):\n        if len(v.strip()) < 3:\n            raise ValueError(\"Product name must be at least 3 characters\")\n        return v.strip()\n\n    @field_validator('sku')\n    @classmethod\n    def validate_sku(cls, v):\n        if not re.match(r'^[A-Z]{3}-\\d{4}$', v):\n            raise ValueError(\"SKU must be in format XXX-0000\")\n        return v\n\n    @field_validator('price')\n    @classmethod\n    def validate_price(cls, v):\n        if v <= 0:\n            raise ValueError(\"Price must be greater than zero\")\n        if v > 10000:\n            raise ValueError(\"Price exceeds maximum allowed value\")\n        return v\n\n# Extract validated data\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Product: Wireless Headphones, SKU: ABC-1234, Price: $79.99\"}\n    ],\n    response_model=Product\n)\n```\n\nField validators can:\n- Perform complex validation logic\n- Clean and normalize data\n- Transform values\n- Check values against external data sources\n\nFor more on custom validators, see the [Custom Validators](../validation/custom_validators.md) guide.\n\n## Model-level Validation\n\nSometimes validation needs to check relationships between fields. For this, use `model_validator`:\n\n```python\nfrom pydantic import BaseModel, Field, model_validator\nimport instructor\nfrom openai import OpenAI\nfrom datetime import date\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass DateRange(BaseModel):\n    start_date: date\n    end_date: date\n\n    @model_validator(mode='after')\n    def validate_date_range(self):\n        if self.end_date < self.start_date:\n            raise ValueError(\"End date must be after start date\")\n        return self\n```\n\n## Validation in Nested Structures\n\nYou can apply validation at any level in nested structures:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator\nimport instructor\nfrom openai import OpenAI\nfrom typing import List\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\n    @field_validator('state')\n    @classmethod\n    def validate_state(cls, v):\n        valid_states = {\"CA\", \"NY\", \"TX\", \"FL\"}  # Example: just a few states\n        if v not in valid_states:\n            raise ValueError(f\"State must be one of: {', '.join(valid_states)}\")\n        return v\n\n    @field_validator('zip_code')\n    @classmethod\n    def validate_zip(cls, v):\n        if not v.isdigit() or len(v) != 5:\n            raise ValueError(\"ZIP code must be 5 digits\")\n        return v\n\nclass Person(BaseModel):\n    name: str\n    addresses: List[Address]  # Nested structure with validation\n```\n\nFor more on nested structures, see the [Nested Structure](nested_structure.md) guide.\n\n## List Item Validation\n\nYou can validate items in a list:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field, field_validator\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass TagList(BaseModel):\n    tags: List[str] = Field(..., min_items=1, max_items=5)\n\n    @field_validator('tags')\n    @classmethod\n    def validate_tags(cls, tags):\n        # Convert all tags to lowercase\n        tags = [tag.lower() for tag in tags]\n\n        # Check for minimum length of each tag\n        for tag in tags:\n            if len(tag) < 2:\n                raise ValueError(\"Each tag must be at least 2 characters\")\n\n        # Check for duplicates\n        if len(tags) != len(set(tags)):\n            raise ValueError(\"Tags must be unique\")\n\n        return tags\n```\n\nFor more on lists, see the [List Extraction](list_extraction.md) guide.\n\n## Using Enumerations for Validation\n\nEnums provide a way to validate fields against a predefined set of values:\n\n```python\nfrom enum import Enum\nfrom pydantic import BaseModel\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Status(str, Enum):\n    PENDING = \"pending\"\n    APPROVED = \"approved\"\n    REJECTED = \"rejected\"\n\nclass Priority(str, Enum):\n    LOW = \"low\"\n    MEDIUM = \"medium\"\n    HIGH = \"high\"\n\nclass Task(BaseModel):\n    title: str\n    description: str\n    status: Status  # Must be one of the enum values\n    priority: Priority  # Must be one of the enum values\n\n# Extract with enum validation\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Task: Update website, Description: Refresh content on homepage, Status: pending, Priority: high\"}\n    ],\n    response_model=Task\n)\n```\n\nFor more information on enums, see the [Enums](../../concepts/enums.md) concepts page.\n\n## Custom Error Messages\n\nYou can customize validation error messages for better feedback:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass CreditCard(BaseModel):\n    number: str = Field(\n        ...,\n        pattern=r'^\\d{16}$',\n        json_schema_extra={\"error_msg\": \"Credit card number must be exactly 16 digits\"}\n    )\n    expiry_month: int = Field(\n        ...,\n        ge=1,\n        le=12,\n        json_schema_extra={\"error_msg\": \"Expiry month must be between 1 and 12\"}\n    )\n    expiry_year: int = Field(\n        ...,\n        ge=2023,\n        le=2030,\n        json_schema_extra={\"error_msg\": \"Expiry year must be between 2023 and 2030\"}\n    )\n    cvv: str = Field(\n        ...,\n        pattern=r'^\\d{3,4}$',\n        json_schema_extra={\"error_msg\": \"CVV must be 3 or 4 digits\"}\n    )\n```\n\n## Handling Validation Failures\n\nWhen validation fails, Instructor will:\n\n1. Capture the validation error\n2. Add the error message to the context\n3. Retry the request with this feedback (if retries are enabled)\n\nTo control retry behavior:\n\n```python\nclient = instructor.from_provider(\n    \"openai/gpt-4o\",\n    max_retries=2,  # Number of retries after the initial attempt\n    throw_error=True  # Whether to raise an exception on validation failure\n)\n```\n\nFor more on retries, see the [Retry Mechanisms](../validation/retry_mechanisms.md) guide.\n\n## Real-world Example: Form Data Validation\n\nHere's a more complete example validating form inputs:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator, model_validator\nimport instructor\nimport re\nfrom datetime import date, datetime\nfrom typing import Optional\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass RegistrationForm(BaseModel):\n    username: str = Field(..., min_length=3, max_length=20)\n    email: str\n    password: str\n    confirm_password: str\n    birth_date: date\n\n    @field_validator('username')\n    @classmethod\n    def validate_username(cls, v):\n        if not re.match(r'^[a-zA-Z0-9_]+$', v):\n            raise ValueError(\"Username can only contain letters, numbers, and underscores\")\n        return v\n\n    @field_validator('email')\n    @classmethod\n    def validate_email(cls, v):\n        if not re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$', v):\n            raise ValueError(\"Invalid email format\")\n        return v\n\n    @field_validator('password')\n    @classmethod\n    def validate_password(cls, v):\n        if len(v) < 8:\n            raise ValueError(\"Password must be at least 8 characters\")\n        if not re.search(r'[A-Z]', v):\n            raise ValueError(\"Password must contain at least one uppercase letter\")\n        if not re.search(r'[a-z]', v):\n            raise ValueError(\"Password must contain at least one lowercase letter\")\n        if not re.search(r'[0-9]', v):\n            raise ValueError(\"Password must contain at least one number\")\n        return v\n\n    @field_validator('birth_date')\n    @classmethod\n    def validate_age(cls, v):\n        today = date.today()\n        age = today.year - v.year - ((today.month, today.day) < (v.month, v.day))\n        if age < 18:\n            raise ValueError(\"You must be at least 18 years old to register\")\n        return v\n\n    @model_validator(mode='after')\n    def passwords_match(self):\n        if self.password != self.confirm_password:\n            raise ValueError(\"Passwords do not match\")\n        return self\n```\n\n## Related Resources\n\n- [Validation Basics](../validation/basics.md) - Core validation concepts\n- [Custom Validators](../validation/custom_validators.md) - Creating custom validation logic\n- [Field-level Validation](../validation/field_level_validation.md) - Advanced field validation\n- [Retry Mechanisms](../validation/retry_mechanisms.md) - Handling validation failures\n- [Fields](../../concepts/fields.md) - Understanding field definitions\n- [Enums](../../concepts/enums.md) - Using enumeration types\n\n## Next Steps\n\n- Learn about [Optional Fields](optional_fields.md) for handling missing data\n- Explore [Custom Validators](../validation/custom_validators.md) for complex validation\n- Check out [Nested Structure](nested_structure.md) for complex data relationships"
  },
  {
    "path": "docs/learning/patterns/list_extraction.md",
    "content": "---\ntitle: List Extraction from LLMs Tutorial\ndescription: Master extracting multiple structured objects from language models using Instructor with type-safe list validation.\n---\n\n# List Extraction Tutorial: Extract Multiple Objects from LLMs\n\nMaster the art of extracting lists and arrays from LLMs in this comprehensive tutorial. Learn how to use Instructor to extract multiple structured objects from language models like GPT-4, Claude, and Gemini with type-safe validation.\n\n## Basic List Extraction\n\nTo extract a list of items, you define a model for a single item and then use Python's typing system to specify you want a list of that type:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nimport instructor\n# Initialize the client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n# Define a single item model\nclass Person(BaseModel):\n    name: str = Field(..., description=\"The person's full name\")\n    age: int = Field(..., description=\"The person's age in years\")\n\n# Define a wrapper model for the list\nclass PeopleList(BaseModel):\n    people: List[Person] = Field(..., description=\"List of people mentioned in the text\")\n\n# Extract the list\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Here's information about some people:\n        - John Smith is 35 years old\n        - Mary Johnson is 28 years old\n        - Robert Davis is 42 years old\n        \"\"\"}\n    ],\n    response_model=PeopleList\n)\n\n# Access the extracted data\nfor i, person in enumerate(response.people):\n    print(f\"Person {i+1}: {person.name}, {person.age} years old\")\n```\n\nThis example shows how to:\n1. Define a model for a single item (`Person`)\n2. Create a wrapper model that contains a list of items (`PeopleList`)\n3. Access each item in the list through the response\n\n## Direct List Extraction\n\nYou can also extract a list directly without a wrapper model:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Book(BaseModel):\n    title: str\n    author: str\n    publication_year: int\n\n# Extract a list directly\nbooks = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Classic novels:\n        1. To Kill a Mockingbird by Harper Lee (1960)\n        2. 1984 by George Orwell (1949)\n        3. The Great Gatsby by F. Scott Fitzgerald (1925)\n        \"\"\"}\n    ],\n    response_model=List[Book]  # Direct list extraction\n)\n\n# Access the extracted data\nfor book in books:\n    print(f\"{book.title} by {book.author} ({book.publication_year})\")\n```\n\n## Nested Lists\n\nYou can extract nested lists by combining list types:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Author(BaseModel):\n    name: str\n    nationality: str\n\nclass Book(BaseModel):\n    title: str\n    authors: List[Author]  # Nested list of authors\n    publication_year: int\n\n# Extract data with nested lists\nbooks = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Book 1: \"Good Omens\" (1990)\n        Authors: Terry Pratchett (British), Neil Gaiman (British)\n\n        Book 2: \"The Talisman\" (1984)\n        Authors: Stephen King (American), Peter Straub (American)\n        \"\"\"}\n    ],\n    response_model=List[Book]\n)\n\n# Access the nested data\nfor book in books:\n    author_names = \", \".join([author.name for author in book.authors])\n    print(f\"{book.title} ({book.publication_year}) by {author_names}\")\n```\n\n## Using Streaming with Lists\n\nYou can stream list extraction results using Instructor's streaming capabilities:\n\n```python\nfrom typing import List\nimport instructor\nfrom pydantic import BaseModel, Field\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Task(BaseModel):\n    description: str\n    priority: str\n    deadline: str\n\n# Stream a list of tasks\nfor task in client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Generate a list of 5 sample tasks for a project manager\"}\n    ],\n    response_model=List[Task],\n    stream=True\n):\n    print(f\"Received task: {task.description} (Priority: {task.priority}, Deadline: {task.deadline})\")\n```\n\nFor more information on streaming, see the [Streaming Basics](../streaming/basics.md) and [Streaming Lists](../streaming/lists.md) guides.\n\n## List Validation\n\nYou can add validation for both individual items and the entire list:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field, field_validator, model_validator\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Product(BaseModel):\n    name: str\n    price: float\n\n    @field_validator('price')\n    @classmethod\n    def validate_price(cls, v):\n        if v <= 0:\n            raise ValueError(\"Price must be greater than zero\")\n        return v\n\nclass ProductList(BaseModel):\n    products: List[Product] = Field(..., min_items=1)\n\n    @model_validator(mode='after')\n    def validate_unique_names(self):\n        names = [p.name for p in self.products]\n        if len(names) != len(set(names)):\n            raise ValueError(\"All product names must be unique\")\n        return self\n\n# Extract list with validation\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"List of products: Headphones ($50), Speakers ($80), Earbuds ($30)\"}\n    ],\n    response_model=ProductList\n)\n```\n\nFor more on validation, see [Field Validation](./field_validation.md) and [Validation Basics](../validation/basics.md).\n\n## List Constraints\n\nYou can add constraints to lists using Pydantic's Field:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Ingredient(BaseModel):\n    name: str\n    amount: str\n\nclass Recipe(BaseModel):\n    title: str\n    ingredients: List[Ingredient] = Field(\n        ...,\n        min_items=2,         # Minimum 2 ingredients\n        max_items=10,        # Maximum 10 ingredients\n        description=\"List of ingredients needed for the recipe\"\n    )\n    steps: List[str] = Field(\n        ...,\n        min_items=1,\n        description=\"Step-by-step instructions to prepare the recipe\"\n    )\n```\n\n## Real-world Example: Task Extraction\n\nHere's a more complete example for extracting a list of tasks from a meeting transcript:\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom datetime import date\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Assignee(BaseModel):\n    name: str\n    email: Optional[str] = None\n\nclass ActionItem(BaseModel):\n    description: str = Field(..., description=\"The task that needs to be completed\")\n    assignee: Assignee = Field(..., description=\"The person responsible for the task\")\n    due_date: Optional[date] = Field(None, description=\"The deadline for the task\")\n    priority: str = Field(..., description=\"Priority level: Low, Medium, or High\")\n\n# Extract action items from meeting notes\naction_items = client.create(\n    model=\"gpt-4\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Meeting Notes - Project Kickoff\n        Date: 2023-05-15\n\n        Attendees: John (john@example.com), Sarah (sarah@example.com), Mike\n\n        Discussion points:\n        1. John will prepare the project timeline by next Friday. This is high priority.\n        2. Sarah needs to contact the client for requirements clarification by Wednesday. Medium priority.\n        3. Mike is responsible for setting up the development environment. Due by tomorrow, high priority.\n        \"\"\"}\n    ],\n    response_model=List[ActionItem]\n)\n\n# Process the extracted action items\nfor item in action_items:\n    due_str = item.due_date.isoformat() if item.due_date else \"Not specified\"\n    print(f\"Task: {item.description}\")\n    print(f\"Assignee: {item.assignee.name} ({item.assignee.email or 'No email'})\")\n    print(f\"Due: {due_str}, Priority: {item.priority}\")\n    print(\"---\")\n```\n\nFor a more detailed example, see the [Action Items Extraction](../../examples/action_items.md) example.\n\n## Related Resources\n\n- [Simple Object Extraction](./simple_object.md) - Extracting single objects\n- [Nested Structure](./nested_structure.md) - Working with complex nested data\n- [Streaming Lists](../streaming/lists.md) - Streaming list results\n- [Lists and Arrays](../../concepts/lists.md) - Concepts related to list extraction\n\n## Next Steps\n\n- Learn about [Nested Structure](./nested_structure.md) for complex data\n- Explore [Streaming Lists](../streaming/lists.md) for handling large lists\n- Check out [Field Validation](./field_validation.md) for validation techniques"
  },
  {
    "path": "docs/learning/patterns/nested_structure.md",
    "content": "---\ntitle: Nested Structure Extraction with Instructor\ndescription: Learn how to extract complex nested data structures from LLMs using hierarchical Pydantic models.\n---\n\n# Simple Nested Structure\n\nThis guide explains how to extract nested structured data using Instructor. Nested structures allow you to represent complex, hierarchical data relationships.\n\n## Understanding Nested Structures\n\nNested structures are objects that contain other objects as fields. They're useful for representing:\n\n1. Parent-child relationships\n2. Complex entities with sub-components\n3. Hierarchical data\n4. Related data that belongs together\n\n## Basic Nested Structure Example\n\nHere's a simple example of extracting a nested structure:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom typing import List, Optional\n# Initialize the client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n# Define nested models\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    address: Address  # Nested structure\n\n# Extract the nested data\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        John Smith is 35 years old.\n        He lives at 123 Main Street, Boston, MA 02108.\n        \"\"\"}\n    ],\n    response_model=Person\n)\n\n# Access the nested data\nprint(f\"Name: {response.name}\")\nprint(f\"Age: {response.age}\")\nprint(f\"Address: {response.address.street}, {response.address.city}, \"\n      f\"{response.address.state} {response.address.zip_code}\")\n```\n\n## Multiple Levels of Nesting\n\nYou can use multiple levels of nesting for more complex structures:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom typing import List, Optional\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass EmployeeDetails(BaseModel):\n    department: str\n    position: str\n    start_date: str\n\nclass ContactInfo(BaseModel):\n    phone: str\n    email: str\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    contact: ContactInfo  # First level nesting\n    address: Address      # First level nesting\n    employment: Optional[EmployeeDetails] = None  # Optional nested structure\n\n# Extract deeply nested data\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Employee Profile:\n        Name: Jane Doe\n        Age: 32\n        Phone: (555) 123-4567\n        Email: jane.doe@example.com\n        Address: 456 Oak Avenue, Chicago, IL 60601\n        Department: Engineering\n        Position: Senior Developer\n        Start Date: 2021-03-15\n        \"\"\"}\n    ],\n    response_model=Person\n)\n```\n\n## Nested Lists\n\nYou can combine nesting with lists to represent complex collections:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom typing import List\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Ingredient(BaseModel):\n    name: str\n    amount: str\n    unit: str\n\nclass Recipe(BaseModel):\n    title: str\n    description: str\n    ingredients: List[Ingredient]  # Nested list of ingredients\n    steps: List[str]  # List of strings\n\n# Extract nested list data\nresponse = client.create(\n    model=\"gpt-4\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Recipe: Chocolate Chip Cookies\n\n        Description: Classic homemade chocolate chip cookies that are soft in the middle and crispy on the edges.\n\n        Ingredients:\n        - 2 1/4 cups all-purpose flour\n        - 1 teaspoon baking soda\n        - 1 teaspoon salt\n        - 1 cup butter\n        - 3/4 cup white sugar\n        - 3/4 cup brown sugar\n        - 2 eggs\n        - 2 teaspoons vanilla extract\n        - 2 cups chocolate chips\n\n        Instructions:\n        1. Preheat oven to 375°F (190°C)\n        2. Mix flour, baking soda, and salt\n        3. Cream butter and sugars, then add eggs and vanilla\n        4. Gradually add dry ingredients\n        5. Stir in chocolate chips\n        6. Drop by rounded tablespoons onto ungreased baking sheets\n        7. Bake for 9 to 11 minutes or until golden brown\n        8. Cool on wire racks\n        \"\"\"}\n    ],\n    response_model=Recipe\n)\n```\n\nFor more information on working with lists, see the [List Extraction](list_extraction.md) guide.\n\n## Handling Optional Nested Fields\n\nSometimes parts of a nested structure might be missing. Use Optional to handle this:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom typing import Optional\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass SocialMedia(BaseModel):\n    twitter: Optional[str] = None\n    linkedin: Optional[str] = None\n    instagram: Optional[str] = None\n\nclass ContactInfo(BaseModel):\n    email: str\n    phone: Optional[str] = None\n    social: Optional[SocialMedia] = None  # Optional nested structure\n\nclass Person(BaseModel):\n    name: str\n    contact: ContactInfo\n```\n\nFor more information on optional fields, see the [Optional Fields](optional_fields.md) guide.\n\n## Nested Structure Validation\n\nYou can add validation to nested structures at any level:\n\n```python\nfrom pydantic import BaseModel, Field, field_validator, model_validator\nimport instructor\nimport re\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass EmailContact(BaseModel):\n    email: str\n\n    @field_validator('email')\n    @classmethod\n    def validate_email(cls, v):\n        pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'\n        if not re.match(pattern, v):\n            raise ValueError(\"Invalid email format\")\n        return v\n\nclass Customer(BaseModel):\n    name: str\n    contact: EmailContact  # Nested structure with its own validation\n\n    @model_validator(mode='after')\n    def validate_name_email_match(self):\n        name_part = self.name.lower().split()[0]\n        if name_part not in self.contact.email.lower():\n            print(f\"Warning: Email {self.contact.email} may not match name {self.name}\")\n        return self\n```\n\nFor more on validation, see [Field Validation](field_validation.md) and [Validation Basics](../validation/basics.md).\n\n## Working with Recursive Structures\n\nFor more complex hierarchical data, you can use recursive structures:\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Comment(BaseModel):\n    text: str\n    author: str\n    replies: List[\"Comment\"] = []  # Recursive structure\n\n# Update the Comment class reference for Pydantic\nComment.model_rebuild()\n\nclass Post(BaseModel):\n    title: str\n    content: str\n    author: str\n    comments: List[Comment] = []\n\n# Extract recursive nested data\nresponse = client.create(\n    model=\"gpt-4\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Blog Post: \"Python Tips and Tricks\"\n        Author: John Smith\n        Content: Here are some helpful Python tips for beginners...\n\n        Comments:\n        1. Alice: \"Great post! Very helpful.\"\n           - Bob: \"I agree, I learned a lot.\"\n             - Alice: \"Bob, did you try the last example?\"\n           - Charlie: \"Thanks for sharing this.\"\n        2. David: \"Could you explain the second tip more?\"\n           - John: \"Sure, I'll add more details.\"\n        \"\"\"}\n    ],\n    response_model=Post\n)\n```\n\nFor more advanced recursive structures, see the [Recursive Structures](../../examples/recursive.md) guide.\n\n## Real-world Example: Organization Structure\n\nHere's a more complete example extracting an organization structure:\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Employee(BaseModel):\n    name: str\n    title: str\n\nclass Department(BaseModel):\n    name: str\n    head: Employee\n    employees: List[Employee]\n    sub_departments: List[\"Department\"] = []\n\n# Update for Pydantic's recursive model support\nDepartment.model_rebuild()\n\nclass Organization(BaseModel):\n    name: str\n    ceo: Employee\n    departments: List[Department]\n\n# Extract organization structure\nresponse = client.create(\n    model=\"gpt-4\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Acme Corporation\n        CEO: Jane Smith, Chief Executive Officer\n\n        Departments:\n\n        1. Engineering\n           Head: Bob Johnson, CTO\n           Employees:\n           - Sarah Lee, Senior Engineer\n           - Tom Brown, Software Developer\n\n           Sub-departments:\n           - Frontend Team\n             Head: Lisa Wang, Frontend Lead\n             Employees:\n             - Mike Chen, UI Developer\n             - Ana Garcia, UX Designer\n\n           - Backend Team\n             Head: David Kim, Backend Lead\n             Employees:\n             - James Wright, Database Engineer\n             - Rachel Patel, API Developer\n\n        2. Marketing\n           Head: Michael Davis, CMO\n           Employees:\n           - Jennifer Miller, Marketing Specialist\n           - Robert Chen, Content Creator\n        \"\"\"}\n    ],\n    response_model=Organization\n)\n```\n\n\n## Related Resources\n\n- [Simple Object Extraction](./simple_object.md) - Extracting basic objects\n- [List Extraction](./list_extraction.md) - Working with lists of objects\n- [Optional Fields](./optional_fields.md) - Handling optional data\n- [Recursive Structures](../../examples/recursive.md) - Building more complex hierarchies\n- [Field Validation](./field_validation.md) - Adding validation to your fields\n\n"
  },
  {
    "path": "docs/learning/patterns/optional_fields.md",
    "content": "---\ntitle: Working with Optional Fields in Instructor\ndescription: Learn how to use optional fields in Pydantic models to handle missing or uncertain information from LLM outputs.\n---\n\n# Optional Fields\n\nThis guide explains how to work with optional fields in your data models. Optional fields allow the model to skip fields when information is unavailable or uncertain.\n\n## Why Use Optional Fields?\n\nOptional fields are useful when:\n\n1. Some information is missing from the input text\n2. Certain fields are only relevant in specific contexts\n3. The LLM can't confidently extract all fields\n4. You want to allow partial success instead of complete failure\n\n## Basic Optional Fields\n\nTo make a field optional, use Python's `Optional` type and provide a default value:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Person(BaseModel):\n    name: str  # Required field\n    age: Optional[int] = None  # Optional field with None default\n    occupation: Optional[str] = None  # Optional field with None default\n```\n\nHere, `name` is required, while `age` and `occupation` are optional and will default to `None` if not found.\n\n## Using Default Values\n\nYou can provide meaningful default values for optional fields:\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Product(BaseModel):\n    name: str\n    price: float\n    currency: str = \"USD\"  # Default value\n    in_stock: bool = True  # Default value\n    tags: List[str] = []  # Default empty list\n```\n\n## Optional Fields with Validation\n\nYou can add the `Field` class for more control and validation:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass UserProfile(BaseModel):\n    username: str\n    email: str\n    bio: Optional[str] = Field(\n        None,  # Default value\n        max_length=200,  # Validation applies if present\n        description=\"User's biography, limited to 200 characters\"\n    )\n```\n\n## Optional Nested Structures\n\nEntire nested structures can be optional:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass Contact(BaseModel):\n    email: str\n    phone: Optional[str] = None\n    address: Optional[Address] = None  # Optional nested structure\n\nclass Person(BaseModel):\n    name: str\n    contact: Contact\n```\n\nWhen using nested optional structures, check if they exist before accessing:\n\n```python\n# Access nested data safely\nif person.contact.address:\n    print(f\"Address: {person.contact.address.city}\")\nelse:\n    print(\"No address information available\")\n```\n\n## Using `Maybe` for Uncertain Fields\n\nInstructor provides a `Maybe` type for uncertain or ambiguous fields:\n\n```python\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor.types import Maybe\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass PersonInfo(BaseModel):\n    name: str\n    age: Maybe[int] = None  # Maybe type for uncertain fields\n```\n\nCheck if a `Maybe` field contains uncertain information:\n\n```python\nif person.age and person.age.is_uncertain:\n    print(f\"Uncertain age: approximately {person.age.value}\")\nelif person.age:\n    print(f\"Age: {person.age.value}\")\nelse:\n    print(\"Age: Unknown\")\n```\n\nFor more about the `Maybe` type, see the [Missing Concepts](../../concepts/maybe.md) page.\n\n## Handling Optional Values\n\nAlways handle the possibility of `None` values in your code:\n\n```python\n# Check for None before using\nif person.age is not None:\n    drinking_age = \"Legal\" if person.age >= 21 else \"Underage\"\nelse:\n    drinking_age = \"Unknown\"\n\n# Use conditional expressions\nprice_display = f\"${product.price}\" if product.price is not None else \"Price unavailable\"\n\n# Provide defaults with 'or'\ndisplay_name = user.nickname or user.username\n```\n\n## Validation with Optional Fields\n\nOptional fields can still have validation when they're present:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel, field_validator\nimport instructor\nimport re\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass ContactInfo(BaseModel):\n    email: str\n    phone: Optional[str] = None\n\n    @field_validator('phone')\n    @classmethod\n    def validate_phone(cls, v):\n        if v is not None and not re.match(r'^\\+?[1-9]\\d{1,14}$', v):\n            raise ValueError(\"Invalid phone format\")\n        return v\n```\n\n## Related Resources\n\n- [Simple Object Extraction](./simple_object.md) - Extracting basic objects\n- [Field Validation](./field_validation.md) - Adding validation to fields\n- [Nested Structure](./nested_structure.md) - Working with complex data\n- [Missing Concepts](../../concepts/maybe.md) - Using the Maybe type for uncertain fields\n\n## Next Steps\n\n- Learn about [Field Validation](./field_validation.md)\n- Explore [Nested Structure](./nested_structure.md) for complex data\n- Check out [Prompt Templates](./prompt_templates.md) for crafting prompts"
  },
  {
    "path": "docs/learning/patterns/prompt_templates.md",
    "content": "---\ntitle: Using Prompt Templates with Instructor\ndescription: Learn how to create reusable prompt templates for consistent structured output extraction across different use cases.\n---\n\n# Prompt Templates\n\nThis guide covers how to use prompt templates with Instructor to create reusable, parameterized prompts for structured data extraction.\n\n## Why Prompt Templates Matter\n\nGood prompts are essential for effective structured data extraction. Prompt templates help you:\n\n1. Create consistent and reusable prompts\n2. Parameterize prompts with dynamic values\n3. Separate prompt engineering from application logic\n4. Standardize prompt patterns for different use cases\n\n## Basic Prompt Templates\n\nThe simplest form of a prompt template is a string with placeholders for variables:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n\n# Define a template with parameters\nprompt_template = \"\"\"\nExtract information about the person mentioned in the following {document_type}:\n\n{content}\n\nPlease provide their name, age, and occupation.\n\"\"\"\n\n# Use the template with specific values\ndocument_type = \"email\"\ncontent = \"Hi team, I'm introducing our new project manager, Sarah Johnson. She's 34 and has been in project management for 8 years.\"\n\nprompt = prompt_template.format(\n    document_type=document_type,\n    content=content\n)\n\n# Extract structured data using the formatted prompt\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": prompt}\n    ],\n    response_model=Person\n)\n```\n\n## Using f-strings for Simple Templates\n\nFor simple cases, you can use f-strings to create prompt templates:\n\n```python\ndef extract_person(content, document_type=\"text\"):\n    prompt = f\"\"\"\n    Extract information about the person mentioned in the following {document_type}:\n\n    {content}\n\n    Please provide their name, age, and occupation.\n    \"\"\"\n\n    return client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\"role\": \"user\", \"content\": prompt}\n        ],\n        response_model=Person\n    )\n\n# Use the function\nperson = extract_person(\n    \"According to his resume, John Smith (42) works as a software developer.\",\n    document_type=\"resume\"\n)\n```\n\n## Template Functions\n\nFor more complex templates, create dedicated template functions:\n\n```python\nfrom typing import List, Optional\nfrom pydantic import BaseModel\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass ProductReview(BaseModel):\n    product_name: str\n    rating: int\n    pros: List[str]\n    cons: List[str]\n    summary: str\n\ndef create_review_extraction_prompt(\n    review_text: str,\n    product_category: str,\n    include_sentiment: bool = False\n) -> str:\n    sentiment_instruction = \"\"\"\n    Also include a brief sentiment analysis of the review.\n    \"\"\" if include_sentiment else \"\"\n\n    return f\"\"\"\n    Extract product review information from the following {product_category} review:\n\n    {review_text}\n\n    Please identify:\n    - The name of the product being reviewed\n    - The numerical rating (1-5)\n    - A list of pros/positive points\n    - A list of cons/negative points\n    - A brief summary of the review\n    {sentiment_instruction}\n    \"\"\"\n\n# Use the template function\nreview_text = \"\"\"\nI recently purchased the UltraSound X300 headphones, and I'm mostly satisfied.\nThe sound quality is amazing and the battery lasts for days. They're also very\ncomfortable to wear for long periods. However, they're a bit pricey at $299, and\nthe Bluetooth occasionally disconnects. Overall, I'd give them 4 out of 5 stars.\n\"\"\"\n\nprompt = create_review_extraction_prompt(\n    review_text=review_text,\n    product_category=\"headphone\",\n    include_sentiment=True\n)\n\nreview = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": prompt}\n    ],\n    response_model=ProductReview\n)\n```\n\n## Best Practices for Prompt Templates\n\n1. **Be explicit about the output format**: Clearly specify what fields you need and in what format\n2. **Use consistent language**: Maintain consistent terminology throughout the template\n3. **Keep it concise**: Avoid unnecessary verbosity that could confuse the model\n4. **Parameterize only what varies**: Only make template parameters for parts that need to change\n5. **Include examples for complex tasks**: Provide few-shot examples for more complex extractions\n6. **Test with different inputs**: Ensure your template works well with a variety of inputs\n\n## Related Resources\n\n- [Simple Object Extraction](./simple_object.md) - Extracting basic objects\n- [List Extraction](./list_extraction.md) - Working with lists of objects\n- [Optional Fields](./optional_fields.md) - Handling optional data\n- [Prompting](../../concepts/prompting.md) - General prompting concepts\n- [Templating](../../concepts/templating.md) - Advanced template techniques\n\n## Next Steps\n\n- Explore [Field Validation](./field_validation.md) for ensuring data quality\n- Try [List Extraction](./list_extraction.md) for extracting multiple items\n- Learn about [Nested Structure](./nested_structure.md) for complex data"
  },
  {
    "path": "docs/learning/patterns/simple_object.md",
    "content": "---\ntitle: Simple Object Extraction Pattern\ndescription: Learn the fundamental pattern of extracting simple objects from text using Instructor with type-safe validation.\n---\n\n# Simple Object Extraction: LLM Tutorial for Structured Data\n\nLearn how to extract structured objects from text using LLMs in this comprehensive tutorial. We'll cover the fundamental pattern of transforming unstructured text into validated Python objects using Instructor with GPT-4, Claude, and other language models.\n\n## Basic LLM Object Extraction Tutorial\n\n```python\nfrom pydantic import BaseModel\nimport instructor\n# Define your LLM extraction schema\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n\n# Extract structured data from LLM\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nperson = client.create(\n    model=\"gpt-3.5-turbo\",  # Works with GPT-4, Claude, Gemini\n    messages=[\n        {\"role\": \"user\", \"content\": \"John Smith is a 35-year-old software engineer.\"}\n    ],\n    response_model=Person  # Type-safe LLM extraction\n)\n\nprint(f\"Name: {person.name}\")\nprint(f\"Age: {person.age}\")\nprint(f\"Occupation: {person.occupation}\")\n```\n\n```\n┌───────────────┐            ┌───────────────┐\n│ Define Model  │            │ Extracted     │\n│ name: str     │  Extract   │ name: \"John\"  │\n│ age: int      │ ─────────> │ age: 35       │\n│ occupation: str│            │ occupation:   │\n└───────────────┘            │ \"software...\" │\n                             └───────────────┘\n```\n\n## Enhance LLM Extraction with Field Descriptions\n\nGuide your LLM with clear field descriptions for more accurate extraction:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass Book(BaseModel):\n    title: str = Field(description=\"The full title of the book\")\n    author: str = Field(description=\"The author's full name\")\n    publication_year: int = Field(description=\"The year the book was published\")\n```\n\nField descriptions serve as prompts for the LLM, improving extraction accuracy and reducing errors in your structured outputs.\n\n## Handle Missing Data in LLM Responses\n\nReal-world LLM extractions often encounter missing information. Here's how to handle it gracefully:\n\n```python\nfrom typing import Optional\nfrom pydantic import BaseModel\n\nclass MovieReview(BaseModel):\n    title: str\n    director: Optional[str] = None  # Optional field\n    rating: float\n```\n\nUsing `Optional` fields ensures your LLM extraction remains robust when dealing with incomplete or partial information.\n\n## Validate LLM Outputs with Pydantic\n\nEnsure LLM outputs meet your requirements with built-in validation:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass Product(BaseModel):\n    name: str\n    price: float = Field(gt=0, description=\"The product price in USD\")\n    in_stock: bool\n```\n\nPydantic validation ensures your LLM outputs are not just structured, but also correct and business-rule compliant.\n\n## Production-Ready LLM Extraction Example\n\nHere's a complete example showing nested object extraction from LLMs:\n\n```python\nfrom pydantic import BaseModel\nfrom typing import Optional\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass ContactInfo(BaseModel):\n    name: str\n    email: str\n    phone: Optional[str] = None\n    address: Optional[Address] = None\n\n# Extract structured data\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\ncontact = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Contact information:\n        Name: Sarah Johnson\n        Email: sarah.j@example.com\n        Phone: (555) 123-4567\n        Address: 123 Main St, Boston, MA 02108\n        \"\"\"}\n    ],\n    response_model=ContactInfo\n)\n\nprint(f\"Name: {contact.name}\")\nprint(f\"Email: {contact.email}\")\n```\n\n## Common LLM Object Extraction Use Cases\n\n- **Contact Information**: Extract names, emails, phones from unstructured text\n- **Product Details**: Parse product descriptions into structured catalogs\n- **Event Information**: Extract dates, locations, attendees from event descriptions\n- **Entity Recognition**: Identify and structure people, places, organizations\n\n## Continue Your LLM Tutorial Journey\n\n- **[List Extraction Tutorial](list_extraction.md)** - Extract multiple objects from LLM responses\n- **[Nested Structures](nested_structure.md)** - Handle complex hierarchical data from LLMs\n- **[Advanced Validation](field_validation.md)** - Implement business rules for LLM outputs\n\nMaster these patterns to build production-ready LLM applications with reliable structured outputs!"
  },
  {
    "path": "docs/learning/streaming/basics.md",
    "content": "---\ntitle: Streaming Basics with Instructor\ndescription: Learn how to use streaming to receive partial structured responses from LLMs as they are generated.\n---\n\n# Streaming Basics\n\nStreaming allows you to receive parts of a structured response as they're generated, rather than waiting for the complete response.\n\n## Why Use Streaming?\n\nStreaming offers several benefits:\n\n1. **Faster Perceived Response**: Users see results immediately\n2. **Progressive UI Updates**: Update your interface as data arrives\n3. **Processing While Generating**: Start using data before the complete response is ready\n\n```\nWithout Streaming:\n┌─────────┐             ┌─────────────────────┐\n│ Request │─── Wait ───>│ Complete Response   │\n└─────────┘             └─────────────────────┘\n\nWith Streaming:\n┌─────────┐    ┌───────┐    ┌───────┐    ┌───────┐\n│ Request │───>│Part 1 │───>│Part 2 │───>│Part 3 │─── ...\n└─────────┘    └───────┘    └───────┘    └───────┘\n```\n\n## Simple Example\n\nHere's how to stream a structured response:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n# Define your data structure\nclass UserProfile(BaseModel):\n    name: str\n    bio: str\n    interests: list[str]\n\n# Set up client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n# Enable streaming\nfor partial in client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Generate a profile for Alex Chen\"}\n    ],\n    response_model=UserProfile,\n    stream=True  # This enables streaming\n):\n    # Print each update as it arrives\n    print(\"\\nUpdate received:\")\n\n    # Access available fields\n    if hasattr(partial, \"name\") and partial.name:\n        print(f\"Name: {partial.name}\")\n    if hasattr(partial, \"bio\") and partial.bio:\n        print(f\"Bio: {partial.bio[:30]}...\")\n    if hasattr(partial, \"interests\") and partial.interests:\n        print(f\"Interests: {', '.join(partial.interests)}\")\n```\n\n## How Streaming Works\n\nWhen streaming with Instructor:\n\n1. Enable streaming with `stream=True`\n2. The method returns an iterator of partial responses\n3. Each partial contains fields that have been completed so far\n4. You check for fields using `hasattr()` since they appear incrementally\n5. The final iteration contains the complete response\n\n## Progress Tracking Example\n\nHere's a simple way to track progress:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Report(BaseModel):\n    title: str\n    summary: str\n    conclusion: str\n\n# Track completed fields\ncompleted = set()\ntotal_fields = 3  # Number of fields in our model\n\nfor partial in client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Generate a report on climate change\"}\n    ],\n    response_model=Report,\n    stream=True\n):\n    # Check which fields are complete\n    for field in [\"title\", \"summary\", \"conclusion\"]:\n        if hasattr(partial, field) and getattr(partial, field) and field not in completed:\n            completed.add(field)\n            percent = (len(completed) / total_fields) * 100\n            print(f\"Received: {field} - {percent:.0f}% complete\")\n```\n\n## Next Steps\n\n- Explore [Streaming Lists](lists.md) for handling collections\n- Learn about [Validation with Streaming](../validation/basics.md)"
  },
  {
    "path": "docs/learning/streaming/lists.md",
    "content": "---\ntitle: Streaming Lists with Instructor\ndescription: Learn how to stream lists of structured objects from LLMs, processing collection items as they are generated for better responsiveness.\n---\n\n# Streaming Lists\n\nThis guide explains how to stream lists of structured data with Instructor. Streaming lists allows you to process collection items as they're generated, improving responsiveness for larger outputs.\n\n## Basic List Streaming\n\nHere's how to stream a list of structured objects:\n\n```python\nfrom typing import Iterable\nimport instructor\nfrom pydantic import BaseModel, Field\n# Initialize the client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Book(BaseModel):\n    title: str = Field(..., description=\"Book title\")\n    author: str = Field(..., description=\"Book author\")\n    year: int = Field(..., description=\"Publication year\")\n\n# Stream a list of books\nfor book in client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"List 5 classic science fiction books\"}\n    ],\n    response_model=Iterable[Book],\n):\n    print(f\"Received: {book.title} by {book.author} ({book.year})\")\n```\n\nThis example shows how to:\n1. Define a Pydantic model for each list item\n2. Use Python's typing system to specify a list\n3. Process each item as it arrives in the stream\n\n## Real-world Example: Task Generation\n\nHere's a practical example of streaming a list of tasks with progress tracking:\n\n```python\nfrom typing import Iterable\nimport instructor\nfrom pydantic import BaseModel, Field\nimport time\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Task(BaseModel):\n    title: str = Field(..., description=\"Task title\")\n    description: str = Field(..., description=\"Detailed task description\")\n    priority: str = Field(..., description=\"Task priority (High/Medium/Low)\")\n    estimated_hours: float = Field(..., description=\"Estimated hours to complete\")\n\n\nprint(\"Generating project tasks...\")\nstart_time = time.time()\nreceived_tasks = 0\n\nfor task in client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Generate a list of 5 tasks for building a personal website\",\n        }\n    ],\n    response_model=Iterable[Task],\n    stream=True,\n):\n    received_tasks += 1\n    print(f\"\\nTask {received_tasks}: {task.title} (Priority: {task.priority})\")\n    print(f\"Description: {task.description[:100]}...\")\n    print(f\"Estimated time: {task.estimated_hours} hours\")\n\n    # Calculate progress percentage based on expected items\n    progress = (received_tasks / 5) * 100\n    print(f\"Progress: {progress:.0f}%\")\n\nelapsed_time = time.time() - start_time\nprint(f\"\\nAll {received_tasks} tasks generated in {elapsed_time:.2f} seconds\")\n\n```\n\n## Related Resources\n\n- [Streaming Basics](./basics.md) - Fundamentals of streaming structured outputs\n- [List Extraction](../../learning/patterns/list_extraction.md) - Core concepts for working with lists\n- [Validation Basics](../../learning/validation/basics.md) - Understanding validation for streaming\n- [Streaming API](../../concepts/partial.md) - Technical details on the streaming implementation\n\n## Next Steps\n\n- Learn about [Validation](../../learning/validation/basics.md) to ensure your streamed data is valid\n- Explore [Field Validation](../../learning/validation/field_level_validation.md) for more control\n- See [Async Support](../../integrations/index.md) for integrating streaming with your specific provider when writing asynchronous code"
  },
  {
    "path": "docs/learning/validation/basics.md",
    "content": "---\ntitle: LLM Validation Basics with Instructor\ndescription: Master the fundamentals of validating LLM outputs to ensure reliable, business-compliant structured data from GPT-4, Claude, and other models.\n---\n\n# LLM Validation Tutorial: Ensure Data Quality with Instructor\n\nMaster the fundamentals of validating LLM outputs in this comprehensive tutorial. Learn how to use Instructor's validation system to ensure GPT-4, Claude, and other language models produce reliable, business-compliant structured data.\n\n## Why LLM Output Validation is Critical\n\nWhen extracting structured data from LLMs, validation ensures:\n\n1. **Data Integrity**: LLM outputs contain all required fields with correct formats\n2. **Business Compliance**: Extracted data adheres to your domain rules and constraints\n3. **Production Reliability**: LLM responses meet quality standards before entering your system\n\n```\n┌─────────────┐    ┌──────────────┐    ┌─────────────┐\n│ LLM         │ -> │ Instructor   │ -> │ Validated   │\n│ Generates   │    │ Validates    │    │ Structured  │\n│ Response    │    │ Structure    │    │ Data        │\n└─────────────┘    └──────────────┘    └─────────────┘\n                          │\n                          │ If validation fails\n                          ▼\n                   ┌─────────────┐\n                   │ Retry with  │\n                   │ Feedback    │\n                   └─────────────┘\n```\n\n## Basic LLM Validation Example\n\nSee how Instructor validates LLM outputs automatically:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\n# Define validation rules for LLM extraction\nclass UserProfile(BaseModel):\n    name: str\n    age: int = Field(ge=13, description=\"User's age in years\")\n\n# Extract and validate LLM output\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",  # Works with GPT-4, Claude, Gemini\n    messages=[\n        {\"role\": \"user\", \"content\": \"My name is Jane Smith and I'm 25 years old.\"}\n    ],\n    response_model=UserProfile  # Automatic validation\n)\n\nprint(f\"User: {response.name}, Age: {response.age}\")\n```\n\nKey validation features in this LLM tutorial:\n- **Constraint Validation**: Age must be ≥ 13 years\n- **Automatic Retry**: If LLM output fails validation, Instructor retries with error context\n- **Type Safety**: Ensures LLM returns proper data types\n\n## Essential LLM Validation Patterns\n\nCommon validation rules for LLM outputs:\n\n| Validation | Example | What It Does |\n|------------|---------|-------------|\n| Type checking | `age: int` | Ensures value is an integer |\n| Required fields | `name: str` | Field must be present |\n| Optional fields | `middle_name: Optional[str] = None` | Field can be missing |\n| Minimum value | `age: int = Field(ge=18)` | Value must be ≥ 18 |\n| Maximum value | `rating: float = Field(le=5.0)` | Value must be ≤ 5.0 |\n| String length | `username: str = Field(min_length=3)` | String must be at least 3 chars |\n\n## How LLM Output Validation Works\n\nThe LLM validation pipeline in Instructor:\n\n1. **LLM Generation**: Language model produces structured output\n2. **Schema Matching**: Instructor maps LLM response to your Pydantic model\n3. **Validation Check**: Pydantic validates against defined constraints\n4. **Smart Retry**: On failure, errors are sent back to the LLM with context\n5. **Success or Timeout**: Process continues until valid output or retry limit\n\n## Enhance LLM Validation with Custom Messages\n\nGuide LLMs with specific error messages for better corrections:\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass Product(BaseModel):\n    name: str\n    price: float = Field(\n        gt=0,\n        description=\"Product price in USD\",\n        json_schema_extra={\"error_msg\": \"Price must be greater than zero\"}\n    )\n```\n\n## Common LLM Validation Use Cases\n\n- **Age Verification**: Ensure extracted ages meet minimum requirements\n- **Price Validation**: Verify LLM-extracted prices are positive numbers\n- **Email Format**: Validate email addresses from unstructured text\n- **Date Constraints**: Ensure dates are within valid ranges\n- **Business Rules**: Enforce domain-specific constraints on LLM outputs\n\n## Continue Your LLM Validation Journey\n\n- **[Custom Validators](custom_validators.md)** - Build complex validation logic for LLM outputs\n- **[Retry Mechanisms](retry_mechanisms.md)** - Configure how Instructor handles validation failures\n- **[Field-Level Validation](field_level_validation.md)** - Validate individual fields in LLM responses\n\nMaster validation to ensure your LLM applications produce reliable, production-ready data!"
  },
  {
    "path": "docs/learning/validation/custom_validators.md",
    "content": "---\ntitle: Custom Validators for LLM Outputs\ndescription: Learn to build custom validators for LLM outputs using rule-based and semantic validation techniques with Instructor.\n---\n\n# Custom LLM Validators Tutorial: Advanced Data Quality Control\n\nLearn how to build custom validators for LLM outputs in this advanced tutorial. Master both rule-based and semantic validation techniques to ensure GPT-4, Claude, and other language models produce data that meets your exact requirements.\n\n## Basic Custom Validator\n\nCustom validators are functions that validate field values and can be applied using Pydantic's field validators.\n\n```python\nfrom pydantic import BaseModel, field_validator\nimport instructor\n\n# Initialize the client\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n    @field_validator('age')\n    @classmethod\n    def validate_age(cls, value):\n        if value < 0 or value > 120:\n            raise ValueError(\"Age must be between 0 and 120\")\n        return value\n\n# Extract data with validation\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"The person's name is John and they are 150 years old.\"}\n    ],\n    response_model=Person\n)\n```\n\nIf the model returns an age outside the valid range, Instructor will retry the request with specific feedback about the validation failure.\n\nFor more information on how Instructor handles validation and retries, see [Validation Basics](../../concepts/validation.md) and the [Retrying](../../concepts/retrying.md) concepts page.\n\n## Complex Validation\n\nYou can create more complex validators that check multiple fields or have conditional logic:\n\n```python\nfrom pydantic import BaseModel, field_validator, model_validator\nimport instructor\nfrom typing import List, Optional\nfrom datetime import date\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Employee(BaseModel):\n    name: str\n    hire_date: date\n    termination_date: Optional[date] = None\n    skills: List[str]\n\n    @field_validator('skills')\n    @classmethod\n    def validate_skills(cls, skills):\n        if len(skills) < 1:\n            raise ValueError(\"Employee must have at least one skill\")\n        return skills\n\n    @model_validator(mode='after')\n    def validate_dates(self):\n        if self.termination_date and self.termination_date < self.hire_date:\n            raise ValueError(\"Termination date cannot be before hire date\")\n        return self\n```\n\nFor more advanced validation approaches, check out [Field-level Validation](../../concepts/fields.md) and the [Validators](../../concepts/reask_validation.md) concepts page.\n\n## Handling Complex Data Types\n\nCustom validators can also process more complex data types and perform transformations:\n\n```python\nfrom pydantic import BaseModel, field_validator\nimport instructor\nimport re\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Contact(BaseModel):\n    name: str\n    email: str\n    phone: str\n\n    @field_validator('email')\n    @classmethod\n    def validate_email(cls, value):\n        pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$'\n        if not re.match(pattern, value):\n            raise ValueError(\"Invalid email format\")\n        return value\n\n    @field_validator('phone')\n    @classmethod\n    def validate_phone(cls, value):\n        # Remove non-digit characters and validate\n        digits_only = re.sub(r'\\D', '', value)\n        if len(digits_only) < 10:\n            raise ValueError(\"Phone number must have at least 10 digits\")\n        return digits_only  # Return the cleaned version\n```\n\nFor a practical example of extraction with validation, see the [Contact Information Extraction](../../examples/extract_contact_info.md) example.\n\n## Using External Services for Validation\n\nYou can also use external services or APIs for validation:\n\n```python\nfrom pydantic import BaseModel, field_validator\nimport instructor\nimport requests\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\n    @field_validator('zip_code')\n    @classmethod\n    def validate_zip_code(cls, value):\n        # Example of validation using an external service (simplified)\n        # In a real app, you might use a postal code validation API\n        if not (value.isdigit() and len(value) == 5):\n            raise ValueError(\"Zip code must be 5 digits\")\n        return value\n```\n\n## Semantic Validation with LLMs\n\nFor complex validation scenarios where rule-based validation is difficult, Instructor provides semantic validation capabilities using LLMs via the `llm_validator` function. For a comprehensive guide on this topic, see the dedicated [Semantic Validation](../../concepts/semantic_validation.md) page:\n\n```python\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nimport instructor\nfrom instructor import llm_validator\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass ProductDescription(BaseModel):\n    product_name: str\n    description: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\n                \"The description must be professional, accurate, and free of hyperbole. \"\n                \"It should not make unsubstantiated claims or use superlatives excessively.\",\n                client=client\n            )\n        )\n    ]\n\n# This would fail validation because it uses excessive hyperbole\ntry:\n    product = ProductDescription(\n        product_name=\"SuperClean 3000\",\n        description=\"The absolute BEST cleaning product in the world! Will change your life FOREVER! Makes every other cleaning product completely OBSOLETE!\"\n    )\nexcept ValueError as e:\n    print(e)  # The validation error would explain the issue with the hyperbolic language\n```\n\nSemantic validation is particularly useful for validating against criteria that are:\n\n1. **Subjective** - Such as tone, style, or appropriateness\n2. **Contextual** - Requiring understanding of relationships between elements\n3. **Complex** - Where multiple interrelated factors need to be evaluated together\n4. **Hard to formalize** - When rules would be too numerous or complex to express programmatically\n\nUnlike rule-based validators that check against predefined criteria, semantic validators leverage LLMs to evaluate content based on natural language instructions. They can understand nuance and context in ways that traditional validation cannot.\n\n### When to Use Semantic Validation\n\nConsider using semantic validation when:\n\n- You need to enforce style guidelines or content policies\n- Validating natural language content against subjective criteria\n- Checking for consistency across multiple fields or complex relationships\n- Traditional validation would require hundreds of individual rules\n\nRemember that semantic validation requires additional API calls, which adds cost and latency to your application. Use it strategically for high-value validation needs rather than for simple constraints that can be handled with standard validators.\n\n## Handling Validation Failures\n\nWhen validation fails, Instructor can handle it in different ways. Learn more about:\n\n- [Retry Mechanisms](../../concepts/retrying.md) for automatic retries with feedback\n- [Self-Correction](../../examples/self_critique.md) for AI model self-correction techniques\n\n## Best Practices for Custom Validators\n\n1. **Be specific in error messages**: Provide clear error messages that explain exactly what went wrong\n2. **Validate early**: Apply validators to individual fields when possible before model-level validation\n3. **Keep validators focused**: Each validator should have a single responsibility\n4. **Use type hints**: Proper type hints help both Pydantic and Instructor understand your data better\n5. **Consider both validation and transformation**: Validators can both validate and transform data\n6. **Choose appropriate validation type**: Use rule-based validation for simple, objective criteria and semantic validation for complex, subjective, or context-dependent validation\n7. **Balance cost and benefits**: Consider the additional cost and latency of semantic validation against the value it provides\n\nFor more information on validation in general, check out the [Validation](../../concepts/validation.md) concepts page.\n\n## Related Resources\n\n- [Fields](../../concepts/fields.md) - Learn about field definitions and properties\n- [Models](../../concepts/models.md) - Understand model creation and configuration\n- [Types](../../concepts/types.md) - Explore the different data types you can use\n\nCustom validators are a powerful way to ensure the data you extract meets your specific requirements, improving the reliability and quality of structured outputs from LLMs."
  },
  {
    "path": "docs/learning/validation/field_level_validation.md",
    "content": "---\ntitle: Field-level Validation with Instructor\ndescription: Learn how to create specific validation rules for individual fields in your Pydantic models to ensure data quality.\n---\n\n# Field-level Validation\n\nField-level validation lets you create specific rules for individual fields in your data models. This guide shows how to use field-level validation with Instructor.\n\n## What is Field-level Validation?\n\nField-level validation in Instructor uses Pydantic's validation features to:\n\n1. Check individual fields with custom rules\n2. Transform field values (like formatting or cleaning data)\n3. Apply business rules to specific fields\n4. Give clear feedback when values are invalid\n\nValidation happens when your model is being processed, and if it fails, Instructor will retry with better instructions.\n\n## Basic Field Validation\n\nYou can apply simple validation using Pydantic's Field constraints:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass User(BaseModel):\n    name: str = Field(..., min_length=2, description=\"User's full name\")\n    age: int = Field(..., ge=18, le=120, description=\"User's age in years\")\n    email: str = Field(\n        ...,\n        pattern=r\"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$\",\n        description=\"Valid email address\"\n    )\n```\n\nFor more details, see the [Fields](../../concepts/fields.md) concepts page.\n\n## Custom Field Validators\n\nFor more complex rules, use the `field_validator` decorator:\n\n```python\nfrom pydantic import BaseModel, field_validator\nimport instructor\nimport re\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Product(BaseModel):\n    name: str\n    sku: str\n    price: float\n\n    @field_validator('name')\n    @classmethod\n    def validate_name(cls, v):\n        if len(v.strip()) < 3:\n            raise ValueError(\"Product name must be at least 3 characters long\")\n        return v.strip().title()  # Clean up and format\n\n    @field_validator('sku')\n    @classmethod\n    def validate_sku(cls, v):\n        pattern = r'^[A-Z]{3}-\\d{4}$'\n        if not re.match(pattern, v):\n            raise ValueError(\"SKU must be in format XXX-0000 (3 uppercase letters, dash, 4 digits)\")\n        return v\n```\n\n## Validating Multiple Fields Together\n\nSometimes one field's validity depends on other fields. Use `model_validator` for this:\n\n```python\nfrom pydantic import BaseModel, model_validator\nimport instructor\nfrom datetime import date\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nclass Reservation(BaseModel):\n    check_in: date\n    check_out: date\n    room_type: str\n    guests: int\n\n    @model_validator(mode='after')\n    def validate_dates(self):\n        if self.check_out <= self.check_in:\n            raise ValueError(\"Check-out date must be after check-in date\")\n\n        if self.room_type == \"Standard\" and self.guests > 2:\n            raise ValueError(\"Standard rooms can only fit 2 guests\")\n\n        return self\n```\n\n## How Validation Errors Are Handled\n\nWhen validation fails, Instructor adds error details to help the AI fix the problem:\n\n```\nThe following errors occurred during validation:\n- product_sku: Product not found\n- quantity: Quantity must be at least 1\n\nPlease fix these errors and ensure the response is valid.\n```\n\n## Best Practices\n\n1. **Order matters**: Validators run in the order they're defined\n2. **Clear messages**: Write specific error messages\n3. **Clean first**: Handle data cleaning before validation\n4. **Validate early**: Check fields before model-level validation\n5. **Transform wisely**: Field validators can both check and change values\n\n## Related Resources\n\n- [Fields](../../concepts/fields.md) - Basic field properties\n- [Custom Validators](../../concepts/reask_validation.md) - Creating custom validation logic\n- [Validation Basics](../../concepts/validation.md) - Fundamental validation concepts\n- [Retry Mechanisms](../../concepts/retrying.md) - How validation retries work\n- [Fallback Strategies](../../concepts/error_handling.md) - Handling persistent validation failures\n- [Types](../../concepts/types.md) - Understanding data types in Pydantic models\n\n"
  },
  {
    "path": "docs/learning/validation/retry_mechanisms.md",
    "content": "# Retry Mechanisms\n\nRetry mechanisms in Instructor handle validation failures by giving the LLM another chance to generate valid responses. This guide explains how retries work and how to customize them for your use case.\n\n## How Retries Work\n\nWhen validation fails, Instructor:\n\n1. Captures the validation error(s)\n2. Formats them as feedback\n3. Adds the feedback to the prompt context\n4. Asks the LLM to try again with this new information\n\nThis creates a feedback loop that helps the LLM correct its output until it produces a valid response.\n\n## Basic Retry Example\n\nHere's a simple example showing retries in action:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field, field_validator\n\n# Initialize the client with max_retries\nclient = instructor.from_provider(\n    \"openai/gpt-4o\",\n    max_retries=2  # Will try up to 3 times (initial + 2 retries)\n)\n\nclass Product(BaseModel):\n    name: str\n    price: float = Field(..., gt=0)\n\n    @field_validator('name')\n    @classmethod\n    def validate_name(cls, v):\n        if len(v) < 3:\n            raise ValueError(\"Product name must be at least 3 characters\")\n        return v\n\n# This will automatically retry if validation fails\nresponse = client.create(\n    model=\"gpt-3.5-turbo\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Product: Pen, Price: -5\"}\n    ],\n    response_model=Product\n)\n```\n\nIn this example, the initial response will likely fail validation because:\n- The price is negative (violating the `gt=0` constraint)\n- Instructor will automatically retry with feedback about these issues\n\nFor more details on max_retries configuration, see the [Retrying](../../concepts/retrying.md) concepts page.\n\n## Customizing Retry Behavior\n\nYou can customize retry behavior when initializing the Instructor client:\n\n```python\nimport instructor\n\n# Customize retry behavior\nclient = instructor.from_provider(\n    \"openai/gpt-4o\",\n    max_retries=3,                   # Maximum number of retries\n    retry_if_parsing_fails=True,     # Retry on JSON parsing failures\n    throw_error=True                 # Throw an error if all retries fail\n)\n```\n\n### Retry Configuration Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `max_retries` | Maximum number of retry attempts | 0 |\n| `retry_if_parsing_fails` | Whether to retry if JSON parsing fails | True |\n| `throw_error` | Whether to throw an error if all retries fail | True |\n\n## Handling Retry Failures\n\nWhen all retries fail, Instructor raises an `InstructorRetryException` that contains comprehensive information about all failed attempts:\n\n```python\nfrom instructor.core.exceptions import InstructorRetryException\n\ntry:\n    response = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[{\"role\": \"user\", \"content\": \"Product: Invalid data\"}],\n        response_model=Product,\n        max_retries=3\n    )\nexcept InstructorRetryException as e:\n    print(f\"Failed after {e.n_attempts} attempts\")\n    print(f\"Total usage: {e.total_usage}\")\n    \n    # New: Access detailed information about each failed attempt\n    for attempt in e.failed_attempts:\n        print(f\"Attempt {attempt.attempt_number}: {attempt.exception}\")\n        if attempt.completion:\n            # Analyze the raw completion that failed validation\n            print(f\"Raw response: {attempt.completion}\")\n```\n\nThe `InstructorRetryException` now includes:\n\n- `failed_attempts`: A list of `FailedAttempt` objects containing:\n  - `attempt_number`: The retry attempt number\n  - `exception`: The specific exception that occurred\n  - `completion`: The raw LLM response (when available)\n- `n_attempts`: Total number of attempts made\n- `total_usage`: Total token usage across all attempts\n- `last_completion`: The final failed completion\n- `messages`: The conversation history\n\nThis comprehensive tracking enables better debugging and analysis of retry patterns.\n\nFor more on handling validation failures, see [Fallback Strategies](../../concepts/error_handling.md).\n\n## Error Messages and Feedback\n\nInstructor provides detailed error messages to the LLM during retries:\n\n```\nThe following errors occurred during validation:\n- price: ensure this value is greater than 0\n- name: Product name must be at least 3 characters\n\nPlease fix these errors and ensure the response is valid.\n```\n\nThis feedback helps the LLM understand exactly what needs to be fixed.\n\n## Retry Limitations\n\nWhile retries are powerful, they have some limitations:\n\n1. **Retry Budget**: Each retry consumes tokens and time\n2. **Persistent Errors**: Some errors might not be fixable by the LLM\n3. **Model Limitations**: Some models may consistently struggle with certain validations\n\nFor complex validation scenarios, consider implementing [Custom Validators](custom_validators.md) or [Field-level Validation](field_level_validation.md).\n\n## Advanced Retry Pattern: Progressive Validation\n\nFor complex schemas, you can implement a progressive validation pattern:\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field\n\n# Initialize with moderate retries\nclient = instructor.from_provider(\n    \"openai/gpt-4o\",\n    max_retries=2\n)\n\n# Basic validation first\nclass BasicProduct(BaseModel):\n    name: str\n    price: float = Field(..., gt=0)\n\n# Advanced validation second\nclass DetailedProduct(BasicProduct):\n    description: str = Field(..., min_length=10)\n    category: str\n    in_stock: bool\n\n# Two-step extraction with validation\ntry:\n    # First get basic fields\n    basic = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\"role\": \"user\", \"content\": \"Product: Mini Pen, Price: $2.50\"}\n        ],\n        response_model=BasicProduct\n    )\n\n    # Then get full details with context from the first step\n    detailed = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Provide more details about {basic.name} which costs ${basic.price}\"}\n        ],\n        response_model=DetailedProduct\n    )\nexcept Exception as e:\n    # Handle validation failures\n    print(f\"Validation failed: {e}\")\n```\n\n## Related Resources\n\n- [Retrying](../../concepts/retrying.md) - Core retry concepts\n- [Validation](../../concepts/validation.md) - Main validation documentation\n- [Custom Validators](../../concepts/reask_validation.md) - Creating custom validation logic\n- [Fallback Strategies](../../concepts/error_handling.md) - Handling persistent validation failures\n- [Self Critique](../../examples/self_critique.md) - Example of model self-correction\n\n## Next Steps\n\n- Learn about [Field-level Validation](field_level_validation.md)\n- Implement [Custom Validators](custom_validators.md)"
  },
  {
    "path": "docs/llms.txt",
    "content": "# Instructor: Type-Safe Structured Outputs from LLMs\n\nInstructor is a library for extracting structured outputs from Large Language Models (LLMs) with type safety and validation.\n\n## Table of Contents\n\n- [Instructor: Type-Safe Structured Outputs from LLMs](#instructor-type-safe-structured-outputs-from-llms)\n  - [Table of Contents](#table-of-contents)\n  - [Installation](#installation)\n  - [Core Concept](#core-concept)\n  - [Supported Providers](#supported-providers)\n    - [OpenAI](#openai)\n    - [Anthropic](#anthropic)\n    - [Google (Gemini)](#google-gemini)\n    - [Mistral](#mistral)\n    - [Cohere](#cohere)\n    - [Groq](#groq)\n    - [Other Providers](#other-providers)\n  - [Key Features](#key-features)\n    - [Response Validation](#response-validation)\n    - [Streaming Responses](#streaming-responses)\n    - [Partial Streaming](#partial-streaming)\n    - [Iterables](#iterables)\n    - [Multimodal Support](#multimodal-support)\n    - [Caching](#caching)\n    - [Hooks](#hooks)\n    - [Retries and Error Handling](#retries-and-error-handling)\n  - [Advanced Usage](#advanced-usage)\n    - [Parallel Processing](#parallel-processing)\n    - [Templating](#templating)\n    - [Maybe Responses](#maybe-responses)\n  - [Examples](#examples)\n    - [Simple Extraction](#simple-extraction)\n    - [Classification](#classification)\n    - [Complex Schema](#complex-schema)\n    - [Vision and Multimodal](#vision-and-multimodal)\n    - [Validation Context](#validation-context)\n    - [Validation Context with Jinja Templating](#validation-context-with-jinja-templating)\n\n## Installation\n\n```bash\npip install instructor\n```\n\nFor specific providers:\n\n```bash\n# OpenAI\npip install \"instructor[openai]\"\n\n# Anthropic\npip install \"instructor[anthropic]\"\n\n# Google (Gemini)\npip install \"instructor[gemini]\"\n\n# Mistral\npip install \"instructor[mistral]\"\n\n# Cohere\npip install \"instructor[cohere]\"\n```\n\n## Core Concept\n\nInstructor uses Pydantic models to define structured outputs and patches LLM clients to enable extraction with validation.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Define your output structure\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Create client using from_provider\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\n# Extract structured data\nuser = client.create(\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract the user: John Doe is 30 years old.\"}\n    ]\n)\n\nprint(user.name)  # \"John Doe\"\nprint(user.age)   # 30\n```\n\n## Supported Providers\n\n### OpenAI\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n```\n\nAvailable Modes:\n- `Mode.TOOLS` (default) - Uses OpenAI function calling\n- `Mode.JSON` - Uses JSON mode\n- `Mode.MD_JSON` - Uses Markdown JSON mode\n- `Mode.FUNCTIONS` - Uses legacy function calling\n\n### Anthropic\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"anthropic/claude-3-5-sonnet\")\n```\n\nAvailable Modes:\n- `Mode.ANTHROPIC_TOOLS` (default) - Uses Claude tool calling\n- `Mode.JSON` - Uses JSON mode\n\n### Google (Gemini)\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"google/gemini-2.5-flash\")\n```\n\nAvailable Modes:\n- `Mode.GEMINI_JSON` (default) - Generates JSON responses\n- `Mode.GEMINI_TOOL` - Uses Gemini's function calling\n\n### Mistral\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"mistral/mistral-large-latest\")\n```\n\nAvailable Modes:\n- `Mode.MISTRAL_TOOLS` (default) - Uses tools mode\n- `Mode.JSON` - Uses JSON mode\n\n### Cohere\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"cohere/command-r-plus\")\n```\n\nAvailable Modes:\n- `Mode.COHERE_TOOL` (default) - Uses Cohere's tool calling\n\n### Groq\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\"groq/mixtral-8x7b-32768\")\n```\n\nAvailable Modes:\n- `Mode.TOOLS` (default) - Uses function calling\n\n### Other Providers\n\nInstructor supports many additional providers:\n- Azure OpenAI\n- Vertex AI\n- Fireworks\n- Cerebras\n- Writer\n- Anyscale\n- Databricks\n- Together\n- Perplexity\n- Ollama\n- OpenRouter\n- LiteLLM\n- llama-cpp-python\n\n## Key Features\n\n### Response Validation\n\nInstructor automatically validates responses against your Pydantic models:\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\nclass UserWithValidation(BaseModel):\n    name: str\n    age: int = Field(gt=0, lt=150)  # Age must be between 0 and 150\n    email: str = Field(pattern=r\"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$\")\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\nuser = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserWithValidation,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract the user: John Doe is 30 years old, email is john@example.com\"}\n    ]\n)\n```\n\nIf validation fails, instructor will automatically reattempt the request with error details.\n\n### Streaming Responses\n\nStream partial responses as they're generated:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclass Report(BaseModel):\n    summary: str\n    analysis: str\n    recommendations: list[str]\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\n# Enable streaming\nfor partial in client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Report,\n    stream=True,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Write a detailed report about renewable energy.\"}\n    ]\n):\n    # Process each update\n    print(f\"Received update: {partial.model_dump_json()}\")\n\n# The final response has the complete model\nprint(f\"Final report: {partial}\")\n```\n\n### Partial Streaming\n\nStream specific fields as they complete:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom instructor.dsl import partial\n\nclass LongReport(BaseModel):\n    executive_summary: str = partial()\n    detailed_analysis: str = partial()\n    conclusion: str = partial()\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\nfor chunk in client.create(\n    model=\"gpt-4\",\n    response_model=LongReport,\n    stream=True,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Create a detailed report on climate change impacts.\"}\n    ]\n):\n    # Each chunk will contain completed fields\n    if hasattr(chunk, 'executive_summary') and chunk.executive_summary:\n        print(\"Executive Summary Complete!\")\n    if hasattr(chunk, 'detailed_analysis') and chunk.detailed_analysis:\n        print(\"Analysis Complete!\")\n    if hasattr(chunk, 'conclusion') and chunk.conclusion:\n        print(\"Conclusion Complete!\")\n```\n\n### Iterables\n\nProcess multiple items efficiently:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom instructor.dsl import iterable\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\nclass PeopleList(BaseModel):\n    people: list[Person] = iterable()\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\nfor person in client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=PeopleList,\n    stream=True,\n    messages=[\n        {\"role\": \"user\", \"content\": \"List 5 fictional characters with their ages.\"}\n    ]\n).people:\n    print(f\"Received: {person.name}, {person.age}\")\n```\n\n### Multimodal Support\n\nProcess images and other media:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport base64\n\nclass ImageContent(BaseModel):\n    objects: list[str]\n    description: str\n    dominant_colors: list[str]\n\n# Load image\nwith open(\"image.jpg\", \"rb\") as image_file:\n    base64_image = base64.b64encode(image_file.read()).decode('utf-8')\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\ncontent = client.create(\n    model=\"gpt-4-vision-preview\",\n    response_model=ImageContent,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\"type\": \"text\", \"text\": \"Describe this image in detail\"},\n                {\n                    \"type\": \"image_url\",\n                    \"image_url\": {\n                        \"url\": f\"data:image/jpeg;base64,{base64_image}\"\n                    }\n                }\n            ]\n        }\n    ]\n)\n\nprint(content.model_dump_json(indent=2))\n```\n\n### Caching\n\nCache responses to improve performance and reduce API costs:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport diskcache\n\n# Create a cache\ncache = diskcache.Cache(\"./my_cache_directory\")\n\n# Create client with caching\nclient = instructor.from_provider(\n    \"openai/gpt-3.5-turbo\",\n    cache=cache\n)\n\nclass Summary(BaseModel):\n    points: list[str]\n\n# This will use the cache if the same request was made before\nsummary = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Summary,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Summarize the key benefits of renewable energy.\"}\n    ]\n)\n```\n\n### Hooks\n\nMonitor and customize the processing flow:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport json\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n# Define hooks\ndef log_prompt(prompt, **kwargs):\n    print(f\"PROMPT: {json.dumps(prompt)}\")\n    return prompt\n\ndef log_response(response, **kwargs):\n    print(f\"RESPONSE: {response}\")\n    return response\n\ndef log_parsed(parsed, **kwargs):\n    print(f\"PARSED: {parsed}\")\n    return parsed\n\n# Apply hooks\nclient = instructor.from_provider(\n    \"openai/gpt-3.5-turbo\",\n    mode=instructor.Mode.TOOLS,\n    hooks={\n        \"prompt\": log_prompt,\n        \"response\": log_response, \n        \"parsed\": log_parsed\n    }\n)\n\nuser = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=User,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract the user: John Doe is 30 years old.\"}\n    ]\n)\n```\n\n### Retries and Error Handling\n\nHandle validation failures with customizable retry logic:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\n\nclass StrictUser(BaseModel):\n    name: str\n    age: int = Field(gt=0, lt=150)\n    email: str = Field(pattern=r\"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$\")\n\n# Configure max retries\nclient = instructor.from_provider(\n    \"openai/gpt-3.5-turbo\",\n    max_retries=3  # Will retry up to 3 times if validation fails\n)\n\ntry:\n    user = client.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=StrictUser,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract the user: John Doe is 30 years old.\"}\n        ]\n    )\nexcept instructor.exceptions.ValidationError as e:\n    print(f\"Validation failed: {e}\")\n```\n\n## Advanced Usage\n\n### Parallel Processing\n\nProcess multiple tasks concurrently:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom instructor.dsl.parallel import parallel\n\nclass Data(BaseModel):\n    summary: str\n    entities: list[str]\n    sentiment: str\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\n# Create parallel tasks\ntasks = [\n    {\"text\": \"Apple announces new iPhone with revolutionary features.\"},\n    {\"text\": \"Climate scientists warn of increasing global temperatures.\"},\n    {\"text\": \"Stock market hits record high amid economic recovery.\"}\n]\n\n# Process in parallel\nresults = parallel(\n    client=client,\n    model=\"gpt-3.5-turbo\",\n    response_model=Data,\n    prompts=[\n        [{\"role\": \"user\", \"content\": f\"Analyze this text: {task['text']}\"}]\n        for task in tasks\n    ],\n    max_workers=3\n)\n\nfor i, result in enumerate(results):\n    print(f\"Result {i+1}:\")\n    print(f\"  Summary: {result.summary}\")\n    print(f\"  Entities: {', '.join(result.entities)}\")\n    print(f\"  Sentiment: {result.sentiment}\")\n```\n\n### Templating\n\nInstructor supports Jinja templates directly in message content, automatically applying variables from the `context` parameter:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\nclass Analysis(BaseModel):\n    key_points: list[str]\n    summary: str\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\n# Context will be used to render templates in messages\nanalysis = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Analysis,\n    messages=[\n        {\n            \"role\": \"system\", \n            \"content\": \"You are an expert {{ analyst_type }} analyst.\"\n        },\n        {\n            \"role\": \"user\", \n            \"content\": \"\"\"\n            Please analyze the following {{ document_type }}:\n            \n            {{ content }}\n            \n            Provide a detailed analysis.\n            \"\"\"\n        }\n    ],\n    context={\n        \"analyst_type\": \"financial\",\n        \"document_type\": \"news article\",\n        \"content\": \"Renewable energy investments reached record levels in 2023...\"\n    }\n)\n\nprint(f\"Key points: {analysis.key_points}\")\nprint(f\"Summary: {analysis.summary}\")\n```\n\nThe templating system automatically processes all message content containing Jinja syntax (`{{ variable }}`, `{% if condition %}`, etc.) using the variables provided in the `context` parameter. This same context is also available to validators through `info.context`.\n\n### Maybe Responses\n\nHandle uncertain responses gracefully:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom instructor.dsl.maybe import Maybe\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\n# Use Maybe to handle potential missing information\nresult = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Maybe[Person],\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract info about Jane Doe who is 28 years old.\"}\n    ]\n)\n\nif result.value:\n    print(f\"Name: {result.value.name}, Age: {result.value.age}\")\n    if hasattr(result.value, 'occupation'):\n        print(f\"Occupation: {result.value.occupation}\")\n    else:\n        print(\"Occupation information not available\")\nelse:\n    print(f\"Unable to extract person. Reason: {result.reason}\")\n```\n\n## Examples\n\n### Simple Extraction\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\nclass Contact(BaseModel):\n    name: str\n    email: str\n    phone: str\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\ncontact = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=Contact,\n    messages=[\n        {\"role\": \"user\", \"content\": \"My name is John Doe, email is john@example.com and phone is 555-123-4567\"}\n    ]\n)\n\nprint(contact.model_dump_json(indent=2))\n```\n\n### Classification\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom enum import Enum\n\nclass Sentiment(str, Enum):\n    POSITIVE = \"positive\"\n    NEGATIVE = \"negative\"\n    NEUTRAL = \"neutral\"\n\nclass SentimentAnalysis(BaseModel):\n    sentiment: Sentiment\n    confidence: float\n    explanation: str\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\nanalysis = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=SentimentAnalysis,\n    messages=[\n        {\"role\": \"user\", \"content\": \"I absolutely loved the new movie! It was fantastic!\"}\n    ]\n)\n\nprint(f\"Sentiment: {analysis.sentiment}\")\nprint(f\"Confidence: {analysis.confidence}\")\nprint(f\"Explanation: {analysis.explanation}\")\n```\n\n### Complex Schema\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import List, Optional\nfrom datetime import datetime\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    state: str\n    zip_code: str\n\nclass Experience(BaseModel):\n    company: str\n    position: str\n    start_date: datetime\n    end_date: Optional[datetime] = None\n    description: str\n\nclass Person(BaseModel):\n    name: str\n    age: int = Field(gt=0, lt=150)\n    email: str\n    phone: Optional[str] = None\n    address: Address\n    skills: List[str] = Field(min_items=1)\n    experience: List[Experience] = Field(min_items=0)\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\nperson = client.create(\n    model=\"gpt-4\",\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Extract information about Jane Smith who is 35 years old.\n        Email: jane.smith@example.com\n        Phone: 555-987-6543\n        Address: 123 Main St, Springfield, IL 62701\n        Skills: Python, Data Analysis, Machine Learning, Communication\n        \n        Work Experience:\n        - Data Scientist at TechCorp (2019-01-15 to 2023-04-30)\n          Led data science projects for major clients\n        - Junior Analyst at DataFirm (2015-06-01 to 2018-12-15)\n          Performed statistical analysis and created reports\n        \"\"\"}\n    ]\n)\n\nprint(person.model_dump_json(indent=2))\n```\n\n### Vision and Multimodal\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nimport base64\nfrom typing import List\n\nclass Item(BaseModel):\n    name: str\n    price: float = Field(gt=0)\n    quantity: int = Field(gt=0)\n\nclass Receipt(BaseModel):\n    store_name: str\n    date: str\n    items: List[Item]\n    subtotal: float\n    tax: float\n    total: float\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\n# Load the receipt image\nwith open(\"receipt.jpg\", \"rb\") as image_file:\n    base64_image = base64.b64encode(image_file.read()).decode('utf-8')\n\nreceipt = client.create(\n    model=\"gpt-4-vision-preview\",\n    response_model=Receipt,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\"type\": \"text\", \"text\": \"Extract all information from this receipt\"},\n                {\n                    \"type\": \"image_url\",\n                    \"image_url\": {\n                        \"url\": f\"data:image/jpeg;base64,{base64_image}\"\n                    }\n                }\n            ]\n        }\n    ]\n)\n\nprint(receipt.model_dump_json(indent=2))\n```\n\n### Validation Context\n\nValidation context allows you to pass additional contextual information to validators, enabling sophisticated validation that depends on external data:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, field_validator, ValidationInfo\n\nclass CitationCheck(BaseModel):\n    statement: str\n    citation: str\n    \n    @field_validator('citation')\n    def validate_citation(cls, citation: str, info: ValidationInfo) -> str:\n        # Access the validation context\n        source_text = info.context.get(\"source_document\", \"\")\n        \n        # Check if the citation actually exists in the source document\n        if citation not in source_text:\n            raise ValueError(f\"Citation '{citation}' not found in source document\")\n        return citation\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n\nsource_document = \"The Earth is the third planet from the Sun and the only astronomical object known to harbor life.\"\n\nresult = client.create(\n    model=\"gpt-4o\",\n    response_model=CitationCheck,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Make a statement about Earth and provide a citation from the text.\"}\n    ],\n    context={\"source_document\": source_document}\n)\n\nprint(f\"Statement: {result.statement}\")\nprint(f\"Citation: {result.citation} (verified to exist in source)\")\n```\n\nValidation context is particularly useful for:\n\n1. **Citation validation**: Ensuring quoted text exists in source documents\n2. **Content moderation**: Checking against banned word lists\n3. **LLM-as-validator**: Using one LLM to validate the output of another\n4. **Reference data validation**: Checking responses against reference data\n\nCombined with Instructor's automatic reasking, validation context creates a powerful feedback loop:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, field_validator, ValidationInfo\n\nclass RelevantAnswer(BaseModel):\n    answer: str\n    \n    @field_validator('answer')\n    def check_relevance(cls, answer: str, info: ValidationInfo) -> str:\n        question = info.context.get(\"question\", \"\")\n        if \"climate change\" in question.lower() and \"climate\" not in answer.lower():\n            raise ValueError(\"Answer doesn't address climate change as requested in the question\")\n        return answer\n\nclient = instructor.from_provider(\n    \"openai/gpt-3.5-turbo\",\n    max_retries=2  # Will retry up to 2 times if validation fails\n)\n\nquestion = \"What are the major impacts of climate change?\"\n\nresult = client.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=RelevantAnswer,\n    messages=[\n        {\"role\": \"user\", \"content\": \"\"\"\n        Answer the following question:\n\n        <question>\n        {{ question }}\n        </question>\n        \"\"\"}\n    ],\n    context={\"question\": question}\n)\n\nprint(result.answer)  # Guaranteed to mention climate change\n```\n\nThis mechanism enables powerful templating through validation, where you can enforce that responses meet specific criteria or follow particular formats by providing the necessary context for validation.\n\n### Validation Context with Jinja Templating\n\nValidation context can also be used directly in Jinja templates, creating a powerful combination where you can both template your prompts and validate responses against the same context:\n\n```python\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, field_validator, ValidationInfo\nfrom instructor.templating import template\n\nclass AnswerWithContext(BaseModel):\n    answer: str\n    \n    @field_validator('answer')\n    def validate_answer(cls, answer: str, info: ValidationInfo) -> str:\n        # Access the same context used in the template\n        context_doc = info.context.get(\"document\", \"\")\n        if len(context_doc) > 100 and not any(fact in answer for fact in context_doc.split('.')[:3]):\n            raise ValueError(\"Answer doesn't use key facts from the context document\")\n        return answer\n\nclient = instructor.from_provider(\"openai/gpt-3.5-turbo\", max_retries=2)\n\n# Document to use in both template and validation\ncontext_document = \"\"\"\nThe James Webb Space Telescope (JWST) was launched on December 25, 2021. \nIt is the largest optical telescope in space and can observe objects too \nold, distant, or faint for the Hubble Space Telescope. The telescope is \nnamed after James E. Webb, who was the administrator of NASA from 1961 to 1968.\n\"\"\"\n\n# Use the template with variables from context\nquestion = \"When was the James Webb Space Telescope launched and what can it do?\"\n\nresult = client.create(\n    model=\"gpt-4o\",\n    response_model=AnswerWithContext,\n    messages=[\n        {\n            \"role\": \"user\", \n            \"content\": \"\"\"\n            Please answer the following question based on this information:\n\n            {{ document }}\n\n            Question: {{ question }}\n            \"\"\"\n        }\n    ],\n    # Pass the same context to validation\n    context={\n        \"document\": context_document,\n        \"question\": question\n    }\n)\n\nprint(result.answer)  # Guaranteed to include facts from the context\n```\n\nThis approach creates a seamless flow where:\n\n1. The same context variables are used in your Jinja templates for prompt construction\n2. Those same variables are available to validators to ensure the LLM's response is faithful to the provided information\n3. If validation fails, Instructor will automatically retry with error details\n\nThis pattern is especially useful for:\n- RAG applications where you need to ensure responses are grounded in retrieved documents\n- Q&A systems where answers must be factually consistent with provided context\n- Any scenario where you want to template prompts and validate responses against the same data\n\nThis guide covers the core features and usage patterns of the Instructor library. For more detailed examples and advanced use cases, refer to the official documentation.\n"
  },
  {
    "path": "docs/modes-comparison.md",
    "content": "---\ntitle: Mode Comparison Guide\ndescription: Compare different modes available in Instructor and understand when to use each\n---\n\n## Instructor Mode Comparison Guide\n\nInstructor uses **core modes** that work across providers. Provider-specific\nmodes still work, but they are deprecated and will show warnings.\n\nMode handling now lives in provider handlers. The DSL no longer stores\nmode-specific streaming logic.\n\n## Core Modes\n\n- `TOOLS`: Tool or function calling for structured extraction.\n- `JSON_SCHEMA`: Native schema support when a provider has it.\n- `MD_JSON`: JSON from text or code blocks for simple or fallback cases.\n- `PARALLEL_TOOLS`: Multiple tool calls in one response.\n- `RESPONSES_TOOLS`: OpenAI Responses API tools.\n\n## Legacy Modes (Deprecated)\n\nThese legacy modes map to core modes:\n\n- `FUNCTIONS` -> `TOOLS`\n- `TOOLS_STRICT` -> `TOOLS`\n- `ANTHROPIC_TOOLS` -> `TOOLS`\n- `ANTHROPIC_JSON` -> `MD_JSON`\n- `GENAI_TOOLS` -> `TOOLS`\n- `GENAI_JSON` -> `JSON`\n- `MISTRAL_TOOLS` -> `TOOLS`\n- `MISTRAL_STRUCTURED_OUTPUTS` -> `JSON_SCHEMA`\n- `BEDROCK_TOOLS` -> `TOOLS`\n- `BEDROCK_JSON` -> `MD_JSON`\n- `FIREWORKS_TOOLS` -> `TOOLS`\n- `FIREWORKS_JSON` -> `MD_JSON`\n- `CEREBRAS_TOOLS` -> `TOOLS`\n- `CEREBRAS_JSON` -> `MD_JSON`\n- `WRITER_TOOLS` -> `TOOLS`\n- `WRITER_JSON` -> `MD_JSON`\n- `PERPLEXITY_JSON` -> `MD_JSON`\n- `VERTEXAI_TOOLS` -> `TOOLS`\n- `VERTEXAI_JSON` -> `MD_JSON`\n- `VERTEXAI_PARALLEL_TOOLS` -> `PARALLEL_TOOLS`\n\n## Mode Selection Tips\n\n- Use `TOOLS` for most structured output cases.\n- Use `JSON_SCHEMA` when the provider supports native schema enforcement.\n- Use `MD_JSON` if tools are not supported or outputs are simple.\n- Use `PARALLEL_TOOLS` for multiple tasks in one response.\n\n## Examples\n\n### TOOLS Mode (Recommended)\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"openai/gpt-4o-mini\",\n    mode=Mode.TOOLS,\n)\n```\n\n### MD_JSON Mode (Fallback)\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0\",\n    mode=Mode.MD_JSON,\n)\n```\n\n### JSON_SCHEMA Mode (Native Schema)\n\n```python\nimport instructor\nfrom instructor import Mode\n\nclient = instructor.from_provider(\n    \"openai/gpt-4o-mini\",\n    mode=Mode.JSON_SCHEMA,\n)\n```\n\nSee the [Mode Migration Guide](concepts/mode-migration.md) for more details.\n\n### Google/Gemini\n\nFor complex structures:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\n    \"google/gemini-2.5-flash\",\n    mode=instructor.Mode.TOOLS,\n)\n```\n\nFor structured outputs with JSON:\n\n```python\nimport instructor\n\nclient = instructor.from_provider(\n    \"google/gemini-2.5-flash\",\n    mode=instructor.Mode.JSON,\n)\n```\n\n## Mode Compatibility List\n\nLegacy modes are shown for compatibility only. Prefer core modes in new code.\n\n- OpenAI: TOOLS, TOOLS_STRICT, PARALLEL_TOOLS, FUNCTIONS; JSON, MD_JSON, JSON_O1.\n- Anthropic: ANTHROPIC_TOOLS, ANTHROPIC_PARALLEL_TOOLS; ANTHROPIC_JSON.\n- Gemini: TOOLS; JSON.\n- Vertex AI: VERTEXAI_TOOLS; VERTEXAI_JSON.\n- Cohere: COHERE_TOOLS; JSON, MD_JSON.\n- Mistral: MISTRAL_TOOLS; MISTRAL_STRUCTURED_OUTPUTS.\n- Anyscale: (none); JSON, MD_JSON, JSON_SCHEMA.\n- Databricks: TOOLS; JSON, MD_JSON.\n- Together: (none); JSON, MD_JSON.\n- Fireworks: FIREWORKS_TOOLS; FIREWORKS_JSON.\n- Cerebras: (none); CEREBRAS_JSON.\n- Writer: WRITER_TOOLS; JSON.\n- Perplexity: (none); PERPLEXITY_JSON.\n- GenAI: TOOLS; JSON.\n- LiteLLM: depends on provider for both tool-based and JSON-based modes.\n\n## Best Practices\n\n1. **Start with the recommended mode for your provider**\n    - For OpenAI: `TOOLS`\n    - For Anthropic: `ANTHROPIC_TOOLS` (Claude 3+) or `ANTHROPIC_JSON`\n    - For Gemini: `TOOLS` or `JSON`\n\n2. **Try JSON modes for simple structures or if you encounter issues**\n   - JSON modes often work with simpler schemas\n   - They may be more token-efficient\n   - They work with more models\n\n3. **Use provider-specific modes when available**\n   - Provider-specific modes are optimized for that provider\n   - They handle special cases and requirements\n\n4. **Test and validate**\n   - Different modes may perform differently for your specific use case\n   - Always test with your actual data and models\n"
  },
  {
    "path": "docs/newsletter.md",
    "content": "---\ntitle: Subscribe to Instructor Newsletter for AI Updates\ndescription: Get notified about AI tips, blog posts, and research. Stay informed with Instructor's latest features and community insights.\n---\n\n# Instructor Newsletter\n\nIf you want to be notified of tips, new blog posts, and research, subscribe to our newsletter. Here's what you can expect:\n\n- Updates on Instructor features and releases\n- Blog posts on AI and structured outputs\n- Tips and tricks from our community\n- Research in the field of LLMs and structured outputs\n- Information on AI development skills with Instructor\n\nSubscribe to our newsletter for updates on AI development. We provide content to keep you informed and help you use Instructor in projects.\n\n<iframe src=\"https://embeds.beehiiv.com/2faf420d-8480-4b6e-8d6f-9c5a105f917a?slim=true\" data-test-id=\"beehiiv-embed\" height=\"52\" width=\"80%\" frameborder=\"0\" scrolling=\"no\" style=\"margin: 0; border-radius: 0px !important; background-color: transparent;\"></iframe>\n"
  },
  {
    "path": "docs/overrides/main.html",
    "content": "{% extends \"base.html\" %} \n{% block announce %}\n  🎉 Introducing <strong>Kura</strong>: Turn your chat logs into actionable insights! Discover user patterns, extract intents, and understand conversation flows at scale. \n  <a href=\"https://github.com/567-labs/kura\" style=\"color: #64B5F6; text-decoration: underline;\">\n    <strong>Try it on GitHub →</strong>\n  </a>\n{% endblock %}\n<script>\n  !(function (t, e) {\n    var o, n, p, r;\n    e.__SV ||\n      ((window.posthog = e),\n      (e._i = []),\n      (e.init = function (i, s, a) {\n        function g(t, e) {\n          var o = e.split(\".\");\n          2 == o.length && ((t = t[o[0]]), (e = o[1])),\n            (t[e] = function () {\n              t.push([e].concat(Array.prototype.slice.call(arguments, 0)));\n            });\n        }\n        ((p = t.createElement(\"script\")).type = \"text/javascript\"),\n          (p.async = !0),\n          (p.src = s.api_host + \"/static/array.js\"),\n          (r = t.getElementsByTagName(\"script\")[0]).parentNode.insertBefore(\n            p,\n            r\n          );\n        var u = e;\n        for (\n          void 0 !== a ? (u = e[a] = []) : (a = \"posthog\"),\n            u.people = u.people || [],\n            u.toString = function (t) {\n              var e = \"posthog\";\n              return (\n                \"posthog\" !== a && (e += \".\" + a), t || (e += \" (stub)\"), e\n              );\n            },\n            u.people.toString = function () {\n              return u.toString(1) + \".people (stub)\";\n            },\n            o =\n              \"capture identify alias people.set people.set_once set_config register register_once unregister opt_out_capturing has_opted_out_capturing opt_in_capturing reset isFeatureEnabled onFeatureFlags getFeatureFlag getFeatureFlagPayload reloadFeatureFlags group updateEarlyAccessFeatureEnrollment getEarlyAccessFeatures getActiveMatchingSurveys getSurveys onSessionId\".split(\n                \" \"\n              ),\n            n = 0;\n          n < o.length;\n          n++\n        )\n          g(u, o[n]);\n        e._i.push([i, s, a]);\n      }),\n      (e.__SV = 1));\n  })(document, window.posthog || []);\n  posthog.init(\"phc_bAUjZfg1PI0Ca2IOQCM053Y5873PRZhJ0DvTDbGsN9A\", {\n    api_host: \"https://p.useinstructor.com\",\n  });\n</script>\n"
  },
  {
    "path": "docs/prompting/decomposition/decomp.md",
    "content": "---\ndescription: \"DECOMP involves using a LLM to break down a complicated task into sub tasks that it has been provided with\"\n---\n\nDecomposed Prompting<sup><a href=\"https://arxiv.org/pdf/2210.02406\">1</a></sup> leverages a Language Model (LLM) to deconstruct a complex task into a series of manageable sub-tasks. Each sub-task is then processed by specific functions, enabling the LLM to handle intricate problems more effectively and systematically.\n\nIn the code snippet below, we define a series of data models and functions to implement this approach.\n\nThe `derive_action_plan` function generates an action plan using the LLM, which is then executed step-by-step. Each action can be\n\n1. InitialInput: Which represents the chunk of the original prompt we need to process\n2. Split : An operation to split strings using a given separator\n3. StrPos: An operation to help extract a string given an index\n4. Merge: An operation to join a list of strings together using a given character\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"57-58\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom typing import Union\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Split(BaseModel):\n    split_char: str = Field(\n        description=\"\"\"This is the character to split\n        the string with\"\"\"\n    )\n\n    def split_chars(self, s: str, c: str):\n        return s.split(c)\n\n\nclass StrPos(BaseModel):\n    index: int = Field(\n        description=\"\"\"This is the index of the character\n        we wish to return\"\"\"\n    )\n\n    def get_char(self, s: list[str], i: int):\n        return [c[i] for c in s]\n\n\nclass Merge(BaseModel):\n    merge_char: str = Field(\n        description=\"\"\"This is the character to merge the\n        inputs we plan to pass to this function with\"\"\"\n    )\n\n    def merge_string(self, s: list[str]):\n        return self.merge_char.join(s)\n\n\nclass Action(BaseModel):\n    id: int = Field(\n        description=\"\"\"Unique Incremental id to identify\n        this action with\"\"\"\n    )\n    action: Union[Split, StrPos, Merge]\n\n\nclass ActionPlan(BaseModel):\n    initial_data: str\n    plan: list[Action]\n\n\ndef derive_action_plan(task_description: str) -> ActionPlan:\n    return client.create(\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"Generate an action plan to help you complete\n                the task outlined by the user\"\"\",\n            },\n            {\"role\": \"user\", \"content\": task_description},\n        ],\n        response_model=ActionPlan,\n        max_retries=3,\n        model=\"gpt-4o\",\n    )\n\n\nif __name__ == \"__main__\":\n    task = \"\"\"Concatenate the second letter of every word in Jack\n    Ryan together\"\"\"\n    plan = derive_action_plan(task)\n    print(plan.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"initial_data\": \"Jack Ryan\",\n      \"plan\": [\n        {\n          \"id\": 1,\n          \"action\": {\n            \"split_char\": \" \"\n          }\n        },\n        {\n          \"id\": 2,\n          \"action\": {\n            \"index\": 1\n          }\n        },\n        {\n          \"id\": 3,\n          \"action\": {\n            \"merge_char\": \"\"\n          }\n        }\n      ]\n    }\n    \"\"\"\n\n    curr = plan.initial_data\n    cache = {}\n\n    for action in plan.plan:\n        if isinstance(action.action, Split) and isinstance(curr, str):\n            curr = action.action.split_chars(curr, action.action.split_char)\n        elif isinstance(action.action, StrPos) and isinstance(curr, list):\n            curr = action.action.get_char(curr, action.action.index)\n        elif isinstance(action.action, Merge) and isinstance(curr, list):\n            curr = action.action.merge_string(curr)\n        else:\n            raise ValueError(\"Unsupported Operation\")\n\n        print(action, curr)\n        #> id=1 action=Split(split_char=' ') ['Jack', 'Ryan']\n        #> id=2 action=StrPos(index=1) ['a', 'y']\n        #> id=3 action=Merge(merge_char='') ay\n\n    print(curr)\n    #> ay\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Decomposed Prompting: A Modular Approach for Solving Complex Tasks](https://arxiv.org/pdf/2210.02406)\n"
  },
  {
    "path": "docs/prompting/decomposition/faithful_cot.md",
    "content": "---\ndescription: \"Faithful Chain of Thought aims to use multiple reasoning steps to improve the quality of the final outputs\"\n---\n\nFaithful Chain of Thought<sup><a href=\"https://arxiv.org/pdf/2301.13379\">1</a></sup> improves the faithfulness of reasoning chains generated by Language Models by breaking it up into two stages\n\n1. **Translation** : We first translate a user query into a series of reasoning steps. These are a task specific set of steps that we can execute deterministically.\n2. **Problem Solving**: We execute our steps and arrive at a final answer that we can derive. This ensures that our Chain Of Thought is able to derive a answer that is consistent with the reasoning steps.\n\nThey list a few examples in the paper of what these task-specific steps could be\n\n1. **Math Word Problems** : Python Code that can be executed by an interpreter to derive a final answer\n2. **Multi-Hop QA** : This is a multi-step reasoning process. To solve this, they use a mix of python and Datalog ( which is a relation and log programming language ) to arrive at a final answer\n3. **Planning** : When trying to generate a plan to solve a user query, they generate a list of symbolic goals in a Programming Language and then call a PDDL Planner to obtain a plan to solve the user's query\n\n![](../../img/faithful_cot_example.png)\n\nIn the example below, we show how you can use a LLM to generate python code that can be executed by an Interpreter to arrive at a final answer.\n\nWe can implement it in `instructor` as seen below\n\n```python hl_lines=\"30-45\"\nimport instructor\nfrom pydantic import BaseModel, Field\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ReasoningStep(BaseModel):\n    id: int = Field(description=\"Unique ID\")\n    rationale: list[str] = Field(\n        description=\"\"\"Specific sections from prior reasoning\n        steps or the context that ground this reasoning step\"\"\"\n    )\n    dependencies: list[int] = Field(\n        description=\"\"\"IDs of prior reasoning steps that this\n        reasoning step depends on\"\"\"\n    )\n    eval_string: str = Field(\n        description=\"\"\"Python Code to execute to generate the\n        final evaluation\"\"\"\n    )\n\n\ndef generate_reasoning_steps(query: str) -> list[ReasoningStep]:\n    return client.create(\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                You are a world class AI who excels at\n                generating reasoning steps to answer a\n                question. You will be given a question\n                and you will generate a list of reasoning\n                steps that are needed to answer the\n                question.\n\n                At each point you should either\n                - declare a variable to be referenced\n                later on\n                - combine multiple variables together to\n                generate a new result that you should\n                store in another variable\n\n                The final answer should be stored in a\n                variable called `answer`.\n                \"\"\",\n            },\n            {\"role\": \"user\", \"content\": query},\n        ],\n        model=\"gpt-4o\",\n        response_model=list[ReasoningStep],\n    )\n\n\nif __name__ == \"__main__\":\n    steps = generate_reasoning_steps(\n        \"\"\"If there are 3 cars in the parking lot and 2 more\n        cars arrive, how many cars are in the parking lot\n        after another 2 more arrive?\"\"\"\n    )\n\n    code = \"\\n\".join([step.eval_string for step in steps])\n    print(code)\n    \"\"\"\n    initial_cars = 3\n    arriving_cars = 2\n    cars_after_first_arrival = initial_cars + arriving_cars\n    final_car_count = cars_after_first_arrival + 2\n    answer = final_car_count\n    \"\"\"\n    exec(code)\n\n    local_vars = {}\n    exec(code, {}, local_vars)\n    print(local_vars.get(\"answer\"))\n    #> 7\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Faithful Chain-of-Thought Reasoning](https://arxiv.org/pdf/2301.13379)\n"
  },
  {
    "path": "docs/prompting/decomposition/least_to_most.md",
    "content": "---\ntitle: \"Solve simpler subproblems\"\ndescription: \"Least-to-Most is a prompting technique that breaks a complex problem down into a series of increasingly complex subproblems.\"\n---\n\nGiven a complex problem, how can we encourage an LLM to solve simpler subproblems?\n\nLeast-to-Most is a prompting technique that breaks a complex problem down into a series of increasingly complex subproblems.\n\n!!! example \"Subproblems Example\"\n    **original problem**: Adam is twice as old as Mary. Adam will be 11 in 1 year. How old is Mary?\n\n    **subproblems**: (1) How old is Adam now? (2) What is half of Adam's current age?\n\nThese subproblems are solved sequentially, allowing the answers from earlier (simpler) subproblems to inform the LLM while solving later (more complex) subproblems.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Iterable\n\nclass Subquestion(BaseModel):\n    question: str\n\n\nclass Answer(BaseModel):\n    answer: int\n\n\nclass SubquestionWithAnswers(BaseModel):\n    question: str\n    answer: int\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef decompose(question):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Iterable[Subquestion],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Break this question down into subquestions to solve sequentially: {question}\",\n            }\n        ],\n    )\n\n\ndef solve(question, solved_questions, original_question):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Answer,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                    <original_question>\n                    {original_question}\n                    </original_question>\n\n                    <solved_subquestions>\n                    {solved_questions}\n                    </solved_subquestions>\n\n                    Solve this next subquestion: {question}\n                    \"\"\",\n            }\n        ],\n    ).answer\n\n\nif __name__ == \"__main__\":\n    question = \"Four years ago, Kody was only half as old as Mohamed. If Mohamed is currently twice 30 years old, how old is Kody?\"\n\n    # Stage 1: Decompose Question into Subquestions\n    subquestions = decompose(question)\n\n    # Stage 2: Sequentially Solve Subquestions\n    solved_questions = []\n    for subquestion in subquestions:\n        solved_questions.append(\n            SubquestionWithAnswers(\n                question=subquestion.question,\n                answer=solve(subquestion, solved_questions, question),\n            )\n        )\n\n    # Print\n    for item in solved_questions:\n        print(f\"{item.question} {item.answer}\")\n        #> How old is Mohamed currently? 60\n        #> How old was Mohamed four years ago? 56\n        #> How old was Kody four years ago if he was half as old as Mohamed? 28\n        #> How old is Kody currently? 32\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Least-to-Most Prompting Enables Complex Reasoning in Large Language Models](https://arxiv.org/abs/2205.10625)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)"
  },
  {
    "path": "docs/prompting/decomposition/plan_and_solve.md",
    "content": "---\ndescription: \"Plan and Solve involves the use of an improved zero-shot CoT prompt. This generates more robust reasoning processes than standard Zero-Shot CoT on multiple reasoning datasets\"\n---\n\nPlan and Solve<sup><a href=\"https://arxiv.org/pdf/2305.04091\">1</a></sup> improves the use of an improved Zero-Shot Chain Of Thought (CoT) prompt which adds more detailed instructions to the prompt given to these large language models.\n\n!!! example \"Plan and Solve Prompt\"\n\n    [User Prompt]\n\n    **Let’s first understand the problem, extract relevant variables and their corresponding numerals, and make a complete plan.Then, let’s carry out the plan, calculate intermediate variables (pay attention to correct numerical calculation and commonsense), solve the problem step by step, and show the answer.**\n\n    [Model Response]\n\n    **Therefore the answer(arabic numerals) is**\n\nThis is a two step process which guides the LLM to pay more attention to calculation and intermediate results to ensure that they are correctly performed as much as possible.\n\n1. **Generate Reasoning**: In the first step we prompt the model with the user's query and prime the model using plan and solve prompting to explicitly devise a plan for solving a problem before generating an intermediate reasoning process\n2. **Extract Answer** : Once we've obtained the model's reasoning, we then extract the answer from a new prompt which includes the model's chain of thought.\n\n![](../../img/plan_and_solve.png)\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"26-34 67\"\nimport instructor\nfrom pydantic import BaseModel\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Reasoning(BaseModel):\n    chain_of_thought: str\n\n\nclass Response(BaseModel):\n    correct_answer: str\n\n\ndef generate_reasoning(query: str):\n    return client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                <user query>\n                {query}\n                </user query>\n\n                Let's first understand the problem,\n                extract relevant variables and their\n                corresponding numerals, and make a\n                complete plan. Then, let's carry out\n                the plan, calculate intermediate\n                variables (pay attention to correct\n                numerical calculation and commonsense),\n                solve the problem step by step, and\n                show the answer.\n                \"\"\",\n            },\n        ],\n        response_model=Reasoning,\n        model=\"gpt-4o\",\n    )\n\n\ndef extract_answer(query: str, reasoning: Reasoning):\n    return client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                <user query>\n                    {query}\n                </user query>\n\n                Let's first understand the problem,\n                extract relevant variables and their\n                corresponding numerals, and make a\n                complete plan. Then, let's carry out\n                the plan, calculate intermediate\n                variables (pay attention to correct\n                numerical calculation and commonsense),\n                solve the problem step by step, and\n                show the answer.\n\n                <reasoning>\n                {reasoning.chain_of_thought}\n                </reasoning>\n\n                Therefore the answer (arabic numerals) is\n                \"\"\",\n            }\n        ],\n        model=\"gpt-4o\",\n        response_model=Response,\n    )\n\n\nif __name__ == \"__main__\":\n    query = (\n        \"In a dance class of 20 students, 20% enrolled \"\n        \"in contemporary dance, 25% of the remaining \"\n        \"enrolled in jazz dance and the rest enrolled \"\n        \"in hip-hop dance. What percentage of the entire \"\n        \"students enrolled in hip-hop dance?\"\n    )\n\n    reasoning = generate_reasoning(query)\n    print(reasoning.model_dump_json(indent=2))\n    \"\"\"\n    {\n    \"chain_of_thought\": \"Let's first break down the\n    problem:\\n\\n1. Total number of students = 20\\n2.\n    Percentage enrolled in contemporary dance = 20%\\n\\n\n    Step-by-Step Plan:\\n1. Calculate the number of\n    students enrolled in contemporary dance.\\n2.\n    Calculate the remaining students after contemporary\n    dance enrollment.\\n3. Calculate the percentage and\n    number of students from the remaining who enrolled in\n    jazz dance.\\n4. Determine the remaining students who\n    enrolled in hip-hop dance.\\n5. Finally, calculate the\n    percentage of the entire students who enrolled in\n    hip-hop dance.\\n\\nLet's carry out the plan:\\n\\n1.\n    Number of students enrolled in contemporary dance =\n    20% of 20 = (20/100) * 20 = 4\\n2. Remaining students\n    after contemporary = 20 - 4 = 16\\n3. Percentage of\n    remaining students enrolled in jazz dance = 25%\\n\n    Number of students enrolled in jazz dance = 25% of 16\n    = (25/100) * 16 = 4\\n4. Remaining students after\n    contemporary and jazz = 16 - 4 = 12\\n5. The number of\n    students enrolled in hip-hop dance = 12\\n6.\n    Percentage of entire students enrolled in hip-hop =\n    (Number of hip-hop students / Total students) *\n    100\\n   Percentage = (12 / 20) * 100 = 60%\\n\\nThus,\n    60% of the entire students enrolled in hip-hop dance.\"\n    }\n    \"\"\"\n\n    response = extract_answer(query, reasoning)\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"correct_answer\": \"60\"\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](https://arxiv.org/pdf/2305.04091)\n"
  },
  {
    "path": "docs/prompting/decomposition/program_of_thought.md",
    "content": "---\ndescription: \"Program Of Thought\"\n---\n\nProgram of Thought aims to leverage an external python interpreter in order to generate intermediate reasoning steps. This helps us to achieve a greater degree of performance in mathematical and programming-related tasks by grounding our final response in deterministic code.\n\n![](../../img/pot.jpeg)\n\nWe can implement it in `instructor` as seen below\n\n```python hl_lines=\"120-125\"\nfrom pydantic import BaseModel, Field, field_validator\nimport instructor\nfrom textwrap import dedent\nfrom typing import Literal\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nprefix = \"\"\"\n# Answer this question by implementing a solver()\n# function, use for loop if necessary.\ndef solver():\n    # Let's write a Python program step by step,\n    # and then return the answer\n    # Firstly, we need to define the following\n    # variable:\n\"\"\".strip()\n\n\ndef execute_program(code: str):\n    code = code.strip() + \"\\nans = solver()\"\n    print(code)\n    \"\"\"\n    # Answer this question by implementing a\n    # solver() function, use for loop if necessary.\n    def solver():\n        # Let's write a Python program step by step,\n        # and then return the answer\n        # Firstly, we need to define the following\n        # variable:\n        selling_price = 360\n        profit_percentage = 20\n\n        # To find the cost price, use the formula:\n        # cost_price = selling_price / (1 + profit_percentage / 100)\n        cost_price = selling_price / (1 + profit_percentage / 100)\n\n        return cost_price\n\n    # Running the solver function to get the cost price\n    result = solver()\n    print(result)\n    ans = solver()\n    \"\"\"\n    exec(code)\n    locals_ = locals()\n    return locals_.get(\"ans\")\n\n\nclass Prediction(BaseModel):\n    choice: Literal[\"A\", \"B\", \"C\", \"D\", \"E\"]\n\n\nclass ProgramExecution(BaseModel):\n    program_code: str = Field(\n        description=\"\"\"Program Code that\n    once executed contains the final answer\"\"\"\n    )\n\n    @field_validator(\"program_code\")\n    @classmethod\n    def ensure_valid_code(cls, v: str) -> str:\n        if not v.startswith(prefix):\n            raise ValueError(\n                f\"\"\"Program Code must begin with the desired\n                prefix of {prefix}\"\"\"\n            )\n\n        answer = execute_program(v)\n        if not answer:\n            raise ValueError(\n                f\"\"\"Make sure to return the answer to the\n                question within the solver function\"\"\"\n            )\n\n        return str(answer)\n\n\ndef generate_intermediate_reasoning(query: str):\n    return client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                You are a world class AI system that excels\n                at answering user queries in a systematic\n                and detailed manner. You are about to be\n                passed a user query to respond to. Make sure\n                to generate a valid program that can be\n                executed to answer the user query.\n\n                Make sure to begin your generated program\n                with the following prefix\n\n                {prefix}\n                \"\"\"\n                ),\n            },\n            {\n                \"role\": \"user\",\n                \"content\": query,\n            },\n        ],\n        response_model=ProgramExecution,\n    )\n\n\ndef generate_prediction(\n    predicted_answer: str, options: list[str], query: str\n) -> Prediction:\n    formatted_options = \",\".join(options)\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Prediction,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                Find the closest options based on the\n                question and prediction.\n\n                Question: {query}\n                Prediction: {predicted_answer}\n                Options: [{formatted_options}]\n                \"\"\"\n                ),\n            }\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"A trader sold an article at a profit of 20%\n    for Rs.360. What is the cost price of the article?\"\"\"\n    reasoning = generate_intermediate_reasoning(query)\n    options = [\"A)270\", \"B)300\", \"C)280\", \"D)320\", \"E)315\"]\n    print(reasoning.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"program_code\": \"300.0\"\n    }\n    \"\"\"\n\n    prediction = generate_prediction(reasoning.program_code, options, query)\n    print(prediction.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"choice\": \"B\"\n    }\n    \"\"\"\n```\n"
  },
  {
    "path": "docs/prompting/decomposition/recurs_of_thought.md",
    "content": "---\ntitle: \"\"\ndescription: \"\"\nkeywords: \"\"\n---\n\n[wip]\n"
  },
  {
    "path": "docs/prompting/decomposition/skeleton_of_thought.md",
    "content": "---\ntitle: \"Generate in Parallel\"\ndescription: \"Skelelton-of-Thought is a technique which prompts an LLM to generate a skeleton outline of the response, then completes each point in the skeleton in parallel.\"\n---\n\nHow do we decrease the latency of an LLM pipeline?\n\nSkelelton-of-Thought is a technique which prompts an LLM to generate a skeleton outline of the response, then completes each point in the skeleton in parallel. The parallelism can be achieved by parallel API calls or batched decoding.\n\nBelow is an example of an implementation using parallel API calls with `instructor`:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass Point(BaseModel):\n    index: int\n    description: str\n\n\nclass Skeleton(BaseModel):\n    points: list[Point]\n\n\nclass Response(BaseModel):\n    response: str\n\n\nasync def get_skeleton(question):\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=Skeleton,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                You’re an organizer responsible for only giving the skeleton (not the full content) for answering the question.\n                Provide the skeleton in a list of points (numbered 1., 2., 3., etc.) to answer the question.\n                Instead of writing a full sentence, each skeleton point should be very short with only 3∼5 words.\n                Generally, the skeleton should have 3∼10 points.\n\n                Now, please provide the skeleton for the following question.\n\n                <question>\n                {question}\n                </question>\n\n                Skeleton:\n                \"\"\",\n            }\n        ],\n    )\n\n\nasync def expand_point(question, skeleton, point_index):\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                You’re responsible for continuing the writing of one and only one point in the overall answer to the following question.\n\n                <question>\n                {question}\n                </question>\n\n                The skeleton of the answer is:\n\n                <skeleton>\n                {skeleton}\n                </skeleton>\n\n                Continue and only continue the writing of point {point_index}.\n                Write it **very shortly** in 1∼2 sentence and do not continue with other points!\n                \"\"\",\n            }\n        ],\n    )\n\n\nasync def main():\n    query = \"Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.\"\n\n    # Step 1: Get the skeleton\n    skeleton = await get_skeleton(query)\n\n    for point in skeleton.points:\n        print(point)\n        #> index=1 description='Introduction to Hawaii trip'\n        #> index=2 description='Arrival and first impressions'\n        #> index=3 description='Traditional Hawaiian cuisine'\n        #> index=4 description='Exploring local markets'\n        #> index=5 description='Visit to historic sites'\n        #> index=6 description='Experience a Hawaiian luau'\n        #> index=7 description='Day at the beach'\n        #> index=8 description='Hiking adventures'\n        #> index=9 description='Scenic viewpoints'\n        #> index=10 description='Closing remarks and tips'\n\n    # Step 2: Expand on each point in parallel\n    tasks = [expand_point(query, skeleton, point.index) for point in skeleton.points]\n    responses = await asyncio.gather(*tasks)\n\n    for response in responses:\n        print(response.response)\n        \"\"\"\n        Hawaii-a paradise of golden beaches, lush landscapes, and vibrant culture-beckoned us with the promise of adventure and unforgettable experiences. Our journey began the moment we landed on this magical archipelago, ready to explore its unique blend of natural beauty and rich traditions.\n        \"\"\"\n        \"\"\"\n        The moment we landed in Hawaii, we were greeted with warm aloha spirit, lush tropical landscapes, and the gentle aroma of hibiscus flowers in the air.\n        \"\"\"\n        \"\"\"\n        The traditional Hawaiian cuisine was an exotic delight; from savoring the rich flavors of poke bowls to indulging in the sweet taste of haupia, every bite was a unique cultural experience.\n        \"\"\"\n        \"\"\"\n        Exploring local markets was a vibrant and delightful experience, where the air was filled with the scent of exotic fruits, freshly-made poke, and sounds of local musicians. We discovered unique handicrafts and interacted with friendly vendors eager to share their stories and traditions.\n        \"\"\"\n        \"\"\"\n        A visit to Pearl Harbor is a poignant reminder of the past, offering a chance to pay respects and learn about the events that shaped history. Walking through the USS Arizona Memorial and exploring the interactive exhibits was both humbling and enlightening.\n        \"\"\"\n        \"\"\"\n        Point 6: Experience a Hawaiian luau - Attending a traditional Hawaiian luau was unforgettable, filled with vibrant dances, soulful music, and a feast of mouthwatering dishes cooked in an imu (underground oven). It was a magical evening that immersed us in the heart of Hawaiian culture.\n        \"\"\"\n        \"\"\"\n        A day at the beach in Hawaii was pure bliss. The crystal-clear waters and soft sands were the perfect backdrop for both relaxation and adventure, from sunbathing to snorkeling.\n        \"\"\"\n        \"\"\"\n        Hiking adventures in Hawaii offer a unique chance to connect with nature, with trails leading to stunning waterfalls and lush rainforests. Don’t miss out on the Na Pali Coast's breathtaking hikes!\n        \"\"\"\n        \"\"\"\n        One of the highlights of my trip was visiting the scenic viewpoints such as the Na Pali Coast and Haleakalā National Park, offering breathtaking panoramic views that are perfect for photography aficionados and nature lovers alike.\n        \"\"\"\n        \"\"\"\n        As you plan your trip, don't forget to pack plenty of sunscreen and a camera to capture every magical moment. Hawaii offers a unique blend of relaxation and adventure that's sure to leave you with unforgettable memories.\n        \"\"\"\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](https://arxiv.org/abs/2307.15337)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/decomposition/tree-of-thought.md",
    "content": "---\ntitle: \"\"\ndescription: \"\"\nkeywords: \"\"\n---\n\n[wip]\n"
  },
  {
    "path": "docs/prompting/ensembling/cosp.md",
    "content": "---\ndescription: \"Consistency Based Self Adaptive Prompting (COSP) is a ensembling technique that aims to combine multiple Chain Of Thought reasoning calls\"\n---\n\nConsistency Based Self Adaptive Prompting (COSP)<sup><a href=\"https://arxiv.org/pdf/2305.14106\">1</a></sup> aims to improve LLM output quality by generating high quality few shot examples to be included in the final prompt. These are examples without labelled ground truth so they use self-consistency and a metric known as normalized entropy to select the best examples.\n\nOnce they've selected the examples, they then append them to the prompt and generate multiple reasoning chains before selecting the final result using [Self-Consistency](self_consistency.md).\n\n## COSP process\n\n![](../../img/cosp.png)\n\nHow does this look in practice? Let's dive into greater detail.\n\n### Step 1 - Selecting Examples\n\nIn the first step, we try to generate high quality examples from questions that don't have ground truth labels. This is challenging because we want to find a way to automatically determine answer quality when sampling our model multiple times.\n\nIn this case, we have `n` questions which we want to generate `m` possible reasoning chains for each question. This gives a total of `nm` examples. We then want to filter out `k` final few shot examples from these `nm` examples to be included inside our final prompt.\n\n1. Using chain of thought, we first generate `m` responses for each question. These responses contain a final answer and a rationale behind that answer.\n2. We compute a score for each response using a weighted sum of two values - normalized entropy and repetitiveness ( How many times this rationale appears for this amswer )\n3. We rank all of our `nm` responses using this score and choose the `k` examples with the lowest scores as our final few shot examples.\n\n#### Normalized Entropy\n\n> In the paper, the authors write that normalized entropy is a good proxy over a number of different tasks where low entropy is positively correlated with correctness. Entropy is also supposed to range from 0 to 1.\n>\n> Therefore in order to do so, we introduce a `-` term in our implementation so that the calculated values range from 0 to 1.\n\n![](../../img/cosp_entropy.png)\n\nAssuming that for a specific question $x^{(i)}$, we have generated $m$ final answers of which $u$ are unique. ( Note that this only cares about the answer itself and not the rationale )\n\n$$\n\\mathcal{H}\\left(x^{(i)} \\mid \\left\\{\\hat{y}_j^{(i)}\\right\\}_{j=1}^m\\right) = \\frac{\\sum_{\\alpha=1}^u \\hat{p}\\left(\\hat{y}_{\\alpha}^{(i)}\\right) \\log \\hat{p}\\left(\\hat{y}_{\\alpha}^{(i)}\\right)}{\\log m},\n$$\n\nWe can measure the entropy of the generated responses using the formula above where\n\n- $x_i$ is the original question that we prompted the model with\n- $y_j^{i}$ represents the $i$-th sampled response from the $m$ that we generated\n- $\\hat{p}\\left(\\hat{y}_{\\alpha}^{(i)}\\right)$ is the frequency of the unique answer in all the $m$ generated answers. (Eg. if we generate 8 responses and 4 of them return the value 10, then $\\hat{p}\\left(\\hat{y}_{\\alpha}^{(i)}\\right)$ is just going to be 0.5)\n\n#### Repetitiveness\n\n$$\nR_r(r_j^{(i)}) = \\frac{2}{Q(Q-1)} \\sum_{a=1}^{Q} \\sum_{b=a+1}^{Q} W_{ab}\n$$\n\nIn the formula above, $Q$ refers to the number of phrases in the sentence and $W_{ab}$ refers to the cosine similarity of two phrases $a$ and $b$.\n\nRepetitiveness aims to measure how often the language model repeats itself. To do so, the paper sums up the cosine similarity between each sentence inside the generated chain of thought rationale before normalizing it.\n\nThe intuition behind this is that high repetitiveness indicates redundancy, which can lead to poorer performance. Therefore responses with a high number of similar sentences will have a larger score for repetitiveness ( since cosine similarity will be larger for each sentence ).\n\n### Step 2 - Self Consistency\n\nWe now take our `k` responses and append them to our prompt. We then sample our model multiple times using this new prompt and take the majority vote as the answer.\n\n## Implementation\n\nNow that we understand what COSP is, let's see how we can implement it in instructor. Note that here we'll measure repetitiveness using cosine similarity between sentence embeddings.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom openai import AsyncOpenAI, OpenAI\nfrom collections import defaultdict, Counter\nimport asyncio\nfrom textwrap import dedent\nimport math\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass Response(BaseModel):\n    chain_of_thought: list[str]\n    answer: int\n\n\nclass ResponseScore(BaseModel):\n    query: str\n    response: Response\n    score: float\n\n    def format_response(self):\n        return dedent(\n            f\"\"\"\n            Q: {self.query}\n            A: {''.join(self.response.chain_of_thought)}. Therefore the answer is {self.response.answer}.\n            \"\"\"\n        )\n\n\ndef cosine_similarity(vec1: list[float], vec2: list[float]):\n    dot_product = sum(a * b for a, b in zip(vec1, vec2))\n    magnitude1 = math.sqrt(sum(a * a for a in vec1))\n    magnitude2 = math.sqrt(sum(b * b for b in vec2))\n\n    if magnitude1 * magnitude2 == 0:\n        return 0  # Handle the case of zero vectors\n\n    return dot_product / (magnitude1 * magnitude2)\n\n\ndef score_repetitiveness(prediction: Response):\n    if len(prediction.chain_of_thought) == 1:\n        return 0\n\n    embedding = OpenAI().embeddings.create(\n        input=prediction.chain_of_thought, model=\"text-embedding-3-small\"\n    )\n    embedding = [item.embedding for item in embedding.data]\n\n    ttl = 0\n    num_comparisons = 0\n    for idx in range(len(embedding)):\n        for idx2 in range(idx + 1, len(embedding)):\n            ttl += cosine_similarity(embedding[idx], embedding[idx2])\n            num_comparisons += 1\n\n    return ttl / num_comparisons if num_comparisons > 0 else 0\n\n\nasync def generate_cot_response(query: str) -> tuple[Response, str]:\n    return (\n        await client.create(\n            model=\"gpt-4o\",\n            messages=[{\"role\": \"user\", \"content\": query}],\n            response_model=Response,\n            temperature=0.4,\n        ),\n        query,\n    )\n\n\nasync def generate_batch_cot_responses(\n    queries: list[str], m: int\n) -> list[tuple[Response, str]]:\n    coros = [generate_cot_response(query) for query in queries for _ in range(m)]\n    return await asyncio.gather(*coros)\n\n\ndef score_entropy(predictions: list[Response]):\n    counter = Counter([prediction.answer for prediction in predictions])\n\n    prob = [counter[i] / len(predictions) for i in counter]\n\n    numer = -sum([p * math.log(p) for p in prob])\n    denom = math.log(len(predictions))\n\n    return numer / denom\n\n\ndef score_responses(\n    predictions: list[tuple[Response, str]], trade_off_param: float\n) -> list[ResponseScore]:\n    query_to_responses: dict[str, list[Response]] = defaultdict(list)\n    for prediction, query in predictions:\n        query_to_responses[query].append(prediction)\n\n    query_to_entropy = {\n        query: score_entropy(predictions)\n        for query, predictions in query_to_responses.items()\n    }\n\n    return [\n        ResponseScore(\n            query=query,\n            response=prediction,\n            score=query_to_entropy[query]\n            + trade_off_param * score_repetitiveness(prediction),\n        )\n        for prediction, query in predictions\n    ]\n\n\ndef get_top_k_examples(queries: list[ResponseScore], k: int):\n    \"\"\"\n    This gets the top k responses that have the minimum possible score\n    \"\"\"\n    sorted_responses = sorted(queries, key=lambda x: x.score)\n    return sorted_responses[:k]\n\n\nasync def generate_answer_with_examples(query: str, examples: list[ResponseScore]):\n    formatted_examples = \"\\n\".join([example.format_response() for example in examples])\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                You are a world class AI system that excels at answering user queries\n\n                <query>\n                {query}\n                </query>\n\n                <examples>\n                {formatted_examples}\n                </examples>\n                \"\"\"\n                ),\n            }\n        ],\n        response_model=Response,\n    )\n\n\nasync def generate_final_answers(\n    query: str, examples: list[ResponseScore], number_samples: int\n):\n    coros = [\n        generate_answer_with_examples(query, examples) for _ in range(number_samples)\n    ]\n\n    return await asyncio.gather(*coros)\n\n\nif __name__ == \"__main__\":\n    query = (\n        \"The schools debate team had 5 boys and 40 girls on it. \"\n        \"If they were split into groups of 9 how many groups \"\n        \"could they make?\"\n    )\n\n    example_questions = [\n        (\n            \"Debby's class is going on a field trip to the zoo. \"\n            \"If each van can hold 4 people and there are 2 students \"\n            \"and 6 adults going, how many vans will they need?\"\n        ),\n        (\n            \"Nancy had 80 files on her computer. She deleted 31 of \"\n            \"them and put the rest into folders with 7 files in each \"\n            \"one. How many folders did Nancy end up with?\"\n        ),\n        (\n            \"At the arcade, Tom won 32 tickets playing 'whack a mole' \"\n            \"and 25 tickets playing 'skee ball'. If he spent 7 of his \"\n            \"tickets on a hat, how many tickets does Tom have left?\"\n        ),\n    ]\n\n    m = 2  # Number of Reasoning Chains per example ( Step 1 )\n    k = 3  # Number of Examples to include in final prompt (Step 2)\n    n = 2  # Number of Reasoning Chains For Self-Consistency ( Step 2 )\n\n    # Step 1 : Generate the examples\n    responses = asyncio.run(generate_batch_cot_responses(example_questions, m))\n    scored_responses = score_responses(responses, 0.2)\n\n    chosen_examples = get_top_k_examples(scored_responses, k)\n\n    # Step 2 : Run Self-Consistency\n    final_responses = asyncio.run(generate_final_answers(query, chosen_examples, n))\n\n    c = Counter([response.answer for response in final_responses])\n    answer = c.most_common(1)[0][0]\n\n    print(answer)\n    #> 5\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Better Zero-Shot Reasoning with Self-Adaptive Prompting](https://arxiv.org/pdf/2305.14106)\n"
  },
  {
    "path": "docs/prompting/ensembling/dense.md",
    "content": "---\ndescription: \"Demonstration Ensembling(DENSE) creates multiple few-shot prompts, each containing a distinct subset of examples from the training set. We then use that to generate a final response\"\n---\n\nWe can maximise the use of our examples by prompting our model multiple times, each time using a different subset of examples. We can then take these multiple outputs and aggregate over them to generate a final response. This is known as Demonstration Ensembling ( DENSE ) <sup><a href=\"https://arxiv.org/pdf/2308.08780\">1</a></sup>.\n\n> For simplicity in this example, we simply iterate over the examples and partition them equally to get equally sized clusters. However, depending on your use-case you might also want to consider sampling these using some form of embedding clusering.\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"26-41\"\nimport instructor\nfrom pydantic import BaseModel\nimport asyncio\nfrom collections import Counter\nfrom typing import Literal\nfrom textwrap import dedent\n\nclass DemonstrationResponse(BaseModel):\n    correct_answer: Literal[\"Positive\", \"Negative\", \"Neutral\"]\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nasync def generate_self_consistent_response(prompt: str, examples: list[str]):\n    concetenated_examples = \"\\n\".join(examples)\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                You are an intelligent AI System that excels\n                at classifying user queries into three\n                possible labels:\n                - Positive\n                - Negative\n                - Neutral\n\n                You are about to be given a user query and\n                asked to classify it into one of the three\n                categories. Make sure to refer closely to\n                the examples provided to you, examining each\n                individual example before coming up with the\n                final answer.\n\n                Here are the examples:\n                {concetenated_examples}\n                \"\"\"\n                ),\n            },\n            {\"role\": \"user\", \"content\": prompt},\n        ],\n        response_model=DemonstrationResponse,\n        temperature=0,\n    )\n\n\nasync def generate_self_consistent_responses(\n    prompt: str, num_responses: int, examples: list[str]\n):\n    assert (\n        len(examples) % num_responses == 0\n    ), \"The number of examples must be evenly divisible by num_responses\"\n\n    # Batch the examples into num_responses batches\n    batch_size = len(examples) // num_responses\n\n    coros = [\n        generate_self_consistent_response(prompt, examples[i : i + batch_size])\n        for i in range(0, len(examples), batch_size)\n    ]\n\n    responses = await asyncio.gather(*coros)\n    return responses\n\n\nif __name__ == \"__main__\":\n    user_query = \"What is the weather like today?\"\n    examples = [\n        \"I love this product! [Positive]\",\n        \"This is the worst service ever. [Negative]\",\n        \"The movie was okay, not great but not terrible. [Neutral]\",\n        \"I'm so happy with my new phone! [Positive]\",\n        \"The food was terrible and the service was slow. [Negative]\",\n        \"It's an average day, nothing special. [Neutral]\",\n        \"Fantastic experience, will come again! [Positive]\",\n        \"I wouldn't recommend this to anyone. [Negative]\",\n        \"The book was neither good nor bad. [Neutral]\",\n        \"Absolutely thrilled with the results! [Positive]\",\n    ]\n    responses = asyncio.run(generate_self_consistent_responses(user_query, 5, examples))\n    answer_counts = Counter([response.correct_answer for response in responses])\n    most_common_answer, _ = answer_counts.most_common(1)[0]\n    print(most_common_answer)\n    #> Neutral\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Exploring Demonstration Ensembling for In Context Learning](https://arxiv.org/pdf/2308.08780)\n"
  },
  {
    "path": "docs/prompting/ensembling/diverse.md",
    "content": "---\ndescription: \"Diverse creates multiple prompts for a given problem before performing self-consistency for each. It then generates multiple reaosning paths before choosing the best final response\"\n---\n\nDiverse Verifier On Reasoning Step (DiVeRSe)<sup><a href=\"https://aclanthology.org/2023.acl-long.291/\">1</a></sup> is a prompting technique which provides two main improvements\n\n1. **Diverse Prompts** : They generate multiple variations of the same prompt by varying the examples used in each prompt\n2. **Verification** : They use a finetuned `Deberta-V3-Large` to determine the quality of a generated response. Instead of using majority voting, they use their model to score each generated response from 0 to 1. They then aggregate these scores for each unique answer to determine the best generated solution.\n\nIn the paper itself, they also train a step-wise verifier that is able to score each individual reasoning step. This enables much more fine-grained predictions but is challenging to obtain training data for.\n\nWe can implement this in `instructor`. However, instead of using a `deberta-v3-large` model, we'll be using gpt-4o to score its own outputs and generate a quality score.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Literal\nfrom textwrap import dedent\nimport asyncio\nfrom collections import defaultdict\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass Response(BaseModel):\n    chain_of_thought: str\n    answer: int\n\n\nclass Grading(BaseModel):\n    grade: Literal[\"Poor\", \"Average\", \"Good\", \"Excellent\"]\n\n    def get_score(self):\n        mapping = {\n            \"Poor\": 0.25,\n            \"Average\": 0.5,\n            \"Good\": 0.75,\n            \"Excellent\": 1,\n        }\n        return mapping[self.grade]\n\n\nasync def generate_response(query: str, examples: list[str]):\n    formatted_examples = \"\\n\".join(examples)\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": dedent(\n                    f\"\"\"\n                You are a world class AI that excels at answering\n                user queries in a succint and accurate manner.\n\n                <query>\n                {query}\n                </query>\n\n                <examples>\n                {formatted_examples}\n                </examples>\n                \"\"\"\n                ),\n            }\n        ],\n        response_model=Response,\n    )\n\n\nasync def score_response(query: str, response: Response) -> tuple[Response, Grading]:\n    return (\n        response,\n        await client.create(\n            model=\"gpt-4o\",\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": dedent(\n                        f\"\"\"\n                You are a world class AI that excels at grading\n                responses to a user query in a succint and clear\n                manner.\n\n                <query>\n                {query}\n                </query>\n\n                <response>\n                {response}\n                </response>\n                \"\"\"\n                    ),\n                }\n            ],\n            response_model=Grading,\n        ),\n    )\n\n\nasync def generate_response_batch(\n    query: str, examples: list[str], n_examples_per_batch: int\n):\n    batches: list[list[str]] = []\n    for i in range(0, len(examples), n_examples_per_batch):\n        batches.append(examples[i : i + n_examples_per_batch])\n\n    coros = [generate_response(query, example_batch) for example_batch in batches]\n    return await asyncio.gather(*coros)\n\n\nasync def score_responses(\n    query: str, responses: list[Response]\n) -> list[tuple[Response, Grading]]:\n    coros = [score_response(query, response) for response in responses]\n    return await asyncio.gather(*coros)\n\n\nif __name__ == \"__main__\":\n    examples = [\n        \"\"\"\n        Q: James decides to run 3 sprints 3 times a week.\n        He runs 60 meters each sprint. How many total\n        meters does he run a week?\n        A: James decides to run 3 sprints 3 times a week.\n        He runs 60 meters each sprint. So he runs 60 meters\n        x 3 sprints x 3 times a week. That is 60 meters x 9.\n        The answer is 540.\n        \"\"\",\n        \"\"\"\n        Q: Brandon's iPhone is four times as old as Ben's\n        iPhone. Ben's iPhone is two times older than Suzy's\n        iPhone. If Suzy's iPhone is 1 year old, how old is\n        Brandon's iPhone?\n        A: Brandon's iPhone is 4 times as old as Ben's\n        iPhone. Ben's iPhone is 2 times older than Suzy's\n        iPhone. So Brandon's iPhone is 4 x 2 = 8 times older\n        than Suzy's iPhone. Suzy's iPhone is 1 year old. So\n        Brandon's iPhone is 8 x 1 = 8 years old. The answer\n        is 8.\n        \"\"\",\n        \"\"\"\n        Q: Jean has 30 lollipops. Jean eats 2 of the\n        lollipops. With the remaining lollipops, Jean wants\n        to package 2 lollipops in one bag. How many bags can\n        Jean fill?\n        A: Jean started with 30 lollipops. She ate 2 of\n        them. So she has 28 lollipops left. She wants to\n        package 2 lollipops in one bag. So she can package\n        28 / 2 = 14 bags. The answer is 14.\n        \"\"\",\n        \"\"\"\n        Q: Weng earns $12 an hour for babysitting.\n        Yesterday, she just did 50 minutes of babysitting.\n        How much did she earn?\n        A: Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.\n        Working 50 minutes, she earned 0.2 x 50 =\n        $<<0.2*50=10>>10. The answer is 10\n        \"\"\",\n    ]\n\n    query = \"\"\"Betty is saving money for a new wallet which\n    costs $100. Betty has only half of the money she needs.\n    Her parents decided to give her $15 for that purpose,\n    and her grandparents twice as much as her parents. How\n    much more money does Betty need to buy the wallet?\"\"\"\n\n    generated_responses = asyncio.run(generate_response_batch(query, examples, 1))\n\n    scored_responses = asyncio.run(score_responses(query, generated_responses))\n\n    scores: dict[int, float] = defaultdict(int)\n\n    for response, grade in scored_responses:\n        scores[response.answer] += grade.get_score()\n\n    print(scores)\n    #> defaultdict(<class 'int'>, {5: 3.5})\n\n    answer = max(scores, key=scores.get)\n    print(answer)\n    #> 5\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Making Language Models Better Reasoners with Step-Aware Verifier](https://aclanthology.org/2023.acl-long.291/)\n"
  },
  {
    "path": "docs/prompting/ensembling/max_mutual_information.md",
    "content": "---\ndescription: \"Max Mutual Information creates multiple prompt templates and then selects the optimal template as the one which maximises mutual information between the prompt and the LLM's outputs\"\n---\n\n## What's Max Mutual Information?\n\nMax Mutual Information Method is a method of prompting that aims to find the best prompt to elicit the desired response from a LLM. We do so by maximising a metric called Mutual Information - which indicates the reduction in a model's uncertainty as a result of the prompt.\n\n### Entropy\n\nWhen a language model recieves a prompt as input, it outputs a series of token probabilities sequentially until it reaches the `<EOS>` token. In the paper, they take the final probability distribution as $P(Y|X)$ where $Y$ is the final prediction of the model and $X$ the prompt.\n\nWhen we have a probability distribution, we can calculate a probability known as entropy. The lower this value is, the better. This is because a lower entropy value means that the model is more confident in its prediction.\n\nWe can calculate entropy with the following formula where $P(T_i)$ represents the probability of the $i$-th token in the final output distribution.\n\n$$\nH(P(Y|X)) = \\sum_{i=0}^n P(T_i) log (P(T_i))\n$$\n\n### Mutual Information\n\n![](../../img/mutual_information.png)\n\nWe can apply this to the calculation of Mutual Information as seen above.\n\nWe'll indicate the calculate of entropy of a probability distribution as $H(X)$ where $X$ here represents a final probability distribution. We also assume you have a train dataset of $n$ examples to use.\n\n1. First, we choose a set of tokens that are likely to be part of the final answer. This could be words that appear inside the choices we have provided.\n\n2. Once we've chosen these tokens, we extract out the log probs for each token from our final distribution. We then normalise it so that these new log probs now sum up to 1.\n\n3. We do this for the $n$ example inside our train set, this gives us a new distribution $P(Y_i|X_i)$ for each $i$-th example.\n\n4. We then take the average of these $n$ distributions to get $H_{marginal}$\n\n5. We then calculate the average of the entropy of each distribution to get $H_{conditional}$\n\n6. We then derive the Mutual Information by taking $H_{marginal} - H_{conditional}$, the higher this metric the better.\n\n??? info \"Unsure how to calculate $H_{marginal}$ and $H\\_{conditional}$\"\n\n    $$\n        H_{marginal} = H(\\frac{1}{n} \\sum_{i=0}^n P(Y_i | X_i) )\n    $$\n\n    $$\n        H_{conditional} = \\frac{1}{n} \\sum_{i=0}^n H(P(Y_i|X_i))\n    $$\n\nWe can then use this new mutual information metric to compare the effectiveness of different prompts at eliciting a desired response from our train dataset.\n\n## Implementation\n\nSince we don't have access to the raw log probabilites of specific tokens we want in the OpenAI API, we'll instead get the language model to generate a final score from 1 - 10 of its confidence in it's prediction.\n\nWe'll then convert this to a probability distribution with two outcomes and calculate a value for the entropy off of that.\n\nNext we'll compare the Mutual Information value for different prompts before choosing what the best prompt is. For this example, we'll be using values from the Story Cloze set.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Callable, Literal\nfrom textwrap import dedent\nimport math\nimport asyncio\n\n\nclass Response(BaseModel):\n    chain_of_thought: str\n    response: Literal[\"A\", \"B\"]\n    confidence: Literal[\n        \"Very High Confidence\",\n        \"High Confidence\",\n        \"Moderate Confidence\",\n        \"Low Confidence\",\n        \"Very Low Confidence\",\n    ]\n\n    def generate_score(self) -> float:\n        confidence_scores = {\n            \"Very High Confidence\": 1,\n            \"High Confidence\": 0.8,\n            \"Moderate Confidence\": 0.6,\n            \"Low Confidence\": 0.4,\n            \"Very Low Confidence\": 0.2,\n        }\n        return confidence_scores[self.confidence]\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", async_client=True)\n\n\ndef prompt_template_1(question: str, options: list[str]):\n    assert len(options) == 2\n    a, b = options\n\n    return dedent(\n        f\"\"\"\n    You are a world class AI System which excels at understanding complex user stories and generating responses. Output your prediction and also quantify your confidence in your prediction with the following scale.\n\n    - Very High Confidence: The model is highly confident in its prediction, displaying deep understanding, flawless execution, and no noticeable errors.\n    - High Confidence: The model is confident in its prediction, with strong relevance and minor errors that do not detract from overall quality.\n    - Moderate Confidence: The model has moderate confidence in its prediction, which is generally relevant with some inaccuracies, and meets minimum requirements.\n    - Low Confidence: The model has low confidence in its prediction, with limited relevance and several inaccuracies.\n    - Very Low Confidence: The model has very low confidence in its prediction, which is largely irrelevant, inaccurate, or incomplete, needing significant improvement\n\n\n    Context\n    {question}\n\n    Options\n    A. {a}\n    B. {b}\n    \"\"\"\n    )\n\n\ndef prompt_template_2(question: str, options: list[str]):\n    assert len(options) == 2\n    a, b = options\n\n    return dedent(\n        f\"\"\"\n    <prompt>\n        <Task>\n        You are about to be passed a story. You are to select the correct response from the options provided.\n\n         <confidence-levels>\n             <level>\n                 <name>Very High Confidence</name>\n                 <description>The model is highly confident in its prediction, displaying deep understanding, flawless execution, and no noticeable errors.</description>\n             </level>\n             <level>\n                 <name>High Confidence</name>\n                 <description>The model is confident in its prediction, with strong relevance and minor errors that do not detract from overall quality.</description>\n             </level>\n             <level>\n                 <name>Moderate Confidence</name>\n                 <description>The model has moderate confidence in its prediction, which is generally relevant with some inaccuracies, and meets minimum requirements.</description>\n             </level>\n             <level>\n                 <name>Low Confidence</name>\n                 <description>The model has low confidence in its prediction, with limited relevance and several inaccuracies.</description>\n             </level>\n             <level>\n                 <name>Very Low Confidence</name>\n                 <description>The model has very low confidence in its prediction, which is largely irrelevant, inaccurate, or incomplete, needing significant improvement</description>\n             </level>\n         </confidence-levels>\n        </Task>\n\n        <Question>\n        {question}\n        </Question>\n\n        <Options>\n        <option>A: {a}</option>\n        <option>B: {b}</option>\n        </Options>\n    </prompt>\n    \"\"\"\n    )\n\n\nasync def generate_response(\n    question: str, options: list[str], prompt_template: Callable[[str, list[str]], str]\n):\n    return await client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": prompt_template(question, options),\n            }\n        ],\n        response_model=Response,\n    )\n\n\nasync def generate_responses(\n    questions: list[str], prompt_template: Callable[[str, list[str]], str]\n):\n    return await asyncio.gather(\n        *[\n            generate_response(\n                question=question[\"question\"],\n                options=question[\"options\"],\n                prompt_template=prompt_template,\n            )\n            for question in questions\n        ]\n    )\n\n\ndef calculate_entropy(probs: list[float]) -> float:\n    return sum([p * math.log(p) if p != 0 else 0 for p in probs])\n\n\ndef calculate_mutual_information(predictions: list[Response]) -> float:\n    probs = [\n        [prediction.generate_score(), 1 - prediction.generate_score()]\n        for prediction in predictions\n    ]\n\n    avg_probs = [0, 0]\n\n    for p1, p2 in probs:\n        avg_probs[0] += p1\n        avg_probs[1] += p2\n\n    h_marginal = calculate_entropy([i / len(probs) for i in avg_probs])\n    h_conditional = sum([calculate_entropy(prob) for prob in probs]) / len(probs)\n\n    return h_marginal - h_conditional\n\n\nif __name__ == \"__main__\":\n    queries = [\n        {\n            \"question\": \"Karen was assigned a roommate her first year of college. Her roommate asked her to go to a nearby city for a concert. Karen agreed happily. The show was absolutely exhilarating.\",\n            \"options\": [\n                \"Karen became good friends with her roommate.\",\n                \"Karen hated her roommate.\",\n            ],\n        },\n        {\n            \"question\": \"Jim got his first credit card in college. He didn’t have a job so he bought everything on his card. After he graduated he amounted a $10,000 debt. Jim realized that he was foolish to spend so much money.\t\",\n            \"options\": [\n                \"Jim decided to devise a plan for repayment.\",\n                \"Jim decided to open another credit card.\",\n            ],\n        },\n        {\n            \"question\": \"Gina misplaced her phone at her grandparents. It wasn’t anywhere in the living room. She realized she was in the car before. She grabbed her dad’s keys and ran outside.\",\n            \"options\": [\n                \"She found her phone in the car.\",\n                \"She didn’t want her phone anymore.\",\n            ],\n        },\n    ]\n\n    best_mi_score = float(\"-inf\")\n    best_template = None\n\n    for prompt_template in [prompt_template_1, prompt_template_2]:\n        responses = asyncio.run(generate_responses(queries, prompt_template))\n        mi_score = calculate_mutual_information(responses)\n        print(f\"{prompt_template.__name__}: {mi_score}\")\n        #> prompt_template_1: -0.0781292189485728\n        #> prompt_template_2: -0.05907285153542691\n        if mi_score > best_mi_score:\n            best_mi_score = mi_score\n            best_template = prompt_template.__name__\n\n    print(best_template, best_mi_score)\n    #> prompt_template_2 -0.05907285153542691\n```\n"
  },
  {
    "path": "docs/prompting/ensembling/meta_cot.md",
    "content": "---\ndescription: \"Meta Chain Of Thought involves decomposing an initial query into multiple sub questions. We then aggregate the response from each of these chains as context before prompting another LLM to generate a response\"\n---\n\nMeta Chain Of Thought (Meta COT) <sup><a href=\"https://arxiv.org/pdf/2304.13007\">1</a></sup>. involves the use of multiple reasoning chains to generate a response to a given query. This helps our model evaluate multiple potential reasoning paths and from there, determine a more accurate answer.\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"41-42 57-61 96-99\"\nimport instructor\nfrom pydantic import BaseModel, Field\nimport asyncio\nfrom typing import Optional\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass ReasoningAndResponse(BaseModel):\n    intermediate_reasoning: str = Field(\n        description=\"\"\"\n    Intermediate reasoning steps\"\"\"\n    )\n    correct_answer: str\n\n\nclass MaybeResponse(BaseModel):\n    result: Optional[ReasoningAndResponse]\n    error: Optional[bool]\n    error_message: Optional[str] = Field(\n        description=\"\"\"Informative explanation of why\n        the reasoning chain was unable to generate\n        a result\"\"\"\n    )\n\n\nclass QueryDecomposition(BaseModel):\n    queries: list[str] = Field(\n        description=\"\"\"A list of queries that need to be\n        answered in order to derive the final answer\"\"\"\n    )\n\n\nasync def generate_queries(query: str):\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are a helpful assistant that\n                decomposes a query into multiple sub-queries.\"\"\",\n            },\n            {\"role\": \"user\", \"content\": query},\n        ],\n        response_model=QueryDecomposition,\n    )\n\n\nasync def generate_reasoning_chain(query: str) -> MaybeResponse:\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                Given a question and a context,\n                answer the question step-by-step.\n\n                Indicate the intermediate reasoning\n                steps.\n                \"\"\",\n            },\n            {\"role\": \"user\", \"content\": query},\n        ],\n        response_model=MaybeResponse,\n    )\n\n\nasync def batch_reasoning_chains(\n    queries: list[str],\n) -> list[MaybeResponse]:\n    coros = [generate_reasoning_chain(query) for query in queries]\n    results = await asyncio.gather(*coros)\n    return results\n\n\nasync def generate_response(query: str, context: list[MaybeResponse]):\n    formatted_context = \"\\n\".join(\n        [\n            f\"\"\"\n            {item.result.intermediate_reasoning}\n            {item.result.correct_answer}\n            \"\"\"\n            for item in context\n            if not item.error and item.result\n        ]\n    )\n\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                Given a question and a context,\n                answer the question step-by-step.\n\n                If you are unsure, answer Unknown.\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                    <question>\n                    {query}\n                    </question>\n                    <context>\n                    {formatted_context}\n                    </context>\n                    \"\"\",\n            },\n        ],\n        response_model=ReasoningAndResponse,\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"Would Arnold Schwarzenegger have been\n    able to deadlift an adult Black rhinoceros at his\n    peak strength?\"\"\"\n    decomposed_queries = asyncio.run(generate_queries(query))\n\n    for generated_query in decomposed_queries.queries:\n        print(generated_query)\n        #> How much weight could Arnold Schwarzenegger\n        #> deadlift at his peak strength?\n        #> What is the average weight of an adult Black\n        #> rhinoceros?\n\n    chains = asyncio.run(batch_reasoning_chains(decomposed_queries.queries))\n\n    for chain in chains:\n        print(chain.model_dump_json(indent=2))\n        \"\"\"\n        {\n          \"result\": {\n            \"intermediate_reasoning\": \"Determining Arnold\n            Schwarzenegger's peak deadlift involves\n            researching historical records, interviews,\n            and Arnold’s competitive powerlifting\n            results.\",\n            \"correct_answer\": \"Arnold Schwarzenegger's\n            peak deadlift was reportedly 710 lbs (322\n            kg).\"\n          },\n          \"error\": false,\n          \"error_message\": null\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"result\": {\n            \"intermediate_reasoning\": \"To determine the\n            average weight of an adult Black rhinoceros,\n            I need to consult reliable sources such as\n            wildlife encyclopedias, zoological databases,\n            or scientific articles. Commonly, the average\n            weight of adult Black rhinoceros ranges\n            between 800 to 1,400 kg.\",\n            \"correct_answer\": \"The average weight of an\n            adult Black rhinoceros ranges between 800 to\n            1,400 kg.\"\n          },\n          \"error\": false,\n          \"error_message\": null\n        }\n        \"\"\"\n\n    response = asyncio.run(generate_response(query, chains))\n\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"intermediate_reasoning\": \"Arnold Schwarzenegger's\n      peak deadlift was 710 lbs (322 kg). The average\n      weight of an adult Black rhinoceros ranges between\n      800 to 1,400 kg (1764 to 3086 lbs). Even at the\n      lower end of the rhinoceros weight range (800 kg\n      or 1764 lbs), it exceeds Arnold Schwarzenegger's\n      peak deadlift capacity of 710 lbs (322 kg).\n      Therefore, Arnold Schwarzenegger would not have\n      been able to deadlift an adult Black rhinoceros at\n      his peak strength.\",\n      \"correct_answer\": \"No\"\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Answering Questions by Meta-Reasoning over Multiple Chains of Thought](https://arxiv.org/pdf/2304.13007)\n"
  },
  {
    "path": "docs/prompting/ensembling/more.md",
    "content": "---\ndescription: \"MoRE creates a set of diverse reasoning experts by using different specialized prompts for different reasoning types. THe best answer from all experts is then selected using an agreement score\"\n---\n\nLanguage Models struggle to generalize across question types that require distinct reasoning abilities. By combining a variety of different specialized language models, we can improve the quality of our responses. This is done through a technique called Mixture Of Reasoning Experts (MoRE).\n\nIn the original paper, they utilise four different experts\n\n1. Factual Expert : This is a model that is augmented by a RAG prompting pipeline. WHen it recieves a query, it retrieves the top 10 most relevant passages from Wikipedia and appends them to the prompt right before the question.\n\n2. Multihop Expert : This is an expert that has manually written rationales after each demo to elicit multi-step reasoning processes for the questions\n\n3. Math Expert : This is an expert that has manually written explanations for the GSM8k Dataset to bias the model towards different reasoning steps\n\n4. Commonsense expert: This is an expert that is provided with 10 different facts that are generated by a Codex model which are appended to the prompt right before the question\n\n![](../../img/more.png)\n\nOnce each expert has genearted a response, they then use a random forest classifier to score it from 0 to 1. This is then used for selecting the final answer and determining if we've generated a sufficiently good answer ( Since we have the option to abstain at each point )\n\nWe can implement a simplified version of MoRE with `instructor` with a few modifications.\n\n```python\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom textwrap import dedent\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass MultihopExpert(BaseModel):\n    chain_of_thought: str\n    answer: str\n\n\nclass FactualExpert(BaseModel):\n    answer: str\n\n\nclass ModelScore(BaseModel):\n    score: float = Field(ge=0, lt=1)\n\n\ndef query_factual_expert(query: str, evidence: list[str]):\n    formatted_evidence = \"\\n-\".join(evidence)\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=FactualExpert,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                <query>\n                {query}\n                </query>\n\n                <evidences>\n                {formatted_evidence}\n                </evidences>\n                \"\"\"\n                ),\n            }\n        ],\n    )\n\n\ndef query_multihop_expert(query: str):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=MultihopExpert,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                <query>\n                {query}\n                </query>\n                \"\"\"\n                ),\n            }\n        ],\n    )\n\n\ndef score_answer(query: str, answer: str):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=ModelScore,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are a helpful assistant that scores\n                answers based on well they are able to answer a\n                specific user query\"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                <user query>\n                {query}\n                </user query>\n\n                <response>\n                {answer}\n                </response>\n                \"\"\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"Who's the original singer of Help Me Make It\n    Through The Night?\"\"\"\n    evidences = [\n        \"\"\"Help Me Make It Through The Night is a country\n        music ballad written and composed by Kris Kristofferson\n        and released on his 1970 album 'Kristofferson'\"\"\"\n    ]\n\n    threshold = 0.8\n\n    factual_expert_output = query_factual_expert(query, evidences)\n    print(factual_expert_output.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"answer\": \"The original singer of 'Help Me Make It Through the\n      Night' is Kris Kristofferson, who released it on his 1970 album\n      'Kristofferson'.\"\n    }\n    \"\"\"\n\n    multihop_expert_output = query_multihop_expert(query)\n    print(multihop_expert_output.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"chain_of_thought\": \"To identify the original singer of 'Help Me\n      Make It Through The Night,' I need to look for the person who\n      first recorded and released the song.\",\n      \"answer\": \"The original singer of 'Help Me Make It Through\n      The Night' is Kris Kristofferson.\"\n    }\n    \"\"\"\n\n    factual_expert_score = score_answer(query, factual_expert_output.answer)\n    multihop_expert_score = score_answer(query, multihop_expert_output.answer)\n\n    if max(factual_expert_score.score, multihop_expert_score.score) < threshold:\n        answer = \"Abstaining from responding\"\n    elif factual_expert_score.score > multihop_expert_score.score:\n        answer = factual_expert_output.answer\n    else:\n        answer = multihop_expert_output.answer\n\n    print(answer)\n    \"\"\"\n    The original singer of 'Help Me Make It Through the Night' is Kris\n    Kristofferson, who released it on his 1970 album 'Kristofferson'.\n    \"\"\"\n```\n"
  },
  {
    "path": "docs/prompting/ensembling/prompt_paraphrasing.md",
    "content": "---\ndescription: \"Use Large Language Models to perform back translation in order to improve prompt performance\"\n---\n\nLarge Language Models are sensitive to the way that they are prompted. When prompted incorrectly, they might perform much worse despite having the information or capability to respond to the prompt. We can help find semantically similar prompts by performing back translation - where we translate our prompts to another language and back to encourage more diversity in the rephrased prompts.\n\nPrompt paraphrasing <sup><a href=\"https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00324/96460/How-Can-We-Know-What-Language-Models-Know\">1</a></sup>. provides some ways for us to improve on the phrasing of our prompts to do so.\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"20-25\"\nimport instructor\nfrom pydantic import BaseModel\nimport random\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass TranslatedPrompt(BaseModel):\n    translation: str\n\n\nasync def translate_prompt(prompt: str, from_language: str, to_language: str):\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                You are an expert translation assistant.\n                You are going to be given a prompt and\n                asked to translate it from {from_language}\n                to {to_language}. Paraphrase and use\n                synonyms where possible, especially for\n                the examples.\n                \"\"\",\n            },\n            {\"role\": \"user\", \"content\": f\"Prompt: {prompt}\"},\n        ],\n        response_model=TranslatedPrompt,\n    )\n\n\nasync def generate_permutation(prompt: str, language: str) -> str:\n    tranlated_prompt = await translate_prompt(prompt, \"english\", language)\n    backtranslated_prompt = await translate_prompt(\n        tranlated_prompt.translation, language, \"english\"\n    )\n    return backtranslated_prompt.translation\n\n\nasync def generate_prompts(\n    prompt: str, languages: list[str], permutations: int\n) -> list[str]:\n    coros = [\n        generate_permutation(prompt, random.choice(languages))\n        for _ in range(permutations)\n    ]\n    return await asyncio.gather(*coros)\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    prompt = \"\"\"\n    You are an expert system that excels at Sentiment\n    Analysis of User Reviews.\n\n    Here are a few examples to refer to:\n\n    1. That was a fantastic experience I had! I'm\n    definitely recommending this to all my friends\n    // Positive\n    2. I think it was a passable evening. I don't think\n    there was anything remarkable or off-putting for me.\n    // Negative\n    3. I'm horrified at the state of affairs in this new\n    restaurant // Negative\n\n    Sentence: This was a fantastic experience!\n    \"\"\"\n    languages = [\"french\", \"spanish\", \"chinese\"]\n    permutations = 2\n\n    generated_prompts = asyncio.run(generate_prompts(prompt, languages, permutations))\n    for prompt in generated_prompts:\n        print(prompt)\n        \"\"\"\n        You are an expert system specializing in user review sentiment analysis. Here are a few examples to guide you: 1. It was an exceptional experience! I will definitely recommend it to all my friends // Positive 2. I think it was a mediocre evening. There wasn't anything outstanding or particularly bad for me // Negative 3. I am horrified by the condition of things in this new restaurant // Negative Sentence: It was an amazing experience!\n        \"\"\"\n        \"\"\"\n        You are an expert system that excels in User Review Sentiment Analysis.\n\n        Here are some reference examples:\n\n        1. I had an amazing experience! I will definitely recommend it to all my friends.\n        // Positive\n        2. I think it was an average evening. I don’t believe there was anything remarkable or unpleasant about it for me.\n        // Negative\n        3. I am horrified by the situation at this new restaurant.\n        // Negative\n\n        Sentence: This was a fantastic experience!\n        \"\"\"\n        \"\"\"\n        You are an expert system skilled in conducting user\n        review sentiment analysis.\n\n        Here are some examples for reference:\n\n        1. That was an awesome experience! I'll definitely\n        recommend it to all my friends // Positive\n        2. I think it was an okay evening. I don't find\n        anything particularly outstanding or unpleasant.\n        // Neutral\n        3. I am very shocked by the condition of this new\n        restaurant // Negative\n\n        Sentence: This was a wonderful experience!\n        \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [How Can We Know What Language Models Know? ](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00324/96460/How-Can-We-Know-What-Language-Models-Know)\n"
  },
  {
    "path": "docs/prompting/ensembling/self_consistency.md",
    "content": "---\ndescription: \"Self Consistency aims to help maximise llm performance by sampling multiple potential calls. We then take a majority vote on the final response to derive the answer\"\n---\n\nBy generating multiple candidate responses in parallel and choosing the most common answer among them, we can get a more accurate answer. This is known as Self-Consistency <sup><a href=\"https://arxiv.org/pdf/2203.11171\">1</a></sup>\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"25-29\"\nimport instructor\nfrom pydantic import BaseModel, Field\nimport asyncio\nfrom collections import Counter\nfrom textwrap import dedent\n\nclass SelfConsistencyResponse(BaseModel):\n    chain_of_thought: str = Field(\n        description=\"reasoning behind the final correct answer\"\n    )\n    correct_answer: int\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nasync def generate_self_consistent_response(prompt: str):\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are an intelligent question\n                answering AI system that excels at answering\n                user queries. Make sure to generate a\n                comprehensive explanation of your thought\n                process before providing the final answer\"\"\",\n            },\n            {\"role\": \"user\", \"content\": prompt},\n        ],\n        response_model=SelfConsistencyResponse,\n        temperature=0.5,\n    )\n\n\nasync def generate_self_consistent_responses(prompt: str, num_responses: int):\n    coros = [generate_self_consistent_response(prompt) for _ in range(num_responses)]\n    responses = await asyncio.gather(*coros)\n    return responses\n\n\nif __name__ == \"__main__\":\n    prompt = dedent(\n        \"\"\"\n        Janet's ducks lay 16 eggs per day.\n        She eats three for breakfast every\n        morning and bakes muffins for her\n        friends every day with four. She sells\n        the remainder for $2 per egg. How\n        much does she make every day?\n        \"\"\"\n    )\n    responses = asyncio.run(generate_self_consistent_responses(prompt, 5))\n    answer_counts = Counter([response.correct_answer for response in responses])\n    most_common_answer, _ = answer_counts.most_common(1)[0]\n\n    print(most_common_answer)\n    #> 18\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Self-Consistency Improves Chain Of Thought\nReasoning In Language Models](https://arxiv.org/pdf/2210.03350)\n"
  },
  {
    "path": "docs/prompting/ensembling/universal_self_consistency.md",
    "content": "---\ndescription: \"Universal Self Consistency aims to extend Self-Consistency by using Large Language Models themselves to select the most consistent answer among multiple candidates\"\n---\n\nUniversal Self Consistency<sup><a href=\"https://arxiv.org/pdf/2311.17311\">1</a></sup> aims to extend self-consistency by using a second LLM model to judge the quality of individual responses. Therefore instead of choosing the final answer based on the most frequently occuring value among each reasoning chain, we instead prompt the model to choose the most consistent answer for us relative to the prompt.\n\n![](../../img/universal_self_consistency.png)\n\nThis enables us to support a greater variety of different response formats and answer, leading to greater diversity of outputs and hence higher accuracy.\n\nWe can implement this in `instructor` as seen below.\n\n```python hl_lines=\"71-73\"\nfrom pydantic import BaseModel, Field, ValidationInfo, field_validator\nimport instructor\nfrom textwrap import dedent\nimport asyncio\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass Response(BaseModel):\n    chain_of_thought: str\n    answer: str\n\n\nclass SelectedResponse(BaseModel):\n    most_consistent_response_id: int = Field(\n        description=\"\"\"The ID of the most consistent response that\n        was provided\"\"\"\n    )\n\n    @field_validator(\"most_consistent_response_id\")\n    @classmethod\n    def validate_id(cls, v: int, info: ValidationInfo):\n        context = info.context\n        number_responses = context.get(\"number_responses\", float(\"inf\"))\n\n        if v > number_responses:\n            raise ValueError(\n                f\"\"\"Most consistent response ID {v} is greater than the\n                number of responses {number_responses}. Please return a\n                valid id between 0 and {number_responses-1}\"\"\"\n            )\n        return v\n\n\nasync def generate_response(query: str) -> Response:\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[{\"role\": \"user\", \"content\": query}],\n    )\n\n\nasync def generate_batch_responses(query: str, no_responses: int):\n    coros = [generate_response(query) for _ in range(no_responses)]\n    return await asyncio.gather(*coros)\n\n\nasync def select_consistent_response(responses: list[Response], query: str):\n    formatted_responses = \"\\n\".join(\n        [\n            f\"Response {idx}: {response.chain_of_thought}. {response.answer}\"\n            for idx, response in enumerate(responses)\n        ]\n    )\n\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=SelectedResponse,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": dedent(\n                    f\"\"\"\n                <user query>\n                {query}\n                </user query>\n\n                {formatted_responses}\n\n                Evaluate these responses.\n                Select the most consistent response based on majority\n                consensus\n                \"\"\"\n                ),\n            }\n        ],\n        context={\"number_responses\": len(responses)},\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"The three-digit number 'ab5' is divisible by 3. How many different\n     three-digit numbers can 'ab5' represent?\"\"\"\n    responses = asyncio.run(generate_batch_responses(query, 3))\n\n    for response in responses:\n        print(response.model_dump_json(indent=2))\n        \"\"\"\n        {\n          \"chain_of_thought\": \"A number is divisible by 3 if\n          the sum of its digits is divisible by 3. Given the\n          number 'ab5', we need to check how many different\n          values of 'a' and 'b', where both are digits (0-9)\n          can make the sum divisible by 3.\\n\\nThe sum of the\n          digits is a + b + 5.\\n\\nWe need to find pairs (a, b)\n          such that (a + b + 5) % 3 == 0.\",\n          \"answer\": \"30\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"chain_of_thought\": \"A number is divisible by 3 if\n          the sum of its digits is divisible by 3. Let's\n          denote the digits a and b. The number 'ab5' has\n          digits a, b, and 5. Therefore, the sum of the\n          digits is a + b + 5. Since the number is divisible\n          by 3, a + b + 5 must be divisible by 3.\\n\\nNow,\n          since a and b are single digits (0-9), we need to\n          find pairs (a, b) such that a + b + 5 is divisible\n          by 3. We will evaluate all possible combinations of\n          values for a and b to count how many valid pairs\n          (a, b) exist.\\n\\nLet's start by considering b's\n          values:\\n1. If b = 0, then a + 5 must be divisible\n          by 3.\\n2. If b = 1, then a + 6 must be divisible by\n          3.\\n3. If b = 2, then a + 7 must be divisible by\n          3.\\n4. If b = 3, then a + 8 must be divisible by\n          3.\\n5. If b = 4, then a + 9 must be divisible by\n          3.\\n6. If b = 5, then a + 10 must be divisible by\n          3.\\n7. If b = 6, then a + 11 must be divisible by\n          3.\\n8. If b = 7, then a + 12 must be divisible by\n          3.\\n9. If b = 8, then a + 13 must be divisible by\n          3.\\n10. If b = 9, then a + 14 must be divisible by\n          3.\\n\\nWe will find all corresponding a values for\n          each b and count the valid combinations.\\n\",\n          \"answer\": \"There are 30 different three-digit\n          numbers that 'ab5' can represent.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"chain_of_thought\": \"A number is divisible by 3 if\n          the sum of its digits is divisible by 3. The given\n          number is in the form 'ab5', where 'a' and 'b' are\n          digits from 0 to 9. To find the total number of\n          different three-digit numbers that 'ab5' can\n          represent, we need to determine all possible digit\n          combinations for 'a' and 'b' such that 'a + b + 5'\n          is divisible by 3.\",\n          \"answer\": \"30\"\n        }\n        \"\"\"\n\n    selected_response = asyncio.run(select_consistent_response(responses, query))\n    print(selected_response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"most_consistent_response_id\": 0\n    }\n    \"\"\"\n\n    print(\n        responses[selected_response.most_consistent_response_id].model_dump_json(\n            indent=2\n        )\n    )\n    \"\"\"\n    {\n      \"chain_of_thought\": \"A number is divisible by 3 if the sum of its digits is divisible by 3. Given the number 'ab5', we need to\n      check how many different values of 'a' and 'b', where both are digits (0-9) can make the sum divisible by 3.\\n\\nThe sum of the\n      digits is a + b + 5.\\n\\nWe need to find pairs (a, b) such that (a + b + 5) % 3 == 0.\",\n      \"answer\": \"30\"\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Universal Self-Consistency For Large Language Model Generation](https://arxiv.org/pdf/2311.17311)\n"
  },
  {
    "path": "docs/prompting/ensembling/usp.md",
    "content": "---\ndescription: \"Universal Self Prompting is a technique that aims to use unlabeled data to generate exemplars and a more complicated scoring function to select them.\"\n---\n\nUniversal Self Prompting is a two stage process similar to [Consistency Based Self Adaptive Prompting (COSP)](../few_shot/cosp.md). Here is a breakdown of the two stages.\n\n1. **Generate Examples** : LLMs are prompted to generate a collection of candidate responses using a test dataset\n2. **Answer Query** : We then select a few of these model-generated responses as examples to prompt the LLM to obtain a final prediction.\n\nNote here that the final answer is obtained using a single forward pass with greedy decoding.\n\n## USP Process\n\n![](../../img/universal_self_adaptive_prompting.png)\n\nLet's see how this works in greater detail.\n\n### Generate Few Shot Examples\n\nWe first prompt our model to generate responses for a given set of prompts. Instead of measuring the entropy and repetitiveness as in COSP, we use one of three possible methods to measure the quality of the generated responses. These methods are decided based on the three categories supported.\n\nThis category has to be specified by a user ahead of time.\n\nNote that for Short Form and Long Form generation, we generate $m$ different samples. This is not the case for classification tasks.\n\n- **Classification** : Classification Tasks are evaluated using the normalized probability of each label using the raw logits from the LLM.\n\n$$\nF_{CLS}(p^{(j)}|d^{(j)}) := -\\sum_{c \\in C} P(c|d^{(j)}) \\log P(c|d^{(j)})\n$$\n\nIn short, we take the raw logit for each token corresponding to the label, use a softmax to normalize each of them and then sum across the individual probabilities and their log probs. We also try to sample enough queries such that we have a balanced number of predictions across each class ( so that our model doesn't have a bias towards specific classes )\n\n- **Short Form Generation**: This is done by using a similar formula to COSP but without the normalizing term\n\n$$\n\\mathcal{H}\\left(x^{(i)} \\mid \\left\\{\\hat{y}_j^{(i)}\\right\\}_{j=1}^m\\right) = \\frac{\\sum_{\\alpha=1}^u \\hat{p}\\left(\\hat{y}_{\\alpha}^{(i)}\\right) \\log \\hat{p}\\left(\\hat{y}_{\\alpha}^{(i)}\\right)}{\\log m},\n$$\n\n- **Long Form Generation**: This is done by using the average pairwise ROUGE score between all pairs of the $m$ responses.\n\nWhat is key here is that depending on the task specified by the user, we have a task-specific form of evaluation. This eventually allows us to better evaluate our individual generated examples. Samples of tasks for each category include\n\n1. **Classification**: Natural Language Inference, Topic Classification and Sentiment Analysis\n2. **Short Form Generation** : Question Answering and Sentence Completion\n3. **Long Form Generation** : Text Summarization and Machine Translation\n\nThis helps to ultimately improve the performance of these large language models across different types of tasks.\n\n### Generate Single Response\n\nOnce we've selected our examples, the second step is relatively simple. We just need to append a few of our chosen examples that score best on our chosen metric to append to our solution.\n\n## Implementation\n\nWe've implemented a classification example below that tries to sample across different classes in a balanced manner before generating a response using a single inference call.\n\nWe bias this sampling towards samples that the model is more confident towards by using a confidence label.\n\n```python\nfrom pydantic import BaseModel\nfrom typing import Literal\nimport instructor\nimport asyncio\nfrom collections import defaultdict\n\n\nclass Classification(BaseModel):\n    chain_of_thought: str\n    label: Literal[\"Happy\", \"Angry\", \"Sadness\"]\n    confidence: Literal[\n        \"Uncertain\", \"Somewhat Confident\", \"Confident\", \"Highly Confident\"\n    ]\n\n    def confidence_score(self) -> int:\n        confidence_order = {\n            \"Highly Confident\": 4,\n            \"Confident\": 3,\n            \"Somewhat Confident\": 2,\n            \"Uncertain\": 1,\n        }\n        return confidence_order[self.confidence]\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\", async_client=True)\n\n\nasync def generate_prediction(query: str):\n    return (\n        await client.create(\n            model=\"gpt-3.5-turbo\",\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Classify the following query {query} into\n                    one of the following categories: Happy, Angry, Sadness\"\"\",\n                }\n            ],\n            response_model=Classification,\n        ),\n        query,\n    )\n\n\nasync def generate_predictions(queries: list[str]) -> list[tuple[Classification, str]]:\n    return await asyncio.gather(*[generate_prediction(query) for query in queries])\n\n\ndef get_balanced_sample(predictions: list[tuple[Classification, str]], k: int):\n    label_to_queries: dict[str, list[tuple[Classification, str]]] = defaultdict(list)\n\n    for prediction in predictions:\n        label_to_queries[prediction[0].label].append(prediction)\n\n    num_classes = len(label_to_queries)\n    num_samples_per_class = k // num_classes\n\n    res: list[str] = []\n    for label, label_queries in label_to_queries.items():\n        label_queries = sorted(\n            label_queries, key=lambda x: x[0].confidence_score(), reverse=True\n        )\n        label_queries = [\n            label_queries[1] for label_queries in label_queries[:num_samples_per_class]\n        ]\n        res.extend([f\"{query} ({label})\" for query in label_queries])\n\n    return res\n\n\nasync def generate_response_with_examples(query: str, examples: list[str]):\n    formatted_examples = \"\\n\".join(examples)\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=Classification,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                You are a helpful assistant that classifies queries into one of the following categories: Happy, Angry, Sadness.\n\n                Here are some samples of queries and their categories:\n\n                <examples>\n                {formatted_examples}\n                </examples>\n\n                Here is a user query to classify\n\n                <query>\n                {query}\n                </query>\n                \"\"\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    examples = [\n        \"\"\"\n        i do feel that running is a divine experience and\n        that i can expect to have some type of spiritual\n        encounter\n        \"\"\",\n        \"\"\"\n        i get giddy over feeling elegant in a perfectly\n        fitted pencil skirt\n        \"\"\",\n        \"\"\"\n        i plan to share my everyday life stories traveling\n        adventures inspirations and handmade creations with\n        you and hope you will also feel inspired\n        \"\"\",\n        \"\"\"\n        i need to feel the dough to make sure its just\n        perfect\n        \"\"\",\n        \"\"\"\n        i found myself feeling a little discouraged that\n        morning\n        \"\"\",\n        \"i didnt really feel that embarrassed\",\n        \"i feel like a miserable piece of garbage\",\n        \"\"\"\n        i feel like throwing away the shitty piece of shit\n        paper\n        \"\"\",\n        \"\"\"\n        i feel irritated and rejected without anyone doing\n        anything or saying anything\n        \"\"\",\n        \"i feel angered and firey\",\n        \"\"\"\n        im feeling bitter today my mood has been strange the\n        entire day so i guess its that\n        \"\"\",\n        \"i just feel really violent right now\",\n        \"i know there are days in which you feel distracted\",\n    ]\n\n    labels = asyncio.run(generate_predictions(examples))\n    balanced_sample = get_balanced_sample(labels, 3)\n    for sample in balanced_sample:\n        print(sample)\n        \"\"\"\n        i do feel that running is a divine experience and that i can\n        expect to have some type of spiritual encounter (Happy)\n        \"\"\"\n        #> i feel like a miserable piece of garbage (Sadness)\n        #> i feel like throwing away the shitty piece of shit paper (Angry)\n\n    response = asyncio.run(\n        generate_response_with_examples(\n            \"\"\"\n            i feel furious that right to life advocates can\n            and do tell me how to live and die through\n            lobbying and supporting those politicians\n            sympathic to their views\n            \"\"\",\n            balanced_sample,\n        )\n    )\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"chain_of_thought\": \"The user expresses feelings of\n      anger and frustration specifically directed at right\n      to life advocates. The language used, such as\n      'furious,' indicates a high level of emotion\n      associated with anger.\",\n      \"label\": \"Angry\",\n      \"confidence\": \"Highly Confident\"\n    }\n    \"\"\"\n```\n"
  },
  {
    "path": "docs/prompting/few_shot/cosp.md",
    "content": "---\ndescription: \"Consistency Based Self Adaptive Prompting (COSP) is a technique that uses entropy and repetitiveness to select high-quality examples for few-shot learning.\"\n---\n\n# Consistency Based Self Adaptive Prompting (COSP)\n\nCOSP is a technique that aims to improve few-shot learning by selecting high-quality examples based on the consistency and confidence of model responses. This approach helps create more effective prompts by identifying examples that the model can process reliably.\n\n## Overview\n\nThe COSP process involves two main stages:\n\n1. **Example Generation**: Generate multiple responses for potential examples\n\n   - Run each example through the model multiple times\n   - Collect responses and confidence scores\n\n2. **Example Selection**: Select the best examples based on entropy and repetitiveness\n   - Calculate entropy of responses to measure consistency\n   - Evaluate repetitiveness to ensure reliability\n\n## How COSP Works\n\n### Stage 1: Example Generation\n\nFor each potential example in your dataset:\n\n1. Generate multiple responses (typically 3-5)\n2. Calculate the entropy of these responses\n3. Measure the repetitiveness across responses\n\n```python\nfrom typing import List\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\nclass Response(BaseModel):\n    content: str = Field(description=\"The model's response to the prompt\")\n    confidence: float = Field(description=\"Confidence score between 0 and 1\")\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\ndef generate_responses(prompt: str, n: int = 3) -> List[Response]:\n    responses = []\n    for _ in range(n):\n        response = client.create(\n            model=\"gpt-4\",\n            messages=[{\"role\": \"user\", \"content\": prompt}],\n            response_model=Response\n        )\n        responses.append(response)\n    return responses\n```\n\n### Stage 2: Example Selection\n\nCalculate metrics for each example:\n\n1. **Entropy**: Measure response variability\n2. **Repetitiveness**: Check response consistency\n\n```python\nimport numpy as np\nfrom scipy.stats import entropy\n\ndef calculate_metrics(responses: List[Response]) -> tuple[float, float]:\n    # Calculate entropy\n    confidences = [r.confidence for r in responses]\n    entropy_score = entropy(confidences)\n\n    # Calculate repetitiveness\n    unique_responses = len(set(r.content for r in responses))\n    repetitiveness = 1 - (unique_responses / len(responses))\n\n    return entropy_score, repetitiveness\n```\n\n## Implementation Example\n\nHere's a complete example of COSP implementation:\n\n```python\nfrom typing import List, Tuple\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\nimport numpy as np\nfrom scipy.stats import entropy\n\nclass Example(BaseModel):\n    text: str\n    score: float = Field(description=\"Combined quality score\")\n    entropy: float = Field(description=\"Entropy of responses\")\n    repetitiveness: float = Field(description=\"Repetitiveness of responses\")\n\nclass COSPSelector:\n    def __init__(self, client: OpenAI, n_samples: int = 3):\n        self.client = instructor.from_provider(\"openai/gpt-4o\")\n        self.n_samples = n_samples\n\n    def generate_responses(self, prompt: str) -> List[Response]:\n        return [\n            self.client.create(\n                model=\"gpt-4\",\n                messages=[{\"role\": \"user\", \"content\": prompt}],\n                response_model=Response\n            )\n            for _ in range(self.n_samples)\n        ]\n\n    def calculate_metrics(self, responses: List[Response]) -> Tuple[float, float]:\n        confidences = [r.confidence for r in responses]\n        entropy_score = entropy(confidences)\n\n        unique_responses = len(set(r.content for r in responses))\n        repetitiveness = 1 - (unique_responses / len(responses))\n\n        return entropy_score, repetitiveness\n\n    def select_examples(self, candidates: List[str], k: int) -> List[Example]:\n        examples = []\n\n        for text in candidates:\n            responses = self.generate_responses(text)\n            entropy_score, repetitiveness = self.calculate_metrics(responses)\n\n            # Combined score (lower is better)\n            score = entropy_score - repetitiveness\n\n            examples.append(Example(\n                text=text,\n                score=score,\n                entropy=entropy_score,\n                repetitiveness=repetitiveness\n            ))\n\n        # Sort by score (lower is better) and select top k\n        return sorted(examples, key=lambda x: x.score)[:k]\n```\n\n## Usage Example\n\n```python\n# Initialize COSP selector\nclient = OpenAI()\nselector = COSPSelector(client)\n\n# Candidate examples\ncandidates = [\n    \"The quick brown fox jumps over the lazy dog\",\n    \"Machine learning is a subset of artificial intelligence\",\n    \"Python is a high-level programming language\",\n    # ... more examples\n]\n\n# Select best examples\nbest_examples = selector.select_examples(candidates, k=3)\n\n# Use selected examples in your prompt\nselected_texts = [ex.text for ex in best_examples]\nprompt = f\"\"\"Use these examples to guide your response:\n\nExamples:\n{chr(10).join(f'- {text}' for text in selected_texts)}\n\nNow, please respond to: [your query here]\n\"\"\"\n```\n\n## Benefits of COSP\n\n1. **Improved Consistency**: By selecting examples with low entropy and high repetitiveness\n2. **Better Performance**: More reliable few-shot learning\n3. **Automated Selection**: No manual example curation needed\n4. **Quality Metrics**: Quantifiable measure of example quality\n\n## Limitations\n\n1. **Computational Cost**: Requires multiple API calls per example\n2. **Time Overhead**: Selection process can be slow for large candidate sets\n3. **Model Dependency**: Performance may vary across different models\n\n## Related Techniques\n\n- [Universal Self Prompting (USP)](../ensembling/usp.md)\n- Chain of Thought Prompting\n- Self-Consistency\n\n## References\n\n1. Original COSP Paper: [arXiv:2305.14121](https://arxiv.org/abs/2305.14121)\n2. Related Work: [Self-Consistency Improves Chain of Thought Reasoning in Language Models](https://arxiv.org/abs/2203.11171)\n"
  },
  {
    "path": "docs/prompting/few_shot/example_generation/sg_icl.md",
    "content": "---\ntitle: \"Generate In-Context Examples\"\ndescription: \"\"\n---\n\nHow can we generate examples for our prompt?\n\nSelf-Generated In-Context Learning (SG-ICL) is a technique which uses an LLM to generate examples to be used during the task. This allows for in-context learning, where examples of the task are provided in the prompt.\n\nWe can implement SG-ICL using `instructor` as seen below.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Literal\nn = 4  # num examples to generate per class\n\n\nclass GeneratedReview(BaseModel):\n    review: str\n    sentiment: Literal[\"positive\", \"negative\"]\n\n\nclass SentimentPrediction(BaseModel):\n    sentiment: Literal[\"positive\", \"negative\"]\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_sample(input_review, sentiment):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=GeneratedReview,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                           Generate a '{sentiment}' review similar to: {input_review}\n                           Generated review:\n                           \"\"\",\n            }\n        ],\n    )\n\n\ndef predict_sentiment(input_review, in_context_samples):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=SentimentPrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"\".join(\n                    [\n                        f\"Review: {sample.review}\\nSentiment: {sample.sentiment}\\n\\n\"\n                        for sample in in_context_samples\n                    ]\n                )\n                + f\"Review: {input_review}\\nSentiment:\",\n            }\n        ],\n    ).sentiment\n\n\nif __name__ == \"__main__\":\n    input_review = (\n        \"This movie was a rollercoaster of emotions, keeping me engaged throughout.\"\n    )\n\n    # Generate in-context samples\n    samples = [\n        generate_sample(input_review, sentiment)\n        for sentiment in ('positive', 'negative')\n        for _ in range(n)\n    ]\n    for sample in samples:\n        print(sample)\n        \"\"\"\n        review='This film was an enthralling experience from start to finish, leaving me captivated every moment.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='This film was an emotional journey that captivated me from start to finish.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='The film took me on an unforgettable journey, capturing my attention at every moment.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='This book was a riveting journey, capturing my attention from start to finish.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='The movie was a total letdown, failing to hold my interest from start to finish.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='This movie was a disjointed mess of emotions, leaving me confused throughout.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='The movie was an emotional rollercoaster, but it left me feeling more confused than engaged.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='This movie was a monotonous ride, failing to engage me at any point.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='This film was an emotional journey, captivating me from start to finish.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='This film captivated me from start to finish with its thrilling plot and emotional depth.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='This movie was a breathtaking journey, capturing my attention from start to finish.' sentiment='positive'\n        \"\"\"\n        \"\"\"\n        review='This movie was a chaotic mess of emotions, losing me at every turn.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='This movie was a confusing mess, leaving me disengaged throughout.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='This movie was a chore to sit through, leaving me bored most of the time.' sentiment='negative'\n        \"\"\"\n        \"\"\"\n        review='This movie was a mishmash of confusing scenes, leaving me frustrated throughout.' sentiment='negative'\n        \"\"\"\n\n    # Predict sentiment\n    print(predict_sentiment(input_review, samples))\n    #> positive\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator](https://arxiv.org/abs/2206.08082)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/few_shot/example_ordering.md",
    "content": "---\ntitle: \"Example Ordering\"\ndescription: \"LLM outputs are heavily impacted by ordering of few shot examples\"\n---\n\n# Example Ordering\n\nThe order of few-shot examples in the prompt can affect LLM outputs <sup><a href=\"https://arxiv.org/abs/2104.08786\">1</a><a href=\"https://arxiv.org/abs/2106.01751\">2</a><a href=\"https://arxiv.org/abs/2101.06804\">3</a><a href=\"https://aclanthology.org/2022.naacl-main.191/\">4</a></sup><sup><a href=\"https://arxiv.org/abs/2406.06608\">\\*</a></sup>. Consider permutating the order of these examples in your prompt to achieve better results.\n\n## Choosing Your Examples\n\nDepending on your use-case, here are a few different methods that you can consider using to improve the quality of your examples.\n\n### Combinatorics\n\nOne of the easiest methods is for us to manually iterate over each of the examples that we have and try all possible combinations we could create. This will in turn allow us to find the best combination that we can find.\n\n### KATE\n\nKATE (k-Nearest Example Tuning) is a method designed to enhance GPT-3's performance by selecting the most relevant in-context examples. The method involves:\n\nFor each example in the test set, K nearest neighbors (examples) are retrieved based on semantic similarity.\nAmong these K examples, those that appear most frequently across different queries are selected as the best in-context examples.\n\n### Using a Unsupervised Retriever\n\n![Retriever Image](../../img/retriever.png)\n\nWe can use a large LLM to compute a single score for each example with respect to a given prompt. This allows us to create a training set that scores an example's relevance when compared against a prompt. Using this training set, we can train a model that mimics this functionality. This allows us to determine the top `k` most relevant and most irrelevant examples when a user makes a query so that we can include this in our final prompt.\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity](https://arxiv.org/abs/2104.08786)\n\n<sup id=\"ref-2\">2</sup>: [Reordering Examples Helps during Priming-based Few-Shot Learning](https://arxiv.org/abs/2106.01751)\n\n<sup id=\"ref-2\">3</sup>: [What Makes Good In-Context Examples for GPT-3?](https://arxiv.org/abs/2101.06804)\n\n<sup id=\"ref-3\">4</sup>: [Learning To Retrieve Prompts for In-Context Learning](https://aclanthology.org/2022.naacl-main.191/)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/few_shot/exemplar_selection/knn.md",
    "content": "---\ntitle: \"Select Effective Examples\"\ndescription: \"KNN can be leveraged to choose the most effective examples to use for a given query.\"\n---\n\nWe can select effective in-context examples by choosing those that are semantically closer to the query using `KNN`.\n\nIn the below implementation using `instructor`, we follow these steps:\n\n1. Embed the query examples\n2. Embed the query that we want to answer\n3. Find the _k_ query examples closest to the query\n4. Use the chosen examples and their as the context for the LLM\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom openai import OpenAI\nimport math\nfrom textwrap import dedent\n\n\nclass Example(BaseModel):\n    question: str\n    answer: str\n\n\nclass Response(BaseModel):\n    answer: str\n\n\noai = OpenAI()\nclient = instructor.from_provider(\"openai/gpt-4o\")\n\n\ndef distance(a: list[float], b: list[float]):\n    return 1 - sum(ai * bi for ai, bi in zip(a, b)) / (\n        math.sqrt(sum(ai**2 for ai in a)) * math.sqrt(sum(bi**2 for bi in b))\n    )\n\n\ndef embed_queries(queries: list[str]) -> list[tuple[list[float], str]]:\n    return [\n        (embedding_item.embedding, query)\n        for embedding_item, query in zip(\n            oai.embeddings.create(input=queries, model=\"text-embedding-3-large\").data,\n            queries,\n        )\n    ]\n\n\ndef knn(\n    embedded_examples: list[tuple[list[float], str]],\n    query_embedding: list[float],\n    k: int,\n):\n    distances = [\n        (distance(embedding, query_embedding), example)\n        for embedding, example in embedded_examples\n    ]\n    distances.sort(key=lambda x: x[0])\n    return distances[:k]\n\n\ndef generate_response(examples: list[str], query: str):\n    formatted_examples = \"\\n\".join(examples)\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": dedent(\n                    f\"\"\"\n                    Respond to the following query with the most accurate\n                    and concise answer possible.\n                    <examples>\n                    {formatted_examples}\n                    </examples>\n                    <query>\n                    {query}\n                    </query>\n                \"\"\"\n                ),\n            }\n        ],\n    )\n\n\ndef generate_question_and_answer_pair(\n    questions: list[str], question_and_answers: list[dict[str, str]]\n) -> list[str]:\n    question_to_answer = {}\n\n    for question in question_and_answers:\n        question_to_answer[question[\"question\"]] = question[\"answer\"]\n\n    return [\n        dedent(\n            f\"\"\"\n        <example>\n        <question>{question}</question>\n        <answer>{question_to_answer[question]}</answer>\n        </example>\n        \"\"\"\n        )\n        for question in questions\n    ]\n\n\nif __name__ == \"__main__\":\n    examples = [\n        {\"question\": \"What is the capital of France?\", \"answer\": \"Paris\"},\n        {\"question\": \"Who wrote Romeo and Juliet\", \"answer\": \"Shakespeare\"},\n        {\"question\": \"What is the capital of Germany?\", \"answer\": \"Berlin\"},\n    ]\n\n    query = \"What is the capital of Italy?\"\n\n    # Step 1 : Embed the Examples\n    embeddings = embed_queries([example[\"question\"] for example in examples] + [query])\n\n    embedded_examples = embeddings[:-1]\n    embedded_query = embeddings[-1]\n\n    # # Step 3: Find the k closest examples to the query\n    k_closest_examples = knn(embedded_examples, embedded_query[0], 2)\n\n    for example in k_closest_examples:\n        print(example)\n        #> (0.4013468481736857, 'What is the capital of France?')\n        #> (0.4471368596136872, 'What is the capital of Germany?')\n\n    # Step 4: Use these examples as in-context examples\n    formatted_examples = generate_question_and_answer_pair(\n        [example[1] for example in k_closest_examples], examples\n    )\n    response = generate_response(formatted_examples, query)\n    print(response.answer)\n    #> Rome\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [What Makes Good In-Context Examples for GPT-3?](https://arxiv.org/abs/2101.06804)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/few_shot/exemplar_selection/vote_k.md",
    "content": "---\ntitle: \"\"\ndescription: \"\"\nkeywords: \"\"\n---\n\n[wip]\n"
  },
  {
    "path": "docs/prompting/index.md",
    "content": "---\ntitle: Advanced Prompting Techniques Guide\ndescription: Research-backed prompting techniques to improve LLM performance with Instructor\n---\n\n# Advanced Prompting Techniques\n\n<div class=\"grid cards\" markdown>\n\n- :material-lightbulb: **Basic Approaches**\n\n    Zero-shot and few-shot techniques for immediate improvements\n\n    [:octicons-arrow-right-16: Zero-Shot](#zero-shot) · [:octicons-arrow-right-16: Few-Shot](#few-shot)\n\n- :material-brain: **Reasoning Methods**\n\n    Techniques to improve model reasoning and problem-solving\n\n    [:octicons-arrow-right-16: Thought Generation](#thought-generation) · [:octicons-arrow-right-16: Decomposition](#decomposition)\n\n- :material-check-all: **Verification**\n\n    Methods for self-assessment and correction\n\n    [:octicons-arrow-right-16: Self-Criticism](#self-criticism)\n\n- :material-group: **Collaboration**\n\n    Ensemble techniques for aggregating multiple model outputs\n\n    [:octicons-arrow-right-16: Ensembling](#ensembling)\n\n</div>\n\nThis guide presents 58 research-backed prompting techniques mapped to Instructor implementations. Based on [The Prompt Report](https://trigaten.github.io/Prompt_Survey_Site) by [Learn Prompting](https://learnprompting.org) which analyzed over 1,500 academic papers on prompting.\n\n## Prompting Technique Map\n\nThe following diagram shows how different prompting techniques relate to each other and when to use them:\n\n```mermaid\nflowchart TD\n    A[Choose Prompting Technique] --> B{Have Examples?}\n\n    B -->|No| C[Zero-Shot Techniques]\n    B -->|Yes| D[Few-Shot Techniques]\n\n    C --> C1[Role Prompting]\n    C --> C2[Emotional Language]\n    C --> C3[Style Definition]\n    C --> C4[Follow-Up Generation]\n\n    D --> D1[Example Ordering]\n    D --> D2[Example Selection]\n    D --> D3[Example Generation]\n\n    A --> E{Need Reasoning?}\n\n    E -->|Yes| F[Thought Generation]\n    F --> F1[Chain of Thought]\n    F --> F2[Step-Back Prompting]\n    F --> F3[Thread of Thought]\n\n    A --> G{Complex Problem?}\n\n    G -->|Yes| H[Decomposition]\n    H --> H1[Least-to-Most]\n    H --> H2[Tree of Thought]\n    H --> H3[Plan and Solve]\n\n    A --> I{Need Verification?}\n\n    I -->|Yes| J[Self-Criticism]\n    J --> J1[Self-Verification]\n    J --> J2[Chain of Verification]\n    J --> J3[Self-Refinement]\n\n    A --> K{Want Multiple Perspectives?}\n\n    K -->|Yes| L[Ensembling]\n    L --> L1[Self-Consistency]\n    L --> L2[Meta-CoT]\n    L --> L3[Specialized Experts]\n\n    classDef category fill:#e2f0fb,stroke:#b8daff,color:#004085;\n    classDef technique fill:#d4edda,stroke:#c3e6cb,color:#155724;\n    classDef decision fill:#fff3cd,stroke:#ffeeba,color:#856404;\n\n    class A,C,D,F,H,J,L category\n    class C1,C2,C3,C4,D1,D2,D3,F1,F2,F3,H1,H2,H3,J1,J2,J3,L1,L2,L3 technique\n    class B,E,G,I,K decision\n```\n\n## When to Use Each Technique\n\n| Goal | Recommended Techniques |\n|------|------------------------|\n| Improve accuracy | Chain of Thought, Self-Verification, Self-Consistency |\n| Handle complex problems | Decomposition, Tree of Thought, Least-to-Most |\n| Generate creative content | Role Prompting, Emotional Language, Style Definition |\n| Verify factual correctness | Chain of Verification, Self-Calibration |\n| Optimize with few examples | KNN Example Selection, Active Prompting |\n| Handle uncertainty | Uncertainty-Routed CoT, Self-Consistency |\n\n## Zero-Shot {#zero-shot}\n\nThese techniques improve model performance without examples:\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Emotional Language](zero_shot/emotion_prompting.md) | Add emotional tone to prompts | Creative writing, empathetic responses |\n| [Role Assignment](zero_shot/role_prompting.md) | Give the model a specific role | Expert knowledge, specialized perspectives |\n| [Style Definition](zero_shot/style_prompting.md) | Specify writing style | Content with particular tone or format |\n| [Prompt Refinement](zero_shot/s2a.md) | Automatic prompt optimization | Iterative improvement of results |\n| [Perspective Simulation](zero_shot/simtom.md) | Have the model adopt viewpoints | Multiple stakeholder analysis |\n| [Ambiguity Clarification](zero_shot/rar.md) | Identify and resolve unclear aspects | Improving precision of responses |\n| [Query Repetition](zero_shot/re2.md) | Ask model to restate the task | Better task understanding |\n| [Follow-Up Generation](zero_shot/self_ask.md) | Generate clarifying questions | Deep exploration of topics |\n\n## Few-Shot {#few-shot}\n\nTechniques for effectively using examples in prompts:\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Example Generation](few_shot/example_generation/sg_icl.md) | Automatically create examples | Domains with limited example data |\n| [Example Ordering](few_shot/example_ordering.md) | Optimal sequencing of examples | Improved pattern recognition |\n| [KNN Example Selection](few_shot/exemplar_selection/knn.md) | Choose examples similar to query | Domain-specific accuracy |\n| [Vote-K Selection](few_shot/exemplar_selection/vote_k.md) | Advanced similarity-based selection | Complex pattern matching |\n\n## Thought Generation {#thought-generation}\n\nMethods to encourage human-like reasoning in models:\n\n### Zero-Shot Reasoning\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Analogical CoT](thought_generation/chain_of_thought_zero_shot/analogical_prompting.md) | Generate reasoning using analogies | Complex problem-solving |\n| [Step-Back Prompting](thought_generation/chain_of_thought_zero_shot/step_back_prompting.md) | Consider higher-level questions first | Scientific and abstract reasoning |\n| [Thread of Thought](thought_generation/chain_of_thought_zero_shot/thread_of_thought.md) | Encourage step-by-step analysis | Detailed explanation generation |\n| [Tabular CoT](thought_generation/chain_of_thought_zero_shot/tab_cot.md) | Structure reasoning in table format | Multi-factor analysis |\n\n### Few-Shot Reasoning\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Active Prompting](thought_generation/chain_of_thought_few_shot/active_prompt.md) | Annotate uncertain examples | Improved accuracy on edge cases |\n| [Auto-CoT](thought_generation/chain_of_thought_few_shot/auto_cot.md) | Choose diverse examples | Broad domain coverage |\n| [Complexity-Based CoT](thought_generation/chain_of_thought_few_shot/complexity_based.md) | Use complex examples | Challenging problem types |\n| [Contrastive CoT](thought_generation/chain_of_thought_few_shot/contrastive.md) | Include correct and incorrect cases | Error detection and avoidance |\n| [Memory of Thought](thought_generation/chain_of_thought_few_shot/memory_of_thought.md) | Use high-certainty examples | Reliability in critical applications |\n| [Uncertainty-Routed CoT](thought_generation/chain_of_thought_few_shot/uncertainty_routed_cot.md) | Select the most certain reasoning path | Decision-making under uncertainty |\n| [Prompt Mining](thought_generation/chain_of_thought_few_shot/prompt_mining.md) | Generate templated prompts | Efficient prompt engineering |\n\n## Ensembling {#ensembling}\n\nTechniques for combining multiple prompts or responses:\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Consistent, Diverse Sets](ensembling/cosp.md) | Build consistent example sets | Stable performance |\n| [Batched In-Context Examples](ensembling/dense.md) | Efficient example batching | Performance optimization |\n| [Step Verification](ensembling/diverse.md) | Validate individual steps | Complex workflows |\n| [Maximizing Mutual Information](ensembling/max_mutual_information.md) | Information theory optimization | Information-dense outputs |\n| [Meta-CoT](ensembling/meta_cot.md) | Merge multiple reasoning chains | Complex problem-solving |\n| [Specialized Experts](ensembling/more.md) | Use different \"expert\" prompts | Multi-domain tasks |\n| [Self-Consistency](ensembling/self_consistency.md) | Choose most consistent reasoning | Logical accuracy |\n| [Universal Self-Consistency](ensembling/universal_self_consistency.md) | Domain-agnostic consistency | General knowledge tasks |\n| [Task-Specific Selection](ensembling/usp.md) | Choose examples per task | Specialized domain tasks |\n| [Prompt Paraphrasing](ensembling/prompt_paraphrasing.md) | Use variations of the same prompt | Robust outputs |\n\n## Self-Criticism {#self-criticism}\n\nMethods for models to verify or improve their own responses:\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Chain of Verification](self_criticism/chain_of_verification.md) | Generate verification questions | Fact-checking, accuracy |\n| [Self-Calibration](self_criticism/self_calibration.md) | Ask if answer is correct | Confidence estimation |\n| [Self-Refinement](self_criticism/self_refine.md) | Auto-generate feedback and improve | Iterative improvement |\n| [Self-Verification](self_criticism/self_verification.md) | Score multiple solutions | Quality assessment |\n| [Reverse CoT](self_criticism/reversecot.md) | Reconstruct the problem | Complex reasoning verification |\n| [Cumulative Reasoning](self_criticism/cumulative_reason.md) | Generate possible steps | Thorough analysis |\n\n## Decomposition {#decomposition}\n\nTechniques for breaking down complex problems:\n\n| Technique | Description | Use Case |\n|-----------|-------------|----------|\n| [Functional Decomposition](decomposition/decomp.md) | Implement subproblems as functions | Modular problem-solving |\n| [Faithful CoT](decomposition/faithful_cot.md) | Use natural and symbolic language | Mathematical reasoning |\n| [Least-to-Most](decomposition/least_to_most.md) | Solve increasingly complex subproblems | Educational applications |\n| [Plan and Solve](decomposition/plan_and_solve.md) | Generate a structured plan | Project planning |\n| [Program of Thought](decomposition/program_of_thought.md) | Use code for reasoning | Algorithmic problems |\n| [Recursive Thought](decomposition/recurs_of_thought.md) | Recursively solve subproblems | Hierarchical problems |\n| [Skeleton of Thought](decomposition/skeleton_of_thought.md) | Generate outline structure | Writing, planning |\n| [Tree of Thought](decomposition/tree-of-thought.md) | Search through possible paths | Decision trees, exploration |\n\n## Implementation with Instructor\n\nAll these prompting techniques can be implemented with Instructor by:\n\n1. Defining appropriate Pydantic models that capture the expected structure\n2. Incorporating the prompting technique in your model docstrings or field descriptions\n3. Using the patched LLM client with your response model\n\n```python\nimport instructor\nfrom pydantic import BaseModel, Field\n# Example implementing Chain of Thought with a field\nclass ReasonedAnswer(BaseModel):\n    \"\"\"Answer the following question with detailed reasoning.\"\"\"\n\n    chain_of_thought: str = Field(\n        description=\"Step-by-step reasoning process to solve the problem\"\n    )\n    final_answer: str = Field(\n        description=\"The final conclusion after reasoning\"\n    )\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nresponse = client.create(\n    model=\"gpt-4\",\n    response_model=ReasonedAnswer,\n    messages=[\n        {\"role\": \"user\", \"content\": \"What is the cube root of 27?\"}\n    ]\n)\n\nprint(f\"Reasoning: {response.chain_of_thought}\")\nprint(f\"Answer: {response.final_answer}\")\n```\n\n## References\n\n<sup>\\*</sup> Based on [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/self_criticism/chain_of_verification.md",
    "content": "---\ndescription: \"We get a model to output a baseline response. Next, we independently verify the response by using a model to generate questions and to verify these questions. Lastly, we use a final API call to verify the baseline response with the generated data\"\n---\n\nChain Of Verification ( CoVe )<sup><a href=\"https://arxiv.org/pdf/2309.11495\">1</a></sup> is a method that allows us to be able to verify our LLM's generated responses. We can do so using the following steps\n\n1. First we get our LLM to generate a response to a query\n2. Then we generate a set of follow up questions that need to be answered to validate the response\n3. We then independently generate a set of responses to these questions\n4. Lastly, we use a final LLM call to verify the response in light of these new question and answer pairs that we've generated\n\n```python hl_lines=\"49-52 95-100\"\nimport instructor\nfrom pydantic import BaseModel, Field\nimport asyncio\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass QueryResponse(BaseModel):\n    correct_answer: str\n\n\nclass ValidationQuestions(BaseModel):\n    question: list[str] = Field(\n        description=\"\"\"A list of questions that need to be\n        answered to validate the response\"\"\"\n    )\n\n\nclass ValidationAnswer(BaseModel):\n    answer: str\n\n\nclass FinalResponse(BaseModel):\n    correct_answer: str\n\n\nasync def generate_initial_response(query: str):\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=QueryResponse,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are an expert question answering system\",\n            },\n            {\"role\": \"user\", \"content\": query},\n        ],\n    )\n\n\nasync def generate_verification_questions(llm_response: str):\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=ValidationQuestions,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are an expert AI system that excels at\n                generating follow up questions to validate a response.\n                These questions should validate key assumptions, facts\n                and other important portions of the generated response\"\"\",\n            },\n            {\"role\": \"user\", \"content\": llm_response},\n        ],\n    )\n\n\nasync def generate_verification_response(questions: list[str]):\n    async def verify_question(question: str) -> tuple[ValidationAnswer, str]:\n        return (\n            await client.create(\n                model=\"gpt-4o\",\n                response_model=ValidationAnswer,\n                messages=[\n                    {\n                        \"role\": \"system\",\n                        \"content\": \"\"\"You are an expert AI system that\n                        excels at answering validation questions.\"\"\",\n                    },\n                    {\"role\": \"user\", \"content\": question},\n                ],\n            ),\n            question,\n        )\n\n    coros = [verify_question(question) for question in questions]\n    return await asyncio.gather(*coros)\n\n\nasync def generate_final_response(\n    answers: list[tuple[ValidationAnswer, str]],\n    initial_response: QueryResponse,\n    original_query: str,\n):\n    formatted_answers = \"\\n\".join(\n        [f\"Q: {question}\\nA: {answer.answer}\" for answer, question in answers]\n    )\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=FinalResponse,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are an expert AI system that excels at\n                validating and verifying if an initial answer answers an\n                initial query based off some Verification Questions and\n                Answers provided. Return the original answer if it is\n                valid else generate a new response off the verification\n                questions and answers provided.\"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                Initial query: {original_query}\n                Initial Answer : {initial_response.correct_answer}\n                Verification Questions and Answers:\n                {formatted_answers}\n            \"\"\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"What was the primary cause of the Mexican-American war and how long did it last?\"\n    initial_response = asyncio.run(generate_initial_response(query))\n    print(initial_response.model_dump_json())\n    \"\"\"\n    {\"correct_answer\":\"The primary cause of the Mexican-American War was\n    the annexation of Texas by the United States and the dispute over\n    whether Texas ended at the Nueces River (as the Mexicans claimed) or\n    the Rio Grande (as the U.S. claimed). The war lasted from April 25,\n    1846, to February 2, 1848, totaling nearly two years.\"}\n    \"\"\"\n\n    verification_questions = asyncio.run(\n        generate_verification_questions(initial_response.correct_answer)\n    )\n    print(verification_questions.model_dump_json())\n    \"\"\"\n    {\"question\":[\"Is it accurate that the primary cause of the\n    Mexican-American War was the annexation of Texas by the United\n    States?\",\"Was there a dispute over whether Texas ended at the Nueces\n    River or the Rio Grande?\",\"Did the Mexican-American War last from\n    April 25, 1846, to February 2, 1848?\",\"Is it correct to state that\n    the disagreement over the Texas border was between the Nueces River\n    and the Rio Grande?\",\"Was the Mexican claim that Texas ended at the\n    Nueces River while the U.S. claimed it was at the Rio Grande?\"]}\n    \"\"\"\n\n    responses = asyncio.run(\n        generate_verification_response(verification_questions.question)\n    )\n\n    final_answer = asyncio.run(\n        generate_final_response(responses, initial_response, query)\n    )\n    print(final_answer.model_dump_json())\n    \"\"\"\n    {\"correct_answer\":\"The primary cause of the Mexican-American War was\n    the annexation of Texas by the United States and the dispute over\n    whether Texas ended at the Nueces River (as the Mexicans claimed) or\n    the Rio Grande (as the U.S. claimed). The war lasted from April 25,\n    1846, to February 2, 1848, totaling nearly two years.\"}\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Chain-Of-Verification Reduces Hallucination In Large Language Models](https://arxiv.org/pdf/2309.11495)\n"
  },
  {
    "path": "docs/prompting/self_criticism/cumulative_reason.md",
    "content": "---\ndescription: \"Cumulative Reasoning breaks the reasoning process into three separate steps so that our model has enough room to reason and filter out the reasoning steps at each point, thus improving model performance\"\n---\n\nCumulative Reasoning<sup><a href=\"https://arxiv.org/pdf/2308.04371\">1</a></sup> aims to generate better outputs by dividing the reasoning process into three separate steps\n\n1. **Propose** : A LLM first suggests potential steps based on the current context, initiating the reasoning cycle\n2. **Verify** : We then assess the proposer's suggestions for accuracy, incorporating valid steps into the ongoing context\n3. **Report** : We then determine the appropriate moment to conclude the reasoning process\n\nBy first generating potential steps and separating out each portions of the reasoning process, we are able to obtain significant improvements in logical inference tasks and mathematical problems.\n\nWe can implement this using `instructor` as seen below\n\n```python hl_lines=\"46-61 94-100 138-148\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom textwrap import dedent\nfrom typing import Literal\nimport asyncio\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass Proposition(BaseModel):\n    premise1: str\n    premise2: str\n    reasoning: str\n    proposition: str\n\n\nclass ProposerOutput(BaseModel):\n    reasoning: str\n    valid_propositions: list[Proposition] = Field(\n        description=\"Concise list of Propositions that are derived from the premises that are relevant to the hypothesis. Note that each Proposition is derived from two given premises at most\",\n        min_length=4,\n    )\n    prediction: Literal[\"False\", \"True\", \"Unknown\"]\n\n\nclass VerifiedProposition(BaseModel):\n    proposition: str\n    reasoning: str\n    is_valid: bool\n\n\nclass ReporterOutput(BaseModel):\n    reasoning: str\n    is_valid_hypothesis: bool\n\n\nasync def generate_propositions(premises: list[str], hypothesis: str) -> ProposerOutput:\n    formatted_premises = \"\\n- \".join(premises)\n    return await client.create(\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    \"\"\"\n                Suppose you are one of the greatest AI\n                scientists, logicians, and mathematicians.\n\n                Let us think step by step. Please use\n                First-Order Logic (FOL) to deduce a list\n                of Propositions. Each Proposition is\n                derived from two given Premises and\n                should be logically correct. Most\n                importantly, each Proposition should\n                not duplicate the two premises that it\n                is derived from. Please make sure your\n                reasoning is directly deduced from the\n                Premises and Propositions rather than\n                introducing unsourced common knowledge\n                and unsourced information by common\n                sense reasoning.\n                \"\"\"\n                ),\n            },\n            {\n                \"role\": \"user\",\n                \"content\": dedent(\n                    f\"\"\"\n                Premises:\n                {formatted_premises}\n\n                We want to deduce more Propositions to\n                determine the correctness of the following\n                Hypothesis:\n                Hypothesis: {hypothesis}\n                \"\"\"\n                ),\n            },\n        ],\n        response_model=ProposerOutput,\n        model=\"gpt-4o\",\n    )\n\n\nasync def verify_propositions(\n    premise_evaluation: ProposerOutput,\n) -> list[VerifiedProposition]:\n    async def create_verification_task(proposition: Proposition) -> VerifiedProposition:\n        return await client.create(\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"\n                    Suppose you are one of the greatest AI\n                    scientists, logicians, and mathematicians.\n                    Let us think step by step. Please use\n                    First-Order Logic (FOL) to determine\n                    whether the deduction of two given\n                    Premises to a Proposition is valid or not,\n                    and reply with True or False.\n                    \"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"\n                    Premises:\n                    {proposition.premise1}\n                    {proposition.premise2}\n\n                    Proposition:\n                    {proposition.proposition}\n                    \"\"\",\n                },\n            ],\n            response_model=VerifiedProposition,\n            model=\"gpt-4o\",\n        )\n\n    tasks = [\n        create_verification_task(proposition)\n        for proposition in premise_evaluation.valid_propositions\n    ]\n\n    return await asyncio.gather(*tasks)\n\n\nasync def final_evaluation(\n    verification_result: list[str], hypothesis: str, premises: list[str]\n) -> ReporterOutput:\n    formatted_premises = \"\\n- \".join(premises)\n    formatted_propositions = \"\\n- \".join(verification_result)\n    return await client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                Suppose you are one of the greatest AI\n                scientists, logicians, and mathematicians.\n                Let us think step by step. Read and analyze\n                the “Premises” first, then use First-Order\n                Logic (FOL) to judge whether the “Hypothesis”\n                is True, False, or Unknown. Please make sure\n                your reasoning is directly deduced from the\n                \"Premises\" and \"Propositions\" rather than\n                introducing unsourced common knowledge and\n                unsourced information by common sense\n                reasoning.\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                Premises:\n                {formatted_premises}\n\n                Hypothesis: {hypothesis}\n                \"\"\",\n            },\n            {\n                \"role\": \"assistant\",\n                \"content\": f\"\"\"\n                Let's think step by step. From the premises,\n                we can deduce the following propositions:\n                {formatted_propositions}\n\n                Recall the Hypothesis: {hypothesis}\n                \"\"\",\n            },\n        ],\n        response_model=ReporterOutput,\n    )\n\n\nif __name__ == \"__main__\":\n    hypothesis = \"Hyraxes lay eggs\"\n    premises = [\n        \"The only types of mammals that lay eggs are platypuses and echidnas\",\n        \"Platypuses are not hyrax\",\n        \"Echidnas are not hyrax\",\n        \"No mammals are invertebrates\",\n        \"All animals are either vertebrates or invertebrates\",\n        \"Mammals are animals\",\n        \"Hyraxes are mammals\",\n        \"Grebes lay eggs\",\n        \"Grebes are not platypuses and also not echidnas\",\n    ]\n    premise_evaluation = asyncio.run(generate_propositions(premises, hypothesis))\n\n    verification_result = asyncio.run(verify_propositions(premise_evaluation))\n\n    filtered_propositions = [\n        proposition.proposition\n        for proposition in verification_result\n        if proposition.is_valid\n    ]\n\n    reporter_output = asyncio.run(\n        final_evaluation(filtered_propositions, hypothesis, premises)\n    )\n    print(reporter_output.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"reasoning\": \"Based on the premises provided, the\n      only mammals that lay eggs are platypuses and\n      echidnas. Hyraxes are mammals but are explicitly\n      stated as not being platypuses or echidnas. Hence,\n      there is no basis in the premises to conclude that\n      hyraxes lay eggs. \\n\\nTherefore, the hypothesis that\n      hyraxes lay eggs is False.\",\n      \"is_valid_hypothesis\": false\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Cumulative Reasoning with Large Language Models](https://arxiv.org/pdf/2308.04371)\n"
  },
  {
    "path": "docs/prompting/self_criticism/reversecot.md",
    "content": "---\ndescription: \"Reverse Chain Of Thought is a method to help identify logical inconsistencies in the reasoning steps of a large language model's response\"\n---\n\nWe can use a method called Reverse Chain Of Thought<sup><a href=\"https://arxiv.org/pdf/2305.11499\">1</a></sup> to reverse engineer a problem given a solution. This helps us to find specific inconsistencies in the reasoning steps taken by our model and to give targetted feedback which can improve the quality of the solution.\n\nThis is done through a 3 step process\n\n1. **Reconstruct The Question** : We first attempt to reconstruct the original problem given the solution and reasoning steps generated\n2. **Identify Inconsistencies** : Identify the inconsistencies between the original problem and the reconstructed problem\n3. **Generate Feedback** : Give fine-grained fedback to guide the LLM in revising its solution\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"54-59 76-83 98-107 127-140 155-167\"\nimport instructor\nfrom pydantic import BaseModel, Field\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ReconstructedPrompt(BaseModel):\n    chain_of_thought: str\n    reconstructed_prompt: str = Field(\n        description=\"\"\"Reconstruction of a potential prompt\n        that could have been used to generate the reasoning\n        and final solution provided by the user\"\"\"\n    )\n\n\nclass ConditionList(BaseModel):\n    conditions: list[str] = Field(\n        description=\"\"\"Key information and conditions present\n        in the reasoning steps which are relevant to answering\n        the question\"\"\"\n    )\n\n\nclass ModelFeedback(BaseModel):\n    detected_inconsistencies: list[str] = Field(\n        description=\"\"\"Inconsistencies that were detected between\n        the original condition list and the reconstructed condition\n        list\"\"\"\n    )\n    feedback: str = Field(\n        description=\"\"\"Feedback on how to fix the inconsistencies\n        detected in the original condition list and the reconstructed\n        condition list\"\"\"\n    )\n    is_equal: bool\n\n\nclass ModelResponse(BaseModel):\n    chain_of_thought: str = Field(\n        description=\"\"\"Logical Steps that were taken to derive\n        the final concluding statement\"\"\"\n    )\n    correct_answer: str\n\n\ndef generate_response(query: str):\n    return client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                You are a helpful AI Question Answerer. You are\n                about to be passed a query by a User.\n\n                Make sure to generate a series of logical steps\n                and reason about the problem before generating\n                a solution.\n                \"\"\",\n            },\n            {\"role\": \"user\", \"content\": query},\n        ],\n        response_model=ModelResponse,\n    )\n\n\ndef reconstruct_prompt(model_response: ModelResponse):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=ReconstructedPrompt,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                    Give the concrete prompt (problem) that can\n                    generate this answer. The problem should\n                    contain all basic and necessary information\n                    and correspond to the answer. The problem\n                    can only ask for one result\n\n                    Reasoning: {model_response.chain_of_thought}\n                    Response: {model_response.correct_answer}\n                    \"\"\",\n            }\n        ],\n    )\n\n\ndef deconstruct_prompt_into_condition_list(prompt: str):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=ConditionList,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                You are an expert AI system that excels at\n                analyzing and decomposing questions into their\n                constituent parts.\n\n                Please list the conditions of the problem given\n                below. There might be multiple conditions in the\n                problem so make sure to navigate through the\n                prompt incrementally, indentifying and extracting\n                the conditions necessary to answer the question\n                in your final response.\n                \"\"\",\n            },\n            {\"role\": \"user\", \"content\": prompt},\n        ],\n    )\n\n\ndef generate_feedback(\n    original_condition_list: list[str], final_condition_list: list[str]\n):\n    formatted_original_conditions = \"\\n- \".join(original_condition_list)\n    formatted_final_conditions = \"\\n- \".join(final_condition_list)\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=ModelFeedback,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                You are an expert AI system that excels at\n                analyzing and comparing two lists of conditions.\n\n                Original Condition List:\n                {formatted_original_conditions}\n\n                Reconstructed Condition List:\n                {formatted_final_conditions}\n\n                Determine if the two condition lists are roughly\n                equivalent. If they are not, give targetted\n                feedback on what is missing from the reconstructed\n                condition list as compared to the original condition\n                list and how it can be fixed.\n                \"\"\",\n            }\n        ],\n    )\n\n\ndef revise_response(response: ModelResponse, feedback: ModelFeedback):\n    formatted_inconsistencies = \"\\n- \".join(feedback.detected_inconsistencies)\n    return client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                Here are the mistakes and reasons in your answer\n                to the prompt\n\n                Original Response: {response.correct_answer}\n                You have overlooked some real conditions:\n                {formatted_inconsistencies}\n\n                Here are detailed reasons:\n                {feedback.feedback}\n\n                Generate a revised response that takes into account\n                the detailed feedback and includes the ignored\n                conditions\n                \"\"\",\n            }\n        ],\n        response_model=ModelResponse,\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"\n    Mary is an avid gardener. Yesterday, she received 18 new\n    potted plants from her favorite plant nursery. She already\n    has 2 potted plants on each of the 40 window ledges of her\n    large backyard. How many potted plants will Mary remain\n    with?\n    \"\"\"\n    response = generate_response(query)\n    reconstructed_prompt = reconstruct_prompt(response)\n    print(reconstructed_prompt.reconstructed_prompt)\n    \"\"\"\n    Mary received 18 new potted plants. She already has 2 potted plants on each\n    of the 40 window ledges in her backyard. How many potted plants does she have now?\n    \"\"\"\n\n    original_condition_list = deconstruct_prompt_into_condition_list(query)\n    new_condition_list = deconstruct_prompt_into_condition_list(\n        reconstructed_prompt.reconstructed_prompt\n    )\n    print(original_condition_list.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"conditions\": [\n        \"Mary received 18 new potted plants.\",\n        \"Mary has 2 potted plants on each of the 40 window ledges in her backyard.\",\n        \"We are required to find the total number of potted plants Mary will have.\"\n      ]\n    }\n    \"\"\"\n    print(new_condition_list.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"conditions\": [\n        \"Mary received 18 new potted plants.\",\n        \"She already has 2 potted plants on each of the 40 window ledges in her backyard.\"\n      ]\n    }\n    \"\"\"\n\n    feedback = generate_feedback(\n        original_condition_list.conditions, new_condition_list.conditions\n    )\n    print(feedback.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"detected_inconsistencies\": [\n        \"The reconstructed list is missing the requirement\n        to find the total number of potted plants Mary will\n        have.\"\n      ],\n      \"feedback\": \"Add the requirement of finding the total\n      number of potted plants Mary will have to the\n      reconstructed condition list to match the original\n      condition list.\",\n      \"is_equal\": false\n    }\n    \"\"\"\n\n    if not feedback.is_equal:\n        response = revise_response(response, feedback)\n\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"chain_of_thought\": \"First, we note that Mary starts\n      with 18 potted plants. According to the problem, she\n      bought 2 packs of 40 new potted plants. So, to find\n      the total number of plants she will have, we add the\n      number of plants she initially has to the number she\n      bought. This gives us 18 (initial) + 2 * 40 (new) =\n      18 + 80 = 98 potted plants.\",\n      \"correct_answer\": \"98 potted plants\"\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [RCoT: Detecting And Rectifying Factual Inconsistency In Reasoning By Reversing Chain-Ofthought](https://arxiv.org/pdf/2305.11499)\n"
  },
  {
    "path": "docs/prompting/self_criticism/self_calibration.md",
    "content": "---\ndescription: \"Self Calibration aims to get language models to determine what they know and do not know\"\n---\n\nWe want our language models to be able to output the extent of their confidence in predictions. To do so, we can get language models to evaluate their responses to a given prompt using a technique called Self Calibration <sup><a href=\"https://arxiv.org/pdf/2207.05221\">1</a></sup>\n\n> The original paper used a fine-tuned regression head over the language model's final output. However, since we don't have access to the model's final hidden states, we can substitute it for a function call instead to achieve a similar result.\n\nWe can ask language models to evaluate their outputs by using the following template\n\nWe can implement this using `instructor` as seen below\n\n```python hl_lines=\"23-27\"\nimport instructor\nfrom pydantic import BaseModel, Field\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass SelfCalibration(BaseModel):\n    chain_of_thought: str\n    is_valid_answer: bool = Field(description=\"Whether the answer is correct or not\")\n\n\ndef evaluate_model_output(original_prompt: str, model_response: str):\n    return client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                Question: {original_prompt}\n\n                {model_response}\n\n                Is this a valid answer to the question?\n                Make sure to examine the question\n                thoroughly and generate a complete\n                reasoning for why the answer is correct\n                or not before responding.\n                \"\"\",\n            }\n        ],\n        response_model=SelfCalibration,\n        model=\"gpt-4o\",\n    )\n\n\nif __name__ == \"__main__\":\n    original_prompt = \"\"\"\n    Question: Who was the third president of the\n    United States?\n    \"\"\"\n    model_response = \"\"\"\n    Here are some brainstormed ideas: James Monroe\n    Thomas Jefferson\n    Jefferson\n    Thomas Jefferson\n    George Washington\n    \"\"\"\n    response = evaluate_model_output(original_prompt, model_response)\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"chain_of_thought\": \"Let's examine the question\n      carefully: 'Who was the third president of the\n      United States?'\\n\\nThe brainstormed ideas are:\n      \\n1. James Monroe\\n2. Thomas Jefferson\\n3.\n      Jefferson\\n4. Thomas Jefferson\\n5. George\n      Washington.\\n\\nTo determine the validity of these\n      answers, I'll cross-check with historical\n      records.\\n\\n1. James Monroe was not the third\n      president; he was the fifth president.\\n2. Thomas\n      Jefferson was indeed the third president of the\n      United States.\\n3. 'Jefferson' is a correct but\n      incomplete answer; it lacks the first name, though\n      it is commonly understood.\\n4. 'Thomas Jefferson'\n      is the full name and correct answer.\\n5. George\n      Washington was the first president, not the\n      third.\\n\\nTherefore, the correct, valid answer to\n      the question 'Who was the third president of the\n      United States?' is 'Thomas Jefferson,' and this\n      answer is correct.\",\n      \"is_valid_answer\": true\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Language Models (Mostly) Know What They Know](https://arxiv.org/pdf/2207.05221)\n"
  },
  {
    "path": "docs/prompting/self_criticism/self_refine.md",
    "content": "---\ntitle: \"Improve With Feedback\"\ndescription: \"Self-refine is an approach that uses an LLM to generate an output, provide feedback on the output, and improve the output based on the provided feedback.\"\n---\n\nHow can we provide feedback for an LLM to improve its responses?\n\nSelf-refine is an approach that uses an LLM to generate an output, provide feedback on the output, and improve the output based on the provided feedback. This processes repeats until a stopping condition is achieved. The same LLM is used for all three steps.\n\n```mermaid\ngraph TD\n    A[Generate initial response]:::blue --> B[Generate feedback]:::orange\n    B --> C{Stopping<br>condition<br>met?}:::orange\n    C -->|No| D[Refine response]:::orange\n    C -->|Yes| E[Final output]:::green\n    D --> B\n\n    classDef blue fill:#E3F2FD,stroke:#90CAF9,color:#1565C0\n    classDef orange fill:#FFF3E0,stroke:#FFE0B2,color:#E65100\n    classDef green fill:#E8F5E9,stroke:#A5D6A7,color:#2E7D32\n    linkStyle default stroke:#90A4AE,stroke-width:2px;\n    linkStyle 1,2,4 stroke:#FFB74D,stroke-width:2px;\n```\n\n```python hl_lines=\"102-106\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\nclass Response(BaseModel):\n    code: str\n\n\nclass Feedback(BaseModel):\n    feedback: list[str] = Field(\n        description=\"A list of actions to take to improve the code.\"\n    )\n    done: bool\n\n\nclass Timestep(BaseModel):\n    response: str\n    feedback: Optional[list[str]] = Field(default_factory=list)\n    refined_response: Optional[str] = Field(default=\"\")\n\n\nclass History(BaseModel):\n    history: list[Timestep] = Field(default_factory=list)\n\n    def add(self, code, feedback, refined_code):\n        self.history.append(\n            Timestep(response=code, feedback=feedback, refined_response=refined_code)\n        )\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_feedback(response):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Feedback,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                        You are an expert Python coder.\n                        Provide feedback on this code.\n                        How can we make it (1) faster and (2) more readable?\n\n                        <code>\n                        {response.code}\n                        </code>\n\n                        If the code does not need to be improved, then indicate by setting \"done\" to True.\n                        \"\"\",\n            }\n        ],\n    )\n\n\ndef refine(response, feedback):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                        You are an expert Python coder.\n\n                        <response>\n                        {response.code}\n                        </response>\n\n                        <feedback>\n                        {feedback.feedback}\n                        </feedback>\n\n                        Refine your response.\n                        \"\"\",\n            }\n        ],\n    )\n\n\ndef stop_condition(feedback, history):\n    return feedback.done or len(history.history) >= 3\n\n\nif __name__ == \"__main__\":\n    response = client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Write Python code to calculate the fibonacci sequence.\",\n            }\n        ],\n    )\n\n    history = History()\n\n    while True:\n        feedback = generate_feedback(response)\n        if stop_condition(feedback, history):\n            break\n        refined_response = refine(response, feedback)\n\n        # Save to history\n        history.add(response.code, feedback.feedback, refined_response.code)\n        response = refined_response\n\n    print(history.history[0].response)\n    \"\"\"\n    def fibonacci(n):\n        sequence = [0, 1]\n        while len(sequence) < n:\n            sequence.append(sequence[-1] + sequence[-2])\n        return sequence[:n]\n\n    # Example usage:\n    n = 10\n    print(fibonacci(n))\n    \"\"\"\n    print(history.history[0].feedback)\n    \"\"\"\n    [\n        'Use a generator to reduce memory consumption for large `n` values and improve speed.',\n        'Enhance readability by adding type hints for input and output.',\n        \"Add docstrings to explain the function's purpose and parameters.\",\n        \"Avoid slicing the list at the end if it's not necessary; instead, ensure the loop condition is precise.\",\n    ]\n    \"\"\"\n    print(history.history[0].refined_response)\n    \"\"\"\n    def fibonacci(n: int) -> list[int]:\n        \"\"\"Generate a Fibonacci sequence of length n.\n\n        Args:\n            n (int): The length of the Fibonacci sequence to generate.\n\n        Returns:\n            list[int]: A list containing the Fibonacci sequence of length n.\n        \"\"\"\n        def fibonacci_generator():\n            a, b = 0, 1\n            for _ in range(n):\n                yield a\n                a, b = b, a + b\n        return list(fibonacci_generator())\n\n    # Example usage:\n    n = 10\n    print(fibonacci(n))\n    \"\"\"\n    print(f\"...process repeated {len(history.history)} times...\")\n    #> ...process repeated 3 times...\n    print(response.code)\n    \"\"\"\n    def fibonacci(n: int) -> list[int]:\n        \"\"\"Generate a Fibonacci sequence of length n.\n\n        Args:\n            n (int): The length of the Fibonacci sequence to generate.\n\n        Returns:\n            list[int]: A list containing the Fibonacci sequence of length n.\n        \"\"\"\n        if n <= 0:\n            return []\n        sequence = [0] * n\n        if n > 1:\n            sequence[1] = 1\n        for i in range(2, n):\n            sequence[i] = sequence[i-1] + sequence[i-2]\n        return sequence\n\n    # Example usage:\n    n = 10\n    print(fibonacci(n))\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Self-Refine: Iterative Refinement with Self-Feedback](https://arxiv.org/abs/2303.17651)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)"
  },
  {
    "path": "docs/prompting/self_criticism/self_verification.md",
    "content": "---\ntitle: \"Self-Verify LLM Responses\"\ndescription: \"The self-verification framework generates multiple response candidates, then uses an LLM to verify these candidates.\"\n---\n\nWe want to verify that an LLM response is correct. How can we automate this?\n\nThe self-verification framework generates multiple response candidates, then uses an LLM to verify these candidates. The process follows two stages:\n\n1. Forward Reasoning\n2. Backward Verification\n\n## Forward Reasoning\nIn forward reasoning, we leaverage CoT to generate multiple candidate solutions.\n\n## Backward Verification\nBackward verification involves three steps.\n\n### Rewrite As Declarative\n\nRewrite the original question and its solution as a declarative.\n\n!!! example \"Rewritten Declaritive Example\"\n    **original question**: Jackie has 10 apples. Adam has 8 apples. How many more apples does Jackie have than Adam?\n    **response candidate**: Jackie has 10 apples. so Jackie has 10-8=2 more apples than Adam, and the answer is 2.\n    **rewritten declarative**: Jackie has 10 apples. Adam has 8 apples. Jackie has 2 more apples than Adam.\n\n### Construct New Question\n\nConstruct a new question and prompt the LLM to verify it. Two possible methods are:\n\n1. True-False Item Verification (TFV)\n2. Condition Mask Verification (CMV)\n\nTFV asks the LLM if the rewritten declarative is correct. CMV filters out conditions provided in the original question and asks an LLM to predict the filtered condition.\n\n!!! example \"TFV Example Prompt\"\n    Jackie has 10 apples. Adam has 8 apples. Jackie has 2 more apples than Adam. Is this correct?\n\n!!! example \"CMV Example Prompt\"\n    Jackie has X apples. Adam has 8 apples. Jackie has 2 more apples than Adam. What is X?\n\n### Compute Verification Score\nThe LLM is then queried with the new question for each candidate *k* times. If TFV is used, the verification score is simply the number of times the LLM outputs \"True\". If CMV is used, the verification score is the number of times the masked value and the real value match.\n\nThe candidate with the highest verification score is then chosen as the final answer.\n\n## Implementation\n\nThe full pipeline with forward reasoning and backward verification can be implemented using `instructor` as seen below:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Literal\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\nn = 3  # Number of candidates to generate\nk = 5  # Number of times to verify\n\n\nclass Date(BaseModel):\n    month: int\n    day: int\n\n\nclass Candidate(BaseModel):\n    reasoning_steps: list[str]\n    month: str\n\n\nclass Rewritten(BaseModel):\n    declarative: str\n\n\nclass Verification(BaseModel):\n    correct: Literal[\"True\", \"False\"]\n\n\ndef query_llm(query, model):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Think step by step: {query}\",\n            }\n        ],\n    )\n\n\ndef rewrite(query, candidate):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Rewritten,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                    Please change the questions and answers into complete declarative sentences\n                    {query}\n                    The answer is {candidate.month}.\n                \"\"\",\n            }\n        ],\n    )\n\n\ndef verify(question):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Verification,\n        messages=[{\"role\": \"user\", \"content\": question}],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"What month is it now if it has been 3 weeks, 10 days, and 2 hours since May 1, 2024 6pm?\"\n\n    # Step 1: Forward Reasoning\n    candidates = [query_llm(query, Candidate) for _ in range(n)]\n\n    # Step 2: Backwards Verification\n    for candidate in candidates:\n        # 2.a Rewrite\n        rewritten = rewrite(query, candidate)\n        # 2.b Construct new questions\n        question = f\"{rewritten.declarative} Do it is correct (True or False)?\"\n        # 2.c Compute verification score\n        scores = [verify(question).correct for _ in range(k)]\n        verification_score = sum(1 for s in scores if s == \"True\")\n\n        print(f\"Candidate: {candidate.month}, Verification Score: {verification_score}\")\n        #> Candidate: May, Verification Score: 0\n        #> Candidate: June, Verification Score: 2\n        #> Candidate: May, Verification Score: 1\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Large Language Models are Better Reasoners with Self-Verification](https://arxiv.org/abs/2212.09561)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/active_prompt.md",
    "content": "---\ndescription: \"Active prompting is a method used to identify the most effective examples for human annotation. \"\n---\n\nWhen we have a large pool of unlabeled examples that could be used in a prompt, how should we decide which examples to manually label?\n\nActive prompting is a method used to identify the most effective examples for human annotation. The process involves four key steps:\n\n1. **Uncertainty Estimation**: Assess the uncertainty of the LLM's predictions on each possible example\n2. **Selection**: Choose the most uncertain examples for human annotation\n3. **Annotation**: Have humans label the selected examples\n4. **Inference**: Use the newly labeled data to improve the LLM's performance\n\n## Uncertainty Estimation\n\nIn this step, we define an unsupervised method to measure the uncertainty of an LLM in answering a given example.\n\n!!! example \"Uncertainty Estimation Example\"\n\n    Let's say we ask an LLM the following query:\n    >query = \"Classify the sentiment of this sentence as positive or negative: I am very excited today.\"\n\n    and the LLM returns:\n    >response = \"positive\"\n\n    The goal of uncertainty estimation is to answer: **How sure is the LLM in this response?**\n\nIn order to do this, we query the LLM with the same example _k_ times. Then, we use the _k_ responses to determine how dissimmilar these responses are. Three possible metrics<sup><a href=\"https://arxiv.org/abs/2302.12246\">1</a></sup> are:\n\n1. **Disagreement**: Ratio of unique responses to total responses.\n2. **Entropy**: Measurement based on frequency of each response.\n3. **Variance**: Calculation of the spread of numerical responses.\n\nBelow is an example of uncertainty estimation for a single input example using the disagreement uncertainty metric.\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclass Response(BaseModel):\n    height: int\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef query_llm():\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"How tall is the Empire State Building in meters?\",\n            }\n        ],\n    )\n\n\ndef calculate_disagreement(responses):\n    unique_responses = set(responses)\n    h = len(unique_responses)\n    return h / k\n\n\nif __name__ == \"__main__\":\n    k = 5  # (1)!\n    responses = [query_llm() for _ in range(k)]  # Query the LLM k times\n    for response in responses:\n        print(response)\n        #> height=443\n        #> height=443\n        #> height=443\n        #> height=443\n        #> height=381\n\n    print(\n        calculate_disagreement([response.height for response in responses])\n    )  # Calculate the uncertainty metric\n    #> 0.4\n```\n\n1. _k_ is the number of times to query the LLM with a single unlabeled example\n\nThis process will then be repeated for all unlabeled examples.\n\n## Selection & Annotation\n\nOnce we have a set of examples and their uncertainties, we can select _n_ of them to be annotated by humans. Here, we choose the examples with the highest uncertainties.\n\n## Inference\n\nNow, each time the LLM is prompted, we can include the newly-annotated examples.\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Active Prompting with Chain-of-Thought for Large Language Models](https://arxiv.org/abs/2302.12246)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/auto_cot.md",
    "content": "---\ndescription: \"Automate few-shot chain of thought to choose diverse examples\"\n---\n\nHow can we improve the performance of few-shot CoT?\n\nWhile few-shot CoT reasoning is effective, its effectiveness relies on manually crafted examples. Further, choosing diverse examples has shown effective in reducing reasoning errors from CoT.\n\nHere, we automate CoT to choose diverse examples. Given a list of potential examples:\n\n1. **Cluster**: Cluster potential examples\n2. **Sample**: For each cluster,\n   1. Sort examples by distance from cluster center\n   2. Select the first example that meets a predefined selection criteria\n3. **Prompt**: Incorporate the chosen questions from each cluster as examples in the LLM prompt\n\n!!! info\n\n    A sample selection criteria could be limiting the number of reasoning steps to a maximum of 5 steps to encourage sampling examples with simpler rationales.\n\n```python hl_lines=\"72 75 106\"\nimport instructor\nimport numpy as np\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nfrom sklearn.cluster import KMeans\nfrom sentence_transformers import SentenceTransformer\n\nclient = instructor.from_provider(\"openai/gpt-4o\")\nNUM_CLUSTERS = 2\n\n\nclass Example(BaseModel):\n    question: str\n    reasoning_steps: list[str]\n\n\nclass FinalAnswer(BaseModel):\n    reasoning_steps: list[str]\n    answer: int\n\n\ndef cluster_and_sort(questions, n_clusters=NUM_CLUSTERS):\n    # Cluster\n    embeddings = SentenceTransformer('all-MiniLM-L6-v2').encode(questions)\n    kmeans = KMeans(n_clusters=n_clusters, random_state=42, n_init=10).fit(embeddings)\n\n    # Sort\n    sorted_clusters = [[] for _ in range(kmeans.n_clusters)]\n    for question, embedding, label in zip(questions, embeddings, kmeans.labels_):\n        center = kmeans.cluster_centers_[label]\n        distance = np.linalg.norm(embedding - center)\n        sorted_clusters[label].append((distance, question))\n    for cluster in sorted_clusters:\n        cluster.sort()  # Sort by distance\n\n    return sorted_clusters\n\n\ndef sample(cluster):\n    for question in cluster:\n        response = client.create(\n            model=\"gpt-4o\",\n            response_model=Example,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are an AI assistant that generates step-by-step reasoning for mathematical questions.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Q: {question}\\nA: Let's think step by step.\",\n                },\n            ],\n        )\n        if (\n            len(response.reasoning_steps) <= 5\n        ):  # If we satisfy the selection criteria, we've found our question for this cluster\n            return response\n\n\nif __name__ == \"__main__\":\n    questions = [\n        \"How many apples are left if you have 10 apples and eat 3?\",\n        \"What's the sum of 5 and 7?\",\n        \"If you have 15 candies and give 6 to your friend, how many do you have left?\",\n        \"What's 8 plus 4?\",\n        \"You start with 20 stickers and use 8. How many stickers remain?\",\n        \"Calculate 6 added to 9.\",\n    ]\n\n    # Cluster and sort the questions\n    sorted_clusters = cluster_and_sort(questions)\n\n    # Sample questions that match selection criteria for each cluster\n    selected_examples = [sample(cluster) for cluster in sorted_clusters]\n    print(selected_examples)\n    \"\"\"\n    [\n        Example(\n            question='If you have 15 candies and give 6 to your friend, how many do you have left?',\n            reasoning_steps=[\n                'Start with the total number of candies you have, which is 15.',\n                'Subtract the number of candies you give to your friend, which is 6, from the total candies.',\n                '15 - 6 = 9, so you are left with 9 candies.',\n            ],\n        ),\n        Example(\n            question=\"What's the sum of 5 and 7?\",\n            reasoning_steps=[\n                'Identify the numbers to be added: 5 and 7.',\n                'Perform the addition: 5 + 7.',\n                'The sum is 12.',\n            ],\n        ),\n    ]\n    \"\"\"\n\n    # Use selected questions as examples for the LLM\n    response = client.create(\n        model=\"gpt-4o\",\n        response_model=FinalAnswer,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                {selected_examples}\n                If there are 10 books in my bad and I read 8 of them, how many books do I have left? Let's think step by step.\n                \"\"\",\n            }\n        ],\n    )\n\n    print(response.reasoning_steps)\n    \"\"\"\n    [\n        'Start with the total number of books in the bag, which is 10.',\n        \"Subtract the number of books you've read, which is 8, from the total books.\",\n        '10 - 8 = 2, so you have 2 books left.',\n    ]\n    \"\"\"\n    print(response.answer)\n    #> 2\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Automatic Chain of Thought Prompting in Large Language Models](https://arxiv.org/abs/2210.03493)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/complexity_based.md",
    "content": "---\ndescription: \"Complexity Based Prompting involves choosing examples based on their reasoning steps. If reasoning length isn't available, then we can use proxies such as response length\"\n---\n\nWe can improve the performance of our language models by choosing more complex examples. This refers to examples that have either more reasoning steps or a longer response ( when reasoning steps are not available ).\n\nIn the event that no examples are available, we can sample multiple responses and generate an answer based off the top few most complex examples. We can determine the complexity based on the length of their reasoning step in a process known as Complexity Based Consistency\n<sup><a href=\"https://arxiv.org/pdf/2210.00720\">1</a></sup> .\n\nWe can implement Complexity Based Consistency using `instructor` as seen below.\n\n```python hl_lines=\"40-42\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom textwrap import dedent\nimport asyncio\nfrom collections import Counter\nimport random\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass ReasoningStep(BaseModel):\n    step: int = Field(..., description=\"The step number\")\n    subquestion: str = Field(..., description=\"Subquestion to solve\")\n    procedure: str = Field(\n        description=\"\"\"Any intermediate computation\n        that was done in the reasoning process. Leave\n        empty if no computation is needed\"\"\",\n    )\n    result: str\n\n\nclass Response(BaseModel):\n    reasoning: list[ReasoningStep] = Field(\n        description=\"reasoning steps to derive answer\",\n    )\n    correct_answer: int\n\n\nasync def generate_single_response(query: str, context: str) -> Response:\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                You are an expert Question Answering system. Make sure\n                to output your reasoning in structured reasoning steps\n                before generating a response to the user's query.\n\n\n                Context:\n                {context}\n\n                Query:\n                {query}\n                \"\"\"\n                ),\n            },\n        ],\n    )\n\n\nasync def complexity_based_consistency(\n    query: str, context: str, samples: int, top_k: int\n):\n    generated_responses = [\n        generate_single_response(query, context) for _ in range(samples)\n    ]\n    responses = await asyncio.gather(*generated_responses)\n    sorted_responses = sorted(responses, key=lambda x: len(x.reasoning), reverse=True)\n    top_responses = sorted_responses[:top_k]\n    return top_responses\n\n\nif __name__ == \"__main__\":\n    query = \"How many loaves of bread did they have left?\"\n    context = \"\"\"\n    The bakers at the Beverly Hills Bakery baked\n    200 loaves of bread on Monday morning. They\n    sold 93 loaves in the morning and 39 loaves\n    in the afternoon. A grocery store returned 6\n    unsold loaves.\n    \"\"\"\n\n    number_of_reasoning_chains = 5\n    top_k_to_sample = 3\n    response = asyncio.run(\n        complexity_based_consistency(\n            query, context, number_of_reasoning_chains, top_k_to_sample\n        )\n    )\n\n    answer_counts = Counter([res.correct_answer for res in response])\n\n    most_common_count = answer_counts.most_common(len(answer_counts))[0][1]\n    max_answers = [\n        answer for answer, count in answer_counts.items() if count == most_common_count\n    ]\n\n    final_answer = random.choice(max_answers)\n    print(final_answer)\n    #> 74\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Complexity-based prompting for multi-step reasoning](https://arxiv.org/pdf/2210.00720)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/contrastive.md",
    "content": "---\ndescription: \"We can improve model performance by deliberating including incorrect examples of reasoning for our model to see\"\n---\n\nWe can get better performance from our model when using chain-of-thought by including examples of incorrect reasoning. This helps our language model to learn what mistakes to avoid when generating a response. This is known as Contrastive Chain Of Thought<sup><a href=\"https://arxiv.org/pdf/2311.09277\">1</a></sup> and can be done using the following template.\n\n!!! example \"Contrastive Chain Of Thought template\"\n\n    <context>sample question</context>\n    <question>sample question</question>\n\n    <Explanations>\n        <Explanation>correct reasoning</Explanation>\n        <WrongExplanation>incorrect reasoning example</WrongExplanation>\n    <Explanations>\n\n    <context>sample question</context>\n    <question>sample question</question>\n\nWe can implement Contrastive Chain Of Thought using `instructor` as seen below.\n\n```python hl_lines=\"35-40\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom textwrap import dedent\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ChainOfThought(BaseModel):\n    chain_of_thought: str = Field(description=\"Incorrect reasoning for the answer\")\n    correct_answer: str\n\n\ndef contrastive_chain_of_thought(\n    query: str,\n    context: str,\n    example_prompt: str,\n    correct_examples: list[str],\n    incorrect_examples: list[str],\n):\n    correct_example_prompt = \"\\n\".join(\n        [f\"<Explanation>{example}</Explanation>\" for example in correct_examples]\n    )\n    incorrect_example_prompt = \"\\n\".join(\n        [\n            f\"<WrongExplanation>{example}</WrongExplanation>\"\n            for example in incorrect_examples\n        ]\n    )\n    \"\"\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=ChainOfThought,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n            <prompt>\n                <role>system</role>\n                <context>\n                You are an expert question answering AI System.\n\n                You are about to be given some examples of incorrect\n                and correct reasoning for a question. You will then\n                be asked to correctly reason through another question\n                to generate a valid response.\n                </context>\n\n                <question>{example_prompt}</question>\n\n                <Explanations>\n                    {correct_example_prompt}\n                    {incorrect_example_prompt}\n                </Explanations>\n                <context>{context}</context>\n                <question>{query}</question>\n\n            </prompt>\n            \"\"\"\n                ),\n            }\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    context = \"\"\"\n    James writes a 3-page letter to 2\n    different friends twice a week.\n    \"\"\"\n    query = \"How many pages does James write in a year?\"\n\n    sample_question = \"\"\"\n    James has 30 teeth. His dentist drills 4\n    of them and caps 7 more teeth than he drills.\n\n    What percentage of James' teeth does the dentist fix?\n    \"\"\"\n\n    incorrect_examples = [\n        \"\"\"James has 30 teeth. The dentist drills and caps some\n        teeth. Since drills are normally used on cars and not\n        teeth, it's safe to say none of the teeth were actually\n        fixed.\"\"\",\n        \"\"\"The dentist drills 4 teeth and caps 11 of them, which\n        means that he fixes 15 teeth. So we take 15 and multiply\n        it by the number of petals on a daisy, and the result is\n        30%, which is the percentage of teeth he fixes.\"\"\",\n    ]\n\n    correct_examples = [\n        \"\"\"The dentist drills 4 teeth, so there are 30 - 4 = 26\n        teeth left. The dentist caps 7 more teeth than he drills,\n        so he caps 4 + 7 = 11 teeth. Therefore, the dentist fixes\n        a total of 4 + 11 = 15 teeth. To find the percentage of\n        teeth the dentist fixes, we divide the number of teeth\n        fixed by the total number of teeth and multiply by 100:\n        15/30 x 100 = 50%\"\"\"\n    ]\n\n    response = contrastive_chain_of_thought(\n        query=query,\n        context=context,\n        example_prompt=sample_question,\n        correct_examples=correct_examples,\n        incorrect_examples=incorrect_examples,\n    )\n\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"chain_of_thought\": \"First, let's determine how many pages James writes per week.\n      He writes a 3-page letter to 2 different friends, so for one writing session, he\n      writes 3 pages x 2 friends = 6 pages. He does this twice a week, so the total number\n       of pages written per week is 6 pages/session x 2 sessions/week = 12 pages/week. \\n\\n\n       Next, we need to find out how many weeks are in a year. There are 52 weeks in a year,\n       so we multiply the number of pages James writes per week by the number of weeks in a year:\n       12 pages/week x 52 weeks/year = 624 pages/year.\\n\\nTherefore, James writes 624 pages in a year.\",\n      \"correct_answer\": \"624\"\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Contrastive Chain-of-Thought Prompting](https://arxiv.org/pdf/2311.09277)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/memory_of_thought.md",
    "content": "---\ntitle: \"\"\ndescription: \"\"\nkeywords: \"\"\n---\n\n[wip]\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/prompt_mining.md",
    "content": "---\ndescription: \"We get a LLM to generate prompts\"\n---\n\nLarge Language Models are sensitive to the way that they are prompted. When prompted incorrectly, they might perform much worse despite having the information or capability to respond to the prompt. Prompt Mining aims to help us discover better formats that occur more frequently in the corpus.\n\nHere are some examples of mined completions that were provided in the paper.\n\n| Manual Prompts                      | Mined Prompts           |\n| ----------------------------------- | ----------------------- |\n| x is affiliated with the y religion | x who converted to y    |\n| The headquarter of x is in y        | x is based in y         |\n| x died in y                         | x died at his home in y |\n| x is represented by music label y   | x recorded for y        |\n| x is a subclass of y                | x is a type of y        |\n\n> The original paper uses a large wikipedia corpus to automatically extract prompt templates by looking at middle words of the prompts and parsing the dependencies within the sentence. We present a more lightweight approach to help achieve a similar result with `instructor`.\n\nWe can implement Prompt Mining using `instructor` as seen below.\n\n```python hl_lines=\"29-33\"\nfrom pydantic import BaseModel, Field\nimport instructor\n\nclass PromptTemplate(BaseModel):\n    prompt_template: str = Field(\n        description=(\n            \"\"\"\n            A template that has the subject and object that we\n            want to extract from the prompt replaced with a\n            single placeholder of {subject} and {object}.\n            Rephrase the prompt if necessary to make it more\n            concise and easier to understand\n            \"\"\"\n        ),\n    )\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_prompt_templates(prompt: str):\n    return client.create(\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": (\n                    \"You are an expert prompt miner that excels at \"\n                    \"generating prompt templates which are more \"\n                    \"concise and easier to understand\\n\\nYou are \"\n                    \"about to be passed a prompt to extract 3 new \"\n                    \"prompt templates for\"\n                ),\n            },\n            {\"role\": \"system\", \"content\": prompt},\n        ],\n        response_model=list[PromptTemplate],\n        temperature=0,\n        max_retries=3,\n        model=\"gpt-4o\",\n    )\n\n\nif __name__ == \"__main__\":\n    prompt = \"France is the capital of Paris\"\n    prompt_template = generate_prompt_templates(prompt)\n    for prompt in prompt_template:\n        print(prompt)\n        #> prompt_template='{subject} is the capital of {object}'\n        #> prompt_template='The capital of {object} is {subject}'\n        #> prompt_template=\"{object}'s capital is {subject}\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [How Can We Know What Language Models Know? ](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00324/96460/How-Can-We-Know-What-Language-Models-Know)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_few_shot/uncertainty_routed_cot.md",
    "content": "---\ndescription: \"Uncertainty Routed Chain Of Thought is a technique used in the Gemini Paper to improve upon the conventional Chain Of Thought approach\"\n---\n\nUncertainty-Routed Chain Of Thought<sup><a href=\"https://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdf\">1</a></sup> prompting generates multiple chain of thought reasoning chains ( This is either 8 or 32 in the original paper ).\n\nIt then takes the majority answer out of these chains as the final solution only if the proportion of chains that agreed on this answer are higher than a specific threshold.\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"74-87\"\nfrom pydantic import BaseModel\nimport instructor\nfrom textwrap import dedent\nfrom typing import Literal\nimport asyncio\nfrom collections import Counter\nclient = instructor.from_provider(\"openai/gpt-5-nano\", async_client=True)\n\n\nclass ChainOfThoughtResponse(BaseModel):\n    chain_of_thought: str\n    correct_answer: Literal[\"A\", \"B\", \"C\", \"D\"]\n\n\nasync def generate_response(query: str, options: dict[str, str]):\n    formatted_options = \"\\n\".join(\n        [f\"{key}:{answer}\" for key, answer in options.items()]\n    )\n    return await client.create(\n        model=\"gpt-4o\",\n        response_model=ChainOfThoughtResponse,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                You are a a world class AI who excels at answering\n                complex questions. Choose one of the options below\n                that best answers the question you are about to be\n                asked\n                <question>\n                {query}\n                </question>\n\n                <options>\n                {formatted_options}\n                </options>\n                \"\"\"\n                ),\n            }\n        ],\n    )\n\n\nasync def generate_batch_responses(\n    query: str, options: dict[str, str], num_chains: int\n) -> list[ChainOfThoughtResponse]:\n    coros = [generate_response(query, options) for _ in range(num_chains)]\n    return await asyncio.gather(*coros)\n\n\nif __name__ == \"__main__\":\n    question = \"\"\"In a population of giraffes, an environmental\n    change occurs that favors individuals that are tallest. As a\n    result, more of the taller individuals are able to obtain\n    nutrients and survive to pass along their genetic information.\n    This is an example of\"\"\"\n\n    options = {\n        \"A\": \"directional selection\",\n        \"B\": \"stabilizing selection\",\n        \"C\": \"sexual selection\",\n        \"D\": \"disruptive selection\",\n    }\n\n    correct_answer = \"A\"\n    k = 8\n    threshold = 0.6\n\n    responses = asyncio.run(generate_batch_responses(question, options, k))\n    votes = Counter([response.correct_answer for response in responses])\n    print(votes)\n    #> Counter({'A': 8})\n\n    majority_vote_element, majority_vote_count = votes.most_common(1)[0]\n    print(majority_vote_element, majority_vote_count)\n    #> A 8\n    majority_threshold = majority_vote_count / k\n\n    if majority_threshold < threshold:\n        response = asyncio.run(generate_response(question, options))\n        response = response.correct_answer\n    else:\n        response = majority_vote_element\n\n    print(response)\n    #> A\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Gemini: A Family of Highly Capable Multimodal Models](https://storage.googleapis.com/deepmind-media/gemini/gemini_1_report.pdf)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md",
    "content": "---\ndescription: \"Analogical Prompting aims to help improve model accuracy by getting a model to generate relevant exemplars before solving the problem\"\n---\n\nAnalogical Prompting<sup><a href=\"https://arxiv.org/pdf/2310.01714\">1</a></sup> is a method that aims to get LLMs to generate examples that are relevant to the problem before starting to address the user's query.\n\nThis takes advantage of the various forms of knowledge that the LLM has acquired during training and explicitly prompts them to recall the relevant problems and solutions. We can use Analogical Prompting using the following template\n\n![](../../../img/analogical_prompting.png)\n\n!!! example \"Analogical Prompting Prompt Template\"\n\n    Problem: [User Prompt]\n\n    Relevant Problems: Recall three relevant and distinct problems. For each problem, describe it and explain the solution\n\n    Solve the problem\n\nWe can implement this using `instructor` as seen below with some slight modifications.\n\n```python hl_lines=\"33-36\"\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom textwrap import dedent\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass RelevantProblem(BaseModel):\n    problem_explanation: str\n    solution: str\n\n\nclass Response(BaseModel):\n    relevant_problems: list[RelevantProblem] = Field(\n        max_length=3,\n        min_length=3,\n    )\n    answer: RelevantProblem\n\n\ndef analogical_prompting(query: str):\n    return client.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": dedent(\n                    f\"\"\"\n                <problem>\n                {query}\n                </problem>\n\n                Relevant Problems: Recall three relevant and\n                distinct problems. For each problem, describe\n                it and explain the solution before solving\n                the problem\n                \"\"\"\n                ),\n            }\n        ],\n        model=\"gpt-4o\",\n        response_model=Response,\n    )\n\n\nif __name__ == \"__main__\":\n    query = (\n        \"What is the area of the square with the four \"\n        \"vertices at (-2, 2), (2, -2), (-2, -6), and \"\n        \"(-6, -2)?\"\n    )\n    response = analogical_prompting(query)\n    for problem in response.relevant_problems:\n        print(problem.model_dump_json(indent=2))\n        \"\"\"\n        {\n          \"problem_explanation\": \"Determine the distance\n          between two points in a coordinate plane.\",\n          \"solution\": \"To find the distance between two\n          points, use the distance formula: \\\\(d =\n          \\\\sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}\\\\). This\n          formula calculates the Euclidean distance between\n          points (x_1, y_1) and (x_2, y_2).\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"problem_explanation\": \"Calculate the area of a\n          square given its side length.\",\n          \"solution\": \"The area of a square can be found\n          using the formula: \\\\(A = s^2\\\\), where \\\\(s\\\\) is\n          the length of one side of the square.\"\n        }\n        \"\"\"\n        \"\"\"\n        {\n          \"problem_explanation\": \"Identify vertices and\n          properties of a geometry shape such as\n          parallelogram.\",\n          \"solution\": \"For any quadrilateral, verify that\n          all sides are equal and angles are right angles to\n          confirm it is a square. Use properties of\n          quadrilaterals and distance formula.\"\n        }\n        \"\"\"\n\n    print(response.answer.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"problem_explanation\": \"Calculate the area of a\n      square given its vertices.\",\n      \"solution\": \"First, confirm the shape is a square by\n      checking the distance between consecutive vertices\n      and ensuring all sides are of equal length using the\n      distance formula. For vertices (-2,2), (2,-2),\n      (-2,-6), and (-6,-2), calculate distances between\n      consecutive points. If distances are equal, use the\n      side length to compute area using \\\\(A = s^2\\\\).\"\n    }\n    \"\"\"\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Large Language Models As Analogical Reasoners](https://arxiv.org/pdf/2310.01714)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md",
    "content": "---\ndescription: \"Step-back prompting is a two-step prompting technique that asks the LLM a step-back question to gather context for the query\"\n---\n\nHow can we encourage an LLM to think through any high-level context required to answer a query? Step-back prompting encourages this in two steps:\n\n1. **Abstraction**: Ask the LLM a generic, higher-level concept. This is generally topic-specific. This is known as the _step-back question_.\n2. **Reasoning**: Ask the LLM the original question, given its answer to the abstract question. This is known as _abstracted-grounded reasoning_.\n\n!!! example \"Step-Back Prompting Example\"\n\n    **Original Question**: What happens to the pressure of an ideal gas when temperature and volume are increased?\n\n    **Step-Back Question**: What are the physics concepts associated with this question?\n\n    **Reasoning Prompt**: {step-back response} {original question}\n\nNote that the step-back question is also generated using an LLM query.\n\nStep-back prompting has been shown to improve scores on reasoning benchmarks for PaLM-2L and GPT-4.<sup><a href=\"https://arxiv.org/abs/2406.06608\">\\*</a></sup>\n\n```python\nimport openai\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Iterable, Literal\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Stepback(BaseModel):\n    original_question: str\n    abstract_question: str\n\n\nclass Education(BaseModel):\n    degree: Literal[\"Bachelors\", \"Masters\", \"PhD\"]\n    school: str\n    topic: str\n    year: int\n\n\nclass Response(BaseModel):\n    school: str\n\n\ndef generate_stepback_question():\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Stepback,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                You are an expert at world knowledge. Your task is to step back\n                and paraphrase a question to a more generic step-back question,\n                which is easier to answer.\n\n                Here are a few examples:\n                Original Question: Which position did Knox Cunningham hold from\n                May 1955 to Apr 1956?\n                Step-back Question: Which positions has Knox Cunningham held in\n                his career?\n                Original Question: Who was the spouse of Anna Karina from 1968\n                to 1974?\n                Step-back Question: Who were the spouses of Anna Karina?\n                Original Question: Which team did Thierry Audel play for from\n                2007 to 2008?\n                Step-back Question: Which teams did Thierry Audel play for in\n                his career?\n\n                Now, generate the step-back question for the following question:\n                Estella Leopold went to which school between Aug 1954 and\n                Nov 1954?\n                \"\"\",\n            },\n        ],\n    )\n\n\ndef ask_stepback_question(stepback):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Iterable[Education],\n        messages=[\n            {\"role\": \"user\", \"content\": stepback.abstract_question},\n        ],\n    )\n\n\ndef get_final_response(stepback, stepback_response):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                Q: {stepback.abstract_question},\n                A: {stepback_response}\n                Q: {stepback.original_question}\n                A:\n                \"\"\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    # Generate the step-back question\n    stepback = generate_stepback_question()\n    print(stepback.original_question)\n    #> Estella Leopold went to which school between Aug 1954 and Nov 1954?\n    print(stepback.abstract_question)\n    #> Which schools did Estella Leopold attend in her life?\n\n    # Ask the step-back question\n    stepback_response = ask_stepback_question(stepback)\n    for item in stepback_response:\n        print(item)\n        \"\"\"\n        degree='Bachelors'\n        school='University of Wisconsin-Madison'\n        topic='Botany'\n        year=1948\n        \"\"\"\n        \"\"\"\n        degree='Masters'\n        school='University of California, Berkeley'\n        topic='Botany and Paleobotany'\n        year=1950\n        \"\"\"\n        \"\"\"\n        degree='PhD'\n        school='Yale University'\n        topic='Botany and Paleobotany'\n        year=1955\n        \"\"\"\n\n    # Ask the original question, appended with context from the stepback response\n    print(get_final_response(stepback, stepback_response))\n    #> school='Yale University'\n```\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models](https://arxiv.org/abs/2310.06117)\n\n<sup id=\"ref-asterisk\">\\*</sup>: [The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_zero_shot/tab_cot.md",
    "content": "---\ndescription: \"Tab-CoT encourages LLMs to output reasoning as a markdown table, improving the structure and reasoning of its output\"\n---\n\nBy getting language models to output their reasoning as a structured markdown table, we can improve their reasoning capabilities and the quality of their outputs. This is known as Tabular Chain Of Thought (Tab-CoT) <sup><a href=\"https://arxiv.org/pdf/2305.17812\">1</a></sup>.\n\nWe can implement this using `instructor` as a response object as seen below to ensure we get exactly the data that we want. Each row in our table is represented here as a `ReasoningStep` object.\n\n```python hl_lines=\"36-38\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom textwrap import dedent\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ReasoningStep(BaseModel):\n    step: int = Field(description=\"The step number\")\n    subquestion: str = Field(description=\"Subquestion to solve\")\n    procedure: str = Field(\n        description=\"\"\"Any intermediate computation\n        that was done in the reasoning process. Leave\n        empty if no computation is needed\"\"\",\n    )\n    result: str\n\n\nclass Response(BaseModel):\n    reasoning: list[ReasoningStep] = Field(\n        description=\"reasoning steps to derive answer\",\n    )\n    correct_answer: int\n\n\ndef generate_structured_reasoning_response(query: str, context: str):\n    response = client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                <system>\n                    <role>expert Question Answering system</role>\n                    <instruction>Make sure to output your reasoning in structured reasoning steps before generating a response to the user's query.</instruction>\n                </system>\n\n                <context>\n                    {context}\n                </context>\n\n                <query>\n                    {query}\n                </query>\n                \"\"\"\n                ),\n            },\n        ],\n    )\n    return response\n\n\nif __name__ == \"__main__\":\n    query = \"How many loaves of bread did they have left?\"\n    context = \"\"\"\n    The bakers at the Beverly Hills Bakery baked\n    200 loaves of bread on Monday morning. They\n    sold 93 loaves in the morning and 39 loaves\n    in the afternoon. A grocery store returned 6\n    unsold loaves.\n    \"\"\"\n\n    response = generate_structured_reasoning_response(query, context)\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"reasoning\": [\n        {\n          \"step\": 1,\n          \"subquestion\": \"How many loaves of bread were sold in the morning\n          and afternoon?\",\n          \"procedure\": \"93 (morning) + 39 (afternoon)\",\n          \"result\": \"132\"\n        },\n        {\n          \"step\": 2,\n          \"subquestion\": \"How many loaves of bread were originally baked?\",\n          \"procedure\": \"\",\n          \"result\": \"200\"\n        },\n        {\n          \"step\": 3,\n          \"subquestion\": \"How many loaves of bread were returned by the\n          grocery store?\",\n          \"procedure\": \"\",\n          \"result\": \"6\"\n        },\n        {\n          \"step\": 4,\n          \"subquestion\": \"How many loaves of bread were left after accounting\n          for sales and returns?\",\n          \"procedure\": \"200 (originally baked) - 132 (sold) + 6 (returned)\",\n          \"result\": \"74\"\n        }\n      ],\n      \"correct_answer\": 74\n    }\n    \"\"\"\n```\n\nThis generates the following reasoning step and the correct response of 74.\n\n| Step | Subquestion                                                                | Procedure                                          | Result |\n| ---- | -------------------------------------------------------------------------- | -------------------------------------------------- | ------ |\n| 1    | How many loaves of bread were sold in the morning and afternoon?           | 93 (morning) + 39 (afternoon)                      | 132    |\n| 2    | How many loaves of bread were originally baked?                            |                                                    | 200    |\n| 3    | How many loaves of bread were returned by the grocery store?               |                                                    | 6      |\n| 4    | How many loaves of bread were left after accounting for sales and returns? | 200 (originally baked) - 132 (sold) + 6 (returned) | 74     |\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Tab-CoT: Zero-shot Tabular Chain of Thought](https://arxiv.org/pdf/2305.17812)\n"
  },
  {
    "path": "docs/prompting/thought_generation/chain_of_thought_zero_shot/thread_of_thought.md",
    "content": "---\ndescription: \"Thread of Thought helps models ignore irrelevant context in their prompt, improving overall response quality and relevance\"\n---\n\nBy encouraging our model to examine each source in the provided context, we can help mitigate the impact of irrelevant context. This improves reasoning performance and the final output. This is known as Thread Of Thought <sup><a href=\"https://arxiv.org/pdf/2311.08734\">1</a></sup>.\n\nWe can implement Thread Of Thought using the following template.\n\n!!! example \"Thread Of Thought template\"\n\n    **[ Input Prompt ]**\n\n    Proceed through the context systematically, zeroing in on areas that could provide the answers we’re seeking\n\nWe can implement this using `instructor` as seen below.\n\n```python hl_lines=\"42-43\"\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom textwrap import dedent\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass ThreadOfThoughtResponse(BaseModel):\n    analysis: list[str] = Field(\n        description=\"\"\"An explanation for each relevant source explaining\n        its relevance and content\"\"\",\n    )\n    correct_answer: int\n\n\ndef analyze_context_and_generate_response(query: str, context: list[str]):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=ThreadOfThoughtResponse,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": dedent(\n                    f\"\"\"\n                    You are an expert Question Answerer.\n\n                    Here are all of the sources that you should refer to\n                    for context:\n                    {'\\n'.join(context)}\n                \"\"\"\n                ),\n            },\n            {\n                \"role\": \"user\",\n                \"content\": query,\n            },\n            {\n                \"role\": \"assistant\",\n                \"content\": dedent(\n                    \"\"\"\n                    Navigate through the context incrementally,\n                    identifying and summarizing relevant portions.\n                    \"\"\"\n                ),\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    context = [\n        \"The price of a house was $100,000 in 2024\",\n        \"\"\"The Great Wall of China is not visible from space\n        with the naked eye\"\"\",\n        \"\"\"Honey never spoils; archaeologists have found pots\n        of honey in ancient Egyptian tombs that are over\n        3,000 years old\"\"\",\n        \"\"\"The world's oldest known living tree is over 5,000\n        years old and is located in California\"\"\",\n        \"The price of a house was $80,000 in 2023\",\n    ]\n    query = \"What was the increase in the price of a house from 2023 to 2024\"\n    response = analyze_context_and_generate_response(query, context)\n    print(response.model_dump_json(indent=2))\n    \"\"\"\n    {\n      \"analysis\": [\n        \"The price of a house was $80,000 in 2023\",\n        \"The price of a house was $100,000 in 2024\"\n      ],\n      \"correct_answer\": 20000\n    }\n    \"\"\"\n```\n\n## Useful Tips\n\nHere are some alternative phrases that you can add to your prompt to generate a thread of thought before your model generates a response.\n\n1. In a step-by-step manner, go through the context, surfacing important information that could be useful.\n2. Walk me through this lengthy document segment by segment, focusing on each part's significance.\n3. Guide me through the context part by part, providing insights along the way.\n4. Divide the document into manageable parts and guide me through each one, providing insights as we move along.\n5. Let's go through this document piece by piece, paying close attention to each section.\n6. Take me through the context bit by bit, making sure we capture all important aspects.\n7. Examine the document in chunks, evaluating each part critically before moving to the next.\n8. Analyze the context by breaking it down into sections, summarizing each as we move forward.\n9. Navigate through the context incrementally, identifying and summarizing relevant portions.\n10. Proceed through the context systematically, zeroing in on areas that could provide the answers we're seeking.\n11. Take me through this long document step-by-step, making sure not to miss any important details.\n12. Analyze this extensive document in sections, summarizing each one and noting any key points.\n13. Navigate through this long document by breaking it into smaller parts and summarizing each, so we don't miss anything.\n14. Let's navigate through the context section by section, identifying key elements in each part.\n15. Let's dissect the context into smaller pieces, reviewing each one for its importance and relevance.\n16. Carefully analyze the context piece by piece, highlighting relevant points for each question.\n17. Read the context in sections, concentrating on gathering insights that answer the question at hand.\n18. Let's read through the document section by section, analyzing each part carefully as we go.\n19. Let's dissect this document bit by bit, making sure to understand the nuances of each section.\n20. Systematically work through this document, summarizing and analyzing each portion as we go.\n21. Let's explore the context step-by-step, carefully examining each segment.\n22. Systematically go through the context, focusing on each part individually.\n23. Methodically examine the context, focusing on key segments that may answer the query.\n24. Progressively sift through the context, ensuring we capture all pertinent details.\n25. Take a modular approach to the context, summarizing each part before drawing any conclusions.\n26. Examine each segment of the context meticulously, and let's discuss the findings.\n27. Approach the context incrementally, taking the time to understand each portion fully.\n28. Let's scrutinize the context in chunks, keeping an eye out for information that answers our queries.\n29. Walk me through this context in manageable parts step by step, summarizing and analyzing as we go.\n30. Let's take a segmented approach to the context, carefully evaluating each part for its relevance to the questions posed.\n\n### References\n\n<sup id=\"ref-1\">1</sup>: [Thread of Thought Unraveling Chaotic Contexts](https://arxiv.org/pdf/2311.08734)\n"
  },
  {
    "path": "docs/prompting/zero_shot/emotion_prompting.md",
    "content": "---\ntitle: \"Emotion Prompting\"\ndescription: \"Adding phrases with emotional significance to humans can help enhance the performance of a language model.\"\n---\n\nDo language models respond to emotional stimuli?\n\nAdding phrases with emotional significance to humans can help enhance the performance of a language model. This includes phrases such as:\n\n- This is very important to my career.\n- Take pride in your work.\n- Are you sure?\n\n!!! info\n    For more examples of emotional stimuli to use in prompts, look into [EmotionPrompt](https://arxiv.org/abs/2307.11760) -- a set of prompts inspired by well-established human psychological phenomena.\n\n## Implementation\n```python hl_lines=\"34\"\nimport openai\nimport instructor\nfrom pydantic import BaseModel\nfrom typing import Iterable\n\n\nclass Album(BaseModel):\n    name: str\n    artist: str\n    year: int\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef emotion_prompting(query, stimuli):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Iterable[Album],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                {query}\n                {stimuli}\n                \"\"\",\n            }\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"Provide me with a list of 3 musical albums from the 2000s.\"\n    stimuli = \"This is very important to my career.\"  # (1)!\n\n    albums = emotion_prompting(query, stimuli)\n\n    for album in albums:\n        print(album)\n        #> name='Kid A' artist='Radiohead' year=2000\n        #> name='The Marshall Mathers LP' artist='Eminem' year=2000\n        #> name='The College Dropout' artist='Kanye West' year=2004\n```\n\n1.  The phrase `This is very important to my career` is used as emotional stimuli in the prompt.\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Large Language Models Understand and Can be Enhanced by Emotional Stimuli](https://arxiv.org/abs/2307.11760)"
  },
  {
    "path": "docs/prompting/zero_shot/rar.md",
    "content": "---\ndescription: \"To help the model better infer human intention from ambigious prompts, we can ask the model to rephrase and respond (RaR).\"\n---\n\nHow can we identify and clarify ambigious information in the prompt?\n\nLet's say we are given the query: *Was Ed Sheeran born on an odd month?*\n\nThere are many ways a model might interpret an *odd month*:\n\n- Februray is *odd* because of an irregular number of days.\n- A month is *odd* if it has an odd number of days.\n- A month is *odd* if its numberical order in the year is odd (i.e. Janurary is the 1st month).\n\n!!! note\n\n    Ambiguities might not always be so obvious!\n\nTo help the model better infer human intention from ambigious prompts, we can ask the model to rephrase and respond (RaR).\n\n## Implementation\n\n```python hl_lines=\"19\"\nfrom pydantic import BaseModel\nimport instructor\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Response(BaseModel):\n    rephrased_question: str\n    answer: str\n\n\ndef rephrase_and_respond(query):\n    return client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"{query}\\nRephrase and expand the question, and respond.\"\"\",  # (1)!\n            }\n        ],\n        response_model=Response,\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"Take the last letters of the words in 'Edgar Bob' and concatinate them.\"\n\n    response = rephrase_and_respond(query)\n\n    print(response.rephrased_question)\n    \"\"\"\n    What are the last letters of each word in the name 'Edgar Bob', and what do you get when you concatenate them?\n    \"\"\"\n    print(response.answer)\n    \"\"\"\n    To find the last letters of each word in the name 'Edgar Bob', we look at 'Edgar' and 'Bob'. The last letter of 'Edgar' is 'r' and the last letter of 'Bob' is 'b'. Concatenating these letters gives us 'rb'.\n    \"\"\"\n```\n\n1. This prompt template comes from [this](https://arxiv.org/abs/2311.04205) paper.\n\nThis can also be implemented as two-step RaR:\n\n1. Ask the model to rephrase the question.\n2. Pass the rephrased question back to the model to generate the final response.\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves](https://arxiv.org/abs/2311.04205)\n"
  },
  {
    "path": "docs/prompting/zero_shot/re2.md",
    "content": "---\ndescription: \"Re2 (Re-Reading) is a technique that asks the model to read the question again.\"\n---\n\nHow can we enhance a model's understanding of a query?\n\nRe2 (**Re** - **R** eading) is a technique that asks the model to read the question again.\n\n!!! example \"Re-Reading Prompting\"\n    **Prompt Template**: Read the question again: <*query*> <*critical thinking prompt*><sup><a href=\"https://arxiv.org/abs/2309.06275\">1</a></sup>\n\n    A common critical thinking prompt is: \"Let's think step by step.\"\n\n## Implementation\n\n```python hl_lines=\"20\"\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Response(BaseModel):\n    answer: int\n\n\ndef re2(query, thinking_prompt):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"Read the question again: {query} {thinking_prompt}\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"Roger has 5 tennis balls.\n        He buys 2 more cans of tennis balls.\n        Each can has 3 tennis balls.\n        How many tennis balls does he have now?\n        \"\"\"\n    thinking_prompt = \"Let's think step by step.\"\n\n    response = re2(query=query, thinking_prompt=thinking_prompt)\n    print(response.answer)\n    #> 11\n```\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Re-Reading Improves Reasoning in Large Language Models](https://arxiv.org/abs/2309.06275)\n"
  },
  {
    "path": "docs/prompting/zero_shot/role_prompting.md",
    "content": "---\ntitle: \"Role Prompting\"\ndescription: \"Role prompting, or persona prompting, assigns a role to the model.\"\n---\n\nHow can we increase a model's performance on open-ended tasks?\n\nRole prompting, or persona prompting, assigns a role to the model. Roles can be:\n\n - **specific to the query**: *You are a talented writer. Write me a poem.*\n - **general/social**: *You are a helpful AI assistant. Write me a poem.*\n\n## Implementation\n\n```python hl_lines=\"27\"\nimport openai\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Response(BaseModel):\n    poem: str\n\n\ndef role_prompting(query, role):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"{role} {query}\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"Write me a short poem about coffee.\"\n    role = \"You are a renowned poet.\"\n\n    response = role_prompting(query, role)\n    print(response.poem)\n    \"\"\"\n    In the morning's gentle light,\n    A brew of warmth, dark and bright.\n    Awakening dreams, so sweet,\n    In every sip, the day we greet.\n\n    Through the steam, stories spin,\n    A liquid muse, caffeine within.\n    Moments pause, thoughts unfold,\n    In coffee's embrace, we find our gold.\n    \"\"\"\n```\n\n!!! info \"More Role Prompting\"\n    To read about a systematic approach to choosing roles, check out [RoleLLM](https://arxiv.org/abs/2310.00746).\n\n    For more examples of social roles, check out [this](https://arxiv.org/abs/2311.10054) evaluation of social roles in system prompts..\n\n    To read about using more than one role, check out [Multi-Persona Self-Collaboration](https://arxiv.org/abs/2307.05300).\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Lanuage Models](https://arxiv.org/abs/2310.00746)\n<sup id=\"ref-2\">2</sup>: [Is \"A Helpful Assistant\" the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts ](https://arxiv.org/abs/2311.10054)\n<sup id=\"ref-4\">3</sup>: [Unleashing the Emergent Cognitive Synergy in Large Lanuage Models: A Task-Solving Agent through Multi-Persona Self-Collaboration ](https://arxiv.org/abs/2307.05300)\n"
  },
  {
    "path": "docs/prompting/zero_shot/s2a.md",
    "content": "---\ntitle: \"System 2 Attention (S2A)\"\ndescription: \"The S2A (System 2 Attention) technique auto-refines a prompt by asking the model to rewrite the prompt to include only relevant information.\"\n---\n\nHow do we remove irrelevant information from the prompt?\n\nThe S2A (System 2 Attention) technique auto-refines a prompt by asking the model to rewrite the prompt to include only *relevant* information. We implement this in two steps:\n\n1. Ask the model to rewrite the prompt\n2. Pass the rewritten prompt back to the model\n\n## Implementation\n\n```python hl_lines=\"25-28\"\nimport openai\nimport instructor\nfrom pydantic import BaseModel, Field\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass Step1(BaseModel):\n    relevant_context: str = Field(..., description=\"Relevant context\")\n    user_query: str = Field(..., description=\"The question from the user\")\n\n\nclass Step2(BaseModel):\n    answer: int\n\n\ndef rewrite_prompt(query):\n    rewritten_prompt = client.create(\n        model=\"gpt-4o\",\n        response_model=Step1,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                    Given the following text by a user, extract the part\n                    that is actually relevant to their question. Please\n                    include the actual question or query that the user\n                    is asking.\n\n                    Text by user:\n                    {query}\n                    \"\"\",  # (1)!\n            }\n        ],\n    )\n    return rewritten_prompt\n\n\ndef generate_final_response(rewritten_prompt):\n    final_response = client.create(\n        model=\"gpt-4o\",\n        response_model=Step2,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"{rewritten_prompt.relevant_context}\n                    Question: {rewritten_prompt.user_query}\"\"\",\n            }\n        ],\n    )\n    return final_response\n\n\nif __name__ == \"__main__\":\n    query = \"\"\"Mary has 3 times as much candy as Megan.\n        Mary then adds 10 more pieces of candy to her collection.\n        Max is 5 years older than Mary.\n        If Megan has 5 pieces of candy, how many does Mary have in total?\n        \"\"\"\n\n    # Step 1: Rewrite the prompt\n    rewritten_prompt = rewrite_prompt(query)\n    print(rewritten_prompt.relevant_context)\n    \"\"\"\n    Mary has 3 times as much candy as Megan. Mary then adds 10 more pieces of candy to her collection. If Megan has 5 pieces of candy, how many does Mary have in total?\n    \"\"\"\n    print(rewritten_prompt.user_query)\n    #> how many does Mary have in total?\n\n    # Step 2: Generate the final response\n    final_response = generate_final_response(rewritten_prompt)\n    print(final_response.answer)\n    #> 25\n```\n\n1. This prompt template comes from [this](https://arxiv.org/abs/2311.11829) paper.\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [System 2 Attention (is something you might need too)](https://arxiv.org/abs/2311.11829)"
  },
  {
    "path": "docs/prompting/zero_shot/self_ask.md",
    "content": "---\ntitle: \"Self-Ask\"\ndescription: \"Self-Ask is a technique which use a single prompt to encourage a model to use the answers to sub-problems to correctly generate the overall solution.\"\n---\n\nModels can sometimes correctly answer sub-problems but incorrectly answer the overall query. This is known as the *compositionality gap*<sup><a href=\"https://arxiv.org/abs/2210.03350\">1</a></sup>.\n\nHow can we encourage a model to use the answers to sub-problems to correctly generate the overall solution?\n\nSelf-Ask is a technique which use a single prompt to:\n\n - decide if follow-up questions are required\n - generate the follow-up questions\n - answer the follow-up questions\n - answer the main query\n\n## Implementation\n\n```python hl_lines=\"26-29\"\nimport instructor\nfrom pydantic import BaseModel, Field\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass FollowUp(BaseModel):\n    question: str = Field(description=\"The follow-up question\")\n    answer: str = Field(description=\"The answer to the follow-up question\")\n\n\nclass Response(BaseModel):\n    follow_ups_required: bool\n    follow_ups: list[FollowUp]\n    final_answer: str\n\n\ndef self_ask(query):\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"Query: {query}\n                        Are follow-up questions needed?\n                        If so, generate follow-up questions, their answers, and then the final answer to the query.\n                        \"\"\",  # !(1)\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    query = \"Who was president of the U.S. when superconductivity was discovered?\"\n\n    response = self_ask(query)\n\n    print(response.follow_ups_required)\n    #> True\n    for follow_up in response.follow_ups:\n        print(follow_up)\n        \"\"\"\n        question='When was superconductivity discovered?' answer='Superconductivity was discovered in April 1911.'\n        \"\"\"\n        \"\"\"\n        question='Who was president of the U.S. in April 1911?' answer='William Howard Taft was the President of the United States in April 1911.'\n        \"\"\"\n    print(response.final_answer)\n    \"\"\"\n    William Howard Taft was president of the U.S. when superconductivity was discovered.\n    \"\"\"\n```\n\n1. Without `instructor`, this prompt would generally be implemented as a one-shot or few-shot prompt<sup><a href=\"https://arxiv.org/abs/2210.03350\">1</a></sup> to encourage thinking through follow-up questions. With `instructor`, we use a zero-shot prompt!\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Measuring and Narrowing the Compositionality Gap in Language Models](https://arxiv.org/abs/2210.03350)\n"
  },
  {
    "path": "docs/prompting/zero_shot/simtom.md",
    "content": "---\ntitle: \"SimToM (Simulated Theory of Mind)\"\ndescription: \"SimToM (Simulated Theory of Mind) is a two-step prompting technique that encourages a model to consider a specific perspective.\"\n---\n\nHow can we encourage the model to focus on relevant information?\n\nSimToM (Simulated Theory of Mind) is a two-step prompting technique that encourages a model to consider a specific perspective.\n\nThis can be useful for complex questions with multiple entities. For example, if the prompt contains information about two individuals, we can ask the model to answer our query from the perspective of one of the individuals.\n\nThis is implemented in two steps. Given an entity:\n\n1. Identify and isolate information relevant to the entity\n2. Ask the model to answer the query from the entity's perspective\n\n!!! example \"Sample Template\"\n\n    **Step 1**: Given the following context, list the facts that <*entity*> would know. Context: <*context*>\n\n    **Step 2**: You are <*entity*>. Answer the following question based only on these facts you know: <*facts*>. Question: <*query*>\n\n## Implementation\n\n```python hl_lines=\"24-25\"\nimport openai\nimport instructor\nfrom pydantic import BaseModel, Field\nfrom typing import Iterable\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\nclass KnownFact(BaseModel):\n    fact: str = Field(description=\"A fact that the given entity would know\")\n\n\nclass Response(BaseModel):\n    location: str\n\n\ndef generate_known_facts(entity, context, query) -> Iterable[KnownFact]:\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Iterable[KnownFact],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"Given the following context, list\n                the facts that {entity} would know:\n\n                Context:\n                {context}\n                {query}\n\n                List only the facts relevant to {entity}.\n                \"\"\",\n            }\n        ],\n    )\n\n\ndef answer_question_based_on_facts(entity, query, known_facts) -> Response:\n    return client.create(\n        model=\"gpt-4o\",\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"You are {entity}. Answer the following question\n                based only on these facts you know:\n                {\" \".join([str(fact) for fact in known_facts])}\"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Question: {query}\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    entity = \"Alice\"\n    context = \"\"\"Alice puts the book on the table.\n        Alice leaves the room.\n        Bob moves the book to the shelf.\n        \"\"\"\n    query = f\"Where does {entity} think the book is?\"\n\n    known_facts = generate_known_facts(entity, context, query)\n    response = answer_question_based_on_facts(entity, query, known_facts)\n\n    for fact in known_facts:\n        print(fact)\n        #> fact='Alice puts the book on the table.'\n        #> fact='Alice leaves the room. Bob moves the book to the shelf.'\n    print(response.location)\n    #> On the table\n```\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities](https://arxiv.org/abs/2311.10227)\n"
  },
  {
    "path": "docs/prompting/zero_shot/style_prompting.md",
    "content": "---\ntitle: \"Style Prompting\"\ndescription: \"To contrain a model's response to fit the boundaries of our task, we can specify a style.\"\n---\n\nHow can we constrain model outputs through prompting alone?\n\nTo contrain a model's response to fit the boundaries of our task, we can specify a style.\n\nStylistic constraints can include:\n\n - **writing style**: write a *flowery* poem\n - **tone**: write a *dramatic* poem\n - **mood**: write a *happy* poem\n - **genre**: write a *mystery* poem\n\n## Implementation\n\n```python hl_lines=\"22\"\nimport instructor\nfrom pydantic import BaseModel\nimport openai\n\n\nclass Email(BaseModel):\n    subject: str\n    message: str\n\n\nclient = instructor.from_provider(\"openai/gpt-5-nano\")\n\n\ndef generate_email(subject, to, sender, tone):\n    return client.create(\n        model=\"gpt-4o\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"\n                Write an email about {subject} to {to} from {sender}.\n                The email should be {tone}.\n                \"\"\",\n            }\n        ],\n        response_model=Email,\n    )\n\n\nif __name__ == \"__main__\":\n    email = generate_email(\n        subject=\"invitation to all-hands on Monday at 6pm\",\n        to=\"John Smith\",\n        sender=\"Jane Doe\",\n        tone=\"formal\",\n    )\n\n    print(email.subject)\n    #> Invitation to All-Hands Meeting\n    print(email.message)\n    \"\"\"\n    Dear Mr. Smith,\n\n    I hope this message finds you well. I am writing to formally invite you to our upcoming all-hands meeting scheduled for Monday at 6:00 PM. This meeting is an important opportunity for us to come together, discuss key updates, and align on our strategic goals.\n\n    Please confirm your availability at your earliest convenience. Your presence and contributions to the discussion would be greatly valued.\n\n    Thank you and I look forward to your confirmation.\n\n    Warm regards,\n\n    Jane Doe\n    \"\"\"\n```\n\n## Stylistic Constraint Examples\n\n| Constraint     | Possible Phrases                                                                  |\n|----------------|-----------------------------------------------------------------------------------|\n| Writing Style  | Functional, Flowery, Candid, Prosaic, Ornate, Poetic                              |\n| Tone           | Dramatic, Humorous, Optimistic, Sad, Formal, Informal                             |\n| Mood           | Angry, Fearful, Happy, Sad                                                        |\n| Genre          | Historical Fiction, Literary Fiction, Science Fiction, Mystery, Dystopian, Horror |\n\n!!! info \"More Stylistic Constraints\"\n\n    To see even more examples of these stylistic constraints and additional constraints (**characterization**, **pacing**, and **plot**), check out [this](https://arxiv.org/abs/2302.09185) paper.\n\n## References\n\n<sup id=\"ref-1\">1</sup>: [Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints](https://arxiv.org/abs/2302.09185)\n\n"
  },
  {
    "path": "docs/repository-overview.md",
    "content": "---\ntitle: Repository Overview\ndescription: Learn the structure of the Instructor repository and the purpose of each major directory.\n---\n\n# Repository Overview\n\nThis page explains the layout of the Instructor codebase and what each key directory contains.\n\n## Directory Summary\n\n### `instructor/`\nCore library with clients, adapters, and utilities for structured outputs.\n\n### `cli/`\nCommand-line interface code used for tasks like job management and usage tracking.\n\n### `docs/`\nDocumentation source files for the website built with MkDocs.\n\n### `examples/`\nPractical examples and cookbooks demonstrating how to use Instructor.\n\n### `tests/`\nTest suite and evaluation scripts ensuring the library functions correctly.\n\n"
  },
  {
    "path": "docs/start-here.md",
    "content": "---\ntitle: Start Here - Instructor for Beginners\ndescription: A beginner-friendly introduction to using Instructor for structured outputs from LLMs\n---\n\n# Start Here: Instructor for Beginners\n\nWelcome! This guide will help you understand what Instructor does and how to start using it in your projects, even if you're new to working with language models.\n\n## What is Instructor?\n\nInstructor is a Python library that helps you get structured, predictable data from language models like GPT-4 and Claude. It's like giving the LLM a form to fill out instead of letting it respond however it wants.\n\n### Where Instructor Fits\n\nHere's how Instructor fits into your application:\n\n```mermaid\nflowchart LR\n    A[Your Application] --> B[Instructor]\n    B --> C[LLM Provider]\n    C --> B\n    B --> A\n\n    style B fill:#e2f0fb,stroke:#b8daff,color:#004085\n```\n\n### The Problem Instructor Solves\n\nWithout Instructor, getting structured data from LLMs can be challenging:\n\n1. **Unpredictable outputs**: LLMs might format responses differently each time\n2. **Format errors**: Getting JSON or specific data structures can be error-prone\n3. **Validation headaches**: Checking if the response matches what you need\n\nInstructor solves these problems by:\n\n1. Defining exactly what data you want using Python classes\n2. Making sure the LLM returns data in that structure\n3. Validating the output and automatically fixing issues\n\n## A Simple Example\n\nLet's see Instructor in action with a basic example:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\n# Define the structure you want\nclass Person(BaseModel):\n    name: str\n    age: int\n    city: str\n\n# Connect to the LLM with Instructor\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\n# Extract structured data\nperson = client.create(\n    response_model=Person,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract: John is 30 years old and lives in New York.\"}\n    ]\n)\n\n# Now you have a structured object\nprint(f\"Name: {person.name}\")  # Name: John\nprint(f\"Age: {person.age}\")    # Age: 30\nprint(f\"City: {person.city}\")  # City: New York\n```\n\nThat's it! Instructor handled all the complexity of getting the LLM to format the data correctly.\n\n**Ready to get started?** [Follow our step-by-step guide →](./getting-started.md)\n\n## Key Concepts\n\nHere are the main concepts you need to know:\n\n### 1. Response Models\n\nResponse models define the structure you want the LLM to return. They are built using Pydantic, which is a data validation library.\n\n```python\nfrom pydantic import BaseModel, Field\n\nclass User(BaseModel):\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n    # The descriptions help the LLM understand what to extract\n```\n\n### 2. Client Setup\n\nThe `from_provider` function connects Instructor to your LLM provider. It automatically handles provider-specific configurations:\n\n```python\n# For OpenAI\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\n# For Anthropic\nclient = instructor.from_provider(\"anthropic/claude-3-5-haiku-latest\")\n\n# For Google Gemini\nclient = instructor.from_provider(\"google/gemini-3-flash\")\n```\n\n### 3. Modes\n\nModes control how Instructor gets structured data from the LLM. Different providers support different modes, and Instructor automatically selects the best one. You can also specify a mode manually if needed.\n\n[Learn more about client setup →](./concepts/from_provider.md)\n\n## Common Use Cases\n\nHere are some popular ways people use Instructor:\n\n1. **Data extraction**: Pull structured information from text documents\n2. **Form filling**: Convert free-text into form fields\n3. **Classification**: Sort content into predefined categories\n4. **Content generation**: Create structured content like articles or product descriptions\n5. **API integration**: Format LLM outputs to match API requirements\n\n## Next Steps\n\nNow that you understand the basics, here are some suggested next steps:\n\n1. **Try the [Getting Started Guide](getting-started.md)** for a more in-depth tutorial\n2. **Explore the [Cookbook Examples](examples/index.md)** for practical use cases\n3. **Learn about [Validation](concepts/validation.md)** to ensure data quality\n4. **Check out [Streaming](concepts/partial.md)** for handling large responses\n5. **Understand [Providers](integrations/index.md)** to use different LLM services\n\n## Common Questions\n\n### Do I need to understand Pydantic?\n\nWhile knowing Pydantic helps, you don't need to be an expert. The basic patterns shown above will get you started. You can learn more advanced features as you need them.\n\n### Which LLM provider should I use?\n\nOpenAI is the most popular choice for beginners because of its reliability and wide support. As you grow more comfortable, you can explore other providers like Anthropic Claude, Gemini, or open-source models.\n\n### Is Instructor hard to learn?\n\nNo! If you're familiar with Python classes and working with APIs, you'll find Instructor straightforward. The core concepts are simple, and you can gradually explore advanced features.\n\n### How does Instructor compare to other libraries?\n\nInstructor focuses specifically on structured outputs with a simple, clean API. Unlike larger frameworks that try to do everything, Instructor does one thing very well: getting structured data from LLMs.\n\n## Getting Help\n\nIf you get stuck:\n\n- Check the [FAQ](faq.md) for common questions\n- Browse the [Examples](examples/index.md) for similar use cases\n- Join our [Discord community](https://discord.gg/bD9YE9JArw) for real-time help\n- Look for related topics in the [Concepts](concepts/index.md) section\n\nWelcome aboard, and happy extracting!\n"
  },
  {
    "path": "docs/templates/provider_template.md",
    "content": "---\ntitle: [Provider Name]\ndescription: Guide to using instructor with [Provider Name]\n---\n\n# Structured outputs with [Provider Name], a complete guide w/ instructor\n\n[Brief introduction to the provider, what models they offer, and why someone would use them]\n\n## Quick Start\n\nFirst, install the required packages:\n\n```bash\npip install \"instructor[provider-specific-extras]\"\n```\n\nYou'll need to set up authentication:\n\n```bash\nexport PROVIDER_API_KEY=your_api_key_here\n# Add any other environment variables needed\n```\n\n## Basic Example\n\nHere's how to extract structured data using [Provider Name]:\n\n```python\n# Standard library imports\nimport os\nfrom typing import Optional\n\n# Third-party imports\nimport instructor\nfrom provider_sdk import ClientClass\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"PROVIDER_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Initialize the client with explicit mode\nclient = instructor.from_provider(\n    ClientClass(\n        api_key=os.environ.get(\"PROVIDER_API_KEY\", \"your_api_key_here\"),\n        # Other configuration options\n    ),\n    mode=instructor.Mode.PROVIDER_SPECIFIC_MODE,\n)\n\n# Define your data structure with proper annotations\nclass UserExtract(BaseModel):\n    \"\"\"Model for extracting user information from text.\"\"\"\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n\n# Extract structured data\ntry:\n    user = client.create(\n        model=\"provider-model-name\",  # Use latest stable model version\n        response_model=UserExtract,\n        messages=[\n            {\"role\": \"system\", \"content\": \"Extract structured user information from the text.\"},\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\n\n    print(user.model_dump_json(indent=2))\n    # Expected output:\n    # {\n    #   \"name\": \"Jason\",\n    #   \"age\": 25\n    # }\nexcept Exception as e:\n    print(f\"Error: {e}\")\n```\n\n## Async Example\n\nFor asynchronous use cases:\n\n```python\n# Standard library imports\nimport os\nimport asyncio\nfrom typing import Optional\n\n# Third-party imports\nimport instructor\nfrom provider_sdk import AsyncClientClass\nfrom pydantic import BaseModel, Field\n\n# Set up environment (typically handled before script execution)\n# os.environ[\"PROVIDER_API_KEY\"] = \"your-api-key\"  # Uncomment and replace with your API key if not set\n\n# Define your data structure with proper annotations\nclass UserExtract(BaseModel):\n    \"\"\"Model for extracting user information from text.\"\"\"\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n\n# Initialize the async client with explicit mode\nclient = instructor.from_provider(\n    AsyncClientClass(\n        api_key=os.environ.get(\"PROVIDER_API_KEY\", \"your_api_key_here\"),\n    ),\n    mode=instructor.Mode.PROVIDER_SPECIFIC_MODE,\n)\n\nasync def extract_data(text: str) -> UserExtract:\n    \"\"\"\n    Asynchronously extract structured data from text.\n\n    Args:\n        text: The input text to extract from\n\n    Returns:\n        A structured UserExtract object\n    \"\"\"\n    try:\n        user = await client.create(\n            model=\"provider-model-name\",  # Use latest stable model version\n            response_model=UserExtract,\n            messages=[\n                {\"role\": \"system\", \"content\": \"Extract structured user information from the text.\"},\n                {\"role\": \"user\", \"content\": text},\n            ],\n        )\n        return user\n    except Exception as e:\n        print(f\"Error during extraction: {e}\")\n        raise\n\n# Example usage\nasync def main():\n    result = await extract_data(\"Extract jason is 25 years old\")\n    print(result.model_dump_json(indent=2))\n\n# Run the async function\nif __name__ == \"__main__\":\n    asyncio.run(main())\n\n# Expected output:\n# {\n#   \"name\": \"Jason\",\n#   \"age\": 25\n# }\n```\n\n## Supported Modes\n\n[Provider Name] supports the following instructor modes:\n\n- `Mode.MODE_1` - Description of when to use this mode\n- `Mode.MODE_2` - Description of when to use this mode\n- [Additional modes as needed]\n\n## Streaming Support\n\nYou can stream results with [Provider Name]:\n\n```python\n# Streaming partial results example code\n```\n\n## Provider-Specific Features\n\n[Describe any special features or considerations specific to this provider]\n\n## Models\n\n[Provider Name] offers the following models:\n\n- `model-1` - Description of capabilities\n- `model-2` - Description of capabilities\n- [More models as appropriate]"
  },
  {
    "path": "docs/tutorials/1-introduction.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Working with structured outputs\\n\",\n    \"\\n\",\n    \"If you've seen my [talk](https://www.youtube.com/watch?v=yj-wSRJwrrc&t=1s) on this topic, you can skip this chapter.\\n\",\n    \"\\n\",\n    \"tl;dr\\n\",\n    \"\\n\",\n    \"When we work with LLMs you find that many times we are not building chatbots, instead we're working with structured outputs in order to solve a problem by returning machine readable data. However the way we think about the problem is still very much influenced by the way we think about chatbots. This is a problem because it leads to a lot of confusion and frustration. In this chapter we'll try to understand why this happens and how we can fix it.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import traceback\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"RED = \\\"\\\\033[91m\\\"\\n\",\n    \"RESET = \\\"\\\\033[0m\\\"\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## The fundamental problem with JSON and Dictionaries\\n\",\n    \"\\n\",\n    \"Lets say we have a simple JSON object, and we want to work with it. We can use the `json` module to load it into a dictionary, and then work with it. However, this is a bit of a pain, because we have to manually check the types of the data, and we have to manually check if the data is valid. For example, lets say we have a JSON object that looks like this:\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"data = [{\\\"first_name\\\": \\\"Jason\\\", \\\"age\\\": 10}, {\\\"firstName\\\": \\\"Jason\\\", \\\"age\\\": \\\"10\\\"}]\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"We have a `name` field, which is a string, and an `age` field, which is an integer. However, if we were to load this into a dictionary, we would have no way of knowing if the data is valid. For example, we could have a string for the age, or we could have a float for the age. We could also have a string for the name, or we could have a list for the name.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Jason is 10\\n\",\n      \"None is 10\\n\",\n      \"Next year Jason will be 11 years old\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Traceback (most recent call last):\\n\",\n      \"  File \\\"/var/folders/l2/jjqj299126j0gycr9kkkt9xm0000gn/T/ipykernel_24047/2607506000.py\\\", line 10, in <module>\\n\",\n      \"    age_next_year = age + 1\\n\",\n      \"                    ~~~~^~~\\n\",\n      \"TypeError: can only concatenate str (not \\\"int\\\") to str\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"for obj in data:\\n\",\n    \"    name = obj.get(\\\"first_name\\\")\\n\",\n    \"    age = obj.get(\\\"age\\\")\\n\",\n    \"    print(f\\\"{name} is {age}\\\")\\n\",\n    \"\\n\",\n    \"for obj in data:\\n\",\n    \"    name = obj.get(\\\"first_name\\\")\\n\",\n    \"    age = obj.get(\\\"age\\\")\\n\",\n    \"    try:\\n\",\n    \"        age_next_year = age + 1\\n\",\n    \"        print(f\\\"Next year {name} will be {age_next_year} years old\\\")\\n\",\n    \"    except TypeError:\\n\",\n    \"        traceback.print_exc()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"You see that while we were able to program with a dictionary, we had issues with the data being valid. We would have had to manually check the types of the data, and we had to manually check if the data was valid. This is a pain, and we can do better.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Pydantic to the rescue\\n\",\n    \"\\n\",\n    \"Pydantic is a library that allows us to define data structures, and then validate them.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"Person(name='Sam', age=30)\"\n      ]\n     },\n     \"execution_count\": 5,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"from pydantic import BaseModel, Field, ValidationError\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Person(BaseModel):\\n\",\n    \"    name: str\\n\",\n    \"    age: int\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"person = Person(name=\\\"Sam\\\", age=30)\\n\",\n    \"person\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"Person(name='Sam', age=30)\"\n      ]\n     },\n     \"execution_count\": 6,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"# Data is correctly casted to the right type\\n\",\n    \"person = Person.model_validate({\\\"name\\\": \\\"Sam\\\", \\\"age\\\": \\\"30\\\"})\\n\",\n    \"person\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Traceback (most recent call last):\\n\",\n      \"  File \\\"/var/folders/l2/jjqj299126j0gycr9kkkt9xm0000gn/T/ipykernel_24047/3040264600.py\\\", line 5, in <module>\\n\",\n      \"    assert person.age == 20\\n\",\n      \"           ^^^^^^^^^^^^^^^^\\n\",\n      \"AssertionError\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"assert person.name == \\\"Sam\\\"\\n\",\n    \"assert person.age == 30\\n\",\n    \"\\n\",\n    \"try:\\n\",\n    \"    assert person.age == 20\\n\",\n    \"except AssertionError:\\n\",\n    \"    traceback.print_exc()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Validation Error:\\n\",\n      \"Field: name, Error: Field required\\n\",\n      \"Field: age, Error: Input should be a valid integer, unable to parse string as an integer\\n\",\n      \"\\u001b[91m\\n\",\n      \"Original Traceback Below\\u001b[0m\\n\"\n     ]\n    },\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Traceback (most recent call last):\\n\",\n      \"  File \\\"/var/folders/l2/jjqj299126j0gycr9kkkt9xm0000gn/T/ipykernel_24047/621989455.py\\\", line 3, in <module>\\n\",\n      \"    person = Person.model_validate({\\\"first_name\\\": \\\"Sam\\\", \\\"age\\\": \\\"30.2\\\"})\\n\",\n      \"             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n\",\n      \"  File \\\"/opt/homebrew/Caskroom/miniconda/base/envs/instructor/lib/python3.11/site-packages/pydantic/main.py\\\", line 509, in model_validate\\n\",\n      \"    return cls.__pydantic_validator__.validate_python(\\n\",\n      \"           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n\",\n      \"pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person\\n\",\n      \"name\\n\",\n      \"  Field required [type=missing, input_value={'first_name': 'Sam', 'age': '30.2'}, input_type=dict]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.6/v/missing\\n\",\n      \"age\\n\",\n      \"  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='30.2', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.6/v/int_parsing\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"# Data is validated to get better error messages\\n\",\n    \"try:\\n\",\n    \"    person = Person.model_validate({\\\"first_name\\\": \\\"Sam\\\", \\\"age\\\": \\\"30.2\\\"})\\n\",\n    \"except ValidationError as e:\\n\",\n    \"    print(\\\"Validation Error:\\\")\\n\",\n    \"    for error in e.errors():\\n\",\n    \"        print(f\\\"Field: {error['loc'][0]}, Error: {error['msg']}\\\")\\n\",\n    \"\\n\",\n    \"    print(f\\\"{RED}\\\\nOriginal Traceback Below{RESET}\\\")\\n\",\n    \"    traceback.print_exc()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"By introducing pydantic into any python codebase you can get a lot of benefits. You can get type checking, you can get validation, and you can get autocomplete. This is a huge win, because it means you can catch errors before they happen. This is even more useful when we rely on language models to generate data for us.\\n\",\n    \"\\n\",\n    \"You can also define validators that are run on the data. This is useful because it means you can catch errors before they happen. For example, you can define a validator that checks if the age is greater than 0. This is useful because it means you can catch errors before they happen.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Fundamental problem with asking for JSON from OpenAI\\n\",\n    \"\\n\",\n    \"As we shall see below, the correct json format would be something of the format below:\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"{\\n\",\n    \"    \\\"name\\\": \\\"Jason\\\",\\n\",\n    \"    \\\"age\\\": 10\\n\",\n    \"}\\n\",\n    \"```\\n\",\n    \"\\n\",\n    \"However, we get errorenous outputs like:\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"{\\n\",\n    \"  \\\"jason\\\": 10\\n\",\n    \"}\\n\",\n    \"```\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"json that we want:\\n\",\n      \"\\n\",\n      \"{\\n\",\n      \"    \\\"name\\\": \\\"Jason\\\",\\n\",\n      \"    \\\"age\\\": 10\\n\",\n      \"}\\n\",\n      \"\\n\",\n      \"error!!\\n\",\n      \"{\\n\",\n      \"  \\\"jason\\\": 10\\n\",\n      \"}\\n\",\n      \"correctly parsed person=Person(name='Jason', age=10)\\n\",\n      \"correctly parsed person=Person(name='jason', age=10)\\n\",\n      \"error!!\\n\",\n      \"{\\n\",\n      \"  \\\"Jason\\\": {\\n\",\n      \"    \\\"age\\\": 10\\n\",\n      \"  }\\n\",\n      \"}\\n\",\n      \"error!!\\n\",\n      \"{\\n\",\n      \"  \\\"Jason\\\": {\\n\",\n      \"    \\\"age\\\": 10\\n\",\n      \"  }\\n\",\n      \"}\\n\",\n      \"error!!\\n\",\n      \"{\\n\",\n      \"  \\\"Jason\\\": {\\n\",\n      \"    \\\"age\\\": 10\\n\",\n      \"  }\\n\",\n      \"}\\n\",\n      \"error!!\\n\",\n      \"{\\n\",\n      \"  \\\"Jason\\\": {\\n\",\n      \"    \\\"age\\\": 10\\n\",\n      \"  }\\n\",\n      \"}\\n\",\n      \"correctly parsed person=Person(name='Jason', age=10)\\n\",\n      \"correctly parsed person=Person(name='Jason', age=10)\\n\",\n      \"error!!\\n\",\n      \"{\\n\",\n      \"  \\\"jason\\\": 10\\n\",\n      \"}\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from openai import OpenAI\\n\",\n    \"\\n\",\n    \"client = OpenAI()\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n    \"    messages=[\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"user\\\",\\n\",\n    \"            \\\"content\\\": \\\"Please give me jason is 10 as a json object ```json\\\\n\\\",\\n\",\n    \"        },\\n\",\n    \"    ],\\n\",\n    \"    n=10,\\n\",\n    \"    temperature=1,\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"print(\\\"json that we want:\\\")\\n\",\n    \"print(\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"{\\n\",\n    \"    \\\"name\\\": \\\"Jason\\\",\\n\",\n    \"    \\\"age\\\": 10\\n\",\n    \"}\\n\",\n    \"\\\"\\\"\\\"\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"for choice in resp.choices:\\n\",\n    \"    json = choice.message.content\\n\",\n    \"    try:\\n\",\n    \"        person = Person.model_validate_json(json)\\n\",\n    \"        print(f\\\"correctly parsed {person=}\\\")\\n\",\n    \"    except Exception as e:\\n\",\n    \"        print(\\\"error!!\\\")\\n\",\n    \"        print(json)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Introduction to Function Calling\\n\",\n    \"\\n\",\n    \"The json could be anything! We could add more and more into a prompt and hope it works, or we can use something called [function calling](https://platform.openai.com/docs/guides/function-calling) to directly specify the schema we want.\\n\",\n    \"\\n\",\n    \"**Function Calling**\\n\",\n    \"\\n\",\n    \"In an API call, you can describe _functions_ and have the model intelligently\\n\",\n    \"choose to output a _JSON object_ containing _arguments_ to call one or many\\n\",\n    \"functions. The Chat Completions API does **not** call the function; instead, the\\n\",\n    \"model generates _JSON_ that you can use to call the function in **your code**.\\n\",\n    \"\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"PersonBirthday(name='Jason Liu', age=30, birthday=datetime.date(1994, 3, 26))\"\n      ]\n     },\n     \"execution_count\": 10,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"import datetime\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class PersonBirthday(BaseModel):\\n\",\n    \"    name: str\\n\",\n    \"    age: int\\n\",\n    \"    birthday: datetime.date\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"schema = {\\n\",\n    \"    \\\"properties\\\": {\\n\",\n    \"        \\\"name\\\": {\\\"type\\\": \\\"string\\\"},\\n\",\n    \"        \\\"age\\\": {\\\"type\\\": \\\"integer\\\"},\\n\",\n    \"        \\\"birthday\\\": {\\\"type\\\": \\\"string\\\", \\\"format\\\": \\\"YYYY-MM-DD\\\"},\\n\",\n    \"    },\\n\",\n    \"    \\\"required\\\": [\\\"name\\\", \\\"age\\\"],\\n\",\n    \"    \\\"type\\\": \\\"object\\\",\\n\",\n    \"}\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n    \"    messages=[\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"user\\\",\\n\",\n    \"            \\\"content\\\": f\\\"Extract `Jason Liu is thirty years old his birthday is yesterday` into json today is {datetime.date.today()}\\\",\\n\",\n    \"        },\\n\",\n    \"    ],\\n\",\n    \"    functions=[{\\\"name\\\": \\\"Person\\\", \\\"parameters\\\": schema}],\\n\",\n    \"    function_call=\\\"auto\\\",\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"PersonBirthday.model_validate_json(resp.choices[0].message.function_call.arguments)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"But it turns out, pydantic actually not only does our serialization, we can define the schema as well as add additional documentation!\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'properties': {'name': {'title': 'Name', 'type': 'string'},\\n\",\n       \"  'age': {'title': 'Age', 'type': 'integer'},\\n\",\n       \"  'birthday': {'format': 'date', 'title': 'Birthday', 'type': 'string'}},\\n\",\n       \" 'required': ['name', 'age', 'birthday'],\\n\",\n       \" 'title': 'PersonBirthday',\\n\",\n       \" 'type': 'object'}\"\n      ]\n     },\n     \"execution_count\": 11,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"PersonBirthday.model_json_schema()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"We can even define nested complex schemas, and documentation with ease.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'$defs': {'Address': {'properties': {'address': {'description': 'Full street address',\\n\",\n       \"     'title': 'Address',\\n\",\n       \"     'type': 'string'},\\n\",\n       \"    'city': {'title': 'City', 'type': 'string'},\\n\",\n       \"    'state': {'title': 'State', 'type': 'string'}},\\n\",\n       \"   'required': ['address', 'city', 'state'],\\n\",\n       \"   'title': 'Address',\\n\",\n       \"   'type': 'object'}},\\n\",\n       \" 'description': 'A Person with an address',\\n\",\n       \" 'properties': {'name': {'title': 'Name', 'type': 'string'},\\n\",\n       \"  'age': {'title': 'Age', 'type': 'integer'},\\n\",\n       \"  'address': {'$ref': '#/$defs/Address'}},\\n\",\n       \" 'required': ['name', 'age', 'address'],\\n\",\n       \" 'title': 'PersonAddress',\\n\",\n       \" 'type': 'object'}\"\n      ]\n     },\n     \"execution_count\": 12,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"class Address(BaseModel):\\n\",\n    \"    address: str = Field(description=\\\"Full street address\\\")\\n\",\n    \"    city: str\\n\",\n    \"    state: str\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class PersonAddress(Person):\\n\",\n    \"    \\\"\\\"\\\"A Person with an address\\\"\\\"\\\"\\n\",\n    \"\\n\",\n    \"    address: Address\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"PersonAddress.model_json_schema()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"These simple concepts become what we built into `instructor` and most of the work has been around documenting how we can leverage schema engineering.\\n\",\n    \"Except now we use `instructor.patch()` to add a bunch more capabilities to the OpenAI SDK.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# The core idea around Instructor\\n\",\n    \"\\n\",\n    \"1. Using function calling allows us use a llm that is finetuned to use json_schema and output json.\\n\",\n    \"2. Pydantic can be used to define the object, schema, and validation in one single class, allow us to encapsulate everything neatly\\n\",\n    \"3. As a library with 100M downloads, we can leverage pydantic to do all the heavy lifting for us and fit nicely with the python ecosystem\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 13,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"PersonAddress(name='Jason Liu', age=30, address=Address(address='123 Main St', city='San Francisco', state='CA'))\"\n      ]\n     },\n     \"execution_count\": 13,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"import instructor\\n\",\n    \"import datetime\\n\",\n    \"\\n\",\n    \"# patch the client to add `response_model` to the `create` method\\n\",\n    \"client = instructor.patch(OpenAI(), mode=instructor.Mode.MD_JSON)\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-3.5-turbo-1106\\\",\\n\",\n    \"    messages=[\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"user\\\",\\n\",\n    \"            \\\"content\\\": f\\\"\\\"\\\"\\n\",\n    \"            Today is {datetime.date.today()}\\n\",\n    \"\\n\",\n    \"            Extract `Jason Liu is thirty years old his birthday is yesterday`\\n\",\n    \"            he lives at 123 Main St, San Francisco, CA\\\"\\\"\\\",\\n\",\n    \"        },\\n\",\n    \"    ],\\n\",\n    \"    response_model=PersonAddress,\\n\",\n    \")\\n\",\n    \"resp\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"By defining `response_model` we can leverage pydantic to do all the heavy lifting. Later we'll introduce the other features that `instructor.patch()` adds to the OpenAI SDK.\\n\",\n    \"but for now, this small change allows us to do a lot more with the API.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Is instructor the only way to do this?\\n\",\n    \"\\n\",\n    \"No. Libraries like Marvin, Langchain, and Llamaindex all now leverage the Pydantic object in similar ways. The goal is to be as light weight as possible, get you as close as possible to the openai api, and then get out of your way.\\n\",\n    \"\\n\",\n    \"More importantly, we've also added straight forward validation and reasking to the mix.\\n\",\n    \"\\n\",\n    \"The goal of instructor is to show you how to think about structured prompting and provide examples and documentation that you can take with you to any framework.\\n\",\n    \"\\n\",\n    \"For further exploration:\\n\",\n    \"\\n\",\n    \"- [Marvin](https://www.askmarvin.ai/)\\n\",\n    \"- [Langchain](https://python.langchain.com/docs/modules/model_io/output_parsers/pydantic)\\n\",\n    \"- [LlamaIndex](https://gpt-index.readthedocs.io/en/latest/examples/output_parsing/openai_pydantic_program.html)\\n\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.8\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 4\n}\n"
  },
  {
    "path": "docs/tutorials/2-tips.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"8bb7d0d0-2b7f-4e9e-8565-467dc5c6fd22\",\n   \"metadata\": {},\n   \"source\": [\n    \"# General Tips on Prompting\\n\",\n    \"\\n\",\n    \"Before we get into some big applications of schema engineering I want to equip you with the tools for success.\\n\",\n    \"This notebook is to share some general advice when using prompts to get the most of your models.\\n\",\n    \"\\n\",\n    \"Before you might think of prompt engineering as massaging this wall of text, almost like coding in a notepad. But with schema engineering you can get a lot more out of your prompts with a lot less work.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"8a785c25-b08d-4ab4-bbd7-22e3b090c2ed\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Classification\\n\",\n    \"\\n\",\n    \"For classification we've found theres generally two methods of modeling.\\n\",\n    \"\\n\",\n    \"1. using Enums\\n\",\n    \"2. using Literals\\n\",\n    \"\\n\",\n    \"Use an enum in Python when you need a set of named constants that are related and you want to ensure type safety, readability, and prevent invalid values. Enums are helpful for grouping and iterating over these constants.\\n\",\n    \"\\n\",\n    \"Use literals when you have a small, unchanging set of values that you don't need to group or iterate over, and when type safety and preventing invalid values is less of a concern. Literals are simpler and more direct for basic, one-off values.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"id\": \"fdf5e1d9-31ad-4e8a-a55e-e2e70fff598d\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'age': 17, 'name': 'Harry Potter', 'house': <House.Gryffindor: 'gryffindor'>}\"\n      ]\n     },\n     \"execution_count\": 1,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"import instructor\\n\",\n    \"from openai import OpenAI\\n\",\n    \"\\n\",\n    \"from enum import Enum\\n\",\n    \"from pydantic import BaseModel, Field\\n\",\n    \"from typing_extensions import Literal\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"# Tip: Do not use auto() as they cast to 1,2,3,4\\n\",\n    \"class House(Enum):\\n\",\n    \"    Gryffindor = \\\"gryffindor\\\"\\n\",\n    \"    Hufflepuff = \\\"hufflepuff\\\"\\n\",\n    \"    Ravenclaw = \\\"ravenclaw\\\"\\n\",\n    \"    Slytherin = \\\"slytherin\\\"\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"    house: House\\n\",\n    \"\\n\",\n    \"    def say_hello(self):\\n\",\n    \"        print(\\n\",\n    \"            f\\\"Hello, I'm {self.name}, I'm {self.age} years old and I'm from {self.house.value.title()}\\\"\\n\",\n    \"        )\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Harry Potter\\\"}],\\n\",\n    \"    response_model=Character,\\n\",\n    \")\\n\",\n    \"resp.model_dump()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"id\": \"c609eb44\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Hello, I'm Harry Potter, I'm 17 years old and I'm from Gryffindor\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"resp.say_hello()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"id\": \"03db160c-81e9-4373-bfec-7a107224b6dd\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'age': 11, 'name': 'Harry Potter', 'house': 'Gryffindor'}\"\n      ]\n     },\n     \"execution_count\": 3,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"    house: Literal[\\\"Gryffindor\\\", \\\"Hufflepuff\\\", \\\"Ravenclaw\\\", \\\"Slytherin\\\"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Harry Potter\\\"}],\\n\",\n    \"    response_model=Character,\\n\",\n    \")\\n\",\n    \"resp.model_dump()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"803e0ce6-6e7e-4d86-a7a8-49ebaad0a40b\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Arbitrary properties\\n\",\n    \"\\n\",\n    \"Often times there are long properties that you might want to extract from data that we can not specify in advanced. We can get around this by defining an arbitrary key value store like so:\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"id\": \"0e7938b8-4666-4df4-bd80-f53e8baf7550\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'age': 38,\\n\",\n       \" 'name': 'Severus Snape',\\n\",\n       \" 'house': 'Slytherin',\\n\",\n       \" 'properties': [{'key': 'role', 'value': 'Potions Master'},\\n\",\n       \"  {'key': 'patronus', 'value': 'Doe'},\\n\",\n       \"  {'key': 'loyalty', 'value': 'Dumbledore'},\\n\",\n       \"  {'key': 'played_by', 'value': 'Alan Rickman'}]}\"\n      ]\n     },\n     \"execution_count\": 4,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"class Property(BaseModel):\\n\",\n    \"    key: str = Field(description=\\\"Must be snake case\\\")\\n\",\n    \"    value: str\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"    house: Literal[\\\"Gryffindor\\\", \\\"Hufflepuff\\\", \\\"Ravenclaw\\\", \\\"Slytherin\\\"]\\n\",\n    \"    properties: list[Property]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Snape from Harry Potter\\\"}],\\n\",\n    \"    response_model=Character,\\n\",\n    \")\\n\",\n    \"resp.model_dump()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"b3e62f68-a79f-4f65-9c1f-726e4e2d340a\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Limiting the length of lists\\n\",\n    \"\\n\",\n    \"In later chapters we'll talk about how to use validators to assert the length of lists but we can also use prompting tricks to enumerate values. Here we'll define a index to count the properties.\\n\",\n    \"\\n\",\n    \"In this following example instead of extraction we're going to work on generation instead.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"id\": \"69a58d01-ab6f-41b6-bc0c-b0e55fdb6fe4\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'age': 38,\\n\",\n       \" 'name': 'Severus Snape',\\n\",\n       \" 'house': 'Slytherin',\\n\",\n       \" 'properties': [{'index': '1',\\n\",\n       \"   'key': 'position_at_hogwarts',\\n\",\n       \"   'value': 'Potions Master'},\\n\",\n       \"  {'index': '2', 'key': 'patronus_form', 'value': 'Doe'},\\n\",\n       \"  {'index': '3', 'key': 'loyalty', 'value': 'Albus Dumbledore'},\\n\",\n       \"  {'index': '4', 'key': 'played_by', 'value': 'Alan Rickman'},\\n\",\n       \"  {'index': '5', 'key': 'final_act', 'value': 'Protecting Harry Potter'}]}\"\n      ]\n     },\n     \"execution_count\": 5,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"class Property(BaseModel):\\n\",\n    \"    index: str = Field(..., description=\\\"Monotonically increasing ID\\\")\\n\",\n    \"    key: str = Field(description=\\\"Must be snake case\\\")\\n\",\n    \"    value: str\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"    house: Literal[\\\"Gryffindor\\\", \\\"Hufflepuff\\\", \\\"Ravenclaw\\\", \\\"Slytherin\\\"]\\n\",\n    \"    properties: list[Property] = Field(\\n\",\n    \"        ...,\\n\",\n    \"        description=\\\"Numbered list of arbitrary extracted properties, should be exactly 5\\\",\\n\",\n    \"    )\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Snape from Harry Potter\\\"}],\\n\",\n    \"    response_model=Character,\\n\",\n    \")\\n\",\n    \"resp.model_dump()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"bbc1d900-617a-4e4d-a401-6d10a5153cda\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Defining Multiple Entities\\n\",\n    \"\\n\",\n    \"Now that we see a single entity with many properties we can continue to nest them into many users\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"id\": \"1f2a2b14-a956-4f96-90c9-e11ca04ab7d1\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"age=11 name='Harry Potter' house='Gryffindor'\\n\",\n      \"age=11 name='Hermione Granger' house='Gryffindor'\\n\",\n      \"age=11 name='Ron Weasley' house='Gryffindor'\\n\",\n      \"age=11 name='Draco Malfoy' house='Slytherin'\\n\",\n      \"age=11 name='Neville Longbottom' house='Gryffindor'\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from collections.abc import Iterable\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"    house: Literal[\\\"Gryffindor\\\", \\\"Hufflepuff\\\", \\\"Ravenclaw\\\", \\\"Slytherin\\\"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Five characters from Harry Potter\\\"}],\\n\",\n    \"    response_model=Iterable[Character],\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"for character in resp:\\n\",\n    \"    print(character)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"id\": \"a3091aba\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"age=11 name='Harry Potter' house='Gryffindor'\\n\",\n      \"age=11 name='Hermione Granger' house='Gryffindor'\\n\",\n      \"age=11 name='Ron Weasley' house='Gryffindor'\\n\",\n      \"age=17 name='Draco Malfoy' house='Slytherin'\\n\",\n      \"age=11 name='Luna Lovegood' house='Ravenclaw'\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from collections.abc import Iterable\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"    house: Literal[\\\"Gryffindor\\\", \\\"Hufflepuff\\\", \\\"Ravenclaw\\\", \\\"Slytherin\\\"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"Five characters from Harry Potter\\\"}],\\n\",\n    \"    stream=True,\\n\",\n    \"    response_model=Iterable[Character],\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"for character in resp:\\n\",\n    \"    print(character)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"f6ed3144-bde1-4033-9c94-a6926fa079d2\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Defining Relationships\\n\",\n    \"\\n\",\n    \"Not only can we define lists of users, but with lists of properties we can also easily define lists of references. It's one of the more interesting things I've learned about prompting.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"id\": \"6de8768e-b36a-4a51-9cf9-940d178552f6\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"id=1 name='Harry Potter' friends_array=[2, 3, 4, 5, 6]\\n\",\n      \"id=2 name='Hermione Granger' friends_array=[1, 3, 4, 5]\\n\",\n      \"id=3 name='Ron Weasley' friends_array=[1, 2, 4, 6]\\n\",\n      \"id=4 name='Neville Longbottom' friends_array=[1, 2, 3, 5]\\n\",\n      \"id=5 name='Luna Lovegood' friends_array=[1, 2, 4, 6]\\n\",\n      \"id=6 name='Draco Malfoy' friends_array=[1, 3, 5]\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"class Character(BaseModel):\\n\",\n    \"    id: int\\n\",\n    \"    name: str\\n\",\n    \"    friends_array: list[int] = Field(\\n\",\n    \"        description=\\\"Relationships to their friends using the id\\\"\\n\",\n    \"    )\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"    messages=[{\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"5 kids from Harry Potter\\\"}],\\n\",\n    \"    stream=True,\\n\",\n    \"    response_model=Iterable[Character],\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"for character in resp:\\n\",\n    \"    print(character)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"523b5797-71a5-4a96-a4b7-21280fb73015\",\n   \"metadata\": {},\n   \"source\": [\n    \"With the tools we've discussed, we can find numerous real-world applications in production settings. These include extracting action items from transcripts, generating fake data, filling out forms, and creating objects that correspond to generative UI. These simple tricks will be highly useful.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"a9d20fd9-0cd0-4300-a8c1-d16388969e8e\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Missing Data\\n\",\n    \"\\n\",\n    \"The Maybe pattern is a concept in functional programming used for error handling. Instead of raising exceptions or returning None, you can use a Maybe type to encapsulate both the result and potential errors.\\n\",\n    \"\\n\",\n    \"This pattern is particularly useful when making LLM calls, as providing language models with an escape hatch can effectively reduce hallucinations.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"id\": \"c04f44aa-dc4b-4499-a151-e812512e77e6\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from typing import Optional\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Character(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: str\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class MaybeCharacter(BaseModel):\\n\",\n    \"    result: Optional[Character] = Field(default=None)\\n\",\n    \"    error: bool = Field(default=False)\\n\",\n    \"    message: Optional[str]\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"id\": \"a2155190-e104-4ed6-a17f-e0732499dd51\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"def extract(content: str) -> MaybeCharacter:\\n\",\n    \"    return client.create(\\n\",\n    \"        model=\\\"gpt-3.5-turbo\\\",\\n\",\n    \"        response_model=MaybeCharacter,\\n\",\n    \"        messages=[\\n\",\n    \"            {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"Extract `{content}`\\\"},\\n\",\n    \"        ],\\n\",\n    \"    )\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"id\": \"a7b59afa-9bf0-4dc0-a5ca-de584514f33b\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"MaybeCharacter(result=Character(age=17, name='Harry Potter'), error=False, message=None)\"\n      ]\n     },\n     \"execution_count\": 11,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"extract(\\\"Harry Potter\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"id\": \"b5ddd5c1-ca75-49a9-95ad-181170435291\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"ename\": \"ValueError\",\n     \"evalue\": \"404 Error\",\n     \"output_type\": \"error\",\n     \"traceback\": [\n      \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n      \"\\u001b[0;31mValueError\\u001b[0m                                Traceback (most recent call last)\",\n      \"\\u001b[1;32m/Users/jasonliu/dev/instructor/docs/tutorials/2-tips.ipynb Cell 20\\u001b[0m line \\u001b[0;36m4\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/docs/tutorials/2-tips.ipynb#X25sZmlsZQ%3D%3D?line=0'>1</a>\\u001b[0m user \\u001b[39m=\\u001b[39m extract(\\u001b[39m\\\"\\u001b[39m\\u001b[39m404 Error\\u001b[39m\\u001b[39m\\\"\\u001b[39m)\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/docs/tutorials/2-tips.ipynb#X25sZmlsZQ%3D%3D?line=2'>3</a>\\u001b[0m \\u001b[39mif\\u001b[39;00m user\\u001b[39m.\\u001b[39merror:\\n\\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/docs/tutorials/2-tips.ipynb#X25sZmlsZQ%3D%3D?line=3'>4</a>\\u001b[0m     \\u001b[39mraise\\u001b[39;00m \\u001b[39mValueError\\u001b[39;00m(user\\u001b[39m.\\u001b[39mmessage)\\n\",\n      \"\\u001b[0;31mValueError\\u001b[0m: 404 Error\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"user = extract(\\\"404 Error\\\")\\n\",\n    \"\\n\",\n    \"if user.error:\\n\",\n    \"    raise ValueError(user.message)\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.6\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 5\n}\n"
  },
  {
    "path": "docs/tutorials/3-0-applications-rag.ipynb",
    "content": "{\n  \"cells\": [\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"# Applying Structured Output to RAG applications\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"**What is RAG?**\\n\",\n        \"\\n\",\n        \"Retrieval Augmented Generation (RAG) models are the bridge between large language models and external knowledge databases. They fetch the relevant data for a given query. For example, if you have some documents and want to ask questions related to the content of those documents, RAG models help by retrieving data from those documents and passing it to the LLM in queries.\\n\",\n        \"\\n\",\n        \"**How do RAG models work?**\\n\",\n        \"\\n\",\n        \"The typical RAG process involves embedding a user query and searching a vector database to find the most relevant information to supplement the generated response. This approach is particularly effective when the database contains information closely matching the query but not more than that.\\n\",\n        \"\\n\",\n        \"![Image](https://python.useinstructor.com/blog/img/dumb_rag.png)\\n\",\n        \"\\n\",\n        \"**Why is there a need for them?**\\n\",\n        \"\\n\",\n        \"Pre-trained large language models do not learn over time. If you ask them a question they have not been trained on, they will often hallucinate. Therefore, we need to embed our own data to achieve a better output.\\n\",\n        \"\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Simple RAG\\n\",\n        \"\\n\",\n        \"**What is it?**\\n\",\n        \"\\n\",\n        \"The simplest implementation of RAG embeds a user query and do a single embedding search in a vector database, like a vector store of Wikipedia articles. However, this approach often falls short when dealing with complex queries and diverse data sources.\\n\",\n        \"\\n\",\n        \"- **Query-Document Mismatch:** It assumes that the query and document embeddings will align in the vector space, which is often not the case.\\n\",\n        \"- **Text Search Limitations:** The model is restricted to simple text queries without the nuances of advanced search features.\\n\",\n        \"- **Limited Planning Ability:** It fails to consider additional contextual information that could refine the search results.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Improving the RAG model\\n\",\n        \"\\n\",\n        \"**What's the solution?**\\n\",\n        \"\\n\",\n        \"Enhancing RAG requires a more sophisticated approach known as query understanding.\\n\",\n        \"\\n\",\n        \"This process involves analyzing the user's query and transforming it to better match the backend's search capabilities.\\n\",\n        \"\\n\",\n        \"By doing so, we can significantly improve both the precision and recall of the search results, providing more accurate and relevant responses.\\n\",\n        \"\\n\",\n        \"![Image](https://python.useinstructor.com/blog/img/query_understanding.png)\\n\",\n        \"\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Practical Examples\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"In the examples below, we're going to use the [`instructor`](https://github.com/jxnl/instructor) library to simplify the interaction between the programmer and language models via the function-calling API.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"import instructor\\n\",\n        \"\\n\",\n        \"from openai import OpenAI\\n\",\n        \"from pydantic import BaseModel, Field\\n\",\n        \"\\n\",\n        \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\"\n      ],\n      \"execution_count\": 1,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Example 1) Improving Extractions\\n\",\n        \"\\n\",\n        \"One of the big limitations is that often times the query we embed and the text\\n\",\n        \"we are searching for may not have a direct match, leading to suboptimal results.\\n\",\n        \"A common method of using structured output is to extract information from a\\n\",\n        \"document and use it to answer a question. Directly, we can be creative in how we\\n\",\n        \"extract, summarize and generate potential questions in order for our embeddings\\n\",\n        \"to do better.\\n\",\n        \"\\n\",\n        \"For example, instead of using just a text chunk we could try to:\\n\",\n        \"\\n\",\n        \"1. extract key words and themes\\n\",\n        \"2. extract hypothetical questions\\n\",\n        \"3. generate a summary of the text\\n\",\n        \"\\n\",\n        \"In the example below, we use the `instructor` library to extract the key words\\n\",\n        \"and themes from a text chunk and use them to answer a question.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"class Extraction(BaseModel):\\n\",\n        \"    topic: str\\n\",\n        \"    summary: str\\n\",\n        \"    hypothetical_questions: list[str] = Field(\\n\",\n        \"        default_factory=list,\\n\",\n        \"        description=\\\"Hypothetical questions that this document could answer\\\",\\n\",\n        \"    )\\n\",\n        \"    keywords: list[str] = Field(\\n\",\n        \"        default_factory=list, description=\\\"Keywords that this document is about\\\"\\n\",\n        \"    )\"\n      ],\n      \"execution_count\": 2,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from pprint import pprint\\n\",\n        \"from collections.abc import Iterable\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"text_chunk = \\\"\\\"\\\"\\n\",\n        \"## Simple RAG\\n\",\n        \"\\n\",\n        \"**What is it?**\\n\",\n        \"\\n\",\n        \"The simplest implementation of RAG embeds a user query and do a single embedding search in a vector database, like a vector store of Wikipedia articles. However, this approach often falls short when dealing with complex queries and diverse data sources.\\n\",\n        \"\\n\",\n        \"**What are the limitations?**\\n\",\n        \"\\n\",\n        \"- **Query-Document Mismatch:** It assumes that the query and document embeddings will align in the vector space, which is often not the case.\\n\",\n        \"    - Query: \\\"Tell me about climate change effects on marine life.\\\"\\n\",\n        \"    - Issue: The model might retrieve documents related to general climate change or marine life, missing the specific intersection of both topics.\\n\",\n        \"- **Monolithic Search Backend:** It relies on a single search method and backend, reducing flexibility and the ability to handle multiple data sources.\\n\",\n        \"    - Query: \\\"Latest research in quantum computing.\\\"\\n\",\n        \"    - Issue: The model might only search in a general science database, missing out on specialized quantum computing resources.\\n\",\n        \"- **Text Search Limitations:** The model is restricted to simple text queries without the nuances of advanced search features.\\n\",\n        \"    - Query: \\\"what problems did we fix last week\\\"\\n\",\n        \"    - Issue: cannot be answered by a simple text search since documents that contain problem, last week are going to be present at every week.\\n\",\n        \"- **Limited Planning Ability:** It fails to consider additional contextual information that could refine the search results.\\n\",\n        \"    - Query: \\\"Tips for first-time Europe travelers.\\\"\\n\",\n        \"    - Issue: The model might provide general travel advice, ignoring the specific context of first-time travelers or European destinations.\\n\",\n        \"\\\"\\\"\\\"\\n\",\n        \"\\n\",\n        \"extractions = client.create(\\n\",\n        \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n        \"    stream=True,\\n\",\n        \"    response_model=Iterable[Extraction],\\n\",\n        \"    messages=[\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"system\\\",\\n\",\n        \"            \\\"content\\\": \\\"Your role is to extract chunks from the following and create a set of topics.\\\",\\n\",\n        \"        },\\n\",\n        \"        {\\\"role\\\": \\\"user\\\", \\\"content\\\": text_chunk},\\n\",\n        \"    ],\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"for extraction in extractions:\\n\",\n        \"    pprint(extraction.model_dump())\"\n      ],\n      \"execution_count\": 3,\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"{'hypothetical_questions': ['What is the basic concept behind simple RAG?',\\n\",\n            \"                            'How does simple RAG work for information '\\n\",\n            \"                            'retrieval?'],\\n\",\n            \" 'keywords': ['Simple RAG',\\n\",\n            \"              'Retrieval-Augmented Generation',\\n\",\n            \"              'user query',\\n\",\n            \"              'embedding search',\\n\",\n            \"              'vector database',\\n\",\n            \"              'Wikipedia articles',\\n\",\n            \"              'information retrieval'],\\n\",\n            \" 'summary': 'The simplest implementation of Retrieval-Augmented Generation '\\n\",\n            \"            '(RAG) involves embedding a user query and conducting a single '\\n\",\n            \"            'embedding search in a vector database, like a vector store of '\\n\",\n            \"            'Wikipedia articles, to retrieve relevant information. This method '\\n\",\n            \"            'may not be ideal for complex queries or varied data sources.',\\n\",\n            \" 'topic': 'Simple RAG'}\\n\",\n            \"{'hypothetical_questions': ['What are the drawbacks of using simple RAG '\\n\",\n            \"                            'systems?',\\n\",\n            \"                            'How does query-document mismatch affect the '\\n\",\n            \"                            'performance of RAG?',\\n\",\n            \"                            'Why is a monolithic search backend a limitation '\\n\",\n            \"                            'for RAG?'],\\n\",\n            \" 'keywords': ['limitations',\\n\",\n            \"              'query-document mismatch',\\n\",\n            \"              'simple RAG',\\n\",\n            \"              'monolithic search backend',\\n\",\n            \"              'text search',\\n\",\n            \"              'planning ability',\\n\",\n            \"              'contextual information'],\\n\",\n            \" 'summary': 'Key limitations of the simple RAG include query-document '\\n\",\n            \"            'mismatch, reliance on a single search backend, constraints of '\\n\",\n            \"            'text search capabilities, and limited planning ability to '\\n\",\n            \"            'leverage contextual information. These issues can result in '\\n\",\n            \"            'suboptimal search outcomes and retrieval of irrelevant or broad '\\n\",\n            \"            'information.',\\n\",\n            \" 'topic': 'Limitations of Simple RAG'}\\n\"\n          ]\n        }\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Now you can imagine if you were to embed the summaries, hypothetical questions,\\n\",\n        \"and keywords in a vector database (i.e. in the metadata fields of a vector\\n\",\n        \"database), you can then use a vector search to find the best matching document\\n\",\n        \"for a given query. What you'll find is that the results are much better than if\\n\",\n        \"you were to just embed the text chunk!\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Example 2) Understanding 'recent queries' to add temporal context\\n\",\n        \"\\n\",\n        \"One common application of using structured outputs for query understanding is to identify the intent of a user's query. In this example we're going to use a simple schema to separately process the query to add additional temporal context.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from datetime import date\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class DateRange(BaseModel):\\n\",\n        \"    start: date\\n\",\n        \"    end: date\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Query(BaseModel):\\n\",\n        \"    rewritten_query: str\\n\",\n        \"    published_daterange: DateRange\"\n      ],\n      \"execution_count\": 4,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"In this example, `DateRange` and `Query` are Pydantic models that structure the user's query with a date range and a list of domains to search within.\\n\",\n        \"\\n\",\n        \"These models **restructure** the user's query by including a <u>rewritten query</u>, a <u>range of published dates</u>, and a <u>list of domains</u> to search in.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Using the new restructured query, we can apply this pattern to our function calls to obtain results that are optimized for our backend.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"def expand_query(q) -> Query:\\n\",\n        \"    return client.create(\\n\",\n        \"        model=\\\"gpt-3.5-turbo\\\",\\n\",\n        \"        response_model=Query,\\n\",\n        \"        messages=[\\n\",\n        \"            {\\n\",\n        \"                \\\"role\\\": \\\"system\\\",\\n\",\n        \"                \\\"content\\\": f\\\"You're a query understanding system for the Metafor Systems search engine. Today is {date.today()}. Here are some tips: ...\\\",\\n\",\n        \"            },\\n\",\n        \"            {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"query: {q}\\\"},\\n\",\n        \"        ],\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"query = expand_query(\\\"What are some recent developments in AI?\\\")\\n\",\n        \"query\"\n      ],\n      \"execution_count\": 5,\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"Query(rewritten_query='Recent developments in artificial intelligence', published_daterange=DateRange(start=datetime.date(2024, 1, 1), end=datetime.date(2024, 3, 31)))\"\n            ]\n          },\n          \"execution_count\": 5,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"This isn't just about adding some date ranges. We can even use some chain of thought prompting to generate tailored searches that are deeply integrated with our backend.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"class DateRange(BaseModel):\\n\",\n        \"    chain_of_thought: str = Field(\\n\",\n        \"        description=\\\"Think step by step to plan what is the best time range to search in\\\"\\n\",\n        \"    )\\n\",\n        \"    start: date\\n\",\n        \"    end: date\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Query(BaseModel):\\n\",\n        \"    rewritten_query: str = Field(\\n\",\n        \"        description=\\\"Rewrite the query to make it more specific\\\"\\n\",\n        \"    )\\n\",\n        \"    published_daterange: DateRange = Field(\\n\",\n        \"        description=\\\"Effective date range to search in\\\"\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"def expand_query(q) -> Query:\\n\",\n        \"    return client.create(\\n\",\n        \"        model=\\\"gpt-4-1106-preview\\\",\\n\",\n        \"        response_model=Query,\\n\",\n        \"        messages=[\\n\",\n        \"            {\\n\",\n        \"                \\\"role\\\": \\\"system\\\",\\n\",\n        \"                \\\"content\\\": f\\\"You're a query understanding system for the Metafor Systems search engine. Today is {date.today()}. Here are some tips: ...\\\",\\n\",\n        \"            },\\n\",\n        \"            {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"query: {q}\\\"},\\n\",\n        \"        ],\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"expand_query(\\\"What are some recent developments in AI?\\\")\"\n      ],\n      \"execution_count\": 6,\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"Query(rewritten_query='latest advancements in artificial intelligence', published_daterange=DateRange(chain_of_thought='Since the user is asking for recent developments, it would be relevant to look for articles and papers published within the last year. Therefore, setting the start date to a year before today and the end date to today will cover the most recent advancements.', start=datetime.date(2023, 3, 31), end=datetime.date(2024, 3, 31)))\"\n            ]\n          },\n          \"execution_count\": 6,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Using Weights and Biases to track experiments\\n\",\n        \"\\n\",\n        \"While running a function like this production is quite simple, a lot of time will be spend on iterating and improving the model. To do this, we can use Weights and Biases to track our experiments.\\n\",\n        \"\\n\",\n        \"In order to do so we wand manage a few things\\n\",\n        \"\\n\",\n        \"1. Save input and output pairs for later\\n\",\n        \"2. Save the JSON schema for the response_model\\n\",\n        \"3. Having snapshots of the model and data allow us to compare results over time, and as we make changes to the model we can see how the results change.\\n\",\n        \"\\n\",\n        \"This is particularly useful when we might want to blend a mix of synthetic and real data to evaluate our model. We can use the `wandb` library to track our experiments and save the results to a dashboard.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {\n        \"scrolled\": true\n      },\n      \"source\": [\n        \"import json\\n\",\n        \"import instructor\\n\",\n        \"\\n\",\n        \"from openai import AsyncOpenAI\\n\",\n        \"from datetime import date\\n\",\n        \"from pydantic import BaseModel, Field\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class DateRange(BaseModel):\\n\",\n        \"    chain_of_thought: str = Field(\\n\",\n        \"        description=\\\"Think step by step to plan what is the best time range to search in\\\"\\n\",\n        \"    )\\n\",\n        \"    start: date\\n\",\n        \"    end: date\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Query(BaseModel):\\n\",\n        \"    rewritten_query: str = Field(\\n\",\n        \"        description=\\\"Rewrite the query to make it more specific\\\"\\n\",\n        \"    )\\n\",\n        \"    published_daterange: DateRange = Field(\\n\",\n        \"        description=\\\"Effective date range to search in\\\"\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"    def report(self):\\n\",\n        \"        dct = self.model_dump()\\n\",\n        \"        dct[\\\"usage\\\"] = self._raw_response.usage.model_dump()\\n\",\n        \"        return dct\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"# We'll use a different client for async calls\\n\",\n        \"# To highlight the difference and how we can use both\\n\",\n        \"aclient = instructor.patch(AsyncOpenAI())\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"async def expand_query(\\n\",\n        \"    q, *, model: str = \\\"gpt-4-1106-preview\\\", temp: float = 0\\n\",\n        \") -> Query:\\n\",\n        \"    return await aclient.create(\\n\",\n        \"        model=model,\\n\",\n        \"        temperature=temp,\\n\",\n        \"        response_model=Query,\\n\",\n        \"        messages=[\\n\",\n        \"            {\\n\",\n        \"                \\\"role\\\": \\\"system\\\",\\n\",\n        \"                \\\"content\\\": f\\\"You're a query understanding system for the Metafor Systems search engine. Today is {date.today()}. Here are some tips: ...\\\",\\n\",\n        \"            },\\n\",\n        \"            {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"query: {q}\\\"},\\n\",\n        \"        ],\\n\",\n        \"    )\"\n      ],\n      \"execution_count\": 7,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"# % pip install pandas wandb\\n\",\n        \"import pandas as pd\\n\",\n        \"from typing import Any\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"def flatten_dict(\\n\",\n        \"    d: dict[str, Any], parent_key: str = \\\"\\\", sep: str = \\\"_\\\"\\n\",\n        \") -> dict[str, Any]:\\n\",\n        \"    \\\"\\\"\\\"\\n\",\n        \"    Flatten a nested dictionary.\\n\",\n        \"\\n\",\n        \"    :param d: The nested dictionary to flatten.\\n\",\n        \"    :param parent_key: The base key to use for the flattened keys.\\n\",\n        \"    :param sep: Separator to use between keys.\\n\",\n        \"    :return: A flattened dictionary.\\n\",\n        \"    \\\"\\\"\\\"\\n\",\n        \"    items = []\\n\",\n        \"    for k, v in d.items():\\n\",\n        \"        new_key = f\\\"{parent_key}{sep}{k}\\\" if parent_key else k\\n\",\n        \"        if isinstance(v, dict):\\n\",\n        \"            items.extend(flatten_dict(v, new_key, sep=sep).items())\\n\",\n        \"        else:\\n\",\n        \"            items.append((new_key, v))\\n\",\n        \"    return dict(items)\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"def dicts_to_df(list_of_dicts: list[dict[str, Any]]) -> pd.DataFrame:\\n\",\n        \"    \\\"\\\"\\\"\\n\",\n        \"    Convert a list of dictionaries to a pandas DataFrame.\\n\",\n        \"\\n\",\n        \"    :param list_of_dicts: List of dictionaries, potentially nested.\\n\",\n        \"    :return: A pandas DataFrame representing the flattened data.\\n\",\n        \"    \\\"\\\"\\\"\\n\",\n        \"    # Flatten each dictionary and create a DataFrame\\n\",\n        \"    flattened_data = [flatten_dict(d) for d in list_of_dicts]\\n\",\n        \"    return pd.DataFrame(flattened_data)\"\n      ],\n      \"execution_count\": 8,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"import asyncio\\n\",\n        \"import time\\n\",\n        \"import pandas as pd\\n\",\n        \"import wandb\\n\",\n        \"\\n\",\n        \"model = \\\"gpt-4-1106-preview\\\"\\n\",\n        \"temp = 0\\n\",\n        \"\\n\",\n        \"run = wandb.init(\\n\",\n        \"    project=\\\"query\\\",\\n\",\n        \"    config={\\\"model\\\": model, \\\"temp\\\": temp},\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"test_queries = [\\n\",\n        \"    \\\"latest developments in artificial intelligence last 3 weeks\\\",\\n\",\n        \"    \\\"renewable energy trends past month\\\",\\n\",\n        \"    \\\"quantum computing advancements last 2 months\\\",\\n\",\n        \"    \\\"biotechnology updates last 10 days\\\",\\n\",\n        \"]\\n\",\n        \"start = time.perf_counter()\\n\",\n        \"queries = await asyncio.gather(\\n\",\n        \"    *[expand_query(q, model=model, temp=temp) for q in test_queries]\\n\",\n        \")\\n\",\n        \"duration = time.perf_counter() - start\\n\",\n        \"\\n\",\n        \"with open(\\\"schema.json\\\", \\\"w+\\\") as f:\\n\",\n        \"    schema = Query.model_json_schema()\\n\",\n        \"    json.dump(schema, f, indent=2)\\n\",\n        \"\\n\",\n        \"with open(\\\"results.jsonlines\\\", \\\"w+\\\") as f:\\n\",\n        \"    for query in queries:\\n\",\n        \"        f.write(query.model_dump_json() + \\\"\\\\n\\\")\\n\",\n        \"\\n\",\n        \"df = dicts_to_df([q.report() for q in queries])\\n\",\n        \"df[\\\"input\\\"] = test_queries\\n\",\n        \"df.to_csv(\\\"results.csv\\\")\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"run.log({\\\"schema\\\": wandb.Table(dataframe=pd.DataFrame([{\\\"schema\\\": schema}]))})\\n\",\n        \"run.log(\\n\",\n        \"    {\\n\",\n        \"        \\\"usage_total_tokens\\\": df[\\\"usage_total_tokens\\\"].sum(),\\n\",\n        \"        \\\"usage_completion_tokens\\\": df[\\\"usage_completion_tokens\\\"].sum(),\\n\",\n        \"        \\\"usage_prompt_tokens\\\": df[\\\"usage_prompt_tokens\\\"].sum(),\\n\",\n        \"        \\\"duration (s)\\\": duration,\\n\",\n        \"        \\\"average duration (s)\\\": duration / len(queries),\\n\",\n        \"        \\\"n_queries\\\": len(queries),\\n\",\n        \"    }\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"run.log(\\n\",\n        \"    {\\n\",\n        \"        \\\"results\\\": wandb.Table(dataframe=df),\\n\",\n        \"    }\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"files = wandb.Artifact(\\\"data\\\", type=\\\"dataset\\\")\\n\",\n        \"files.add_file(\\\"schema.json\\\")\\n\",\n        \"files.add_file(\\\"results.jsonlines\\\")\\n\",\n        \"files.add_file(\\\"results.csv\\\")\\n\",\n        \"\\n\",\n        \"run.log_artifact(files)\\n\",\n        \"run.finish()\"\n      ],\n      \"execution_count\": 9,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"The output of Weights and Biases would return something like the below table.\\n\",\n        \"\\n\",\n        \"| Metric                   | Value  |\\n\",\n        \"|--------------------------|--------|\\n\",\n        \"| average duration (s)     | 1.5945 |\\n\",\n        \"| duration (s)             | 6.37799|\\n\",\n        \"| n_queries                | 4      |\\n\",\n        \"| usage_completion_tokens  | 376    |\\n\",\n        \"| usage_prompt_tokens      | 780    |\\n\",\n        \"| usage_total_tokens       | 1156   |\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Example 3) Personal Assistants, parallel processing\\n\",\n        \"\\n\",\n        \"A personal assistant application needs to interpret vague queries and fetch information from multiple backends, such as emails and calendars. By modeling the assistant's capabilities using Pydantic, we can dispatch the query to the correct backend and retrieve a unified response.\\n\",\n        \"\\n\",\n        \"For instance, when you ask, \\\"What's on my schedule today?\\\", the application needs to fetch data from various sources like events, emails, and reminders. This data is stored across different backends, but the goal is to provide a consolidated summary of results.\\n\",\n        \"\\n\",\n        \"It's important to note that the data from these sources may not be embedded in a search backend. Instead, they could be accessed through different clients like a calendar or email, spanning both personal and professional accounts.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from typing import Literal\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class SearchClient(BaseModel):\\n\",\n        \"    query: str = Field(description=\\\"The search query that will go into the search bar\\\")\\n\",\n        \"    keywords: list[str]\\n\",\n        \"    email: str\\n\",\n        \"    source: Literal[\\\"gmail\\\", \\\"calendar\\\"]\\n\",\n        \"    date_range: DateRange\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Retrieval(BaseModel):\\n\",\n        \"    queries: list[SearchClient]\"\n      ],\n      \"execution_count\": 10,\n      \"outputs\": []\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Now, we can utilize this with a straightforward query such as \\\"What do I have today?\\\".\\n\",\n        \"\\n\",\n        \"The system will attempt to asynchronously dispatch the query to the appropriate backend.\\n\",\n        \"\\n\",\n        \"However, it's still crucial to remember that effectively prompting the language model is still a key aspect.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"retrieval = client.create(\\n\",\n        \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n        \"    response_model=Retrieval,\\n\",\n        \"    messages=[\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"system\\\",\\n\",\n        \"            \\\"content\\\": f\\\"\\\"\\\"You are Jason's personal assistant.\\n\",\n        \"                He has two emails jason@work.com jason@personal.com\\n\",\n        \"                Today is {date.today()}\\\"\\\"\\\",\\n\",\n        \"        },\\n\",\n        \"        {\\\"role\\\": \\\"user\\\", \\\"content\\\": \\\"What do I have today for work? any new emails?\\\"},\\n\",\n        \"    ],\\n\",\n        \")\\n\",\n        \"print(retrieval.model_dump_json(indent=4))\"\n      ],\n      \"execution_count\": 11,\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"{\\n\",\n            \"    \\\"queries\\\": [\\n\",\n            \"        {\\n\",\n            \"            \\\"query\\\": \\\"work\\\",\\n\",\n            \"            \\\"keywords\\\": [\\n\",\n            \"                \\\"work\\\",\\n\",\n            \"                \\\"today\\\"\\n\",\n            \"            ],\\n\",\n            \"            \\\"email\\\": \\\"jason@work.com\\\",\\n\",\n            \"            \\\"source\\\": \\\"gmail\\\",\\n\",\n            \"            \\\"date_range\\\": {\\n\",\n            \"                \\\"chain_of_thought\\\": \\\"Check today's work schedule\\\",\\n\",\n            \"                \\\"start\\\": \\\"2024-03-31\\\",\\n\",\n            \"                \\\"end\\\": \\\"2024-03-31\\\"\\n\",\n            \"            }\\n\",\n            \"        },\\n\",\n            \"        {\\n\",\n            \"            \\\"query\\\": \\\"new emails\\\",\\n\",\n            \"            \\\"keywords\\\": [\\n\",\n            \"                \\\"email\\\",\\n\",\n            \"                \\\"new\\\"\\n\",\n            \"            ],\\n\",\n            \"            \\\"email\\\": \\\"jason@work.com\\\",\\n\",\n            \"            \\\"source\\\": \\\"gmail\\\",\\n\",\n            \"            \\\"date_range\\\": {\\n\",\n            \"                \\\"chain_of_thought\\\": \\\"Check for new emails today\\\",\\n\",\n            \"                \\\"start\\\": \\\"2024-03-31\\\",\\n\",\n            \"                \\\"end\\\": \\\"2024-03-31\\\"\\n\",\n            \"            }\\n\",\n            \"        }\\n\",\n            \"    ]\\n\",\n            \"}\\n\"\n          ]\n        }\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"To make it more challenging, we will assign it multiple tasks, followed by a list of queries that are routed to various search backends, such as email and calendar. Not only do we dispatch to different backends, over which we have no control, but we are also likely to render them to the user in different ways.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"retrieval = client.create(\\n\",\n        \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n        \"    response_model=Retrieval,\\n\",\n        \"    messages=[\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"system\\\",\\n\",\n        \"            \\\"content\\\": f\\\"\\\"\\\"You are Jason's personal assistant.\\n\",\n        \"                He has two emails jason@work.com jason@personal.com\\n\",\n        \"                Today is {date.today()}\\\"\\\"\\\",\\n\",\n        \"        },\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"user\\\",\\n\",\n        \"            \\\"content\\\": \\\"What meetings do I have today and are there any important emails I should be aware of\\\",\\n\",\n        \"        },\\n\",\n        \"    ],\\n\",\n        \")\\n\",\n        \"print(retrieval.model_dump_json(indent=4))\"\n      ],\n      \"execution_count\": 12,\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"{\\n\",\n            \"    \\\"queries\\\": [\\n\",\n            \"        {\\n\",\n            \"            \\\"query\\\": \\\"Jason's meetings\\\",\\n\",\n            \"            \\\"keywords\\\": [\\n\",\n            \"                \\\"meeting\\\",\\n\",\n            \"                \\\"appointment\\\",\\n\",\n            \"                \\\"schedule\\\",\\n\",\n            \"                \\\"calendar\\\"\\n\",\n            \"            ],\\n\",\n            \"            \\\"email\\\": \\\"jason@work.com\\\",\\n\",\n            \"            \\\"source\\\": \\\"calendar\\\",\\n\",\n            \"            \\\"date_range\\\": {\\n\",\n            \"                \\\"chain_of_thought\\\": \\\"Since today's date is 2024-03-31, we should look for meetings scheduled for this exact date.\\\",\\n\",\n            \"                \\\"start\\\": \\\"2024-03-31\\\",\\n\",\n            \"                \\\"end\\\": \\\"2024-03-31\\\"\\n\",\n            \"            }\\n\",\n            \"        }\\n\",\n            \"    ]\\n\",\n            \"}\\n\"\n          ]\n        }\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Example 4) Decomposing questions\\n\",\n        \"\\n\",\n        \"Lastly, a lightly more complex example of a problem that can be solved with structured output is decomposing questions. Where you ultimately want to decompose a question into a series of sub-questions that can be answered by a search backend. For example\\n\",\n        \"\\n\",\n        \"\\\"Whats the difference in populations of jason's home country and canada?\\\"\\n\",\n        \"\\n\",\n        \"You'd ultimately need to know a few things\\n\",\n        \"\\n\",\n        \"1. Jason's home country\\n\",\n        \"2. The population of Jason's home country\\n\",\n        \"3. The population of Canada\\n\",\n        \"4. The difference between the two\\n\",\n        \"\\n\",\n        \"This would not be done correctly as a single query, nor would it be done in parallel, however there are some opportunities try to be parallel since not all of the sub-questions are dependent on each other.\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"class Question(BaseModel):\\n\",\n        \"    id: int = Field(..., description=\\\"A unique identifier for the question\\\")\\n\",\n        \"    query: str = Field(..., description=\\\"The question decomposed as much as possible\\\")\\n\",\n        \"    subquestions: list[int] = Field(\\n\",\n        \"        default_factory=list,\\n\",\n        \"        description=\\\"The subquestions that this question is composed of\\\",\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class QueryPlan(BaseModel):\\n\",\n        \"    root_question: str = Field(..., description=\\\"The root question that the user asked\\\")\\n\",\n        \"    plan: list[Question] = Field(\\n\",\n        \"        ..., description=\\\"The plan to answer the root question and its subquestions\\\"\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"retrieval = client.create(\\n\",\n        \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n        \"    response_model=QueryPlan,\\n\",\n        \"    messages=[\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"system\\\",\\n\",\n        \"            \\\"content\\\": \\\"You are a query understanding system capable of decomposing a question into subquestions.\\\",\\n\",\n        \"        },\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"user\\\",\\n\",\n        \"            \\\"content\\\": \\\"What is the difference between the population of jason's home country and canada?\\\",\\n\",\n        \"        },\\n\",\n        \"    ],\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"print(retrieval.model_dump_json(indent=4))\"\n      ],\n      \"execution_count\": 13,\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"{\\n\",\n            \"    \\\"root_question\\\": \\\"What is the difference between the population of Jason's home country and Canada?\\\",\\n\",\n            \"    \\\"plan\\\": [\\n\",\n            \"        {\\n\",\n            \"            \\\"id\\\": 1,\\n\",\n            \"            \\\"query\\\": \\\"What is the population of Jason's home country?\\\",\\n\",\n            \"            \\\"subquestions\\\": []\\n\",\n            \"        },\\n\",\n            \"        {\\n\",\n            \"            \\\"id\\\": 2,\\n\",\n            \"            \\\"query\\\": \\\"What is the population of Canada?\\\",\\n\",\n            \"            \\\"subquestions\\\": []\\n\",\n            \"        },\\n\",\n            \"        {\\n\",\n            \"            \\\"id\\\": 3,\\n\",\n            \"            \\\"query\\\": \\\"What is the difference between two population numbers?\\\",\\n\",\n            \"            \\\"subquestions\\\": [\\n\",\n            \"                1,\\n\",\n            \"                2\\n\",\n            \"            ]\\n\",\n            \"        }\\n\",\n            \"    ]\\n\",\n            \"}\\n\"\n          ]\n        }\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"I hope in this section I've exposed you to some ways we can be creative in modeling structured outputs to leverage LLMS in building some lightweight components for our systems.\\n\"\n      ]\n    }\n  ],\n  \"metadata\": {\n    \"kernelspec\": {\n      \"display_name\": \"Python 3 (ipykernel)\",\n      \"language\": \"python\",\n      \"name\": \"python3\"\n    },\n    \"language_info\": {\n      \"codemirror_mode\": {\n        \"name\": \"ipython\",\n        \"version\": 3\n      },\n      \"file_extension\": \".py\",\n      \"mimetype\": \"text/x-python\",\n      \"name\": \"python\",\n      \"nbconvert_exporter\": \"python\",\n      \"pygments_lexer\": \"ipython3\",\n      \"version\": \"3.11.8\"\n    }\n  },\n  \"nbformat\": 4,\n  \"nbformat_minor\": 4\n}"
  },
  {
    "path": "docs/tutorials/3-1-validation-rag.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"5a01f3ac-5306-4a1b-9e47-a5d254bce93a\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Understanding Validators\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"9dcc78ac-ed6d-49e3-b71b-fb2fb25f16a8\",\n   \"metadata\": {},\n   \"source\": [\n    \"Pydantic offers an customizable and expressive validation framework for Python. Instructor leverages Pydantic's validation framework to provide a uniform developer experience for both code-based and LLM-based validation, as well as a reasking mechanism for correcting LLM outputs based on validation errors. To learn more check out the Pydantic [docs](https://docs.pydantic.dev/latest/) on validators.\\n\",\n    \"\\n\",\n    \"Then we'll bring it all together into the context of RAG from the previous notebook.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"064c286b\",\n   \"metadata\": {},\n   \"source\": [\n    \"Validators will enable us to control outputs by defining a function like so:\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"def validation_function(value):\\n\",\n    \"    if condition(value):\\n\",\n    \"        raise ValueError(\\\"Value is not valid\\\")\\n\",\n    \"    return mutation(value)\\n\",\n    \"```\\n\",\n    \"\\n\",\n    \"Before we get started lets go over the general shape of a validator:\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"7cfc6c66\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Defining Validator Functions\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 18,\n   \"id\": \"d4bb6258-b03a-4621-8a73-29056a20ec0f\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from typing import Annotated\\n\",\n    \"from pydantic import BaseModel, AfterValidator, WithJsonSchema\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def name_must_contain_space(v: str) -> str:\\n\",\n    \"    if \\\" \\\" not in v:\\n\",\n    \"        raise ValueError(\\\"Name must contain a space.\\\")\\n\",\n    \"    return v\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def uppercase_name(v: str) -> str:\\n\",\n    \"    return v.upper()\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"FullName = Annotated[\\n\",\n    \"    str,\\n\",\n    \"    AfterValidator(name_must_contain_space),\\n\",\n    \"    AfterValidator(uppercase_name),\\n\",\n    \"    WithJsonSchema(\\n\",\n    \"        {\\n\",\n    \"            \\\"type\\\": \\\"string\\\",\\n\",\n    \"            \\\"description\\\": \\\"The user's full name\\\",\\n\",\n    \"        }\\n\",\n    \"    ),\\n\",\n    \"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class UserDetail(BaseModel):\\n\",\n    \"    age: int\\n\",\n    \"    name: FullName\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 19,\n   \"id\": \"23f8cadd\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"UserDetail(age=30, name='JASON LIU')\"\n      ]\n     },\n     \"execution_count\": 19,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"UserDetail(age=30, name=\\\"Jason Liu\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 20,\n   \"id\": \"e4f53ecf\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'properties': {'age': {'title': 'Age', 'type': 'integer'},\\n\",\n       \"  'name': {'description': \\\"The user's full name\\\",\\n\",\n       \"   'title': 'Name',\\n\",\n       \"   'type': 'string'}},\\n\",\n       \" 'required': ['age', 'name'],\\n\",\n       \" 'title': 'UserDetail',\\n\",\n       \" 'type': 'object'}\"\n      ]\n     },\n     \"execution_count\": 20,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"UserDetail.model_json_schema()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 21,\n   \"id\": \"2284a7e8\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"1 validation error for UserDetail\\n\",\n      \"name\\n\",\n      \"  Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/value_error\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"try:\\n\",\n    \"    person = UserDetail.model_validate({\\\"age\\\": 24, \\\"name\\\": \\\"Jason\\\"})\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(e)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"3c0302ca\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Using Field\\n\",\n    \"\\n\",\n    \"We can also use the `Field` class to define validators. This is useful when we want to define a validator for a field that is primitive, like a string or integer which supports a limited number of validators.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 22,\n   \"id\": \"3242856f\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"2 validation errors for UserDetail\\n\",\n      \"age\\n\",\n      \"  Input should be greater than 0 [type=greater_than, input_value=-10, input_type=int]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/greater_than\\n\",\n      \"name\\n\",\n      \"  Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/value_error\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from pydantic import Field\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"Age = Annotated[int, Field(gt=0)]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class UserDetail(BaseModel):\\n\",\n    \"    age: Age\\n\",\n    \"    name: FullName\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"try:\\n\",\n    \"    person = UserDetail(age=-10, name=\\\"Jason\\\")\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(e)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"4f689121\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Providing Context\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"id\": \"ec043c23\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"1 validation error for Response\\n\",\n      \"message\\n\",\n      \"  Assertion failed, `hurt` was found in the message `I will hurt them.` [type=assertion_error, input_value='I will hurt them.', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/assertion_error\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from pydantic import ValidationInfo\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def message_cannot_have_blacklisted_words(v: str, info: ValidationInfo) -> str:\\n\",\n    \"    blacklist = info.context.get(\\\"blacklist\\\", [])\\n\",\n    \"    for word in blacklist:\\n\",\n    \"        assert word not in v.lower(), f\\\"`{word}` was found in the message `{v}`\\\"\\n\",\n    \"    return v\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"ModeratedStr = Annotated[str, AfterValidator(message_cannot_have_blacklisted_words)]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Response(BaseModel):\\n\",\n    \"    message: ModeratedStr\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"try:\\n\",\n    \"    Response.model_validate(\\n\",\n    \"        {\\\"message\\\": \\\"I will hurt them.\\\"},\\n\",\n    \"        context={\\n\",\n    \"            \\\"blacklist\\\": {\\n\",\n    \"                \\\"rob\\\",\\n\",\n    \"                \\\"steal\\\",\\n\",\n    \"                \\\"hurt\\\",\\n\",\n    \"                \\\"kill\\\",\\n\",\n    \"                \\\"attack\\\",\\n\",\n    \"            }\\n\",\n    \"        },\\n\",\n    \"    )\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(e)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"37e3a638-c9c9-44cd-bcd0-ad1a39f448db\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Using OpenAI Moderation\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"88d0b816-7ec8-42b0-9b91-c9aab382c960\",\n   \"metadata\": {},\n   \"source\": [\n    \"To enhance our validation measures, we'll extend the scope to flag any answer that contains hateful content, harassment, or similar issues. OpenAI offers a moderation endpoint that addresses these concerns, and it's freely available when using OpenAI models.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"65f46eb5\",\n   \"metadata\": {},\n   \"source\": [\n    \"With the `instructor` library, this is just one function edit away:\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 13,\n   \"id\": \"82521112-5301-4442-acce-82b495bd838f\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"1 validation error for Response\\n\",\n      \"message\\n\",\n      \"  Value error, `I want to make them suffer the consequences` was flagged for harassment, harassment_threatening, violence, harassment/threatening [type=value_error, input_value='I want to make them suffer the consequences', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/value_error\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from typing import Annotated\\n\",\n    \"from pydantic import AfterValidator\\n\",\n    \"from instructor import openai_moderation\\n\",\n    \"\\n\",\n    \"import instructor\\n\",\n    \"from openai import OpenAI\\n\",\n    \"\\n\",\n    \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\\n\",\n    \"\\n\",\n    \"# This uses Annotated which is a new feature in Python 3.9\\n\",\n    \"# To define custom metadata for a type hint.\\n\",\n    \"ModeratedStr = Annotated[str, AfterValidator(openai_moderation(client=client))]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Response(BaseModel):\\n\",\n    \"    message: ModeratedStr\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"try:\\n\",\n    \"    Response(message=\\\"I want to make them suffer the consequences\\\")\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(e)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"faa5116e\",\n   \"metadata\": {},\n   \"source\": [\n    \"## General Validator\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"49d8b772\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from instructor import llm_validator\\n\",\n    \"\\n\",\n    \"HealthTopicStr = Annotated[\\n\",\n    \"    str,\\n\",\n    \"    AfterValidator(\\n\",\n    \"        llm_validator(\\n\",\n    \"            \\\"don't talk about any other topic except health best practices and topics\\\",\\n\",\n    \"            client=client,\\n\",\n    \"        )\\n\",\n    \"    ),\\n\",\n    \"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class AssistantMessage(BaseModel):\\n\",\n    \"    message: HealthTopicStr\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"AssistantMessage(\\n\",\n    \"    message=\\\"I would suggest you to visit Sicily as they say it is very nice in winter.\\\"\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"050e72fe-4b13-4002-a1d0-94f7b88b784b\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Avoiding hallucination with citations\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e3f2869e-c8a3-4b93-82e7-55eb70930900\",\n   \"metadata\": {},\n   \"source\": [\n    \"When incorporating external knowledge bases, it's crucial to ensure that the agent uses the provided context accurately and doesn't fabricate responses. Validators can be effectively used for this purpose. We can illustrate this with an example where we validate that a provided citation is actually included in the referenced text chunk:\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 27,\n   \"id\": \"638fc368-5cf7-4ae7-9d3f-efea1b84eec0\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"1 validation error for AnswerWithCitation\\n\",\n      \"citation\\n\",\n      \"  Value error, Citation `Blueberries contain high levels of protein` not found in text, only use citations from the text. [type=value_error, input_value='Blueberries contain high levels of protein', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/value_error\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from pydantic import ValidationInfo\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def citation_exists(v: str, info: ValidationInfo):\\n\",\n    \"    context = info.context\\n\",\n    \"    if context:\\n\",\n    \"        context = context.get(\\\"text_chunk\\\")\\n\",\n    \"        if v not in context:\\n\",\n    \"            raise ValueError(\\n\",\n    \"                f\\\"Citation `{v}` not found in text, only use citations from the text.\\\"\\n\",\n    \"            )\\n\",\n    \"    return v\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"Citation = Annotated[str, AfterValidator(citation_exists)]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class AnswerWithCitation(BaseModel):\\n\",\n    \"    answer: str\\n\",\n    \"    citation: Citation\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"try:\\n\",\n    \"    AnswerWithCitation.model_validate(\\n\",\n    \"        {\\n\",\n    \"            \\\"answer\\\": \\\"Blueberries are packed with protein\\\",\\n\",\n    \"            \\\"citation\\\": \\\"Blueberries contain high levels of protein\\\",\\n\",\n    \"        },\\n\",\n    \"        context={\\\"text_chunk\\\": \\\"Blueberries are very rich in antioxidants\\\"},\\n\",\n    \"    )\\n\",\n    \"except Exception as e:\\n\",\n    \"    print(e)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"3064b06b-7f85-40ec-8fe2-4fa2cce36585\",\n   \"metadata\": {},\n   \"source\": [\n    \"Here we assume that there is a \\\"text_chunk\\\" field that contains the text that the model is supposed to use as context. We then use the `field_validator` decorator to define a validator that checks if the citation is included in the text chunk. If it's not, we raise a `ValueError` with a message that will be returned to the user.\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"If we want to pass in the context through the `chat.completions.create`` endpoint, we can use the `validation_context` parameter\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n    \"    response_model=AnswerWithCitation,\\n\",\n    \"    messages=[\\n\",\n    \"        {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"Answer the question `{q}` using the text chunk\\\\n`{text_chunk}`\\\"},\\n\",\n    \"    ],\\n\",\n    \"    validation_context={\\\"text_chunk\\\": text_chunk},\\n\",\n    \")\\n\",\n    \"```\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"64d15ad2\",\n   \"metadata\": {},\n   \"source\": [\n    \"In practice there are many ways to implement this: we could use a regex to check if the citation is included in the text chunk, or we could use a more sophisticated approach like a semantic similarity check. The important thing is that we have a way to validate that the model is using the provided context accurately.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"5bbbaa11-32d2-4772-bc31-18d1d6d6c919\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Reasking with validators\\n\",\n    \"\\n\",\n    \"For most of these examples all we've done we've mostly only defined the validation logic. Which can be separate from generation, however when we are given validation errors, we shouldn't end there! Instead instructor allows us to collect all the validation errors and reask the llm to rewrite their answer.\\n\",\n    \"\\n\",\n    \"Lets try to use a extreme example to illustrate this point:\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 15,\n   \"id\": \"97f544e7-2552-465c-89a9-a4820f00d658\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"{\\n\",\n      \"  \\\"question\\\": \\\"What is the meaning of life?\\\",\\n\",\n      \"  \\\"answer\\\": \\\"According to the devil, the meaning of life is a life of sin and debauchery.\\\"\\n\",\n      \"}\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"class QuestionAnswer(BaseModel):\\n\",\n    \"    question: str\\n\",\n    \"    answer: str\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"question = \\\"What is the meaning of life?\\\"\\n\",\n    \"context = (\\n\",\n    \"    \\\"The according to the devil the meaning of life is a life of sin and debauchery.\\\"\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n    \"    response_model=QuestionAnswer,\\n\",\n    \"    messages=[\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"system\\\",\\n\",\n    \"            \\\"content\\\": \\\"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\\\",\\n\",\n    \"        },\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"user\\\",\\n\",\n    \"            \\\"content\\\": f\\\"using the context: `{context}`\\\\n\\\\nAnswer the following question: `{question}`\\\",\\n\",\n    \"        },\\n\",\n    \"    ],\\n\",\n    \")\\n\",\n    \"\\n\",\n    \"print(resp.model_dump_json(indent=2))\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 20,\n   \"id\": \"0328bbc5\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"Retrying, exception: 1 validation error for QuestionAnswer\\n\",\n      \"answer\\n\",\n      \"  Assertion failed, The statement promotes sin and debauchery, which can be considered objectionable. [type=assertion_error, input_value='The meaning of life, acc... of sin and debauchery.', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/assertion_error\\n\",\n      \"Traceback (most recent call last):\\n\",\n      \"  File \\\"/Users/jasonliu/dev/instructor/instructor/patch.py\\\", line 277, in retry_sync\\n\",\n      \"    return process_response(\\n\",\n      \"           ^^^^^^^^^^^^^^^^^\\n\",\n      \"  File \\\"/Users/jasonliu/dev/instructor/instructor/patch.py\\\", line 164, in process_response\\n\",\n      \"    model = response_model.from_response(\\n\",\n      \"            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n\",\n      \"  File \\\"/Users/jasonliu/dev/instructor/instructor/function_calls.py\\\", line 137, in from_response\\n\",\n      \"    return cls.model_validate_json(\\n\",\n      \"           ^^^^^^^^^^^^^^^^^^^^^^^^\\n\",\n      \"  File \\\"/Users/jasonliu/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py\\\", line 532, in model_validate_json\\n\",\n      \"    return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)\\n\",\n      \"           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n\",\n      \"pydantic_core._pydantic_core.ValidationError: 1 validation error for QuestionAnswer\\n\",\n      \"answer\\n\",\n      \"  Assertion failed, The statement promotes sin and debauchery, which can be considered objectionable. [type=assertion_error, input_value='The meaning of life, acc... of sin and debauchery.', input_type=str]\\n\",\n      \"    For further information visit https://errors.pydantic.dev/2.5/v/assertion_error\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"from instructor import llm_validator\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"NotEvilAnswer = Annotated[\\n\",\n    \"    str,\\n\",\n    \"    AfterValidator(llm_validator(\\\"don't say objectionable things\\\", client=client)),\\n\",\n    \"]\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class QuestionAnswer(BaseModel):\\n\",\n    \"    question: str\\n\",\n    \"    answer: NotEvilAnswer\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"resp = client.create(\\n\",\n    \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n    \"    response_model=QuestionAnswer,\\n\",\n    \"    max_retries=2,\\n\",\n    \"    messages=[\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"system\\\",\\n\",\n    \"            \\\"content\\\": \\\"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\\\",\\n\",\n    \"        },\\n\",\n    \"        {\\n\",\n    \"            \\\"role\\\": \\\"user\\\",\\n\",\n    \"            \\\"content\\\": f\\\"using the context: `{context}`\\\\n\\\\nAnswer the following question: `{question}`\\\",\\n\",\n    \"        },\\n\",\n    \"    ],\\n\",\n    \")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 21,\n   \"id\": \"814d3554\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"{\\n\",\n      \"  \\\"question\\\": \\\"What is the meaning of life?\\\",\\n\",\n      \"  \\\"answer\\\": \\\"The meaning of life is subjective and can vary depending on one's beliefs and perspectives. According to the devil, it is a life of sin and debauchery. However, this viewpoint may not be universally accepted and should be evaluated critically.\\\"\\n\",\n      \"}\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"print(resp.model_dump_json(indent=2))\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.6\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 5\n}\n"
  },
  {
    "path": "docs/tutorials/4-validation.ipynb",
    "content": "{\n  \"cells\": [\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"# Validators\"\n      ],\n      \"id\": \"5a01f3ac-5306-4a1b-9e47-a5d254bce93a\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Instead of framing \\\"self-critique\\\" or \\\"self-reflection\\\" in AI as new concepts, we can view them as validation errors with clear error messages that the system can use to self correct.\\n\",\n        \"\\n\",\n        \"Pydantic offers an customizable and expressive validation framework for Python. Instructor leverages Pydantic's validation framework to provide a uniform developer experience for both code-based and LLM-based validation, as well as a reasking mechanism for correcting LLM outputs based on validation errors. To learn more check out the Pydantic [docs](https://docs.pydantic.dev/latest/) on validators.\\n\",\n        \"\\n\",\n        \"Note: For the majority of this notebook we won't be calling openai, just using validators to see how we can control the validation of the objects.\"\n      ],\n      \"id\": \"9dcc78ac-ed6d-49e3-b71b-fb2fb25f16a8\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Validators will enable us to control outputs by defining a function like so:\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"```python\\n\",\n        \"def validation_function(value):\\n\",\n        \"    if condition(value):\\n\",\n        \"        raise ValueError(\\\"Value is not valid\\\")\\n\",\n        \"    return mutation(value)\\n\",\n        \"```\\n\",\n        \"\\n\",\n        \"Before we get started lets go over the general shape of a validator:\"\n      ],\n      \"id\": \"064c286b\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from pydantic import BaseModel\\n\",\n        \"from typing import Annotated\\n\",\n        \"from pydantic import AfterValidator\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"def name_must_contain_space(v: str) -> str:\\n\",\n        \"    if \\\" \\\" not in v:\\n\",\n        \"        raise ValueError(\\\"Name must contain a space.\\\")\\n\",\n        \"    return v.lower()\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class UserDetail(BaseModel):\\n\",\n        \"    age: int\\n\",\n        \"    name: Annotated[str, AfterValidator(name_must_contain_space)]\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"person = UserDetail(age=29, name=\\\"Jason\\\")\"\n      ],\n      \"execution_count\": 61,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for UserDetail\\nname\\n  Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"\\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 4\\u001b[0m line \\u001b[0;36m1\\n\\u001b[1;32m     <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#W3sZmlsZQ%3D%3D?line=10'>11</a>\\u001b[0m     age: \\u001b[39mint\\u001b[39m\\n\\u001b[1;32m     <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#W3sZmlsZQ%3D%3D?line=11'>12</a>\\u001b[0m     name: Annotated[\\u001b[39mstr\\u001b[39m, AfterValidator(name_must_contain_space)]\\n\\u001b[0;32m---> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#W3sZmlsZQ%3D%3D?line=13'>14</a>\\u001b[0m person \\u001b[39m=\\u001b[39m UserDetail(age\\u001b[39m=\\u001b[39;49m\\u001b[39m29\\u001b[39;49m, name\\u001b[39m=\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mJason\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m)\\n\",\n            \"File \\u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[39m=\\u001b[39m \\u001b[39mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m __pydantic_self__\\u001b[39m.\\u001b[39;49m__pydantic_validator__\\u001b[39m.\\u001b[39;49mvalidate_python(data, self_instance\\u001b[39m=\\u001b[39;49m__pydantic_self__)\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for UserDetail\\nname\\n  Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\"\n          ]\n        }\n      ],\n      \"id\": \"d4bb6258-b03a-4621-8a73-29056a20ec0f\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"**Validation Applications**\\n\",\n        \"\\n\",\n        \"Validators are essential in tackling the unpredictabile nature of LLMs.\\n\",\n        \"\\n\",\n        \"Straightforward examples include:\\n\",\n        \"\\n\",\n        \"* Flagging outputs containing blacklisted words.\\n\",\n        \"* Identifying outputs with tones like racism or violence.\\n\",\n        \"\\n\",\n        \"For more complex tasks:\\n\",\n        \"\\n\",\n        \"* Ensuring citations directly come from provided content.\\n\",\n        \"* Checking that the model's responses align with given context.\\n\",\n        \"* Validating the syntax of SQL queries before execution.\"\n      ],\n      \"id\": \"417fafe5-4616-4372-b9e9-78e89afff536\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Setup and Dependencies\"\n      ],\n      \"id\": \"1bd2104b-7eed-4619-a47d-c3d197f9d483\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Using the [instructor](https://github.com/jxnl/instructor) library, we streamline the integration of these validators. `instructor` manages the parsing and validation of outputs and automates retries for compliant responses. This simplifies the process for developers to implement new validation logic, minimizing extra overhead.\"\n      ],\n      \"id\": \"e94449ab-50a9-4325-972c-f64fcdadee00\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"To use instructor in our api calls, we just need to patch the openai client:\"\n      ],\n      \"id\": \"a7a84adc\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"import instructor\\n\",\n        \"from openai import OpenAI\\n\",\n        \"\\n\",\n        \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\"\n      ],\n      \"execution_count\": 5,\n      \"outputs\": [],\n      \"id\": \"1aa2c503-82f8-4735-aae3-373b55fb1064\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Software 2.0: Rule-based validators\"\n      ],\n      \"id\": \"45cd244f-d59c-4431-be2d-aa356a6fefa0\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Deterministic validation, characterized by its rule-based logic, ensures consistent outcomes for the same input. Let's explore how we can apply this concept through some examples.\"\n      ],\n      \"id\": \"3494e664-c5b3-42ea-9c19-aa301a041bdb\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Flagging bad keywords\"\n      ],\n      \"id\": \"717ecefd-0355-4ba4-a642-95d281b0f075\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"To begin with, we aim to prevent engagement in topics involving explicit violence.\"\n      ],\n      \"id\": \"3a15013e-42f3-4d3b-b395-d6edbdec34e5\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"We will define a blacklist of violent words that cannot be mentioned in any messages:\"\n      ],\n      \"id\": \"13d61a81\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"blacklist = {\\n\",\n        \"    \\\"rob\\\",\\n\",\n        \"    \\\"steal\\\",\\n\",\n        \"    \\\"hurt\\\",\\n\",\n        \"    \\\"kill\\\",\\n\",\n        \"    \\\"attack\\\",\\n\",\n        \"}\"\n      ],\n      \"execution_count\": 63,\n      \"outputs\": [],\n      \"id\": \"59330d7d-082a-4240-98c4-eaee18f02728\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"To validate if the message contains a blacklisted word we will use a [field_validator](https://python.useinstructor.com/blog/2023/10/23/good-llm-validation-is-just-good-validation/#using-field_validator-decorator) over the 'message' field:\"\n      ],\n      \"id\": \"7ce06bbf\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from pydantic import BaseModel, field_validator\\n\",\n        \"from pydantic.fields import Field\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Response(BaseModel):\\n\",\n        \"    message: str\\n\",\n        \"\\n\",\n        \"    @field_validator(\\\"message\\\")\\n\",\n        \"    def message_cannot_have_blacklisted_words(cls, v: str) -> str:\\n\",\n        \"        for word in v.split():\\n\",\n        \"            if word.lower() in blacklist:\\n\",\n        \"                raise ValueError(f\\\"`{word}` was found in the message `{v}`\\\")\\n\",\n        \"        return v\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"Response(message=\\\"I will hurt him\\\")\"\n      ],\n      \"execution_count\": 64,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for Response\\nmessage\\n  Value error, `hurt` was found in the message `I will hurt him` [type=value_error, input_value='I will hurt him', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"\\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 17\\u001b[0m line \\u001b[0;36m1\\n\\u001b[1;32m     <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X23sZmlsZQ%3D%3D?line=10'>11</a>\\u001b[0m                 \\u001b[39mraise\\u001b[39;00m \\u001b[39mValueError\\u001b[39;00m(\\u001b[39mf\\u001b[39m\\u001b[39m\\\"\\u001b[39m\\u001b[39m`\\u001b[39m\\u001b[39m{\\u001b[39;00mword\\u001b[39m}\\u001b[39;00m\\u001b[39m` was found in the message `\\u001b[39m\\u001b[39m{\\u001b[39;00mv\\u001b[39m}\\u001b[39;00m\\u001b[39m`\\u001b[39m\\u001b[39m\\\"\\u001b[39m)\\n\\u001b[1;32m     <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X23sZmlsZQ%3D%3D?line=11'>12</a>\\u001b[0m         \\u001b[39mreturn\\u001b[39;00m v\\n\\u001b[0;32m---> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X23sZmlsZQ%3D%3D?line=13'>14</a>\\u001b[0m Response(message\\u001b[39m=\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mI will hurt him\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m)\\n\",\n            \"File \\u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[39m=\\u001b[39m \\u001b[39mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m __pydantic_self__\\u001b[39m.\\u001b[39;49m__pydantic_validator__\\u001b[39m.\\u001b[39;49mvalidate_python(data, self_instance\\u001b[39m=\\u001b[39;49m__pydantic_self__)\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for Response\\nmessage\\n  Value error, `hurt` was found in the message `I will hurt him` [type=value_error, input_value='I will hurt him', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\"\n          ]\n        }\n      ],\n      \"id\": \"9bb87f47-db98-4f1d-80cb-ad5f39df8793\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Flagging using OpenAI Moderation\"\n      ],\n      \"id\": \"37e3a638-c9c9-44cd-bcd0-ad1a39f448db\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"To enhance our validation measures, we'll extend the scope to flag any answer that contains hateful content, harassment, or similar issues. OpenAI offers a moderation endpoint that addresses these concerns, and it's freely available when using OpenAI models.\"\n      ],\n      \"id\": \"88d0b816-7ec8-42b0-9b91-c9aab382c960\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"With the `instructor` library, this is just one function edit away:\"\n      ],\n      \"id\": \"65f46eb5\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from typing import Annotated\\n\",\n        \"from pydantic.functional_validators import AfterValidator\"\n      ],\n      \"execution_count\": 1,\n      \"outputs\": [],\n      \"id\": \"b2ad8c19-6a94-4e4a-aa3e-dce149e8a479\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from instructor import openai_moderation\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Response(BaseModel):\\n\",\n        \"    message: Annotated[str, AfterValidator(openai_moderation(client=client))]\"\n      ],\n      \"execution_count\": 6,\n      \"outputs\": [],\n      \"id\": \"82521112-5301-4442-acce-82b495bd838f\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Now we have a more comprehensive flagging for violence and we can outsource the moderation of our messages.\"\n      ],\n      \"id\": \"90542190-a4f2-4242-8261-2f0ace323022\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"Response(message=\\\"I want to make them suffer the consequences\\\")\"\n      ],\n      \"execution_count\": 7,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for Response\\nmessage\\n  Value error, `I want to make them suffer the consequences` was flagged for harassment, harassment_threatening, violence, harassment/threatening [type=value_error, input_value='I want to make them suffer the consequences', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.5/v/value_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"Cell \\u001b[0;32mIn[7], line 1\\u001b[0m\\n\\u001b[0;32m----> 1\\u001b[0m \\u001b[43mResponse\\u001b[49m\\u001b[43m(\\u001b[49m\\u001b[43mmessage\\u001b[49m\\u001b[38;5;241;43m=\\u001b[39;49m\\u001b[38;5;124;43m\\\"\\u001b[39;49m\\u001b[38;5;124;43mI want to make them suffer the consequences\\u001b[39;49m\\u001b[38;5;124;43m\\\"\\u001b[39;49m\\u001b[43m)\\u001b[49m\\n\",\n            \"File \\u001b[0;32m~/.virtualenvs/pampa-labs/lib/python3.10/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[38;5;66;03m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[38;5;241m=\\u001b[39m \\u001b[38;5;28;01mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m \\u001b[43m__pydantic_self__\\u001b[49m\\u001b[38;5;241;43m.\\u001b[39;49m\\u001b[43m__pydantic_validator__\\u001b[49m\\u001b[38;5;241;43m.\\u001b[39;49m\\u001b[43mvalidate_python\\u001b[49m\\u001b[43m(\\u001b[49m\\u001b[43mdata\\u001b[49m\\u001b[43m,\\u001b[49m\\u001b[43m \\u001b[49m\\u001b[43mself_instance\\u001b[49m\\u001b[38;5;241;43m=\\u001b[39;49m\\u001b[43m__pydantic_self__\\u001b[49m\\u001b[43m)\\u001b[49m\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for Response\\nmessage\\n  Value error, `I want to make them suffer the consequences` was flagged for harassment, harassment_threatening, violence, harassment/threatening [type=value_error, input_value='I want to make them suffer the consequences', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.5/v/value_error\"\n          ]\n        }\n      ],\n      \"id\": \"54a9de1b-c6e7-4a5f-854c-506083a06a9d\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"And as an extra, we get flagging for other topics like religion, race etc.\"\n      ],\n      \"id\": \"f138f9f8-495a-4a09-96a0-c71d01561855\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"Response(message=\\\"I will mock their religion\\\")\"\n      ],\n      \"execution_count\": 26,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for Response\\nmessage\\n  Value error, `I will mock their religion` was flagged for ['harassment'] [type=value_error, input_value='I will mock their religion', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.5/v/value_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"Cell \\u001b[0;32mIn[26], line 1\\u001b[0m\\n\\u001b[0;32m----> 1\\u001b[0m \\u001b[43mResponse\\u001b[49m\\u001b[43m(\\u001b[49m\\u001b[43mmessage\\u001b[49m\\u001b[38;5;241;43m=\\u001b[39;49m\\u001b[38;5;124;43m\\\"\\u001b[39;49m\\u001b[38;5;124;43mI will mock their religion\\u001b[39;49m\\u001b[38;5;124;43m\\\"\\u001b[39;49m\\u001b[43m)\\u001b[49m\\n\",\n            \"File \\u001b[0;32m~/.virtualenvs/pampa-labs/lib/python3.10/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[38;5;66;03m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[38;5;241m=\\u001b[39m \\u001b[38;5;28;01mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m \\u001b[43m__pydantic_self__\\u001b[49m\\u001b[38;5;241;43m.\\u001b[39;49m\\u001b[43m__pydantic_validator__\\u001b[49m\\u001b[38;5;241;43m.\\u001b[39;49m\\u001b[43mvalidate_python\\u001b[49m\\u001b[43m(\\u001b[49m\\u001b[43mdata\\u001b[49m\\u001b[43m,\\u001b[49m\\u001b[43m \\u001b[49m\\u001b[43mself_instance\\u001b[49m\\u001b[38;5;241;43m=\\u001b[39;49m\\u001b[43m__pydantic_self__\\u001b[49m\\u001b[43m)\\u001b[49m\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for Response\\nmessage\\n  Value error, `I will mock their religion` was flagged for ['harassment'] [type=value_error, input_value='I will mock their religion', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.5/v/value_error\"\n          ]\n        }\n      ],\n      \"id\": \"feb77670-afd7-4947-89f8-a9446f6fb12c\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Filtering very long messages\"\n      ],\n      \"id\": \"886f122b-22c9-440e-99cf-2e594b3df99b\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"In addition to content-based flags, we can also set criteria based on other aspects of the input text. For instance, to maintain user engagement, we might want to prevent the assistant from returning excessively long texts. \\n\",\n        \"\\n\",\n        \"Here, noticed that `Field` has built-in validators for `min_length` and `max_length`. to learn more checkout [Field Constraints](https://docs.pydantic.dev/latest/concepts/fields)\"\n      ],\n      \"id\": \"692b1164-4bd5-4943-b9ab-2edec00d4f7d\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"class AssistantMessage(BaseModel):\\n\",\n        \"    message: str = Field(..., max_length=100)\"\n      ],\n      \"execution_count\": 68,\n      \"outputs\": [],\n      \"id\": \"45ffdbd4-deae-4a46-9637-1b5339904f53\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"AssistantMessage(\\n\",\n        \"    message=\\\"Certainly! Lorem ipsum is a placeholder text commonly used in the printing and typesetting industry. Here's a sample of Lorem ipsum text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam euismod velit vel tellus tempor, non viverra eros iaculis. Sed vel nisl nec mauris bibendum tincidunt. Vestibulum sed libero euismod, eleifend tellus id, laoreet elit. Donec auctor arcu ac mi feugiat, vel lobortis justo efficitur. Fusce vel odio vitae justo varius dignissim. Integer sollicitudin mi a justo bibendum ultrices. Quisque id nisl a lectus venenatis luctus. Please note that Lorem ipsum text is a nonsensical Latin-like text used as a placeholder for content, and it has no specific meaning. It's often used in design and publishing to demonstrate the visual aspects of a document without focusing on the actual content.\\\"\\n\",\n        \")\"\n      ],\n      \"execution_count\": 69,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for AssistantMessage\\nmessage\\n  String should have at most 100 characters [type=string_too_long, input_value=\\\"Certainly! Lorem ipsum i... on the actual content.\\\", input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/string_too_long\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"\\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 29\\u001b[0m line \\u001b[0;36m1\\n\\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X41sZmlsZQ%3D%3D?line=0'>1</a>\\u001b[0m AssistantMessage(message\\u001b[39m=\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mCertainly! Lorem ipsum is a placeholder text commonly used in the printing and typesetting industry. Here\\u001b[39;49m\\u001b[39m'\\u001b[39;49m\\u001b[39ms a sample of Lorem ipsum text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam euismod velit vel tellus tempor, non viverra eros iaculis. Sed vel nisl nec mauris bibendum tincidunt. Vestibulum sed libero euismod, eleifend tellus id, laoreet elit. Donec auctor arcu ac mi feugiat, vel lobortis justo efficitur. Fusce vel odio vitae justo varius dignissim. Integer sollicitudin mi a justo bibendum ultrices. Quisque id nisl a lectus venenatis luctus. Please note that Lorem ipsum text is a nonsensical Latin-like text used as a placeholder for content, and it has no specific meaning. It\\u001b[39;49m\\u001b[39m'\\u001b[39;49m\\u001b[39ms often used in design and publishing to demonstrate the visual aspects of a document without focusing on the actual content.\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m)\\n\",\n            \"File \\u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[39m=\\u001b[39m \\u001b[39mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m __pydantic_self__\\u001b[39m.\\u001b[39;49m__pydantic_validator__\\u001b[39m.\\u001b[39;49mvalidate_python(data, self_instance\\u001b[39m=\\u001b[39;49m__pydantic_self__)\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for AssistantMessage\\nmessage\\n  String should have at most 100 characters [type=string_too_long, input_value=\\\"Certainly! Lorem ipsum i... on the actual content.\\\", input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/string_too_long\"\n          ]\n        }\n      ],\n      \"id\": \"66430dc5-b78c-45e2-a53b-ddc392b20583\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Avoiding hallucination with citations\"\n      ],\n      \"id\": \"050e72fe-4b13-4002-a1d0-94f7b88b784b\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"When incorporating external knowledge bases, it's crucial to ensure that the agent uses the provided context accurately and doesn't fabricate responses. Validators can be effectively used for this purpose. We can illustrate this with an example where we validate that a provided citation is actually included in the referenced text chunk:\"\n      ],\n      \"id\": \"e3f2869e-c8a3-4b93-82e7-55eb70930900\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from pydantic import ValidationInfo\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class AnswerWithCitation(BaseModel):\\n\",\n        \"    answer: str\\n\",\n        \"    citation: str\\n\",\n        \"\\n\",\n        \"    @field_validator(\\\"citation\\\")\\n\",\n        \"    @classmethod\\n\",\n        \"    def citation_exists(cls, v: str, info: ValidationInfo):\\n\",\n        \"        context = info.context\\n\",\n        \"        if context:\\n\",\n        \"            context = context.get(\\\"text_chunk\\\")\\n\",\n        \"            if v not in context:\\n\",\n        \"                raise ValueError(f\\\"Citation `{v}` not found in text\\\")\\n\",\n        \"        return v\"\n      ],\n      \"execution_count\": 70,\n      \"outputs\": [],\n      \"id\": \"638fc368-5cf7-4ae7-9d3f-efea1b84eec0\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Here we assume that there is a \\\"text_chunk\\\" field that contains the text that the model is supposed to use as context. We then use the `field_validator` decorator to define a validator that checks if the citation is included in the text chunk. If it's not, we raise a `ValueError` with a message that will be returned to the user.\"\n      ],\n      \"id\": \"3064b06b-7f85-40ec-8fe2-4fa2cce36585\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"AnswerWithCitation.model_validate(\\n\",\n        \"    {\\n\",\n        \"        \\\"answer\\\": \\\"Blueberries are packed with protein\\\",\\n\",\n        \"        \\\"citation\\\": \\\"Blueberries contain high levels of protein\\\",\\n\",\n        \"    },\\n\",\n        \"    context={\\\"text_chunk\\\": \\\"Blueberries are very rich in antioxidants\\\"},\\n\",\n        \")\"\n      ],\n      \"execution_count\": 71,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for AnswerWithCitation\\ncitation\\n  Value error, Citation `Blueberries contain high levels of protein` not found in text [type=value_error, input_value='Blueberries contain high levels of protein', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"\\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 34\\u001b[0m line \\u001b[0;36m1\\n\\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=0'>1</a>\\u001b[0m AnswerWithCitation\\u001b[39m.\\u001b[39;49mmodel_validate(\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=1'>2</a>\\u001b[0m     {\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=2'>3</a>\\u001b[0m         \\u001b[39m\\\"\\u001b[39;49m\\u001b[39manswer\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m: \\u001b[39m\\\"\\u001b[39;49m\\u001b[39mBlueberries are packed with protein\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m, \\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=3'>4</a>\\u001b[0m         \\u001b[39m\\\"\\u001b[39;49m\\u001b[39mcitation\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m: \\u001b[39m\\\"\\u001b[39;49m\\u001b[39mBlueberries contain high levels of protein\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=4'>5</a>\\u001b[0m     },\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=5'>6</a>\\u001b[0m     context\\u001b[39m=\\u001b[39;49m{\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mtext_chunk\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m: \\u001b[39m\\\"\\u001b[39;49m\\u001b[39mBlueberries are very rich in antioxidants\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m}, \\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X50sZmlsZQ%3D%3D?line=6'>7</a>\\u001b[0m )\\n\",\n            \"File \\u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:503\\u001b[0m, in \\u001b[0;36mBaseModel.model_validate\\u001b[0;34m(cls, obj, strict, from_attributes, context)\\u001b[0m\\n\\u001b[1;32m    501\\u001b[0m \\u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    502\\u001b[0m __tracebackhide__ \\u001b[39m=\\u001b[39m \\u001b[39mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 503\\u001b[0m \\u001b[39mreturn\\u001b[39;00m \\u001b[39mcls\\u001b[39;49m\\u001b[39m.\\u001b[39;49m__pydantic_validator__\\u001b[39m.\\u001b[39;49mvalidate_python(\\n\\u001b[1;32m    504\\u001b[0m     obj, strict\\u001b[39m=\\u001b[39;49mstrict, from_attributes\\u001b[39m=\\u001b[39;49mfrom_attributes, context\\u001b[39m=\\u001b[39;49mcontext\\n\\u001b[1;32m    505\\u001b[0m )\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for AnswerWithCitation\\ncitation\\n  Value error, Citation `Blueberries contain high levels of protein` not found in text [type=value_error, input_value='Blueberries contain high levels of protein', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\"\n          ]\n        }\n      ],\n      \"id\": \"0f3030b6-e6cf-45bf-a366-12de996fea40\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Software 3.0: Probabilistic validators\"\n      ],\n      \"id\": \"06e54533-3304-4fa0-9828-9591d5dcdefd\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"For scenarios requiring more nuanced validation than rule-based methods, we use probabilistic validation. This approach incorporates LLMs into the validation workflow for a sophisticated assessment of outputs.\\n\",\n        \"\\n\",\n        \"The `instructor` library offers the `llm_validator` utility for this purpose. By specifying the desired directive, we can use LLMs for complex validation tasks. Let's explore some intriguing use cases enabled by LLMs.\\n\",\n        \"\\n\",\n        \"### Keeping an agent on topic\\n\",\n        \"\\n\",\n        \"When creating an agent focused on health improvement, providing answers and daily practice suggestions, it's crucial to ensure strict adherence to health-related topics. This is important because the knowledge base is limited to health topics, and veering off-topic could result in fabricated responses.\\n\",\n        \"\\n\",\n        \"To achieve this focus, we'll follow a similar process as before, but with an important addition: integrating an LLM into our validator.\"\n      ],\n      \"id\": \"1907df5b-472f-45ac-9181-45235e3cd0c3\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"This LLM will be tasked with determining whether the agent's responses are exclusively related to health topics. For this, we will use the `llm_validator` from `instructor` like so:\"\n      ],\n      \"id\": \"546625ac\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from instructor import llm_validator\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class AssistantMessage(BaseModel):\\n\",\n        \"    message: Annotated[\\n\",\n        \"        str,\\n\",\n        \"        AfterValidator(\\n\",\n        \"            llm_validator(\\n\",\n        \"                \\\"don't talk about any other topic except health best practices and topics\\\",\\n\",\n        \"                client=client,\\n\",\n        \"            )\\n\",\n        \"        ),\\n\",\n        \"    ]\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"AssistantMessage(\\n\",\n        \"    message=\\\"I would suggest you to visit Sicily as they say it is very nice in winter.\\\"\\n\",\n        \")\"\n      ],\n      \"execution_count\": 73,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for AssistantMessage\\nmessage\\n  Assertion failed, The statement is not related to health best practices or topics. [type=assertion_error, input_value='I would suggest you to v...is very nice in winter.', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/assertion_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"\\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 38\\u001b[0m line \\u001b[0;36m1\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X56sZmlsZQ%3D%3D?line=4'>5</a>\\u001b[0m \\u001b[39mclass\\u001b[39;00m \\u001b[39mAssistantMessage\\u001b[39;00m(BaseModel):\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X56sZmlsZQ%3D%3D?line=5'>6</a>\\u001b[0m     message: Annotated[\\u001b[39mstr\\u001b[39m, \\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X56sZmlsZQ%3D%3D?line=6'>7</a>\\u001b[0m                        AfterValidator(\\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X56sZmlsZQ%3D%3D?line=7'>8</a>\\u001b[0m                            llm_validator(\\u001b[39m\\\"\\u001b[39m\\u001b[39mdon\\u001b[39m\\u001b[39m'\\u001b[39m\\u001b[39mt talk about any other topic except health best practices and topics\\u001b[39m\\u001b[39m\\\"\\u001b[39m, \\n\\u001b[1;32m      <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X56sZmlsZQ%3D%3D?line=8'>9</a>\\u001b[0m                                          openai_client\\u001b[39m=\\u001b[39mclient))]\\n\\u001b[0;32m---> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#X56sZmlsZQ%3D%3D?line=10'>11</a>\\u001b[0m AssistantMessage(message\\u001b[39m=\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mI would suggest you to visit Sicily as they say it is very nice in winter.\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m)\\n\",\n            \"File \\u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[39m=\\u001b[39m \\u001b[39mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m __pydantic_self__\\u001b[39m.\\u001b[39;49m__pydantic_validator__\\u001b[39m.\\u001b[39;49mvalidate_python(data, self_instance\\u001b[39m=\\u001b[39;49m__pydantic_self__)\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for AssistantMessage\\nmessage\\n  Assertion failed, The statement is not related to health best practices or topics. [type=assertion_error, input_value='I would suggest you to v...is very nice in winter.', input_type=str]\\n    For further information visit https://errors.pydantic.dev/2.4/v/assertion_error\"\n          ]\n        }\n      ],\n      \"id\": \"8cf00cad-c4c0-49dd-9be5-fb02338a5a7f\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Important that for these examples we're not waiting for the messages, to get this message we would need to call the openai with `response_model=AssistantMessage`.\"\n      ],\n      \"id\": \"1dce5a7a-024e-4742-a124-fe51973df5f2\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Validating agent thinking with CoT\"\n      ],\n      \"id\": \"a6ec4afa-0be7-469e-93c0-5c729a06d4fc\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Using probabilistic validation, we can also assess the agent's reasoning process to ensure it's logical before providing a response. With [chain of thought](https://learnprompting.org/docs/intermediate/chain_of_thought) prompting, the model is expected to think in steps and arrive at an answer following its logical progression. If there are errors in this logic, the final response may be incorrect.\\n\",\n        \"\\n\",\n        \"Here we will use Pydantic's [model_validator](https://docs.pydantic.dev/latest/concepts/validators/#model-validators) which allows us to apply validation over all the properties of the `AIResponse` at once.\\n\",\n        \"\\n\",\n        \"To make this easier we'll make a simple validation class that we can reuse for all our validation:\"\n      ],\n      \"id\": \"424d915b-f332-48f3-a75e-6e1cd6d12075\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from typing import Optional\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Validation(BaseModel):\\n\",\n        \"    is_valid: bool = Field(\\n\",\n        \"        ..., description=\\\"Whether the value is valid based on the rules\\\"\\n\",\n        \"    )\\n\",\n        \"    error_message: Optional[str] = Field(\\n\",\n        \"        ...,\\n\",\n        \"        description=\\\"The error message if the value is not valid, to be used for re-asking the model\\\",\\n\",\n        \"    )\"\n      ],\n      \"execution_count\": 74,\n      \"outputs\": [],\n      \"id\": \"65340b8c-2ea3-4457-a6d4-f0e652c317b4\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"The function we will call will integrate an LLM and will ask it to determine whether the answer the model provided follows from the chain of thought: \"\n      ],\n      \"id\": \"de2104f1\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"def validate_chain_of_thought(values):\\n\",\n        \"    chain_of_thought = values[\\\"chain_of_thought\\\"]\\n\",\n        \"    answer = values[\\\"answer\\\"]\\n\",\n        \"    resp = client.create(\\n\",\n        \"        model=\\\"gpt-4-1106-preview\\\",\\n\",\n        \"        messages=[\\n\",\n        \"            {\\n\",\n        \"                \\\"role\\\": \\\"system\\\",\\n\",\n        \"                \\\"content\\\": \\\"You are a validator. Determine if the value follows from the statement. If it is not, explain why.\\\",\\n\",\n        \"            },\\n\",\n        \"            {\\n\",\n        \"                \\\"role\\\": \\\"user\\\",\\n\",\n        \"                \\\"content\\\": f\\\"Verify that `{answer}` follows the chain of thought: {chain_of_thought}\\\",\\n\",\n        \"            },\\n\",\n        \"        ],\\n\",\n        \"        response_model=Validation,\\n\",\n        \"    )\\n\",\n        \"    if not resp.is_valid:\\n\",\n        \"        raise ValueError(resp.error_message)\\n\",\n        \"    return values\"\n      ],\n      \"execution_count\": 75,\n      \"outputs\": [],\n      \"id\": \"e9ab3804-6962-4a48-83da-1f8360d8379a\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"The use of the 'before' argument in this context is significant. It means that the validator will receive the complete dictionary of inputs in their raw form, before any parsing by Pydantic.\"\n      ],\n      \"id\": \"b79b94cf-15c2-432b-b0d5-aad0c2997f91\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from typing import Any\\n\",\n        \"from pydantic import model_validator\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class AIResponse(BaseModel):\\n\",\n        \"    chain_of_thought: str\\n\",\n        \"    answer: str\\n\",\n        \"\\n\",\n        \"    @model_validator(mode=\\\"before\\\")\\n\",\n        \"    @classmethod\\n\",\n        \"    def chain_of_thought_makes_sense(cls, data: Any) -> Any:\\n\",\n        \"        # here we assume data is the dict representation of the model\\n\",\n        \"        # since we use 'before' mode.\\n\",\n        \"        return validate_chain_of_thought(data)\"\n      ],\n      \"execution_count\": 76,\n      \"outputs\": [],\n      \"id\": \"fbc9887a-df0d-4a4b-9ef5-ea450701d85b\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"AIResponse(\\n\",\n        \"    chain_of_thought=\\\"The user suffers from diabetes.\\\",\\n\",\n        \"    answer=\\\"The user has a broken leg.\\\",\\n\",\n        \")\"\n      ],\n      \"execution_count\": 77,\n      \"outputs\": [\n        {\n          \"ename\": \"ValidationError\",\n          \"evalue\": \"1 validation error for AIResponse\\n  Value error, The statement about the user having a broken leg does not logically follow from the information provided about the user suffering from diabetes. These are two separate health conditions and one does not imply the other. [type=value_error, input_value={'chain_of_thought': 'The...user has a broken leg.'}, input_type=dict]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\",\n          \"output_type\": \"error\",\n          \"traceback\": [\n            \"\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m\",\n            \"\\u001b[0;31mValidationError\\u001b[0m                           Traceback (most recent call last)\",\n            \"\\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 47\\u001b[0m line \\u001b[0;36m1\\n\\u001b[0;32m----> <a href='vscode-notebook-cell:/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb#Y103sZmlsZQ%3D%3D?line=0'>1</a>\\u001b[0m AIResponse(chain_of_thought\\u001b[39m=\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mThe user suffers from diabetes.\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m, answer\\u001b[39m=\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m\\u001b[39mThe user has a broken leg.\\u001b[39;49m\\u001b[39m\\\"\\u001b[39;49m)\\n\",\n            \"File \\u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\\u001b[0m, in \\u001b[0;36mBaseModel.__init__\\u001b[0;34m(__pydantic_self__, **data)\\u001b[0m\\n\\u001b[1;32m    162\\u001b[0m \\u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\\u001b[39;00m\\n\\u001b[1;32m    163\\u001b[0m __tracebackhide__ \\u001b[39m=\\u001b[39m \\u001b[39mTrue\\u001b[39;00m\\n\\u001b[0;32m--> 164\\u001b[0m __pydantic_self__\\u001b[39m.\\u001b[39;49m__pydantic_validator__\\u001b[39m.\\u001b[39;49mvalidate_python(data, self_instance\\u001b[39m=\\u001b[39;49m__pydantic_self__)\\n\",\n            \"\\u001b[0;31mValidationError\\u001b[0m: 1 validation error for AIResponse\\n  Value error, The statement about the user having a broken leg does not logically follow from the information provided about the user suffering from diabetes. These are two separate health conditions and one does not imply the other. [type=value_error, input_value={'chain_of_thought': 'The...user has a broken leg.'}, input_type=dict]\\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\"\n          ]\n        }\n      ],\n      \"id\": \"a38f2b28-f5b9-4a44-bfe5-9735726ec57d\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Reasking with validators\\n\",\n        \"\\n\",\n        \"For most of these examples all we've done we've mostly only defined the validation logic.\\n\",\n        \"\\n\",\n        \"We'eve covered field validators and model validators and even used LLMs to validate our outputs. But we haven't actually used the validators to reask the model! One of the most powerful features of `instructor` is that it will automatically reask the model when it receives a validation error. This means that we can use the same validation logic for both code-based and LLM-based validation.\\n\",\n        \"\\n\",\n        \"This also means that our 'prompt' is not only the prompt we send, but the code that runs the validator, and the error message we send back to the model.\"\n      ],\n      \"id\": \"5bbbaa11-32d2-4772-bc31-18d1d6d6c919\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Integrating these validation examples with the OpenAI API is streamlined using `instructor`. After patching the OpenAI client with `instructor`, you simply need to specify a `response_model` for your requests. This setup ensures that all the validation processes occur automatically.\\n\",\n        \"\\n\",\n        \"To enable reasking you can set a maximum number of retries. When calling the OpenAI client, the system can re-attempt to generate a correct answer. It does this by resending the original query along with feedback on why the previous response was rejected, guiding the LLM towards a more accurate answer in subsequent attempts.\"\n      ],\n      \"id\": \"39e642d9-0d20-4231-a694-baa0ea03f147\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"class QuestionAnswer(BaseModel):\\n\",\n        \"    question: str\\n\",\n        \"    answer: str\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"question = \\\"What is the meaning of life?\\\"\\n\",\n        \"context = (\\n\",\n        \"    \\\"The according to the devil the meaning of life is a life of sin and debauchery.\\\"\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"resp = client.create(\\n\",\n        \"    model=\\\"gpt-4-1106-preview\\\",\\n\",\n        \"    response_model=QuestionAnswer,\\n\",\n        \"    messages=[\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"system\\\",\\n\",\n        \"            \\\"content\\\": \\\"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\\\",\\n\",\n        \"        },\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"user\\\",\\n\",\n        \"            \\\"content\\\": f\\\"using the context: `{context}`\\\\n\\\\nAnswer the following question: `{question}`\\\",\\n\",\n        \"        },\\n\",\n        \"    ],\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"resp.answer\"\n      ],\n      \"execution_count\": 79,\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"'a life of sin and debauchery'\"\n            ]\n          },\n          \"execution_count\": 79,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"id\": \"97f544e7-2552-465c-89a9-a4820f00d658\"\n    },\n    {\n      \"cell_type\": \"code\",\n      \"metadata\": {},\n      \"source\": [\n        \"from pydantic import BeforeValidator\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class QuestionAnswer(BaseModel):\\n\",\n        \"    question: str\\n\",\n        \"    answer: Annotated[\\n\",\n        \"        str,\\n\",\n        \"        BeforeValidator(llm_validator(\\\"don't say objectionable things\\\", client=client)),\\n\",\n        \"    ]\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"resp = client.create(\\n\",\n        \"    model=\\\"gpt-3.5-turbo\\\",\\n\",\n        \"    response_model=QuestionAnswer,\\n\",\n        \"    max_retries=2,\\n\",\n        \"    messages=[\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"system\\\",\\n\",\n        \"            \\\"content\\\": \\\"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\\\",\\n\",\n        \"        },\\n\",\n        \"        {\\n\",\n        \"            \\\"role\\\": \\\"user\\\",\\n\",\n        \"            \\\"content\\\": f\\\"using the context: `{context}`\\\\n\\\\nAnswer the following question: `{question}`\\\",\\n\",\n        \"        },\\n\",\n        \"    ],\\n\",\n        \")\\n\",\n        \"\\n\",\n        \"resp.answer\"\n      ],\n      \"execution_count\": 80,\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"'The meaning of life is a concept that varies depending on individual perspectives and beliefs.'\"\n            ]\n          },\n          \"execution_count\": 80,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"id\": \"0328bbc5\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"# Conclusion\"\n      ],\n      \"id\": \"a0c07b8b-ba6d-4e5d-a26c-ba72ca7d4f22\"\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"This guide explains how to use deterministic and probabilistic validation techniques with Large Language Models (LLMs). We discussed using an instructor to establish validation processes for content filtering, context relevance maintenance, and model reasoning verification. These methods enhance the performance of LLMs across different tasks.\\n\",\n        \"\\n\",\n        \"For those interested in further exploration, here's a to-do list:\\n\",\n        \"\\n\",\n        \"1. **SQL Syntax Checker**: Create a validator to check the syntax of SQL queries before executing them.\\n\",\n        \"2. **Context-Based Response Validation**: Design a method to flag responses based on the model's own knowledge rather than the provided context.\\n\",\n        \"3. **PII Detection**: Implement a mechanism to identify and handle Personally Identifiable Information in responses while prioritizing user privacy.\\n\",\n        \"4. **Targeted Rule-Based Filtering**: Develop filters to remove specific content types, such as responses mentioning named entities.\\n\",\n        \"\\n\",\n        \"Completing these tasks will enable users to acquire practical skills in improving LLMs through advanced validation methods.\"\n      ],\n      \"id\": \"344c623a-9b3b-4134-92d4-ad4eb9bb5f9e\"\n    }\n  ],\n  \"metadata\": {\n    \"kernelspec\": {\n      \"display_name\": \"Python 3 (ipykernel)\",\n      \"language\": \"python\",\n      \"name\": \"python3\"\n    },\n    \"language_info\": {\n      \"codemirror_mode\": {\n        \"name\": \"ipython\",\n        \"version\": 3\n      },\n      \"file_extension\": \".py\",\n      \"mimetype\": \"text/x-python\",\n      \"name\": \"python\",\n      \"nbconvert_exporter\": \"python\",\n      \"pygments_lexer\": \"ipython3\",\n      \"version\": \"3.11.6\"\n    }\n  },\n  \"nbformat\": 4,\n  \"nbformat_minor\": 5\n}"
  },
  {
    "path": "docs/tutorials/5-knowledge-graphs.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Knowledge Graphs for Complex Topics\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Introduction\\n\",\n    \"\\n\",\n    \"**What is a knowledge graph?**\\n\",\n    \"\\n\",\n    \"A knowledge graph, also known as a semantic network, represents real-world entities and their relationships. It consists of nodes, edges, and labels. Nodes can represent any entity, while edges define the connections between them. For example, a node representing an author like \\\"J.K. Rowling\\\" can be connected to another node representing one of her books, \\\"Harry Potter\\\", with the edge \\\"author of\\\".\\n\",\n    \"\\n\",\n    \"**Applications of knowledge graphs**\\n\",\n    \"\\n\",\n    \"Knowledge graphs have various applications, including:\\n\",\n    \"\\n\",\n    \"-  Search Engines: They enhance search results by incorporating semantic-search information from diverse sources.\\n\",\n    \"-  Recommendation Systems: They suggest products or services based on user behavior and preferences.\\n\",\n    \"-  Natural Language Processing: They aid in understanding and generating human language.\\n\",\n    \"-  Data Integration: They facilitate the integration of data from different sources by identifying relationships.\\n\",\n    \"-  Artificial Intelligence and Machine Learning: They provide contextual information to improve decision-making.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"----\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Setup and Dependencies\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Today, we're going to use the [`instructor`](https://github.com/jxnl/instructor) library to simplify the interaction between OpenAI and our code. Along with [Graphviz](https://graphviz.org) library to bring structure to our intricate subjects and have a graph visualization.\\n\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"import instructor\\n\",\n    \"from openai import OpenAI\\n\",\n    \"\\n\",\n    \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"Install the Graphviz based on your operation system https://graphviz.org/download/\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Node and Edge Classes\\n\",\n    \"\\n\",\n    \"We begin by modeling our knowledge graph with Node and Edge objects.\\n\",\n    \"\\n\",\n    \"Node objects represent key concepts or entities, while Edge objects signify the relationships between them.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from pydantic import BaseModel, Field\\n\",\n    \"from typing import Optional\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Node(BaseModel):\\n\",\n    \"    id: int\\n\",\n    \"    label: str\\n\",\n    \"    color: str\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Edge(BaseModel):\\n\",\n    \"    source: int\\n\",\n    \"    target: int\\n\",\n    \"    label: str\\n\",\n    \"    color: str = \\\"black\\\"\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## `KnowledgeGraph` Class\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"The `KnowledgeGraph` class combines nodes and edges to create a comprehensive graph structure. It includes lists of nodes and edges, where each node represents a key concept or entity, and each edge represents a relationship between two nodes.\\n\",\n    \"\\n\",\n    \"Later on, you'll see that we designed this class to match the graph object in the graphviz library, which makes it easier to visualize our graph.\\n\",\n    \"\\n\",\n    \"The `visualize_knowledge_graph` function is used to visualize a knowledge graph. It takes a `KnowledgeGraph` object as input, which contains nodes and edges. The function utilizes the `graphviz` library to generate a directed graph (`Digraph`). Each node and edge from the `KnowledgeGraph` is added to the `Digraph` with their respective attributes (id, label, color). Finally, the graph is rendered and displayed.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from graphviz import Digraph\\n\",\n    \"from IPython.display import display\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class KnowledgeGraph(BaseModel):\\n\",\n    \"    nodes: list[Node] = Field(\\n\",\n    \"        ..., default_factory=list\\n\",\n    \"    )  # A list of nodes in the knowledge graph.\\n\",\n    \"    edges: list[Edge] = Field(\\n\",\n    \"        ..., default_factory=list\\n\",\n    \"    )  # A list of edges in the knowledge graph.\\n\",\n    \"\\n\",\n    \"    def visualize_knowledge_graph(self):\\n\",\n    \"        dot = Digraph(comment=\\\"Knowledge Graph\\\")\\n\",\n    \"\\n\",\n    \"        for node in self.nodes:\\n\",\n    \"            dot.node(name=str(node.id), label=node.label, color=node.color)\\n\",\n    \"        for edge in self.edges:\\n\",\n    \"            dot.edge(\\n\",\n    \"                str(edge.source), str(edge.target), label=edge.label, color=edge.color\\n\",\n    \"            )\\n\",\n    \"\\n\",\n    \"        return display(dot)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Generating the Knowledge Graph\\n\",\n    \"\\n\",\n    \"### generate_graph function\\n\",\n    \"\\n\",\n    \"The ``generate_graph`` function uses OpenAI's model to create a KnowledgeGraph object from an input string.\\n\",\n    \"\\n\",\n    \"It requests the model to interpret the input as a detailed knowledge graph and uses the response to form the KnowledgeGraph object.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"def generate_graph(input) -> KnowledgeGraph:\\n\",\n    \"    return client.create(\\n\",\n    \"        model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"        messages=[\\n\",\n    \"            {\\n\",\n    \"                \\\"role\\\": \\\"user\\\",\\n\",\n    \"                \\\"content\\\": f\\\"Help me understand the following by describing it as small knowledge graph: {input}\\\",\\n\",\n    \"            }\\n\",\n    \"        ],\\n\",\n    \"        response_model=KnowledgeGraph,\\n\",\n    \"    )\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"image/svg+xml\": [\n       \"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\" standalone=\\\"no\\\"?>\\n\",\n       \"<!DOCTYPE svg PUBLIC \\\"-//W3C//DTD SVG 1.1//EN\\\"\\n\",\n       \" \\\"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\\\">\\n\",\n       \"<!-- Generated by graphviz version 9.0.0 (20230911.1827)\\n\",\n       \" -->\\n\",\n       \"<!-- Pages: 1 -->\\n\",\n       \"<svg width=\\\"1303pt\\\" height=\\\"133pt\\\"\\n\",\n       \" viewBox=\\\"0.00 0.00 1303.11 132.50\\\" xmlns=\\\"http://www.w3.org/2000/svg\\\" xmlns:xlink=\\\"http://www.w3.org/1999/xlink\\\">\\n\",\n       \"<g id=\\\"graph0\\\" class=\\\"graph\\\" transform=\\\"scale(1 1) rotate(0) translate(4 128.5)\\\">\\n\",\n       \"<polygon fill=\\\"white\\\" stroke=\\\"none\\\" points=\\\"-4,4 -4,-128.5 1299.11,-128.5 1299.11,4 -4,4\\\"/>\\n\",\n       \"<!-- 1 -->\\n\",\n       \"<g id=\\\"node1\\\" class=\\\"node\\\">\\n\",\n       \"<title>1</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"blue\\\" cx=\\\"633.01\\\" cy=\\\"-106.5\\\" rx=\\\"88.71\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"633.01\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Mechanics</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 2 -->\\n\",\n       \"<g id=\\\"node2\\\" class=\\\"node\\\">\\n\",\n       \"<title>2</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"80.01\\\" cy=\\\"-18\\\" rx=\\\"80.01\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"80.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Particles</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;2 -->\\n\",\n       \"<g id=\\\"edge1\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;2</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M558.22,-96.54C504.74,-89.92 431.09,-80.38 366.51,-70.5 278.43,-57.03 256.67,-52.03 169.01,-36 163.02,-34.9 156.8,-33.75 150.56,-32.58\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"151.31,-29.16 140.84,-30.75 150.02,-36.04 151.31,-29.16\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"385.76\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">studies</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 3 -->\\n\",\n       \"<g id=\\\"node3\\\" class=\\\"node\\\">\\n\",\n       \"<title>3</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"272.01\\\" cy=\\\"-18\\\" rx=\\\"93.83\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"272.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Wave&#45;Particle Duality</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;3 -->\\n\",\n       \"<g id=\\\"edge2\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;3</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M573.16,-92.84C543.25,-86.39 506.54,-78.28 473.76,-70.5 427.73,-59.57 376.03,-46.35 336.49,-36.05\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"337.44,-32.68 326.88,-33.54 335.67,-39.45 337.44,-32.68\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"499.14\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">describes</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4 -->\\n\",\n       \"<g id=\\\"node4\\\" class=\\\"node\\\">\\n\",\n       \"<title>4</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"454.01\\\" cy=\\\"-18\\\" rx=\\\"70.29\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"454.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum States</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge3\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M599.76,-89.43C570.51,-75.3 527.79,-54.65 496.15,-39.36\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"497.77,-36.26 487.24,-35.06 494.72,-42.56 497.77,-36.26\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"582.89\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">involves</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 5 -->\\n\",\n       \"<g id=\\\"node5\\\" class=\\\"node\\\">\\n\",\n       \"<title>5</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"633.01\\\" cy=\\\"-18\\\" rx=\\\"90.25\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"633.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Uncertainty Principle</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;5 -->\\n\",\n       \"<g id=\\\"edge4\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;5</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M633.01,-88.41C633.01,-76.76 633.01,-61.05 633.01,-47.52\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"636.51,-47.86 633.01,-37.86 629.51,-47.86 636.51,-47.86\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"661.14\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">introduces</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6 -->\\n\",\n       \"<g id=\\\"node6\\\" class=\\\"node\\\">\\n\",\n       \"<title>6</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"838.01\\\" cy=\\\"-18\\\" rx=\\\"96.9\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"838.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Schrodinger&#39;s Equation</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;6 -->\\n\",\n       \"<g id=\\\"edge5\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;6</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M670.14,-89.84C703.64,-75.7 753.13,-54.82 789.7,-39.39\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"790.88,-42.69 798.73,-35.58 788.16,-36.24 790.88,-42.69\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"782.89\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">defined by</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 7 -->\\n\",\n       \"<g id=\\\"node7\\\" class=\\\"node\\\">\\n\",\n       \"<title>7</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"1053.01\\\" cy=\\\"-18\\\" rx=\\\"99.97\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"1053.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Entanglement</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;7 -->\\n\",\n       \"<g id=\\\"edge6\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;7</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M697.03,-93.66C732.19,-87.06 776.56,-78.56 816.01,-70.5 871.58,-59.15 934.29,-45.49 981.24,-35.09\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"981.73,-38.56 990.74,-32.98 980.22,-31.73 981.73,-38.56\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"916.39\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">predicts</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 8 -->\\n\",\n       \"<g id=\\\"node8\\\" class=\\\"node\\\">\\n\",\n       \"<title>8</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"1233.01\\\" cy=\\\"-18\\\" rx=\\\"62.1\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"1233.01\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Superposition</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;8 -->\\n\",\n       \"<g id=\\\"edge7\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;8</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M712.85,-98.26C816.88,-88.08 1004.13,-67.23 1162.01,-36 1166.63,-35.09 1171.4,-34.08 1176.18,-33.03\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"1176.84,-36.47 1185.82,-30.84 1175.3,-29.64 1176.84,-36.47\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"1075.14\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">introduces</text>\\n\",\n       \"</g>\\n\",\n       \"</g>\\n\",\n       \"</svg>\\n\"\n      ],\n      \"text/plain\": [\n       \"<graphviz.graphs.Digraph at 0x106e7f650>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"generate_graph(\\\"Explain quantum mechanics\\\").visualize_knowledge_graph()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Advanced: Accumulating Knowledge Graphs\\n\",\n    \"\\n\",\n    \"When dealing with larger datasets, or knowledge that grows over time, processing them all at once can be challenging due to limitations in prompt length or the complexity of the content. In such cases, an iterative approach to building the knowledge graph can be beneficial. This method involves processing the text in smaller, manageable chunks and updating the graph with new information from each chunk.\\n\",\n    \"\\n\",\n    \"### What are the benefits of this approach?\\n\",\n    \"\\n\",\n    \"-  Scalability: This approach can handle large datasets by breaking them down into smaller, more manageable pieces.\\n\",\n    \"\\n\",\n    \"-  Flexibility: It allows for dynamic updates to the graph, accommodating new information as it becomes available.\\n\",\n    \"\\n\",\n    \"-  Efficiency: Processing smaller chunks of text can be more efficient and less prone to errors or omissions.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### What has changed?\\n\",\n    \"\\n\",\n    \"The previous example provided a basic structure, while this new example introduces additional complexity and functionality. The Node and Edge classes now have a __hash__ method, allowing them to be used in sets and simplifying duplicate handling.\\n\",\n    \"\\n\",\n    \"The KnowledgeGraph class has been enhanced with two new methods: ``update`` and ``draw``.\\n\",\n    \"\\n\",\n    \"In the KnowledgeGraph class, the nodes and edges fields are now optional, offering greater flexibility.\\n\",\n    \"\\n\",\n    \"The ``update`` method enables the merging and removal of duplicates from two graphs.\\n\",\n    \"\\n\",\n    \"The ``draw`` method includes a prefix parameter, making it easier to create different graph versions during iterations.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"class Node(BaseModel):\\n\",\n    \"    id: int\\n\",\n    \"    label: str\\n\",\n    \"    color: str\\n\",\n    \"\\n\",\n    \"    def __hash__(self) -> int:\\n\",\n    \"        return hash((id, self.label))\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"class Edge(BaseModel):\\n\",\n    \"    source: int\\n\",\n    \"    target: int\\n\",\n    \"    label: str\\n\",\n    \"    color: str = \\\"black\\\"\\n\",\n    \"\\n\",\n    \"    def __hash__(self) -> int:\\n\",\n    \"        return hash((self.source, self.target, self.label))\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"class KnowledgeGraph(BaseModel):\\n\",\n    \"    # Optional list of nodes and edges in the knowledge graph\\n\",\n    \"    nodes: Optional[list[Node]] = Field(..., default_factory=list)\\n\",\n    \"    edges: Optional[list[Edge]] = Field(..., default_factory=list)\\n\",\n    \"\\n\",\n    \"    def update(self, other: \\\"KnowledgeGraph\\\") -> \\\"KnowledgeGraph\\\":\\n\",\n    \"        # This method updates the current graph with the other graph, deduplicating nodes and edges.\\n\",\n    \"        return KnowledgeGraph(\\n\",\n    \"            nodes=list(set(self.nodes + other.nodes)),  # Combine and deduplicate nodes\\n\",\n    \"            edges=list(set(self.edges + other.edges)),  # Combine and deduplicate edges\\n\",\n    \"        )\\n\",\n    \"\\n\",\n    \"    def visualize_knowledge_graph(self):\\n\",\n    \"        dot = Digraph(comment=\\\"Knowledge Graph\\\")\\n\",\n    \"\\n\",\n    \"        for node in self.nodes:\\n\",\n    \"            dot.node(str(node.id), node.label, color=node.color)\\n\",\n    \"        for edge in self.edges:\\n\",\n    \"            dot.edge(\\n\",\n    \"                str(edge.source), str(edge.target), label=edge.label, color=edge.color\\n\",\n    \"            )\\n\",\n    \"\\n\",\n    \"        return display(dot)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": []\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Generate iterative graphs\\n\",\n    \"\\n\",\n    \"The updated `generate_graph` function is specifically designed to handle a list of inputs iteratively. It updates the graph with each new piece of information.\\n\",\n    \"\\n\",\n    \"Upon closer inspection, this pattern resembles a common programming technique known as a \\\"reduce\\\" or \\\"fold\\\" function. A simple example of this would be iterating over a list to find the sum of all the elements squared.\\n\",\n    \"\\n\",\n    \"Here's an example in Python:\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"cur_state = 0\\n\",\n    \"for i in [1, 2, 3, 4, 5]:\\n\",\n    \"    cur_state += i**2\\n\",\n    \"print(cur_state)\\n\",\n    \"```\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"def generate_graph(input: list[str]) -> KnowledgeGraph:\\n\",\n    \"    # Initialize an empty KnowledgeGraph\\n\",\n    \"    cur_state = KnowledgeGraph()\\n\",\n    \"\\n\",\n    \"    # Iterate over the input list\\n\",\n    \"    for i, inp in enumerate(input):\\n\",\n    \"        new_updates = client.create(\\n\",\n    \"            model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"            messages=[\\n\",\n    \"                {\\n\",\n    \"                    \\\"role\\\": \\\"system\\\",\\n\",\n    \"                    \\\"content\\\": \\\"\\\"\\\"You are an iterative knowledge graph builder.\\n\",\n    \"                    You are given the current state of the graph, and you must append the nodes and edges \\n\",\n    \"                    to it Do not provide any duplicates and try to reuse nodes as much as possible.\\\"\\\"\\\",\\n\",\n    \"                },\\n\",\n    \"                {\\n\",\n    \"                    \\\"role\\\": \\\"user\\\",\\n\",\n    \"                    \\\"content\\\": f\\\"\\\"\\\"Extract any new nodes and edges from the following:\\n\",\n    \"                    # Part {i}/{len(input)} of the input:\\n\",\n    \"\\n\",\n    \"                    {inp}\\\"\\\"\\\",\\n\",\n    \"                },\\n\",\n    \"                {\\n\",\n    \"                    \\\"role\\\": \\\"user\\\",\\n\",\n    \"                    \\\"content\\\": f\\\"\\\"\\\"Here is the current state of the graph:\\n\",\n    \"                    {cur_state.model_dump_json(indent=2)}\\\"\\\"\\\",\\n\",\n    \"                },\\n\",\n    \"            ],\\n\",\n    \"            response_model=KnowledgeGraph,\\n\",\n    \"        )  # type: ignore\\n\",\n    \"\\n\",\n    \"        # Update the current state with the new updates\\n\",\n    \"        cur_state = cur_state.update(new_updates)\\n\",\n    \"\\n\",\n    \"        # Draw the current state of the graph\\n\",\n    \"        cur_state.visualize_knowledge_graph()\\n\",\n    \"\\n\",\n    \"    # Return the final state of the KnowledgeGraph\\n\",\n    \"    return cur_state\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Examples Use Case\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"In this approach, we process the text in manageable chunks, one at a time.\\n\",\n    \"\\n\",\n    \"This method is particularly beneficial when dealing with extensive text that may not fit into a single prompt.\\n\",\n    \"\\n\",\n    \"It is especially useful in scenarios such as constructing a knowledge graph for a complex topic, where the information is distributed across multiple documents or sections.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 13,\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"image/svg+xml\": [\n       \"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\" standalone=\\\"no\\\"?>\\n\",\n       \"<!DOCTYPE svg PUBLIC \\\"-//W3C//DTD SVG 1.1//EN\\\"\\n\",\n       \" \\\"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\\\">\\n\",\n       \"<!-- Generated by graphviz version 9.0.0 (20230911.1827)\\n\",\n       \" -->\\n\",\n       \"<!-- Pages: 1 -->\\n\",\n       \"<svg width=\\\"401pt\\\" height=\\\"133pt\\\"\\n\",\n       \" viewBox=\\\"0.00 0.00 400.90 132.50\\\" xmlns=\\\"http://www.w3.org/2000/svg\\\" xmlns:xlink=\\\"http://www.w3.org/1999/xlink\\\">\\n\",\n       \"<g id=\\\"graph0\\\" class=\\\"graph\\\" transform=\\\"scale(1 1) rotate(0) translate(4 128.5)\\\">\\n\",\n       \"<polygon fill=\\\"white\\\" stroke=\\\"none\\\" points=\\\"-4,4 -4,-128.5 396.9,-128.5 396.9,4 -4,4\\\"/>\\n\",\n       \"<!-- 3 -->\\n\",\n       \"<g id=\\\"node1\\\" class=\\\"node\\\">\\n\",\n       \"<title>3</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"orange\\\" cx=\\\"44.19\\\" cy=\\\"-18\\\" rx=\\\"44.19\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"44.19\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Physicist</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4 -->\\n\",\n       \"<g id=\\\"node2\\\" class=\\\"node\\\">\\n\",\n       \"<title>4</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"red\\\" cx=\\\"152.19\\\" cy=\\\"-18\\\" rx=\\\"45.72\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"152.19\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Professor</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1 -->\\n\",\n       \"<g id=\\\"node3\\\" class=\\\"node\\\">\\n\",\n       \"<title>1</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"blue\\\" cx=\\\"152.19\\\" cy=\\\"-106.5\\\" rx=\\\"31.39\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"152.19\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Jason</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;3 -->\\n\",\n       \"<g id=\\\"edge2\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;3</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M134.35,-91.22C117.49,-77.71 91.92,-57.23 72.32,-41.53\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"74.68,-38.94 64.69,-35.42 70.3,-44.4 74.68,-38.94\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"112.69\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge3\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M152.19,-88.41C152.19,-76.76 152.19,-61.05 152.19,-47.52\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"155.69,-47.86 152.19,-37.86 148.69,-47.86 155.69,-47.86\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"156.69\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 2 -->\\n\",\n       \"<g id=\\\"node4\\\" class=\\\"node\\\">\\n\",\n       \"<title>2</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"304.19\\\" cy=\\\"-18\\\" rx=\\\"88.71\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"304.19\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Mechanics</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;2 -->\\n\",\n       \"<g id=\\\"edge1\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;2</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M173.96,-93.11C197.83,-79.53 236.58,-57.47 265.61,-40.95\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"267.16,-44.1 274.12,-36.11 263.7,-38.01 267.16,-44.1\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"276.69\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">knows about</text>\\n\",\n       \"</g>\\n\",\n       \"</g>\\n\",\n       \"</svg>\\n\"\n      ],\n      \"text/plain\": [\n       \"<graphviz.graphs.Digraph at 0x129342f10>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/svg+xml\": [\n       \"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\" standalone=\\\"no\\\"?>\\n\",\n       \"<!DOCTYPE svg PUBLIC \\\"-//W3C//DTD SVG 1.1//EN\\\"\\n\",\n       \" \\\"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\\\">\\n\",\n       \"<!-- Generated by graphviz version 9.0.0 (20230911.1827)\\n\",\n       \" -->\\n\",\n       \"<!-- Pages: 1 -->\\n\",\n       \"<svg width=\\\"401pt\\\" height=\\\"221pt\\\"\\n\",\n       \" viewBox=\\\"0.00 0.00 400.90 221.00\\\" xmlns=\\\"http://www.w3.org/2000/svg\\\" xmlns:xlink=\\\"http://www.w3.org/1999/xlink\\\">\\n\",\n       \"<g id=\\\"graph0\\\" class=\\\"graph\\\" transform=\\\"scale(1 1) rotate(0) translate(4 217)\\\">\\n\",\n       \"<polygon fill=\\\"white\\\" stroke=\\\"none\\\" points=\\\"-4,4 -4,-217 396.9,-217 396.9,4 -4,4\\\"/>\\n\",\n       \"<!-- 3 -->\\n\",\n       \"<g id=\\\"node1\\\" class=\\\"node\\\">\\n\",\n       \"<title>3</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"orange\\\" cx=\\\"44.19\\\" cy=\\\"-106.5\\\" rx=\\\"44.19\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"44.19\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Physicist</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 5 -->\\n\",\n       \"<g id=\\\"node2\\\" class=\\\"node\\\">\\n\",\n       \"<title>5</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"yellow\\\" cx=\\\"152.19\\\" cy=\\\"-18\\\" rx=\\\"33.44\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"152.19\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Smart</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4 -->\\n\",\n       \"<g id=\\\"node3\\\" class=\\\"node\\\">\\n\",\n       \"<title>4</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"red\\\" cx=\\\"152.19\\\" cy=\\\"-106.5\\\" rx=\\\"45.72\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"152.19\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Professor</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4&#45;&gt;5 -->\\n\",\n       \"<g id=\\\"edge1\\\" class=\\\"edge\\\">\\n\",\n       \"<title>4&#45;&gt;5</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M152.19,-88.41C152.19,-76.76 152.19,-61.05 152.19,-47.52\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"155.69,-47.86 152.19,-37.86 148.69,-47.86 155.69,-47.86\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"160.44\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">are</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1 -->\\n\",\n       \"<g id=\\\"node4\\\" class=\\\"node\\\">\\n\",\n       \"<title>1</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"blue\\\" cx=\\\"152.19\\\" cy=\\\"-195\\\" rx=\\\"31.39\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"152.19\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Jason</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;3 -->\\n\",\n       \"<g id=\\\"edge3\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;3</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M134.35,-179.72C117.49,-166.21 91.92,-145.73 72.32,-130.03\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"74.68,-127.44 64.69,-123.92 70.3,-132.9 74.68,-127.44\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"112.69\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge4\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M152.19,-176.91C152.19,-165.26 152.19,-149.55 152.19,-136.02\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"155.69,-136.36 152.19,-126.36 148.69,-136.36 155.69,-136.36\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"156.69\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 2 -->\\n\",\n       \"<g id=\\\"node5\\\" class=\\\"node\\\">\\n\",\n       \"<title>2</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"304.19\\\" cy=\\\"-106.5\\\" rx=\\\"88.71\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"304.19\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Mechanics</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;2 -->\\n\",\n       \"<g id=\\\"edge2\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;2</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M173.96,-181.61C197.83,-168.03 236.58,-145.97 265.61,-129.45\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"267.16,-132.6 274.12,-124.61 263.7,-126.51 267.16,-132.6\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"276.69\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">knows about</text>\\n\",\n       \"</g>\\n\",\n       \"</g>\\n\",\n       \"</svg>\\n\"\n      ],\n      \"text/plain\": [\n       \"<graphviz.graphs.Digraph at 0x1293494d0>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/svg+xml\": [\n       \"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\" standalone=\\\"no\\\"?>\\n\",\n       \"<!DOCTYPE svg PUBLIC \\\"-//W3C//DTD SVG 1.1//EN\\\"\\n\",\n       \" \\\"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\\\">\\n\",\n       \"<!-- Generated by graphviz version 9.0.0 (20230911.1827)\\n\",\n       \" -->\\n\",\n       \"<!-- Pages: 1 -->\\n\",\n       \"<svg width=\\\"468pt\\\" height=\\\"310pt\\\"\\n\",\n       \" viewBox=\\\"0.00 0.00 467.78 309.50\\\" xmlns=\\\"http://www.w3.org/2000/svg\\\" xmlns:xlink=\\\"http://www.w3.org/1999/xlink\\\">\\n\",\n       \"<g id=\\\"graph0\\\" class=\\\"graph\\\" transform=\\\"scale(1 1) rotate(0) translate(4 305.5)\\\">\\n\",\n       \"<polygon fill=\\\"white\\\" stroke=\\\"none\\\" points=\\\"-4,4 -4,-305.5 463.78,-305.5 463.78,4 -4,4\\\"/>\\n\",\n       \"<!-- 3 -->\\n\",\n       \"<g id=\\\"node1\\\" class=\\\"node\\\">\\n\",\n       \"<title>3</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"orange\\\" cx=\\\"220.07\\\" cy=\\\"-106.5\\\" rx=\\\"44.19\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"220.07\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Physicist</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6 -->\\n\",\n       \"<g id=\\\"node2\\\" class=\\\"node\\\">\\n\",\n       \"<title>6</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"pink\\\" cx=\\\"114.07\\\" cy=\\\"-283.5\\\" rx=\\\"31.9\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"114.07\\\" y=\\\"-278.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Sarah</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 7 -->\\n\",\n       \"<g id=\\\"node3\\\" class=\\\"node\\\">\\n\",\n       \"<title>7</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"purple\\\" cx=\\\"39.07\\\" cy=\\\"-195\\\" rx=\\\"39.07\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"39.07\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Student</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;7 -->\\n\",\n       \"<g id=\\\"edge7\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;7</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M100.66,-267.04C89.46,-254.12 73.29,-235.47 60.32,-220.51\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"63.31,-218.62 54.12,-213.35 58.02,-223.2 63.31,-218.62\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"88.57\\\" y=\\\"-234.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4 -->\\n\",\n       \"<g id=\\\"node5\\\" class=\\\"node\\\">\\n\",\n       \"<title>4</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"red\\\" cx=\\\"112.07\\\" cy=\\\"-106.5\\\" rx=\\\"45.72\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"112.07\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Professor</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge6\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M113.87,-265.08C113.52,-234.94 112.81,-172.8 112.4,-136.2\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"115.9,-136.32 112.28,-126.36 108.9,-136.4 115.9,-136.32\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"141.07\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">student of</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1 -->\\n\",\n       \"<g id=\\\"node6\\\" class=\\\"node\\\">\\n\",\n       \"<title>1</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"blue\\\" cx=\\\"219.07\\\" cy=\\\"-195\\\" rx=\\\"31.39\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"219.07\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Jason</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;1 -->\\n\",\n       \"<g id=\\\"edge5\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;1</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M131.41,-268.22C148.13,-254.44 173.65,-233.42 192.85,-217.6\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"195.04,-220.33 200.54,-211.27 190.59,-214.92 195.04,-220.33\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"193.69\\\" y=\\\"-234.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">knows</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 5 -->\\n\",\n       \"<g id=\\\"node4\\\" class=\\\"node\\\">\\n\",\n       \"<title>5</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"yellow\\\" cx=\\\"112.07\\\" cy=\\\"-18\\\" rx=\\\"33.44\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"112.07\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Smart</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4&#45;&gt;5 -->\\n\",\n       \"<g id=\\\"edge2\\\" class=\\\"edge\\\">\\n\",\n       \"<title>4&#45;&gt;5</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M112.07,-88.41C112.07,-76.76 112.07,-61.05 112.07,-47.52\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"115.57,-47.86 112.07,-37.86 108.57,-47.86 115.57,-47.86\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"120.32\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">are</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;3 -->\\n\",\n       \"<g id=\\\"edge4\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;3</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M219.27,-176.91C219.4,-165.26 219.58,-149.55 219.74,-136.02\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"223.23,-136.4 219.85,-126.36 216.23,-136.32 223.23,-136.4\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"224.57\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge3\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M201.4,-179.72C184.8,-166.29 159.67,-145.99 140.3,-130.32\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"142.73,-127.79 132.75,-124.22 138.33,-133.23 142.73,-127.79\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"180.57\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 2 -->\\n\",\n       \"<g id=\\\"node7\\\" class=\\\"node\\\">\\n\",\n       \"<title>2</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"371.07\\\" cy=\\\"-106.5\\\" rx=\\\"88.71\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"371.07\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Mechanics</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;2 -->\\n\",\n       \"<g id=\\\"edge1\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;2</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M240.85,-181.61C264.71,-168.03 303.46,-145.97 332.5,-129.45\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"334.04,-132.6 341,-124.61 330.58,-126.51 334.04,-132.6\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"342.57\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">knows about</text>\\n\",\n       \"</g>\\n\",\n       \"</g>\\n\",\n       \"</svg>\\n\"\n      ],\n      \"text/plain\": [\n       \"<graphviz.graphs.Digraph at 0x128ff4d50>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    },\n    {\n     \"data\": {\n      \"image/svg+xml\": [\n       \"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\" standalone=\\\"no\\\"?>\\n\",\n       \"<!DOCTYPE svg PUBLIC \\\"-//W3C//DTD SVG 1.1//EN\\\"\\n\",\n       \" \\\"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\\\">\\n\",\n       \"<!-- Generated by graphviz version 9.0.0 (20230911.1827)\\n\",\n       \" -->\\n\",\n       \"<!-- Pages: 1 -->\\n\",\n       \"<svg width=\\\"669pt\\\" height=\\\"310pt\\\"\\n\",\n       \" viewBox=\\\"0.00 0.00 669.50 309.50\\\" xmlns=\\\"http://www.w3.org/2000/svg\\\" xmlns:xlink=\\\"http://www.w3.org/1999/xlink\\\">\\n\",\n       \"<g id=\\\"graph0\\\" class=\\\"graph\\\" transform=\\\"scale(1 1) rotate(0) translate(4 305.5)\\\">\\n\",\n       \"<polygon fill=\\\"white\\\" stroke=\\\"none\\\" points=\\\"-4,4 -4,-305.5 665.5,-305.5 665.5,4 -4,4\\\"/>\\n\",\n       \"<!-- 3 -->\\n\",\n       \"<g id=\\\"node1\\\" class=\\\"node\\\">\\n\",\n       \"<title>3</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"orange\\\" cx=\\\"421.78\\\" cy=\\\"-106.5\\\" rx=\\\"44.19\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"421.78\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Physicist</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 9 -->\\n\",\n       \"<g id=\\\"node2\\\" class=\\\"node\\\">\\n\",\n       \"<title>9</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"red\\\" cx=\\\"91.78\\\" cy=\\\"-106.5\\\" rx=\\\"38.56\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"91.78\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Canada</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 8 -->\\n\",\n       \"<g id=\\\"node3\\\" class=\\\"node\\\">\\n\",\n       \"<title>8</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"blue\\\" cx=\\\"91.78\\\" cy=\\\"-195\\\" rx=\\\"91.78\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"91.78\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">University of Toronto</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 8&#45;&gt;9 -->\\n\",\n       \"<g id=\\\"edge8\\\" class=\\\"edge\\\">\\n\",\n       \"<title>8&#45;&gt;9</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M91.78,-176.91C91.78,-165.26 91.78,-149.55 91.78,-136.02\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"95.28,-136.36 91.78,-126.36 88.28,-136.36 95.28,-136.36\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"103.41\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is in</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6 -->\\n\",\n       \"<g id=\\\"node4\\\" class=\\\"node\\\">\\n\",\n       \"<title>6</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"pink\\\" cx=\\\"277.78\\\" cy=\\\"-283.5\\\" rx=\\\"31.9\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"277.78\\\" y=\\\"-278.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Sarah</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;8 -->\\n\",\n       \"<g id=\\\"edge3\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;8</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M253.3,-271.58C238.02,-264.74 217.97,-255.69 200.28,-247.5 178.95,-237.62 155.37,-226.46 135.65,-217.05\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"137.33,-213.98 126.8,-212.82 134.31,-220.29 137.33,-213.98\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"227.03\\\" y=\\\"-234.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">student at</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 7 -->\\n\",\n       \"<g id=\\\"node5\\\" class=\\\"node\\\">\\n\",\n       \"<title>7</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"purple\\\" cx=\\\"240.78\\\" cy=\\\"-195\\\" rx=\\\"39.07\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"240.78\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Student</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;7 -->\\n\",\n       \"<g id=\\\"edge9\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;7</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M270.65,-265.82C265.51,-253.81 258.47,-237.34 252.51,-223.41\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"255.77,-222.13 248.62,-214.31 249.33,-224.88 255.77,-222.13\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"267.28\\\" y=\\\"-234.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4 -->\\n\",\n       \"<g id=\\\"node7\\\" class=\\\"node\\\">\\n\",\n       \"<title>4</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"red\\\" cx=\\\"313.78\\\" cy=\\\"-106.5\\\" rx=\\\"45.72\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"313.78\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Professor</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge7\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M281.3,-265.4C287.5,-235.29 300.41,-172.53 307.95,-135.85\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"311.31,-136.91 309.89,-126.41 304.45,-135.5 311.31,-136.91\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"326.78\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">student of</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1 -->\\n\",\n       \"<g id=\\\"node8\\\" class=\\\"node\\\">\\n\",\n       \"<title>1</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"blue\\\" cx=\\\"420.78\\\" cy=\\\"-195\\\" rx=\\\"31.39\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"420.78\\\" y=\\\"-189.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Jason</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 6&#45;&gt;1 -->\\n\",\n       \"<g id=\\\"edge6\\\" class=\\\"edge\\\">\\n\",\n       \"<title>6&#45;&gt;1</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M298.88,-269.74C322.96,-255.17 362.56,-231.22 390.06,-214.58\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"391.56,-217.77 398.3,-209.6 387.93,-211.78 391.56,-217.77\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"380.41\\\" y=\\\"-234.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">knows</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 5 -->\\n\",\n       \"<g id=\\\"node6\\\" class=\\\"node\\\">\\n\",\n       \"<title>5</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"yellow\\\" cx=\\\"313.78\\\" cy=\\\"-18\\\" rx=\\\"33.44\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"313.78\\\" y=\\\"-12.95\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Smart</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 4&#45;&gt;5 -->\\n\",\n       \"<g id=\\\"edge2\\\" class=\\\"edge\\\">\\n\",\n       \"<title>4&#45;&gt;5</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M313.78,-88.41C313.78,-76.76 313.78,-61.05 313.78,-47.52\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"317.28,-47.86 313.78,-37.86 310.28,-47.86 317.28,-47.86\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"322.03\\\" y=\\\"-57.2\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">are</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;3 -->\\n\",\n       \"<g id=\\\"edge5\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;3</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M420.98,-176.91C421.12,-165.26 421.3,-149.55 421.45,-136.02\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"424.95,-136.4 421.57,-126.36 417.95,-136.32 424.95,-136.4\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"426.28\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;4 -->\\n\",\n       \"<g id=\\\"edge4\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;4</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M403.12,-179.72C386.51,-166.29 361.39,-145.99 342.01,-130.32\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"344.45,-127.79 334.47,-124.22 340.05,-133.23 344.45,-127.79\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"382.28\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">is</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 2 -->\\n\",\n       \"<g id=\\\"node9\\\" class=\\\"node\\\">\\n\",\n       \"<title>2</title>\\n\",\n       \"<ellipse fill=\\\"none\\\" stroke=\\\"green\\\" cx=\\\"572.78\\\" cy=\\\"-106.5\\\" rx=\\\"88.71\\\" ry=\\\"18\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"572.78\\\" y=\\\"-101.45\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">Quantum Mechanics</text>\\n\",\n       \"</g>\\n\",\n       \"<!-- 1&#45;&gt;2 -->\\n\",\n       \"<g id=\\\"edge1\\\" class=\\\"edge\\\">\\n\",\n       \"<title>1&#45;&gt;2</title>\\n\",\n       \"<path fill=\\\"none\\\" stroke=\\\"black\\\" d=\\\"M442.56,-181.61C466.43,-168.03 505.18,-145.97 534.21,-129.45\\\"/>\\n\",\n       \"<polygon fill=\\\"black\\\" stroke=\\\"black\\\" points=\\\"535.76,-132.6 542.72,-124.61 532.3,-126.51 535.76,-132.6\\\"/>\\n\",\n       \"<text text-anchor=\\\"middle\\\" x=\\\"545.28\\\" y=\\\"-145.7\\\" font-family=\\\"Times,serif\\\" font-size=\\\"14.00\\\">knows about</text>\\n\",\n       \"</g>\\n\",\n       \"</g>\\n\",\n       \"</svg>\\n\"\n      ],\n      \"text/plain\": [\n       \"<graphviz.graphs.Digraph at 0x129349610>\"\n      ]\n     },\n     \"metadata\": {},\n     \"output_type\": \"display_data\"\n    }\n   ],\n   \"source\": [\n    \"text_chunks = [\\n\",\n    \"    \\\"Jason knows a lot about quantum mechanics. He is a physicist. He is a professor\\\",\\n\",\n    \"    \\\"Professors are smart.\\\",\\n\",\n    \"    \\\"Sarah knows Jason and is a student of his.\\\",\\n\",\n    \"    \\\"Sarah is a student at the University of Toronto. and UofT is in Canada.\\\",\\n\",\n    \"]\\n\",\n    \"\\n\",\n    \"graph: KnowledgeGraph = generate_graph(text_chunks)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Conclusion\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"metadata\": {},\n   \"source\": [\n    \"This tutorial shows how to generate and visualize a knowledge graph for complex topics. It also demonstrates how to extract graphic knowledge from the language model or provided text. The tutorial highlights the iterative process of building the knowledge graph by processing text in smaller chunks and updating the graph with new information.\\n\",\n    \"\\n\",\n    \"Using this approach, we can extract various things, including:\\n\",\n    \"\\n\",\n    \"1) People and their relationships in a story.\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"class People(BaseModel):\\n\",\n    \"    id: str\\n\",\n    \"    name: str\\n\",\n    \"    description: str\\n\",\n    \"\\n\",\n    \"class Relationship(BaseModel):\\n\",\n    \"    id: str\\n\",\n    \"    source: str\\n\",\n    \"    target: str\\n\",\n    \"    label: str\\n\",\n    \"    description: str\\n\",\n    \"\\n\",\n    \"class Story(BaseModel):\\n\",\n    \"    people: List[People]\\n\",\n    \"    relationships: List[Relationship]\\n\",\n    \"```\\n\",\n    \"\\n\",\n    \"2) Task dependencies and action items from a transcript.\\n\",\n    \"\\n\",\n    \"```python\\n\",\n    \"class Task(BaseModel):\\n\",\n    \"    id: str\\n\",\n    \"    name: str\\n\",\n    \"    description: str\\n\",\n    \"\\n\",\n    \"class Participant(BaseModel):\\n\",\n    \"    id: str\\n\",\n    \"    name: str\\n\",\n    \"    description: str\\n\",\n    \"\\n\",\n    \"class Assignment(BaseModel):\\n\",\n    \"    id: str\\n\",\n    \"    source: str\\n\",\n    \"    target: str\\n\",\n    \"    label: str\\n\",\n    \"    description: str\\n\",\n    \"\\n\",\n    \"class Transcript(BaseModel):\\n\",\n    \"    tasks: List[Task]\\n\",\n    \"    participants: List[Participant]\\n\",\n    \"    assignments: List[Assignment]\\n\",\n    \"```\\n\",\n    \"\\n\",\n    \"3) Key concepts and their relationships from a research paper.\\n\",\n    \"4) Entities and their relationships from a news article.\\n\",\n    \"\\n\",\n    \"As an exercise, try to implement one of the above examples.\\n\",\n    \"\\n\",\n    \"All of them will follow an idea of iteratively extracting more and more information and accumulating it into some state.\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \"Python 3 (ipykernel)\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.6\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 4\n}\n"
  },
  {
    "path": "docs/tutorials/6-chain-of-density.ipynb",
    "content": "{\n \"cells\": [\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"df019bc4-bdc3-4351-9f03-294be147bf01\",\n   \"metadata\": {},\n   \"source\": [\n    \"# Chain Of Density Summarization\"\n   ]\n  },\n  {\n   \"attachments\": {},\n   \"cell_type\": \"markdown\",\n   \"id\": \"2b2ec7b8-96f0-44ae-afad-2d578a7164aa\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Introduction\\n\",\n    \"\\n\",\n    \"**What is Chain Of Density summarization?**\\n\",\n    \"\\n\",\n    \"Summarizing extensive texts with AI can be challenging. Initially, an AI produces a summary, then refines it through multiple iterations, adding missing article entities. Each iteration adds new article entities to the summary, keeping length consistent, leading to an entity-dense, informative summary called Chain Of Density.\\n\",\n    \"\\n\",\n    \"It was first introduced in the paper - From Sparse to Dense : GPT-4 Summarization with Chain of Density prompting. \\n\",\n    \"\\n\",\n    \"This was done in the original paper by asking GPT-4 to generate all of the rewritten summaries in a single go with the following prompt below. \"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"3850682a-91ac-43ec-8279-fa12cfb88c2f\",\n   \"metadata\": {},\n   \"source\": [\n    \"> Article: {{ARTICLE}}\\n\",\n    \">\\n\",\n    \"> You will generate increasingly concise, entity-dense summaries of the\\n\",\n    \"> above Article.\\n\",\n    \">\\n\",\n    \"> Repeat the following 2 steps 5 times.\\n\",\n    \">\\n\",\n    \"> Step 1. Identify 1-3 informative Entities (\\\";\\\" delimited) from the\\n\",\n    \"> Article which are missing from the previously generated summary.\\n\",\n    \"> Step 2. Write a new, denser summary of identical length which covers\\n\",\n    \"> every entity and detail from the previous summary plus the Missing\\n\",\n    \"> Entities.\\n\",\n    \">\\n\",\n    \"> A Missing Entity is:\\n\",\n    \"> - Relevant: to the main story.\\n\",\n    \"> - Specific: descriptive yet concise (5 words or fewer).\\n\",\n    \"> - Novel; not in the previous summary.\\n\",\n    \"> - Faithful: present in the Article.\\n\",\n    \"> - Anywhere: located anywhere in the Article.\\n\",\n    \">\\n\",\n    \"> Guidelines:\\n\",\n    \"> - The first summary should be long (4-5 sentences, -80 words) yet\\n\",\n    \"> highly non-specific, containing little information beyond the\\n\",\n    \"> entities marked as missing. Use overly verbose language and fillers\\n\",\n    \"> (e.g., \\\"this article discusses\\\") to reach -80 words.\\n\",\n    \"> - Make every word count: re-write the previous summary to improve\\n\",\n    \"> flow and make space for additional entities.\\n\",\n    \"> - Make space with fusion, compression, and removal of uninformative\\n\",\n    \"> phrases like \\\"the article discusses\\\"\\n\",\n    \"> - The summaries should become highly dense and concise yet\\n\",\n    \"> self-contained, e.g., easily understood without the Article.\\n\",\n    \"> - Missing entities can appear anywhere in the new summary.\\n\",\n    \"> - Never drop entities from the previous summary. If space cannot be\\n\",\n    \"> made, add fewer new entities.\\n\",\n    \">\\n\",\n    \"> Remember, use the exact same number of words for each summary.\\n\",\n    \">\\n\",\n    \"> Answer in JSON. The JSON should be a list (length 5) of dictionaries\\n\",\n    \"> whose keys are \\\"Missing_Entities\\\" and \\\"Denser_Summary\\\"\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"758c99e8-2c9e-4a2b-9ae2-cebce820dde2\",\n   \"metadata\": {},\n   \"source\": [\n    \"While the original paper used a single prompt to generate the iterative generations, we can go one step better with `Instructor` and break down the process into smaller API calls - with validation along the way.\\n\",\n    \"\\n\",\n    \"The process can be broken down as seen below.\"\n   ]\n  },\n  {\n   \"attachments\": {\n    \"e3835897-9292-49af-a248-95eaa1d0b86a.png\": {\n     \"image/png\": \"iVBORw0KGgoAAAANSUhEUgAAAt0AAAJfCAIAAAAsLf12AAABVmlDQ1BJQ0MgUHJvZmlsZQAAKJF1kDFLQmEUhh/LMDJCKGhpcMywEC2iocAcQqgQTbKWuF5NA7XL1aj+QAQ1NzcFLU2BUxDR4B5UCBEh1A8IXExu52qlFh04vA8vL9/3cqDLqmhaxgpkcwU9vDDvjK2tO21v2BmkFwcoal7zh0KLEuFbO6d6j8XUu3HzrdrcbGXpObDsiZXPj4JXL3/zHdOXSOZV0Q/ZMVXTC2AZFQ7tFjST94SHdCklfGxyqsmnJsebfNnIrIQDwiVhh5pWEsKPwu54m59q42xmR/3qYLbvT+aiEdEB2REieJkmjI8ppME/2clGNsA2GvvobJEiTQEnfnE0MiSFg+RQmcAt7MUj6zNv/Pt2LS+bhZmofHXY8jYO4OJW6pVanqsCw69w86QpuvJzUUvVmt/0eZtsL0LPiWG8r4LNBfUHw6gVDaN+Bt1luK5+AlHfZIkD0Yd0AAAAOGVYSWZNTQAqAAAACAABh2kABAAAAAEAAAAaAAAAAAACoAIABAAAAAEAAALdoAMABAAAAAEAAAJfAAAAAHBWW9YAAEAASURBVHgB7N15oH5T9T/wr1n6miLJEAmZU8aIykxCqAhlTClTGTJkKEqUoYTMGQqZhwxJMouUDJnKPGQqGcv0e2X9fvt3eqb7fO597jPcu54/zt1nn73X3vt9zj3rfdZae++J3nzzzf/JXyKQCCQCiUAikAgkAn2AwMR90IfsQiKQCCQCiUAikAgkAv9BIHlJPgeJQCKQCCQCiUAi0C8IJC/plzuR/UgEEoFEIBFIBBKB5CX5DCQCiUAikAgkAolAvyCQvKRf7kT2IxFIBBKBRCARSASSl+QzkAgkAolAIpAIJAL9gkDykn65E9mPRCARSAQSgUQgEUheks9AIpAIJAKJQCKQCPQLAslL+uVOZD8SgUQgEUgEEoFEIHlJPgOJQCKQCCQCiUAi0C8IJC/plzuR/UgEEoFEIBFIBBKB5CX5DCQCiUAikAgkAolAvyCQvKRf7kT2IxFIBBKBRCARSASSl+QzkAgkAolAIpAIJAL9gkDykn65E9mPRCARSAQSgUQgEehrXvLqq6/GHZJ4/fXX824lAolAIpAIJAKJwNhGoB95yY033rjddtvNO++8Cy644CuvvOIGbLrppkcccUQH78TPfvazz3zmMx0UmKISgUQgEUgEEoFEYOQITDpyEZ2S8NJLL1166aVHH3307bffPttss+28887TTjvt5JNPTv7LL7981113KfDkk0/KnH766UfS6NVXX73rrrvONddcIxGSdROBRCARSAQSgUSg4whM9Oabb3Zc6DAEvvHGG4stttjTTz+91FJLbbnllqusssrEE0+MiJx22mkPPvjgWWed9dxzz4XYeeaZ54orrhhGE1GFS2jJJZfUEF7y29/+dthysmIikAgkAolAIpAIdByBfrGXYCEzzTQTuoA3zDHHHE4N9dxzz917771jzCwoO+200yKLLFK1c9x9991nnHHGjDPOuN5666mupOrnnXfeNNNMs9JKK4WQGsgmm2yy3/zmN4ceeqhjzaU8TQQSgUQgEUgEEoHeItAvvAQK55xzzrHHHnvCCScwliy33HJcLeuvvz7Oseiii+67777PPPMM8lEF6+STT9599925dZhSTjrppOuuu+6FF15Q5c4771SM3UVISpCVai3p6aabTthKeIhqLuVpIpAIJAKJQCKQCPQQgT6Ke51qqqmEu/7ud787/PDDn3jiiTXXXPPKK6/EUXCLqaee+rXXXqvCdOuttyIlSyyxxMUXX7zOOus88sgjYlC4e5ASTAW5IWGrrbaqVqmmX3zxRc1VczKdCCQCiUAikAgkAj1HoF94yb333otqgIOfZe21177ooovmm2++E088sQDEQSONTxxwwAEShx12GP5x0003LbPMMtw9CA2ecd9997nErMKJs9dee91yyy0PPfRQkVBNiFyZYoopqjmZTgQSgUQgEUgEEoGeI9Avfpzjjjvu1FNPXXHFFZdeemmkAZ8wAUckbADEZBKxrmeeeebxxx+/yy67MKtssskm22+/vUnFs8wyi2BYJUXIOrKaLLvsstdee610wxAT+fw4k07aL2PXn/wlAolAIpAIJAKJAAT6RTfvuOOOU0455R/+8IeDDz5Yt4S+brvttjvssEPcJATlkEMO+chHPoJ5fP3rX8c2rG6Cymy22WYf/ehHo8xTTz3FlWNVEmW22GILgSkHHXSQaNm4WnNUUoGazDxNBBKBRCARSAQSgd4i0C/zhFujYBYx982jjz7KR7PCCisobObOGmus8fjjj6+22mpcPzfffLP0AgssIB4FcXn++eeFpLSQ+ec//3mSSSZBblqUyUuJQCKQCCQCiUAi0GUEBoOXNATFGmvCUMTGmlnzwQ9+cOWVVzapeKGFFhIP27B8ZiYCiUAikAgkAolAnyMwwLykHlnOIIGxVoytv5Q5iUAikAgkAolAItD/CPTLfJyOILX44oubNsyJ0xFpKSQRSAQSgUQgEUgEuozAmOIlIk7MLhbT2mUQs7lEIBFIBBKBRCAR6AgCY8qP0xFEUkgikAgkAolAIpAI9AqBMWUv6RWI2W4ikAgkAolAIpAIdASB5CUdgTGFJAKJQCKQCCQCiUAHEOiXddUaDuW3v/1tw/zMTATGAAJlScAxMJYcQiKQCCQCnUKgj3gJFvKDH/wgBmbhVwmrknRqnCknEegrBOIJ16V4yJdccsk999yzr3qYnUkEEoFEoCcI9AUv2W+//ex3Y/yWag0U8lOyJ09DNtp9BNDxq6++Wruzzz771ltvLZEEpft3IVtMBBKB/kGgx/Nx1lprLR+OPhkxkuQi/fNYZE96ggCC/pOf/ETTCEqyk57cgmw0EUgEeo5AL3kJUsJ8vdxyyyUj6flzkB3oHwSCnSQ16Z87kj1JBBKBbiLQM14SpCQ/Crt5s7OtQUGAc2fjjTdOajIo9yv7mQgkAh1EoDfxJUlKOngLU9TYQ4AF8ZRTTkFNDC25+9i7vzmiRCARaIFAD+wlzNQ6lG/bFnclLyUCEAirCYKSjs58HhKBRGD8INADXmLewcMPPzx+IM6RJgLDRiCmqp1//vnDlpAVE4FEIBEYLAS6vd6r92xMhhwsmLK3iUBPEBAVXlY66UkHstFEIBFIBLqMQLd5SUyD7PIgs7lEYEAR4MExi55DZ0D7n91OBBKBRGBCEeiqHyeN0hN6e7J8IoCUWAc5XTn5JCQCicA4QaDb9hILlowTZHOYiUBHEIig1zSZdATMFJIIJAL9j0BXeYnF5vnL+x+U7GEikAgkAolAIpAI9ASBrvKSjODryT3ORscAArGHzhgYSA4hEUgEEoHWCHSVl+hKrsTQ+n7k1USgHoH0ftZjkjmJQCIwVhHoNi8ZqzjmuBKB0UOA9zM23B69JlJyIpAIJAJ9gkD3eInAPTMe+2TY2Y1EIBFIBBKBRCAR6EMEusdL+nDwY7hLf//734866qjXX399DI8xh5YIJAKJQCIw9hBIXjL27ul/RnTLLbfsv//+f/nLX8bm8MbfqDJmfPzd8xxxIjBOEUheMjZv/JtvvmlgL7744tgcXo4qEUgEEoFEYIwikLxkVG7sQw89NCpy2xb6xhtvtF12ggv2fHQT3OOskAgkAolAIjAgCAwqL7nhhhtMOf7whz/85JNP9hvU11xzzbLLLttbw/urr74Klqmmmqrj4PTD6Do+qBSYCCQCiUAi0CcIDB4vQUS+9KUvffrTn/7rX//6yCOP9AmO1W688sorTu+8885qZpfT//rXv7Q4Gryk4ejciFVXXfXf//53l4eZzSUCiUAikAiMMQQmHazxMJNsueWWzz333Nve9rYVV1zxi1/84kwzzdRXQ3jppZeCLZ144om/+tWv/va3v0066aR77bXXEkss0c1+jpK9pNnoZpxxRjzsjjvuyKng3bzL2VYikAgkAmMPgUHiJaeeeuo3vvEN92DnnXfGTkbDGFB/gxGLf/zjH7PPPvuQzV1++eU77LADzhRC7nrrN9tssy288MKTTz55veRRzQmrxtvf/vZOtdJ6dGiihu65555O8RIhLNdffz0ImWGWXnrpTo1iJHKef/75hx9++N3vfvf0008/EjlZNxFIBBKBRKAFAgPjxwlS4rv8+9///pRTTvnoo482GxVTAVvFym/9DjjggHAumJ/y1FNPqYI3HHjggT/72c+q1e+9916+IdEqm2++OZNMXKKHPvGJTyy++OIrrbTS+9///v3226/4KZAVxW699dbqhBdTc4OU6CQJG220ERMC5Xr00Ud/4AMfKM01bKtcbZjQk5tuuun3v/99+8E0DBtEAaoIxK723Xffj3zkI+uvv34Z/jPPPBNrnPz617/+5je/+cQTT5TyNYnWowvS1rB722677R577FGkWV5viy22uPHGG0tONaHbpjdbdl2Azk477XTsscf+8Ic/LAUg8PTTT5fTESZiylIR4rFxQ93W+++/v2SWBEK8wAILIEmLLLLIF77wBSM94YQTPvnJT4ZdSjFEivXOU1qqZCIRSAQSgURgOAh4O3fnd+WVV3qPD68tqoLhYcEFF3zggQd23XVXaT+eETrv4IMPvuiii/75z3+GZHYCoSeuzvPWT4IPxaXjjz9emlZba621JPwwhqhCJUcOgZHAYFyiF51SkNtvv33UQlDEbVCcUczRp/yDDz4YcrAWWo1mFfji0iGHHBL51WOztqplatIXXnihoZQWr7322poCDU8RMrXKJVpz0UUXJQSGIeriiy921SmywhYSmeutt15UMZ3nzDPP3HjjjT//+c/fd999MluPDj8j4Xvf+56Sf/7znz/72c/uvffeGI/l3eRvtdVWIdYxkDzttNNKTjWh29ETuv/cc8+9/fbbSYgCt912m0uIYynvXq+yyip//OMf5Xi6HG1ut9lmm6EveqvRNdZY49lnn5WP1b3wwgvu1Fe/+tWTTz5ZDsbjdt98883SfnfffbdbGU071ty7s88+WyaskFeASMMWwwOphkLCd7/7XfkelTjt7NHoCO+szJSWCCQCiUB/IvA/XevWSHgJG4n3cmiUSy+99OMf/7jTml9opmAt22yzDU1JDymz2mqrGeORRx4pveaaazouv/zyjlSg/ChDzYS+54hxCXugbySoImaGgAiPoXt23HFH+ajPb37zGzNTpBGX1157rQoja4p8+r6a2aKtmmLV01/84hdEUZlnnXUWp0awisKEqiVr0vvssw+tGZm6F0OmjHEFnIBM5h9XDdwlRzlByyh7Vh/EQk78gMlBU/hQw9ERq/C3vvUtdyGkOWW1QiwktBs9YY9xahQvv/xyTYfj1C4wCvi5a1dddVW1THSbUadkolZKEh60lSXGvZCDie65554Sfscdd5zyoMBfkQY5yCUi+9bF/8tymLX02e+II47AcjAbVwFeGgo0kM7IgT9pc889dyFJaFw0jTyVWh1MJC/pIJgpKhFIBPocgcHw4/DrswUJ8nD0iXzFFVcIKZWea665aO6vfe1r0rgLF0kY0s8///zVV1+dBpIvPNbRZ7cjrUP7Mg+o+Kc//UmOT3xHapIR3hcwI4FQiQ996EMcHPK1Ne2000r4IQcKaA65OeOMMz72sY/5yJZPS1GZbxX5v4dYOyT8I7LY/DliJJq1Va1bTauFBs0xxxyXXXbZuuuu60kKP1HIqZasTxtRceLoHhOOMsgKTNiBpHE73Esxlxx5Jehy+ZS0YlgIvsLwYHQHHXQQIwHdHD6LhqObeOL/PEiIGhOLRIg65phjkBinM8wwg6Mfde642267lb69lf3/D+4O2wYXGELzuc99jssJG4jLTGISGEac6sZhhx0mzdEGGQn8SW8l8BWUyC2WjrssgaCgGhLKbL311u4ydxsaIYePBgKMIl/+8pfnn3/+mOCNthYfjSrCSpZaaimF/d7znvcoicCFw07OeeedpwxT0EILLfRWkTwkAolAIpAIDBOBweAloY3EhUSMiLFSJFTLdNNNhy4wzsthFWDnkGCERx0QC1+9dKqvZJmPP/64oxwOnckmm2yZZZYRTIA64DfCShgPzOthGuHIYI8hlv1f+ZofPS1n9913d6TwqK4ogBIFY4jTaaaZRiK4iISghB/84Act2opa9cfgPXxVU089NV4VA1EM6xLtUV++mqO50oFf/vKXsMISaFZzhYz39NNPh0CJJmHawVfkkIC6mVYjseGGG8IEGzj00EPVcgvgJr/Z6FzCaTSERVH8KB1VHVXQOJ1BfcS1oFkkK9zsN/PMMwP2uuuu23TTTYWhME3x6fCOvfOd71SFmcoRY9hll11wF+mJJpooRLkFwUXca8YP/i+nJVjE1eAi8eRgEkwySiJnwHTfUQpicaYIYXGJJSYkB0mNdByDVxmpOCfMSWfkq1stk+lEIBFIBBKBYSAwGPNx1l57bVoNFWDJoFwpg5iLS2cb8wUXXODo+56nQ8I3t+/sGiwiJBM/wDlcooROOeUUVgFa6l3vehfN51etEt/K0UrJp1Ol6TBf1Ww29BzzCQ2KCYmQ/clPfsJDoYAgUJd8iwtGoYnpOTb/xx57rFlbRX5NYtZZZ5XDc0H/iZxQXVrTDAkbbLABMuQDvaZKOdUHmljTvumZkUzMQY9o4lJAIkgeuPgjnCrpx11FPoMHkwBDQuFbBZ+Goyt8UZjqnHPOSRp7lW6rjiLovKBRmX5uTRhX4rT++O1vfxsFYaGRcH/dMhiKVnHKaCHoRLwLloP0zDfffO64nE022STksMess846wMc+kTmTg3Cy0gQ7kKcIJqADI6blGQCO28qUYgixIo6eG75H7jvf+Y7+f/3rX4e8n0ciaFYIBMhPf/pTtDhOsT2st7SViUQgEUgEEoFhItA1P9NI4kt0kjGDZqIGqBPmEIoKaYjO+9anEqgr1nsJX9i0SBmXD2I6DAURHVIymUNwCMYS0lTh2SmXJFyNKEsauprPCEFTKu8nppJdIa5SeHKIKoUj9vatgrOFx0Q3mrXF2VEq1iQiXDeEo19xVSIkR0RqTZU4jRCKn//8505ZL5SvDl+m+BtqGNGh2osE5iVcx6mKmAEe44hYqK7/pVj96Ch1xUTnlDLKu1kYgNFR4dAOOUJVSpmGieitwgJXeeiwQK0jnQoTZX64U5BigW4HG5Icz4bMCFZFJniCQjJDCDlG6mbxQ8nEI3n64irOgZOJSkHFdI8EhdnkiFVAP+N+4S6acwnLjIpxNECBLCpGXc9e9Wpn0xlf0lk8U1oikAj0MwIT6dwwGc0EVqMAfPvyQUxgvaGLM34IPRFVQKNQP4z/PqzN/WFs55WgpdAUXowaa4HCk0wyCdcPf4E2+B1MAcV1OGt8Q/sWR4MERlBINT2gBX3xhzujXDLhxQd6WdnCnBTGBqrrU5/6lGiJWEekRVsRLlOkVRM6zzxQs3wct45MgQ7VktW028rlYX6yITAvYXLsBAZIxfKJGGN4gnhedLtUjCe1xp4B3ve+971hlIqSDUfH48NQUbUoUO0Qft/73he1TNyWaDHSKKZ7dLxfuN5kspCJiSmWiShWc6wxZpSrcZcjJqZmXMrEVQnVOWvYaXS41JXpSWDpcXP9RBeVS9UEiow+MuaVu1+92pG0/x2BOxFl1RGBKSQRSAQSgb5FYCzwElYNUasCUaFMl6AgrAUl4JFWwzz4WUogQs3NEGgikoAWD58FToPB8GJI1JQc+Wk326r2lhvr8MMPxwnCM4VArLDCCsZYXVilWr6ksQRrt1hxhE2oZA4j8Ze//MUNYmBgBWmzOjuH2FL8r8p12qzbtWKYjcBbMCJSo9do8pLRwzYlJwKJQL8hMBjxJa1RK6uEKUaHmZfhFxYF3GJIrUap/OhHP/JJbXGUd7zjHRGA0rrFYV/tZlvVTjK3mMTrh5dYF19ITTOWVq0lDRNHYT01+RN6yjGkCutR+xUZmcLO1H6V7pfkYWGI+sxnPtP9prPFRCARSATGJAJjgZc0vDGM6hNkV2fkj9kcDaV1NrObbdX0XBRFTU79KZsTCxP/DroWc15GyEsINFeZvURcbX1zA51jXIxP4aIa6IFk5xOBRCAR6BMExiwv6RN8B7Eb/FkRlMP7E0EeMTNo2GMJHxlX2rAl9GdFNrlLLrmEEWhIm1x/9j97lQgkAolAHyIwGOuX9CFw/dwli3MIZBl2D1k1TFrhAsNO+H3EzLbp9GnWognDLom6bVZgQPPxrTE5rgG9HdntRCARGBsIJC8ZG/fxv0Zx1FFH/fjHP/6vrAk8iQm0ookty2HVtQmsXVvc5CbOjljXpPbaIJ/PO++8um++0iAPIvueCCQCiUB/IZB+nP66Hx3pjRhe29SNUBQmYXWTYQsxAUfdmCRsiREzekZodBl2T0avIkZi/vDYG9foIZaSE4FEIBEYEoHkJUNCNHgFYrmO3vbbfGCRszxK0Y2xqrzH6rh6+/Bk64lAIjCeEUg/zhi8+9Zz6/kMW4EpEzQfagzehhxSIpAIJAKJwIQjMBZ4yR577GFnuPqxW43KCuixklj91cixGYplSZtdrc8XumFxdBNf6y/1T07PeYnVxixTa5WU/sEke5IIJAKJQCIwEAiMBV5iYfjiL6iCbtcbu+vZ4aUFjbCnbuxCXK3YIo0A2WNlJIEXLYR36hJaIDqkU9KGIcea9GqNvdVKhgFFVkkEEoFEIBGYIATGAi+xTFlDm4dVz21va2pJqMmGuNhD5+677254qWHm2WefTez//u//lquxiIXt4k4//fSSOaEJm93Yt9bWxNWK1oA/88wzW5CqKKyAIdgEp9RVsdrDkj9kwvr9u+22W014iv2Ghpx1bGOd+++/30aA0VtLoGrLdjNDtlhfoCN41ovNnEQgEUgEEoGBQGAw4l7tk0JD29+1fk6mbe79GmJttStrXvnFVUJsb0v1siVsvvnmNkKTT39bPaxh9YaZVqkvO7zY5t4S+La+KyXXX3/96sZvJb+asLmxxbhs32PNe14kE3HFTtpfkCXGImbLLLNMKWy5VZvazDLLLDLpe2uK+Lm64oor8iVNPvnk0mDZcccdo4otBu2M+MEPftDCaFNNNZVMQ7Mk/wYbbKBjUcZOQDbtM/W3ZiPAuHr00Ufbz892wWWBV7aoTTbZxNIjGlLGPnaaQFPmn39+s4hjLz07EpMfuwtNO+20eFtsKNhsRf+GCBA+JJ4WfQcIE5c9jyAwwtXeYsh5TAQSgUQgEegrBAaDl3Cd7L777sstt1x1Kxw4UpPbbrstHcxCEF/5tCa9a/tZu6kpQIPOMcccFKRN4OjOWFVdPqvAzDPPvNJKK9ksBl+RQ63uueee9rLfb7/9GupsZfwYBq666qrll1/+2muvVV6OxTnsGIwNcFu0JiW4xSGHHIJtvCXpPwebvVH5jrHCh316yyU2nuOOO86y8ew9eoUcXH/99eGdsTaJ6A3cgo0kSIlN+BAs2/KttdZahx12GCFhL7n55ptZOyyMFmKhFIuRGH5pqJrQFl6CZxRegqloK7gdp5h4HeUtuaYtvz/84Q/2KzYNGHqqmBWskww/wVfuuOMOlMtA3AIbA6EsLRBg8mmN5/e+9z2kROtwxuEQOx2odj7TiUAikAgkAmMAgcHw4wiYgHXMMeEvoCmd+pQXOyJBGTtSjZttthnHDcbASBBswz4vhx56qKu77LILUmKLFqwlHC60pnyanjQ7yNtDmN3l4osv9rkvv9lPSYYWGnHBBReMDYf5LNCRBRZYYMjPd6o3SAmCRa3edtttNLGE/tPZn/3sZ4ni0Imm7d+rY2gWq8/ee+9N3yvGrCJoRoHf/e53jmHDwANYGpAVLEHJ4C6BVTh3WBdCZqy7ytwS9ozIrB7tMOy0OMXQPh1W3Y7NFupASgiHntbDAKPz8GT/YPUR4nPwwQcDX5WwlIgR/upXv4pQ4m0MXTrZAgHGnhZ4ujVIiTI33HCD1tE15Ckeg2r/M50IJAKJQCIw6AgMBi8JXWt3tCOOOIKtgjr35e0z3VwbrpnZZ5/dbcBRfNCjF5wjHCKXXnqpTOqZtjY/5fzzz/fVfswxx7CFcIuwu4QfhzcE1VhvvfWI+spXvmLrPo6ep556qtl9FY/iEpkmwZrvQ+myxDh+/OMfxxL0qllF+aGtKVcaWnWnW2+9tclEmqbaV1llFWUMwVEUCxMRNqCryEG4bwzB0Nh4FMAtHPXckdvF0c9gYRKGokAsXFRMGq6GrUJCi/8p3einJLNHLK/OSKN7SuFGzB7MFdKoAPfNRz7yEYPVhML4ivyyoa5Yn3XWWWexxRaTiT2ccMIJ6JexKLz//vuHFachAswqzfDUc+E7BIKC5WbJJZeEGLYUY5Sfv0QgEUgEEoExg8Bg8JKAGwMI4wflvf322/NQCBNhZgjTCFsC6wjmIfJAeXu7OGIhnCNUrDSjSySkWSxCW6MsDC14zE9+8pNvfOMb4aeIuorV/6JWxJTQiwwMDAY0PeMNlwp2wpxTXytyYg2uueeeu2zzpudh/5hiiilC0VLeok/0hLci/BoxXYgDiDvmmWeeoewPOuggwyfz2WefrW8r/FkR6hu9RYNAx6wiLhWBQAvqa5Uc1EexAw44QB/wANYjfA50TESMIuHkMlhMDvNDrRr2IVYuWXjhhbEoZdAXPcFpIh62GQLN8IS2GwRqlCj6ud122x1//PGlz5lIBBKBRCARGDsICJjozk/Q4ic/+cnhtUUF0sc1PwYS+pJAGtcllhIkI+SLb2BWkf7Sl77kknzWEQnxE4gLplItKd+836jIAuGU/SNOGx55cFAEl1g1KG/f7tKU7pFHHqmuphvWiky9CvkMCYJg9Mcpc0hcDQly/HCjyERBnD7wwAP1YhlUXOLaqF7CSGTGEB577DHp6q+mcLVipC1gb4BRBcJYjnxGETmcMvXlo89oU/USsoL9qAJ/ES3omjRThzItEGiG52mnnaZ6Qana0DhJ+9+BwDgZbA4zEUgExjkCg2EvKUZ7So5+EojAsEEj8hpgiGEVEH9Q7BA77LCDj36shUNBYfk+r31w++wmgT+IvwZJ4oxglmAPKK4Nb3+WAGEQLYgnP4IvfgUwgFNOOQURET9BQlhZ0BRWjWbVxV4IdD3ppJMwIeqWHBLQiyjPb0J56wM3U7hj5Iuoddx5553DLBQluTZwhdgJL1xLke9o9Xec4Je//KW0sZ9zzjliXwzTqSgWnS8lGyY4g3RPeT0R9Bo2HjHCcgR51Kz1Yk5vmEZq+iBTGIqYkgsvvBARZHdh1kLFtNgCgWZ4xjBZicJvVbrtbpZwnJKZiUQgEUgEEoGBRmAivKw7AxA9IFJBkMTwmqOGeWGa+SCKggzhvvL5RyJig+qiqiOfEAGkokPM07GxHGWP39DxlHHpFVECPzGhklOTUN7PJCDQGQ4+JAYzyqA4Qm4Lz6ipWE7Ni/EzTzh4VclvmMCuKHVOKyQDl8IAeKxwIBYaTeuDeI6aiqJV6OyYFBOXeF54Vbi9Yj5OTfl2TjESbjIloYpaoV8cWAiHmUSmRLGIFJCr0gTeGilYYlZzufQWALUItMDzwAMPRGhUh62J0wBn+OEVGsmISmf6PzHC/53+H2D2MBFIBBKBgkBXeQlnSs0nb+nHQCfQFGyGNUKYyGgMhHxWBwG55r+EfLQJS/jEJz4R9owhGz355JN5dkSu8NEMWbhZAWTO8ieiYkXkKIMn8Z2xQkk0qzK8/IZ4mu9jerBJ2iETH2LZ4mmKlVqG19Cg1BLWYxbSsDn9oAwz+5kIJAKJAAS6x0s0ZuLMmOQlXXuSsB/mHzygeKzabFo0K2tTTPZps0qzYmxRgl0Ye2J6UbNio5TPSiRohslkXE3GwUvgGXHQowRsik0EEoFEoE8Q6HZ8CYt0n4x8ELshbsPSIxNKSoTLWLulTOUd4cD5nkTn9ISU6Dk6YvW2cUVKRni/snoikAgkAoOFQFd5SYRwDhZAY6C3AmyNYo011hgDYxmfQ4hl9Mbn2HPUiUAiMN4Q6CovGW/g9sl4hZUwMDRb47VPOpndaI2AFXdaF8iriUAikAiMDQS6zUtMyRkbwA3QKOyxZ2WzAepwdrUGATPIPvrRj9Zk5mkikAgkAmMSga7GvQouGatTcvr54TD/ts1pO/08inHbtwx6Hbe3PgeeCIxPBLpqL/HNJ8QkQ1+7/KglKeky4J1tzg4JnRWY0hKBRCAR6GcEuspLAGHN+HTl9PMDkX3rKwTSWNJXtyM7kwgkAl1AoNu8JNzkaTLpwq3NJsYGArGr89gYS44iEUgEEoEhEehqfEn0JqJM7AuToXxD3p4sMJ4RyGVex/Pdz7EnAuMWgW7bSwCNjvgEFACbVpNx+9jlwIdEIEnJkBBlgUQgERiTCEzak1HFitqoSVpNeoJ/NtrnCCQp6fMblN1LBBKB0UNgkn322Wf0pLeQvPzyy9tUdo899nCUblEyLyUC4wcBRsTtttvOPsy5S9/4uek50kQgEagi0DNeohNBTSQ22mgj7MQyG3POOWe1c5lOBMYPAsFIrr/++iWXXPLII48cPwPPkSYCiUAiUEWgB3Gv1eZLmuHaOg2xgY73cuTn2tsFn0yMSQSuvvrq2PvGiq4efrPoMxh8TN7oHFQikAi0j0C/8JLosU9Gb+pIl/d1+4PJkonAQCBQNrAMCh78OxnJQNy77GQikAiMNgL9xUtGe7TjUD6qZyG7DFYYh7c+h5wIJAKJwCAi0IN5woMIU/Y5EUgEEoFEIBFIBLqAQPKSLoCcTSQCiUAikAgkAolAWwgkL2kLpiyUCCQCiUAikAgkAl1AIHlJF0DOJhKBRCARSAQSgUSgLQSSl7QFUxZKBBKBRCARSAQSgS4gkLykCyBnE4lAIpAIJAKJQCLQFgLJS9qCKQslAolAIpAIJAKJQBcQSF7SBZCziUQgEUgEEoFEIBFoC4HkJW3BlIUSgUQgEUgEEoFEoAsIJC/pAsjZRCKQCCQCiUAikAi0hUDykrZgykKJQCKQCCQCiUAi0AUEkpd0AeQeN2Gv2h73IJtPBBKBRCARSATaQyB5SXs4ZalEIBFIBBKBRCARGH0EkpeMPsbZQiKQCCQCiUAikAi0h0DykvZwylKJQCKQCCQCiUAiMPoIJC8ZfYyzhUQgEUgEEoFEIBFoD4HkJe3hlKUSgUQgEUgEEoFEYPQRmOjNN98c/VayhV4iMPvssz/88MO97EG2nQgkAolAIpAItIdA2kvawylLJQKJQCKQCCQCicDoI5C8ZPQxzhYSgUQgEUgEEoFEoD0Ekpe0h1OWSgQSgUQgEUgEEoHRRyB5yehjnC0kAolAIpAIJAKJQHsIJC9pD6cslQgkAolAIpAIJAKjj0DyktHHOFtIBBKBRCARSAQSgfYQSF7SHk5ZKhFIBBKBRCARSARGH4HkJaOPcbaQCCQCiUAikAgkAu0hkLykPZyyVCKQCCQCiUAikAiMPgLJS0Yf42whEUgEEoFEIBFIBNpDIHlJezhlqUQgEUgEEoFEIBEYfQSSl4w+xtlCIpAIJAKJQCKQCLSHQPKS9nDKUolAIpAIJAKJQCIw+ggkLxl9jLOFRCARSAQSgUQgEWgPgeQl7eGUpRKBRCARSAQSgURg9BFIXjL6GPdBC7/97W/7oBfZhUQgEUgEEoFEYAgEkpcMAVBeTgQSgUQgEUgEEoGuIZC8pGtQZ0OJQCKQCCQCiUAiMAQCyUuGACgvJwKJQCKQCCQCiUDXEEhe0jWos6FEIBFIBBKBRCARGAKB5CVDADQGLn/wgx8cA6PIISQCiUAikAiMBwSSl4yHu5xjTAQSgUQgEUgEBgOB5CWDcZ+yl4lAIpAIJAKJwHhAIHnJeLjLOcZEIBFIBBKBRGAwEEheMhj3KXuZCCQCiUAikAiMBwSSl4yHu5xjTAQSgUQgEUgEBgOB5CWDcZ+yl4lAIpAIJAKJwHhAIHnJeLjLOcZEIBFIBBKBRGAwEEheMhj3KXuZCCQCiUAikAiMBwSSl4yHu5xjTAQSgUQgEUgEBgOB5CWDcZ+yl4lAIpAIJAKJwHhAIHnJeLjLOcZEIBFIBBKBRGAwEEheMhj3KXuZCCQCiUAikAiMBwSSl4yHu5xjTAQSgUQgEUgEBgOB5CWDcZ+yl4lAIpAIJAKJwHhAIHnJeLjLOcZEIBFIBBKBRGAwEEheMhj3KXuZCCQCiUAikAiMBwSSl4yHu5xjTAQSgUQgEUgEBgOB5CWDcZ9G2Murr756hBKyeiKQCCQCiUAi0AUEkpd0AeRsIhFIBBKBRCARSATaQiB5SVswZaFEIBFIBBKBRCAR6AICyUu6AHI2kQgkAolAIpAIJAJtIZC8pC2YslAikAgkAolAIpAIdAGBid58880uNJNNdA2B3/72t9FWiXX93e9+J2fJJZes6cOee+5Zk5OniUAikAgkAolAbxFIXtJb/DvcOlKy8cYbv/3tb6/KffHFF6unJf3www+XdCYSgUQgEUgEEoF+QGDSfuhE9qFTCHz0ox+de+6577vvviEFbr311kOWyQKJQCKQCCQCiUCXEUh7SZcBH/XmmExwjmY2ktJ8GksKFJlIBBKBRCAR6B8EBiPu9dVXXw3IJF5//fX+ga8Pe8Jk8u53v7t1x9JY0hqfvJoIJAKJQCLQKwT6mpfceOON22233bzzzrvgggu+8sorMNp0002POOKIzoL15JNPHnXUUXvssccPfvCDJ554orPCeyJtn332qQkxqenGcsstV5OTp4lAIpAIJAKJQD8g0I/xJS+99NKll1569NFH33777bPNNtvOO+887bTTTj755PB6+eWX77rrLgWQCZnTTz/9CEG89dZbP/3pTxM7xxxzPPjgg6eccsrll18+wwwzjFBsb6uHyaRZlAljiQK97WG2nggkAolAIpAINESg73jJG2+8seyyyz799NNLLbXUMcccs8oqq0w88cSIyIknnog33HPPPTfddNP5559vMPPMM88VV1zRcFTtZ5555plIyemnn77MMsuQvO666yJDY0BtM5m0E2XSPlBZMhFIBBKBRCAR6AICfcdLsJCZZpoJLxFKwobhFArnnnvu3nvvHXCwoOy0006LLLLIXHPNVQC6++67zzjjjBlnnHG99dZTXb7q55133jTTTLPSSiuFkFK4mth2223RIKRE5hRTTOH497//vVpgQNMtTCa5bMmA3tPsdiKQCCQC4wGBfpyPwzpy7LHHnnDCCdiJSIhdd911/vnnv/LKKxdddNF99933mWeeOe2006r35uSTT9599925dZ577jms5brrrnvhhRfWX3/9O++8UzF2FyEpQVaqterT+++/v0CTG264YdZZZ62/OnA59RNz3vWud62zzjrJSwbuVmaHE4FEIBEYPwj0Y9zrVFNNJdzVKqWHH364QNQ111wTKeHQwS2mnnrq1157rXp7BIggJUssscTFF19M6T7yyCP8MmeddRZSgqkgNyRstdVW1So1aRG1jz76KDlIyYorrjg2SIkxhsmkOti//e1vSUqqgGQ6EUgEEoFEoN8Q6Dtecu+996IIYJpsssnWXnvtiy66aL755hNcUoCLOcPW5zjggANkHnbYYfiH0BC+GO4ehAatiZBPPh1OnL322uuWW2556KGHioRq4rjjjhOnsvTSS2M/8hGUyy67rFpgoNM1E3NyevBA383sfCKQCCQC4wGBvosvQRROPfVUdgtcgUMHnzABZ7HFFoubwWQSsa7iVY8//vhddtmFWWWTTTbZfvvtTSqeZZZZkAwlRcg6spqIHbn22mulG4aY2BuI5n7b297G+4MP8fjgJVtsscUXv/jFb37zm9HiQB+ZTKoLrKWxZKDvZnY+EUgEEoHxgMAkFHNfjXPhhRe2ctr9999/wQUX3Hzzzf/+97833HDD3XbbbZJJJtFPpyeddNLZZ599zjnn4CIf/vCHTes1qXijjTayxknM733qqacYTjASFheRJRY2FZUSka01I51ooon+8Y9/sLU8++yzn/vc5xTecsstWV8OPvjgbbbZZtJJ+4601fS/nVPc7ve//72SjCXLL798O1WyTCKQCCQCiUAi0CsE+jHutQUWZhFz37Bq8NGssMIKSoqNXWONNR5//PHVVlsNEUFlpBdYYAHxKF//+teff/55ISktBLrEMURsTMaJkhZHaSdOtrXY/rk6++yz60wuPN8/dyR7kggkAolAItAMgQEzCXDHiHKtDsbc4AsvvFAYithY+RaEXXnllU0q/te//uV0SFKiDDbjWP2NJVJiXBlWUr25mU4EEoFEIBHoZwQGzF7SJpQ77rijuAorxrZZPoslAolAIpAIJAKJQD8g0HfzcToCyuKLL27aMCdOR6SlkEQgEUgEEoFEIBHoDgJjk5eIODG72EIm3QExW0kEEoFEIBFIBBKBjiAwNv04HYEmhSQCiUAikAgkAolAlxEYm/aSLoOYzSUCiUAikAgkAolARxBIXtIRGFNIIpAIJAKJQCKQCHQAgQGYJ2z/uQ4MNEUkAn2JgDV5+7Jf2alEIBFIBHqDQN/xEizkBz/4QYDxhz/8QeKDH/xgb7DJVhOBUUYgnnCNxEO+5JJL5l4Bowx5ik8EEoF+R6CPeMl+++1nsxuAWac1YMtPyX5/fLJ/HUIAHb/66qsJszhvrIOXBKVD0KaYRCARGDAE+mI+zlprreXD0ScjRpJcZMCeoOxupxFA0H/yk5+QiqAkO+k0uikvEUgE+h2B3vMSpIT5ernllktG0u8PS/aviwgEO0lq0kXIs6lEIBHoCwR6zEuClORHYV88C9mJPkOAc2fjjTdOatJntyW7kwgkAqOLQC/jS5KUjO69TekDjgAL4imnnIKaGEdy9wG/mdn9RCARaBeBntlLmKn1Md+27d6oLDdeEQirCYKSjs7x+gjkuBOB8YVAz3iJeQcPP/zw+AI7R5sIDAuBmKp2/vnnD6t2VkoEEoFEYJAQ6M16r96zMRlykKDKviYCPUJAVHhZ6aRHXchmE4FEIBHoEgK94SUxDbJLQ8xmEoEBR4AHxyx6Dp0BH0d2PxFIBBKBoRHogR8njdJD35YskQj8NwJIiXWQ05Xz36jkWSKQCIxBBHpjL7FgyRjEMoeUCIwaAhH0miaTUQM4BScCiUC/INADXmKxef7yfgEg+5EIJAKJQCKQCCQCfYNAD3hJRvD1zd3PjgwYArGHzoB1OrubCCQCicCEINADXqJ7uRLDhNyjLJsI/AeB9H7mc5AIJALjAYHe8JLxgGyOMRHoLAK8n7HhdmfFprREIBFIBPoKgW7zEoF7Zjz2FQTZmUQgEUgEEoFEIBHoEwS6zUu6POxbb731wx/+8J/+9KdRbfdHP/rRQgst9JnPfOaRRx4Z1YZS+Ogh8MYbb/z6179+8803R6+JlJwIJAKJQCIwJAJjnJdcf/31uMJIIm2ff/75f/zjHy1wfPDBBw888MDnnntOWyussALd1qLwgF569dVXH3300SE7PyRWQ0oYRoG//e1vujeMijVVLrnkkk033fTaa6+tye+r05E8yX01kOxMIpAIJALNEBhIXkIV2VvnpZdeOvHEE08//fSjjjrq6KOPbqic7r33XiN/+eWXm41/yHy7uW6zzTYtioWq+Na3voWRzDTTTHQbDdei/CBeOvjggz/+8Y//61//at35IbFqXX0YV//9738vvvjixxxzzDDq1lR54okn5Ew33XQ1+XmaCCQCiUAi0E0EJu1mYx1pCx1ZccUV2SdqpE0xxRRf+MIXajLvv/9+OarU5Ld/SvM99NBDNeUxoXvuued73/veZJNN9uSTT7q62mqrvfvd777ooos+97nPbbXVVqeddtqyyy5bU2twT4GA2z3++ONzzjlnGYVQoeOPP37//fefbbbZInNIrErdTiVee+01ou64446RC3zmmWcIwSxHLiolJAKJQCKQCAwbgcHjJaiAT+SJJ54YM+BD2WijjZZeemnqZIkllqhHQRmZ008/ff2lIXOEGjDM8E08/fTTX/nKV2hlunmBBRawHPgZZ5zBEjP55JMfcMABVYY07bTTnnrqqausssphhx02ZnjJ3//+9+Beu+++uyAMp1NPPfUJJ5xw+eWXX3HFFZ///OcvvfTSSSedtB2shsR8ggqwkP31r39VBUP60pe+FNtTf/rTn2aymiA5UTi8dTPMMMMw6maVRCARSAQSgU4hMJC8hPvG+O2zY/8/jGH22WdvBkeQBpaMZgUa5mMkn/3sZ8WLlKv2JXnb294muDW+p88+++xf/OIXNLECVHIpJsERsPnmm1Pe1cwupylsHRj5pz+D0HHHHVe8YLGo13zzzcdqwlCx1157fehDH7rhhhs22GCD6vzVFlgNiYNu33jjjbfccsv73vc+t6BZeTFDGsVKo4C7zFKFFH7gAx9wbFardf6zzz6r7iSTTNK6WF5NBBKBRCARGFUE/kunjmpLHRceur84EYp8XhufzvPOO+9EE00UmabklKtXXnnl4YcfLphgqaWW+trXvjbrrLO6xB5Ai6MjzACsI1tvvXWQEooqmI3YkXnmmacIRD44a1TEiihRCdaUmWeeOQr4dpez7bbbTjPNNNwc0n6+6U866aQvfvGL2o2cmmNNH3bccUeWoSjjU54B5le/+pUm1l13Xa4i+bfffvtNN93EXFGjSg2fzUC3r7nmmpomymlDgRwZxkWawULJ0FCEICWBw/e//32tl16R9qlPfWrVVVd9//vfLz0kVqV1piY2J1Ol5p9/foAwd7n005/+9Gc/+9mdd95Ziq2//vo1QyuXMJIgJRgn5PFFNw44pYCEAgJ9TMV6xzvesfrqq3sGyu2rFivpp556qkVwCarHEuanPDeiJ4e1rNTNRCKQCCQCiUDHEKCMu/mj8D75yU92pMUNN9xQMGZVFC/Dd77zHUzFjxuFbpZYb731Shlul7cuzrboootKrLTSSi7RhdK0foRKSNOa99133wMPPOCqoFc5JBchEk4NxHH55Zd3NX6IiyiT7bff/pxzzkFuZOIupdZaa60lR9xJyakmGvYhCohuid4uuOCC0dDFF1/sUpz+/ve/r8qR1gHF8Jia/HLaQuC+++7LOxOtwI0JAXtgGjnllFNkMo0UIZH44x//iB22iVVUQXpCPr9bJBAClphIuyMYBrEsIjVt1ZwqA2SZwMHDqleFubD0hMByXGONNTC/arGaNDkg9Ri4R4xhd9999+uvvx5lXnnlFU0Q5Rb7SbAV1VTvwqlHTtNdaCibSAQSgUSghwgM5HycIGVmrs4999xVgkalHXHEETPOOON222238MILS7sqEiXKnHvuuSwl9Ar9yu/gFX/XXXexB1BjChxyyCFHHnlkeHzYNvgR5phjDvliKRyLLyNEsceYe0KHoQgsFjKJpdiYHM466yyWkh/+8Icyi2mEx4dZhUVh7bXXDgk1x4Z9UIZq1BAFvM8++/j0Z7GQiYs4RpW3v/3tpsn85S9/CYEQ0AH0SB8ip+bYWiDS8OUvf1kVODCWTDXVVIsssgijhVZkCrWpkSaS49BDD20TK3VRIrHJPGJmUbkFLCIyjQshCOSNVHOCeMKOVdNc9ZTLJmJB9C1sWuUqSmFFGacWSCX8tttu23PPPSVQwwCtlKwmNE3OZpttttNOOzFWMYogSWGT23vvvdnPVEeG8Da1qn6rqpBMJwKJQCKQCIwQgQHmJWIesYHq+M0XpfMEY+688870pSgQV2kUrg12eKYUp5wIvA+2GvFFjjcoHwGP9BbeQKnjAX/+85/ZQkJyzPigzuM0GMCUU07p1AxhmpsCk95ll10Ew1JXIm2vu+662DC5BFGaZ6vMbrvtFhVDVPXYrA++3SO0Ey+hJqlMtViJHDkd6HKxNYstttjHPvYxKh+r8B0/11xzMZY081k0E0hh417ackRuKHJNFK9KoBFQyFcMlZbACaorarTGSnlmDEdNcN985CMfOfPMM+EvSEVgMi4orvbFF190NECX3DKFh/y5NeVmMWxgq+GO8WzwDZHslGNujz32cMfD+9NCptggPeEjW2eddTwqDCeO4b4RN8Mf5C6oHje9hZzuXNLV7jSUrSQCiUAi0DUEBpWXxGoTQgcKUvQTxbPyyitTRb7sN9lkE5EHYbH45S9/ScU6ZQygEaMKm4pprtLyI4f6oeYFIlCctFFkIisS//znPx0pdQxAOghHqO0oUIRQtIS8853vVB5T0RNqXuQE6wvHU8isP5bqNX3QcwLxLQMxOn1jaVhmmWVCAl7iUz6sBYxDRqewKBZsqb6JyGkmMPBUhiuH6o0mmAeilkAZiRdeeMERKbHtIluRtKAcHC44itPWWLlBahmCgGUVUTGuInN5gkbouf4zBSEQLrFYYCcxzZvkFj/Vi70EO2TCCU7GllZCYfCVsHOYTN5MVBhsTO9C7AwkplO5xdieKsxpa665JnsYM9tBBx3EWdZMTtfykRIcGpjJTrqGeTaUCCQCXUBg8OJemdZ5T8IW4vubwqCWKBI6z9E7mtq48MILqU92EZEiTCMnn3xyxIqaSCLxzW9+s4qs+AanaArXgATPhSMrS4RzRjSlMAvzbjh6NBFKWkL4hZJhsxFjIV1+hMgXNsElEZl8FsWjVIqVRLM+oEdsEmJLuWZK4UjoAN+QX5zGNz0zQ7ifagqX02YCowNiPhgMFOYL8wMCv4bT0NloiigNkMrhu3F817veJfaWmSQYQGusHnvsMYRPFa4cP9WrP0E5rE1f/epXRQ3jFqZcCfdxs/ShWqw+TSDzhv6LByJE9BIfEG7hIUFx3AWrm3gqMD9Py3ve8556CZEjpPfHP/6xjqFliBGvn3zMiQlNAjLhdWpWvfv5sSm3KWl+DEJh4up+N7LFRCARSAQ6i8Dg2UsYDHbddVfRD4Cg55guKA+OG3qUasFRxHYgJT6+uXLoS+qN8hZuojzKEqtcFBCpK9/BohpNP4lM3IVxotgPgqZ471s0hdhvf/vbUQzdiegWNEX5CEQoYn2yM66w9mMPLAHyP/GJT5Sr9QlzYhv2gY9D3EO4P0qtmHCErMjBe2huDYluQb/CPlRK1ieaCWR5QkG++93vliosBzERWs573/teo+AWwdW4M9CF4CVcSChLMUu0xgprcY/o+7BAlIZAJyKVFUp0rbk52sUn0B0FmIhiubNSuD5h4DKNy92XYDJxFF+iM0xHOKgb4U4RHty0XkLkeIS22GILzaFEOonZMFPxx8U2k64yupS6fEwcZ216mkqtjidKfBVqwkrHdtLxJlJgIpAIJAJdRmCiYoTvTsO+XH1w023Dbs637AUXXDDLLLPQCtRwzZppLP+YB/XpM7o0QbXTiHaxiXBI+kl1Qa8CYF1iIeC8KIUleByqM1RVZIfg2mA2x06ipNb9wmNCY4Ex+EdVTqS5liTC8VF/tVlO9EG0BNqkk7QsvwZGwtMRNhIBMQI8ac0Wvol64S0E8llEkG/UMiK/YuOxQAgceDFsT8gmUfI5qqq1WmOFkbCFkG/pOSNCOwyHCceaKJdddpmnQmCy+xIdYPgxGao1mVBSB5AGJhMmJdYO6jmqO4LLD+UqvS2XmiXcU/3hZvLARBk3AmvEgzEwAxchxG6ENuHEiEu9HauZ5JHn+9/xBFaJtVDcanxPNJG2k5FDnRISgUSghwgMHi8ZCVjnnXeemI+rrroqhFCHbC0m1LQIyBhJc+qKkxWSIhw11jsZhjS2BNoaraHFVcd+7A7IGhRep34QOEF94BE79thjf/Ob30RQCGVPtRtOuIqIwgxYUNiZJohvTVAfJrSwLqEgmBkuGHURYgSLDaxZfPGENtFO+XpeIgcLESxcXz3ZST0mmZMIJAIDgcD44iVxS3zpinVgMmlm4ejgnWNaZ2P3UVu+v4ctHC8R48IO1Cld2HGB7Q/NDBrhICwZEfTafsXelkSYmGeqrqtu9qchL2FBadGHZCctwMlLiUAi0J8IDF58ychxREeER3SBlPjOjik8IyclRs2HIkSjU6RkNAS2f2s4VgRwDBYpMTpOQ5GzJZ6m/fGOUskIfW0hPONOWoCTlxKBRKA/ERiPvKRrdyK8Fa0jXrvWmWxoTCJQQl9bjA474SlrUWBcXRKgZur+uBpyDjYRGCwEkpeM4v0Sy0l6zFIZxWZS9DhGYMgV3swzx11scziOQfqvoQusFu/Flvlfuf99UjO97r8v5lkikAiMLgLJS0YRX3G1vEUmHo9iGyl6fCNgJjPm0QwDlwSgsNsN6fFpJmGs5puo1Wxo1j72LWG6eLMCmZ8IJAKjisC44CXmgFhmtEymqAJqiq8F0UWBVDPbSVs31jIqrUtuueWWpvKO3mSf1q3n1fGAAMLRcD5OjB0pGcT11kxQN3WurCHU2fsYuyW02A46Vhkui/e0aN1/d2xZ1aLMkJcss2TpvyGLDVnAUs6xGOCQJRWYoMLtCMwyiUDl4RwTAABAAElEQVQHERgXvMTKY+ae1K/0AEdvFnum2Kl1gjD13rScqImjQ9bqYJjqkG1lgfGJQMMQE5mWkhtEUuIm2srKyr9lN8oWt9UqizWbFVjwxhI7LarY59JVhkzrADX8Vol2Y9eFFnJcspKNvajM72tdrPVVm1RYp7haxmpJliyq5rST5jUWSNROSWXqCw+v0TabG7fFjj766LJ717gFYRgDH7x16IcxSCtlqdXw9RHrx7f44mzYXHzJlW1ZGpbJzESgJwjw3cw777xY+ID6bi655JKjjjqqHej8G9rJ0k5Gxd5py0ybPwgfaVFdZEnMxbPjgRhYe1aAq1o+eEk7Zs54pTB4sMGgO3YLt19BVdSQaav28SjpMyEsNGaoscJaN9IeF2UnryGFRAGmXz0HnbUTfQ4hdpYHbFa3vvDwGm0mP/MDAeY0C3IutNBCCcgEITAu7CXxRdXQoxzvoIaUpQWOFqRvcTUvJQLdRMBe09UQE74bK+daucRqsN3sRqfaYsMYcjuFaMtaPgJBrr32WhEhkWMHK0FdrdcwRCDCiWNLKQsEW0enpudlxb+a/HJqzwSbHmAzsdOWLTOZbbwTJiha1q4RnMuCzzRnqWWrYFtT2DZPxoI2seOW5lonEFALLVrmmPnWe8y2G7aotL1GQ3tPi8IT1GjrLg37qu9DTLFhdQSOO97yhscdd1xPLBBIsD1KJ1RTeNLav5Vl4HbD4Giza0pHHHxF7AAlxoW95J577nFLalasj5sUlKXhpRZ3MXhJfHW1KJaXEoEuI8B3E2YSHhy8xLJ+A+fKscMRDwvbw5A+UAW4bPAY5e2s5N+cQ9ZGnoE5wmEnSIvNsGFY+6fcCAo7dk6wp6OdmCKfZ5YQl3zdhqMnNpR2FQu54oorWESUj0WW6U4bEZDJvGEDBNaaUjik+QTi37G0tIWLBJnZx6C0XhK4COcyBqb/dncSZmuLg7jqxWKV51KymqgflH2aYicHnafPDLnFskAtCrdo1H4d9pcAr1A8T1Sz1Zh0wyWj4IazbdkOO+wQPbdzghw2IZRx9913N1i7T1jA+utf/3p1KSBDtufG6quvXrMrCPsW/mej1uAE+mmRbmN0in+rUkg5StcwbAgssQN8FclI85dVhxPbYng8bPpRDGZoENMax70qWjcELXLQ2As9NrjQPXc8NjStaYKQ2BK1Jt+gLJOodUtS+Yctj7q1yG1S5qGK8p66BRdcsKbuuDj1D9nNnxeHTUa62aK2vEH8PFj17fJGu8TwW3+pRY7PMrVsPNuiTF5KBDqLgP8dT11DmfJtd1B/yf+anSbr8/s8xyvbiBCFIftJ5ylJayppjRYvccpJ+uc//7l8TAJfwUvMDS6iBGHEf64vYHt8yqd44j2gCvfHtttuazPIKC+ChEz5mIcjp0/kx5ZYFLZM3/FFuAR9iRHK1x97LEjgH9UCkWYAiK7asNNWGNUCmBnVK8cG14Zg28642mxQ0RmETOGqnIbpZoWbNeq1aQiGz2Um8bWvfa2hWJkGS0MzVinmR8fL9AkHQB1THapuh0xETQHfhNLxo4DhoJhov/+X93//Uv9vyZvNqjMMKuWqgBj55a3O7KS6G42ZuU1unPv+i1/84rbbbovWVQQgmwfuridoFoMZCTYMd0kH8CSneusq/PG8aIs0+bZRI80ApW3vZbBxVYfl1NzBuOQIN3wFa8FsdMNYZHLSxYOhhzLxsFLek0maqwbisSz54y0xwH4c7ljvBS+FhtEhHgUGNLezsEtW00jjqi4F+45vI5MtSzGs3zOksAeR/JIv4WMF9ZaIutVa1WKZTgS6jAD1bDJwfaNetdQwNV9/qZ9z/PPqXvlgbdFVL3ELB1NO/m1FSPif9QkuyiR2h/ZyQFk4NWKX6ZBDeNhL6EVXOV9U953KdgIo23SLQvUhq5g9ByhLW3ZTdcqovuOOO4ajITZviq/z6ktGGZuDYlTLL7/8OuusEw3ZPqJ+CD61w1rgSClWCzA5eMMwdMX+3gRSoi0GFZ1xjPdSVVR9ulnhho0a7HbbbUcItyAcJBoaJKIVehR1AJS92dkVbr75ZvmeQIyBv8zbUhNRXUIBe2AdcMABvE6K7b333mYn2DnLesohrRyt0OO2OmV08UHopsSlMHWX0A3swfCxLkChmz413XfhvZtssol807t0z3wlYELV/WWgigmVMX1BfA9nHG6Hx8Q+nXYli4ZYy3QVR1x//fW/9a1vMXTZ1XX++eePq7iORDEORWY5GiaVwaqnezK1iI+ibjIZezxX5HBOXXPNNVEFB8VU2Es8sfx6Rc54SwwkL/Ei8Iza2tdbgx0Px2SfLHfOVU+eLYXl+2dghnWJvdSzJeEpxIhdYpL12gpe4iUS1T2CnLXKeHpYGsn3BLuEvbJhkoZK//SnP41aWHPUymMi0J8IMBHTKCJOBouahH6tGvlbwEtbICUbbbSRMj5bHYUg+FcVdkML8qfIQRRC/0n7LPHfLRHeWy78k046yfvB14idj3xMu+SnvO9jEkiTH6G4OkbnRQHHMONHb7EZgSaUHzXjLUF7bbHFFnQtBVPd27zULQmsJd4n5FDA8cLh3PGeoVy5P5T0Xmo9KGXCFxAkySd4UXWloWqiYeH6Rrmo9MrrkfNCZzhiWiwcHL4n+1l+4xvf8Bb1dvXj8FLdWJAVlCJ8NPQ6PkGjU/aogKEpgBeW93C1qwqr5RNRTA/2sNhiiznCObZ8j4ckXEVq6aEj8N1xiWCTlL1RBI/BBiJ+qNx06h9oGJU75VlCxY4//nh18ZIYEbsafolLyfSTVl6mtIYUo4Ps6/7WxQYHDfkM9iwhwW4ugqJFj5MqKFT0U1RQ3DhOLg+eZyw4sf5oooHQsZ41kLzExxDizKQhdsx/DjMpM2y8a9wvT5gXBK8hpr/wwgtLy4w3iH88bx/PqJe1h8nLmrnP1fgvRW7wegwXccZkPRMuqcIe433HXvepT31KAc3FQq7xj6FM/hKBvkXAZ7envaHzu2/77D9U3yhs6of9338iMtGstzEr2Mcx+hWxAtz/zPVeETQZHcCTxaayzTbbUAMhNj5jYrqK/2u6jWmdPlBeFK0XCPO76bsMNnQY3elrmKogn4MmQi+jM/b+lPBC0GHKklhx9NJ0M33jLeH1wivUrOeRj/SYQSNNZfoWMuSIn/Wq8WEdsb34TetBqU6Oox1J9YHhISy70UT9sb5ww0b1n0XKa9Zr0HC4IertGSFctyldNpVgHpgZRhJvV3NSGAB87Hmp8jYqD1g5YQ0iGYNkovBmru9n5Jii5f4yrrDBKOlhRpJipWM3Rb6oFE4xhQsb44hheJCDZHjVSyAHIY0vRgIC+BA+6hb7eQa0QlkwmCGjmqAI2FrcGqyxGGlCAv6EL2pUYTlhzolLNcdgmbgXB9bKK6/savAMTfu09pwACiGDxvbbb+/JYe/xgPni/fWvfy0hNoXVrfrVXSN/zJ6iad38jTy+xAPhX8VDHN12g536+dCJHE+AJ9KLxqnbHFFvCqAXnKMSnt0o6X39n5r/L8ebK04ZVMKdTI5/V1RXfvHyMgBGsZIT0vKYCIwqAi3iS4Zsd7ACTahV/2Jeyj5e43/NB3GLMVJCiqECUYY69C8sZ4011gCaTNo63gMcInS/S9SDfFpBgfIfrZYvbPmM+croQMQBeKWIc5RP+UUkSvVto6QXhaMvch2W1hZVp3z8vItKsML/y/v/f+P1oick+GR3ge/D+ycCQZz6wna19aAUQ49IiIEL24zglf/fzH+n6gs3bJTKJ9PtKLWJRRHKaTWBkURMiUxhJXAIqLGr0hn0RXVEh1h3zS2j41HnUqAqsKRpbuV9FmoCIyEZIK66Wao7pdTZNjbbbLODDz5Y1AgQ3AjEFBkKIcw8CIrgoQ033JDXDNn1JLgEZDcXi41YE62gg/z1LvlqdaoheoFAPQ9RjggEuAJqBaqXSplIEK5vJZPZSR/UJVm3GZa4C11FXORoSHBCXFKGEUVvnepbkTBOEv/T5XGOnJf4LnGrUEg9Zyd0d536eT48i545aTfYVR4cjhin8dry/+9zJx5oT5LAcpc8647cxioSpbyXlP9qz5znyX8XOfHPGWkPuvKEeOD8WjyRXQY2mxvzCIyElwBngKiJj4GiUZjBgyu0uL9YS00Zeo6QahU5JaaS+SG0PgUT/8L0tG/xalhlIRY+eatypBkkyj++m+IL3rdvvJFcZW31ivDzFvJiiS8c+rJGSDnVE2EHdKoYCEZf+d5F1Z7IYWlwbD0oBZAnbzDf2V59Tlv/ago3bBSdikhMr0RmBiON9y0TQmvhrsZYQgd7ndLBpiaFLoe/V7cWhxQSBTSHG7EiaJ0EY0QaWtStl+x+xS96Va1bclCEGtgpFNzUg1T4VrWiNALK2FOTWT0ls0q53BccSAGSq/lywuomwbHoq5tkg6ViRLTU9Koqf6ymJzKwbtqCPE9MZCyWw25U7BLTKCuciCFuV9Y2by4WWpRzjjnmEOXnn4dwtNd3BjKBeWhxySWXZIFkd5WJv3NIe6Gw7jrlFfISRO3Z2RjNBEbV9A27x3P9V2iUH1SLfMneYmg+F6m4k5ryeZoIjAYC/ndYreng4QmP6j70YyLx8IRkrXYQYJb3ihOJzCPgG4ZS53Qoc0HbkdA/ZdAmr1yvR34QQXuG4wXbYsW2+p5zkHGRCBDx+lXdx2EJVq0vPEA5PFD2iufbiojgAer5AHS1y4Rr5PYS9BYviS8SdlrsMoaAt8r03MdUrijAFBZ0OOwcvkgi31HISHxRedFjptxD6sr3X1TFhA3WYkTxleAqs2G0iO36kuA6rRbOdCIwegiM0F6iYyItPMPkjF4nU3IiME4Q8LFKBYyTwXZ5mINnLwmuxw4mlHWaaaapUj+mVMHSwuw5erEN7L4aCc8eKywOF2FCnHXWWVGNUlemhXpYXwQiyWQasY4kw1ossyPGm1lFGJfQM7N4IoRWMUZgd6ud2YyloUwkAsNGYIT2kmiX+5JTYCQGy2H3PysmAmMGAd+lrD78dC0mKI2ZwXZ/IIPKS0YJKZSFI5MBlndZExw3At88eRKj1GKKTQTaRKAjvERbDOncmryQbbabxRKBRKAGAevEmzaB37eYIVxTJU/bR6DBgj/tVx57JZlDxJeIkBKS/Y53vKPFis5jb+w5ov5HoCMvQS9T26kY7JinJqyqQsGElPb/nc0eDhYCoRpEKA5Wtweltz1Yv0S4Rp+jw1PjgUtS0ue3abx1L5bY6sioRb8O4jqwEzp2yyGaVtpi7ZMJFTjGyrMKC78bY4PqznDso2RSRazV2Z0Wx1Ur3eYlORdgXD1eOdjOIsD/0hGBsQ6seWodkdbnQszV7IceikjALJlwqp2xgIf5gNZA68naWVakzfCI6u2YoHSb6xFPkMwsHAh0m5dEqzzleQMSgUSghwhw4mA5Yk162IfhNW3Cqnlw7dS1joViJuu2U7j9MoiOlb4miO5YtcKCAtZkqy4rwFyxwQYbWLHaTEDLnFinINZKb78nIyxppquFvEYoJKsnAh1HoAe8pCM+8o4DkQITgT5HwDyazvYw4kvM0Oms2NGWZuVTu96YDTdkQ7HvjAWHhiw5QQXM7LNmksXB2qwlmn7ttde2zqm1DKxzX2pZwtEoTOuwyL3lsyz71mIt9lKrUwmWG63XrLDeKeEpJxEYCQIZ9zoS9LJuItBVBGxH19n2LMdprTZiB8jBGozEus/MIZgHlW9xzIawiCzpOCnRkKXGHM8880yLK9LuPrRij56GfeCgsaegPtt2x3IX1TJWoLY/y0EHHfTe975XvkB7i5Rb4CCikqslRyMtPIJYax+MhvCUmQiMBIHe8BKO7QF6D44E36ybCHQKAQHjHf+vIVAM7EiWke3U6IaUYw1vywtZSjV4iS3WRB1aZ8jiQ83qYi0jdOJoSwiItYssYmTJAPuq2FAtNmOzphHSY5Gk1jSCjYS/xv6xNaTEBj0MJPbYClJiCOQ7WjOptcCaweJewlNmnnnmmvw4FdRi1XzrPFmxqaZA4FZWQBAAC1s2OVH/ZjCVVZpqauVpItAFBHrAS+ITrQtjyyYSgTGDAG+L9RJGYzioCckCTfpqsTUsxDqHLCLWPzR738BZJvAD69WaKEeDCssQk9EaENu7WGixlGGKsJGN4FMkw2Kd9pSPS7F8opIWjC/LJLJzWDlaH6KMwniDDuABdjLnxLF/ln1AW+tvE4Ks4I4/2Z2YgUeHbYIRFIEokquBzDiEnCmnnDJarDna2JbfBzOrMgxkQo4t67jkrKiB6Ng+xjrx8847r+rGZeO3oHEMM3bqwHjQFLtw6Ek4uSSMyEjPPvvsKIls2VAsZyPW4J+nXUWgy+vLRnO2EMvFsHuCfDY6oAjQx1aRH73O99WufmXvXFrTwFkmYuCx2Z6dVmQK2hgSDXOhy0rh3BZ0NpbD5iHO1I5oquM9ZWM5mfRx2fh377331ooq3lQYQ7RFl1vcSNolJpAhO8AWpaQfOZtvvrkEchPSbB1sdFUJGlIg9rmt5kdaSI2rBFYvif+VyZPl6GdTMC4tCeEvisVeu7bxQ2hi07sQooBisakvWOzw8p/Ks83G0zQOt4ir4pnpPkGg2/sJx7D9B3oP9gkE2Y1EoM8RiH1tRrWToRQdR7WVdoSzlGAP9ox97LHHpOlLp5wOpS7HikzcpeQ0S7AQ2LYzrsbO4cxONslCSuhs+aHXxXlgG9/97neJtXl4lDdRhXtLDhWuAAZTbUX+kUceWc1pmA5eQoKBKMDGoyJbi7T9dY0rWE7U1UMlCweqEYhjqSuiBeeIpi+44AI5Bx54oCNRYFFFPyNHK/JLJzUUuAmttWjkVVddBQEFrrnmGlWs8hJCABJdrWk9TxOBbiLQg/k4zEHhJs/Zwl21jGVjg4zAKDlxCiQl0KTk9CqBBAgKOe6444Q+sEnoBv+CGSulP+E6CaeD5VwFmpRLNQkeilg3jHGF34Qmth/4V77yFZNQ9t13X+9ZFgJ2i/XXX5+T6Pjjj1f98MMP5xaRsGUui4KtdKl826CY38tRUuRzdkQHTFomsOTXJGJWsEUaw52EECgQobhzzjknCaJVogoGo4fbbLNNs1UxJplkEiVxCB4o25Ga12PpkRVWWIH5h0CcJuY5KhbB0RHWWjb+tZsxB5aSYvvsu656FGCLUgUarFDrrrsuZ9Niiy3miMdEx/KYCHQfgd7wEuOMKJOkJt2/5dniYCEQO+11Yc34EmjSW3xEeAgU/fGPf7zDDjugI4wWYl84IzCV6Ngss8wiYVtNep0GbbEi2aSTThpXGVdUYQwQmGJrcdGdH/vYx6hqJOCSSy7BbFZddVVkiM6muTfZZBPBHMojMTb+PPXUU1WZe+65UUOJ6AN9L1pFmkEC6YnM+iMGsOaaayIB1ilhqEB6kAYTg5VEAhyNUWCHl+Fuu+224oor2uS8XkjkBJuxElqcEihKRutGMf/88+vYZZddxvgBEEK0omklq9OADRBiTGJoFnj1B6c566yzeMeAYBozWxF7jPAUOLCgNOtJ5icCo45AN40zNW3lrus1gORpIlCDgP+RLns8ex5ownQhCINbgbfl6KOPBgjXhhXJ5JToCpfC7+BoLbIa0MqpmFMFzHxhKpAQc1q8JOwBNDGfiHw/kRmMGSr6UnIq1kRaGKy0WuwTVLi0ebwhXMRrFHNksykt1idQAcwyQluEyla9JGYaqx4/bh0LtdVXLzn33Xefkhw9olk5uXRY/+MqYhRdVUDP0QuixKk4ZRAqEu644w458TM5GQKiSQhEmywTJ5/RheEEI2EiCgRK3UwkAt1EoNv7CdfwLP+x/KZeN+HZqbmap4nAeEYgLCVdniZDMTNR9Pxf0tzXGWaYoXr3+R1MqWUekKmTTAXm6ehq61m13BMcHFNMMQVicfrpp5sag3iZ6Cv8glWGz4gQL9wyDYdw+RxAM84441NPPcWUgtzw7Fjnw5YohMS0XiQGZTRpBWHi+Kj2s1kaD6ifvMPogi6Y7FMcLs2qy4eAXpn027AMLsJpVZ1/ZOzmUVeHxnyCpRlCaU6OoXEPcVqdccYZSB6zygILLICBcXs1bCgzE4HRRqDHvMTwgpqwkXbBUj3aaKb8RKAjCNC7bOlEdZmUROd7woc6gltrIZdffjmHBW3NE8QewD0ULpXWtfJqIpAIdBmB3vMSA/YedGQ4wU54RtN20uWHIJvrHwQKI7GyRQ+ZOp3d2w70zx3JniQCiUCXEegLXlLGHLaTCCwvKw5FeHkpk4lEYIwhIPoh9r6xoquHXxRkz6l5n3hzxtiNzuEkAolAOwj0Fy+JHnsnelNHuryv2xlMlkkEBgiB4N86HBQ8+HfPGUkBMCw3PXEklT5kIhFIBMYhAv3IS8bhbcghJwJ9iEB6c/rwpmSXEoExj0DP1i8Z88jmABOBQUeAsYTBkuFk0AeS/U8EEoEBQiDtJQN0s7KriUC3EYhAk1hDrNttZ3uJQCIwLhFIXjIub3sOOhFoGwHeHGUz0KRtwLJgIpAIjAiB5CUjgi8rJwLjAQFrl/V8pbXxgHOOMRFIBCCQ8SXj5TFgkI8P3/Ey4Bxn5xBASmKdt86JTEmJQCKQCDRGIHlJY1wyNxFIBAoCMXs51j8smZlIBBKBRGA0EEheMhqopsxEYKwhYLU3KzLn3Jyxdl9zPIlA/yGQvKT/7kn2KBHoPwSYTGwTkd6c/rsz2aNEYKwhkLxkrN3RHE8iMEoIxH496c0ZJXhTbCKQCAQCyUvySUgEEoF2EQhvTruls1wikAgkAhOOQPKSCccsayQC4xUB3hzb+qTJZLze/xx3ItANBJKXdAPlbCMRGDMIpMlkzNzKHEgi0J8IJC/pz/uSvUoE+hSBMJnkWjh9enuyW4nA4COQvGTw72GOIBHoLgJMJn/4wx9yznB3Uc/WEoHxgkDykvFyp3OciUCnEMg5w51CMuUkAolAPQLJS+oxyZxEIBEYAgFzhtNkMgRGeTkRSASGhUDykmHBNpiVKJLB7Hj2uh8RGDPLrP3jH/844YQT+hHi7FMiMC4RSF4yLm97DjoRGDECsczaGIgyueuuu/baa69XX321BSR///vfW1zNS4nAwCHQz4908pKBe5yyw4lAvyCw5JJLjpmV6V966aVmsP773/9eZJFFTjzxxGYFMj8RGCwE+vyRTl4yWI9T9jYR6CMERjvKhIfl0UcfffPNN0d1zK+99hr5k08+ebNW7r//fpf+9re/NStQ8pGbPfbYowXFKSVbJH75y1+ee+65LQq0eenuu+/+/ve/32bhrhUbEqIhC3Stq2O4ofYf6Z6AkLykJ7Bno4nAGEHA8q+jYTK57rrrVlhhhYUXXnjppZf+0Ic+dNNNN7WJl4pXX311tfAhhxzCrvP0009XM6vpf/3rX07f9ra34UCXX3559VKk//KXv0i88MIL9ZdqcrRy0kknXX/99TX5E3R62WWX2bq5WuXGG28cRnDYH//4x8MOO+zll1+uiup5uiFEl1xyyZNPPhl9G7JAz4cwBjrQ/iPdk8FO2pNWs9FEIBEYGwhYy2TjjTfu7FgeeOCBz372s2TuuOOOU0011cEHH/zFL36xTcWsP88888wdd9wx2WSTkfD8888feeSRyy233AwzzNCskyJLkBJXWRfOPPPMQw89dL311qsWjpe4nlQzG6ZfeeUV+YTcfPPNjD1I22c+85mGJVtkEqLFo446Ck+aaKKJvvrVrxr7DTfcMKGOpGAkhsMghHutvfbaSyyxRIt2u3OpHqIPf/jDW2211XzzzXfeeecBecgC3enn2G6l/Ue6JzgkL+kJ7D1o1JoTPWg1mxzrCMTyr6JfO/iATTzxf+y43/rWtzbbbDMJvvCDDjqIuppyyimHhHPNNdek0X//+98ztCh86qmnUs8HHHAABd+sLp0dTpy999573nnnLR/upfxzzz0n/e53v7vk1Cd22mmnX//612GVufDCCxGd97///bPPPnt9yWY5yAdSde+990aB/ffff7bZZltooYXYaUjjlGlWsSb/n//856abbnr77bcHLzniiCOmnXZa8TE9j3NsBhGUfv7zn1900UVf+9rXWIbqMSwFuNLe+9731oy3nL7++utG/bvf/c7z48mJp6hcHTLx4osvYqjTTTddw5JQffjhh2eZZZbpp5++YYHWmU888QT855xzzmDArQu3efXWW2/1pO22225lpAg99BZffPEhJbTzSA8pZBQL8N3mb5wg4DU3Tkaaw+wmAldeeeUnP/nJzrZIp77xxhsh8/Of/7xHF3top4l77rlHYZYPhRGaRRddlLaLit7F55xzzi9+8QsOjqqoU045BYmp5khrnc3D0AgRRkMmyS3kLL/88gsuuOCnP/1pJbEoOrIqkM77zne+ozMrrbTSaaedVr1U0mJK1FVAZ+aZZx5qrFw6/fTTXSqn1YRAAVfPPvtsWjPyH3zwQYUJWW211SSuueaaavn6NGuKkco32C996Uv33XdflDHeb3/72wxXrFZXXXVVfcWaHNwCCEwyagGtXmxriEgbskBpsaYtze26665AM16/GvRKrUgY7He/+11VqvkYj56bliWzRricY4899i3B/zl84xvfcDdlPvTQQ+By01nX8OCqtGrak8AaFNV1zIz0eLA5+z71qU+pbr79U089Va1S0iVfbzFszX3uc59j/4v8ffbZh9g777yzlCcQL4/T+lG0eKSLhD5J/E+f9CO70QUEPMRdaCWbGIcI4CW02mgMnPXCc7vhhhu2L5yC8YJW/uSTT1b3z3/+szQuQgc4XXbZZR19aBaBgjmwgTilsWgdb3AKQDE/bpRtt90WpYgCzeSgTaFvVGGwKcIlcCzbCcn/8pe/HFzhkUceqRYoaTYhaQExFFjJlECnVJdglzIKXCHUKt0jX2EsxKhNeI5aIYfrx1Vf1VVR9WkhNYoJynH0C6iPOeaYOKV0V1llFWnspEaXV0Xtu+++yii8+eabS/Bk1YttARElfcEFFwxZIFqsb+vSSy+N3tL6QRqqfatJQ09hfK7ko1AIJRhRunrhTB3Ke0Lw1x/+8Ifw/8IXvvDXv/5VQhVGIM9GPa8twj1pqoMFIf7mN78pffTRR1988cUSHkXVJYI6u5ueVa0QiBQiIi6dddZZONPHP/5xaS3i6BqVwK4EXcn0bERb/gGdYrdO60fR4pEuXe2fRMa9jqItKkUnAuMEgY5PGGYt4BYxM2XnnXeG4Xbbbdc+kvSoV7bgDFpkqaWWErggyoTiWWCBBRj5qQSiaFnxHyHTBJCpp5460j/+8Y+935VhITCzBg9YddVVvfqZxxVrIYcnqLiKvN+rvf3Rj350yy23MAass8460dCkkzZ2oE8xxRQqOoYLpggJBxZlv9FGGyEH1157LaOLCF/gbLHFFpjHgQce+Pjjj++yyy5RJeSEc6qmM0VmSUQBdIRWIz9sJ1dccYUC++23n0Zp/Z/+9KfU6hlnnFFqVRN6EjwmSIlLgnvqxcpsBpERiSJqgWEUILlhWyuuuKKmXWWUYk5wm6rdq0kDkzOFb4t3L8Kc+e8QI0YRfLF+IBFGzQIHHAyVu01FHjfmNwRF6BKBAXVNQ3EKSW15rtZff32uSeaNlVdeOZ5nlhLPpGJRHaXwqGDGHjaxRCxALnFveZJ599CR3/zmN24E28wHPvCBLbfckr9G5m233aaYh3mHHXYgWRRRQ4iaPdLRyX47Ji/ptzuS/UkEBg8Bb+cOdpomEA8hSMKLGzshmeaomWXTorl1113XVSYTqlrArDSd6tV/3HHHiRFhzJBD8WMtIURUAWN7pGksL3pzahAaZnPlfb/GJTqstZwiIViFiJCvfOUrvlOpOl/AFBIOQf9hD+9617uicMNjKO/Q6z58OWKiGC7FLoWFzDHHHDpjOGBnzDc0826UwX5C0Ub5CDuIzqBWAk0aNhfznw3c97oeKs9s8Pa3v10r9G5UgYYC3AcNJTA44TTTTDON+3X88cev/tavodioTlQVIpnveMc7qF72kiELNGxrkkkmART7kFsvznexxRZzhHzD3iosasdgcQWmKTBS9jgo4thQeMS7lFuGXRka4qu8im4xGqT1hm3JZG5BRktolLTJR4av+u67785zhDczpEX1888/PyI/yhMo4CYCrlGZd77znYqRgJFErxAyXjw2RaSEoQhhUqDhKIrAmkc62u23Y2Pa3m+9zP4kAolAPyMg6NWsnE5Fv/K/GKyPv1/96le+NVksfLJzrAhcQFZcZa5HL8SoNsTEvGI6lXKda665CFHGm93rnh6iJBjGdVUYI7ow66yz0sSsF2IFQpS3tjTd411Px5sCLZRScz6mf/aznzF4tJATEmaaaSahHtI+3HEIEx8oId2mMvlZXG0xM6j0QeKxxx7jjtEuThNa3Lc1/qFva6yxhu9psCAiWqHnJICjn9tss41LyyyzDAmCNB3ZAFSnp017Cfk1R5jIwdJEZSJDSANd+L73vY9YAqlwCMBcQGhMkqqp7tQXvM7AKmB8z3veI7OhWE24VANR5DjikUMWaNiWuno744wzUszY2+GHH64zNDd7j0s1Pw8Djhszz//01s+4cFDFGgpnJ3MJ+Zh55plDFF+eBAuKW+mezj333EEd4mrNEdmqsd+ojsaB1LMEJfHRJXBVXb4et9JziDDppOc8hLsXOsAugqfiQ1w/Cm+yySYK4GHSngTASjQcRbNHmodRlb77eRDzN04Q8M8wqiPF080dGNUmUnjfItDBEBOeFNESHlduewYGQ6Ybtt9+e5+Y0uFr519oAQWngOqISJTxNo/ADgJ592XyhkQECXMI9qNwxA+KJuELoNrl+HHkM4Arz4jilMGghZxoi8JQEplwZKXQNGO74VBCUcDx2WefbfGfgklE044Gq6u4lHQJvI0CuJruyUdTIrgHlQncSknjVUAHHKsRFaUnEuZUi+UsOfrPlaC34dmJ6pQ9nlfK1CQieoa6Lfm+3d24erFRoAYimdibhgomLQo0bAueDB4kMJIxXTBCGLJbUPpTTURQjlCYcK4Jai5xMw2Fe2BIZv0qQtAaOaJZS0W2GaSqnJaSEpii26RAyUSeVMc4S46Kqnv2YC7YRdBJ3EHE1B0UdMJQF0+UiiJdhJ6UunEJyyk5DUeBtajrV/NIN+xzEdWrRMa99gr5HrTroRzVVsn3xdyiCe9o/3UtCuSlwUWAakRNOth/Wqo6q0Wa/Zx8z5h3euv3Kb3ofV1Txgd0Tfeow1AYlGjEino+feAqRveLtKgGURZi0UKOitS/WAeaNQI15MRsGv8dVBQFH4G3mFNNZ6qndKGAGBYddp3IL4k4RQL00wCr03ZcklMlEG4KYwCmxZJRld8iXcVcEzXyG1bEbIIhaYtvAk8yWIq/Wrgqth4iJatQtyjQrC1d9VQILcJIdAZ6Ld5F7FgNX0TNhHMmVsciLb7EGDWEewn1iOEHh64pCUAGjGqmBxu3UB3pZB0BWhBHT2AVpahSzfH+ZPqqimqYbjaKFo90Qzk9zJxI231nw8kOjQ4ClgEIC/PoiP8f8r1JfVA2ky/yiyPZO7dZgcwfaARG+wEbXHC4QoQOiFsUC+LDl0ISGVCCQAd3XKXn/BHCNnm+mC7EBjHb4CUlqKIU60hiVNtqX7jBch4JauEvc09ZKfi82hwgCgIuhjpmGH4c1bGT8MK0KaF1sfZH0VpOr64mL+kV8j1od1TVBl7PN8wA7oXbbGwCAuaff36rOjYrEPmM9haVYpht4bVtLcFV//BmHrKIDlmydYFOyWndyti46tXsO7KDC6yNDVhyFIlAIjBBCOR8nAmCa7wUZvU1EbE6WkZRi2o7VjOraQZkp8L4fQpwq9fvGh9GyJoQsKqEkmYnH/kmIwzdNZuDiOYzx6+00maiXk6bFYdXjDGJZ2F4dXteq+OzhXs+ouxAIpAIdB+BnI/TfcwHoEXB+ZzivJ4R6a3HdDyrLAtts94HERHMz17to1nQeM12blYiUhdxaSah5GM20gIbeeixmeFtMhJTGKqbgyAl/b/JiCB/kynaNwgX0PohYdqqBUL6oSfZh0QgERhcBJKXDO69G8Wei1zDS1hHgpcIEzNjTexIi70hxJ3p0P/+7/8KiZeIaXjVLor/cmr+WzWzJs3tMsJNRoQHNtscBGfilKlpsdlpCznNqjTLj9kTza7W5Is/0Mn2eQkvtYBBKz4VOcxCQI65miVz4BKeFnajIafUDty4ssOJQCIwJALpxxkSovFYQFyedSOsGxiDjzlmYt2dsosI+OLliamJBR3aUZo5xER5e6jaIiQumfIgOoz7JoiLyQiR31AONiM/FlowL9TaAGJEzGIorWhX+AJ/gZUJwkJTLkWCfYUQ+ixUu/JCDq08ofN4CYpQUz5OsRAKnoUGAxtSTkMJ3FhsM+wxNVdZngzH8GPRJ72yUgJHVU2xciqkptnqVaL6Dcf6j7GgQlRxa6prOpkzYkkD0whdFeNsvSYtCjc2JbI0MXoJt2YYnrL6/ni0WOYEA6LC9VczJxFIBMY4Aj2cC5RNdxkBM9Pab9ECgspzqZhJL2GRH3UF28f6DeLteXloxCIwFniI2YzWhIjVIGhldf2UN8NegjenhRzcJeZtKmm9oCI8EpqTL84/lrtutoKFPiuPByiMZxQhol7kOGUHMq9PADwy4VQESUzzq9k5pZmcIrCa4HIiHIWqZlo8W6a1Sq2BIVGzHUa1ZEkrw/+FeaAvELZGhUvmNJZ5ibQ1BMpKD8gQyTGBVkkzQp2aKdr+/h2l6Y4ktD5yOe6vscciFiBtOKVz5K0UCSi12a3VearlUiYSgUSg+wikvWSM885hDy8WymQasVQzSwMtTpSlhJ555hkRJMIIzKzxOVsWyY54jggfwQB8spurZtsO+1tGrEl8+1r6qYUcXowyedI/Q7XzDTd9qBYo6Wabg8SsRWymuskIS0aznVOaySkNVRNR2JQ/g7UmAcaDGfAoWcDbckkNt8OoVi9pUEOYwcl2XzLtkWGiE/LBreZGIH9uBE+NFS2jiqVLJaxa6YgbWWscDeLEwWPYTtrZvyPkdOrYOji6zVaEXdufDD9j6DKXEno1D0ObctosVh9nbbFXO/S2Wb1rxRi94kZ3rcVsKBHoDQLdp0LZYq8Q8C0bzpd2OsBa4Js1lo+k6lSx7CAJsUZQWAKcMp+EhYM+dhoWCAsNmTIqVJaE+Ay1gqSrfpYhai0n+qZiLDzFlRNGCIrK0ofcMSGnuthiw+GwgigZKx3RMZb+DKOFTN4NBgmWCTLZgRhLwsyDRYXwqh2oXk7D5tiH1OVkYc+Q0KLeQo+FKbxgsbqoS5ZUqq58VSMtFlwyfD0PAw/PiB6yPClJmkvRyWIjcQuQFcpVc1ggHhN91mIUVt3KGTUNjdJpR1Z9hSE/YPQwnrSwG41SnwX0gJSBykKcnk8UEICgiwd7lBodhlj3V6/0dhh1s0oiMEAIZNxrb+hg/7fKAGBHj4hUsB2XDsdqIhS5eBHrlFAen/jEJyw0IhDke9/7nt2wlLH1pVVMLBDk253xgxEFE/LJy0rhvS8qFj/YbLPNlGwmh8HA1fodNBpu+qBks1/95iCWdFO4ZpMRRIddodnOKcrXy2nYYoBjSzbSFPCJbyBsHiKFgyU02w6jRlqEnjCNoDURssMhJWQH5hAGO1TF1lg8WxyJPdLs8GJrOs25HVq0yKm9NrRIbJv7d9R0oB9Ojbd0w+wku9LY+kQOC5x4WA9YPCSlzLATzeKsWX08upprHaY9vHbbGQWKLFiHsa00YWtl65kKt3rggQeabQxUCjdMsMOxtFkMVFQ7qtqwTGSitqKy2EQ9S/5bq7u3tKjVwUu+E0RH+ddrEWjfweZSVN8hMEAcKrs6QgR8FLZvL9GWT3ZVbNlQ2o0oBx9tPiuZHOTToMowafiUl/DqlInN+HD3BpQTP5+/vj4pWt98voBbyIm26jfIaLjpg0ZL3+oTVLvWw2ZgZeuGm4x4/0bEjMLWDiGEySdsG6JkQmaNnPqG5IiBMDTNMXLYzZzMsih1s+0wGsrRFp5XLol74AaCZ8DIL0MtuSpERg4uEiWtQaciO3+cTtD+HaWtjiRGaC9BO8JihMUylYX5TcdoShayAMENtYeZx0nkkx3O2Ios7n7VVVc167844maXbLhDQ3PkkewR1UqUpBTloHf1FT3Dgnggr5/CkOsLlBzMhuFN98reJQ1HobxHxSw2PfEBoLeCh7QuyKaICmNk9Z8XUOyRDH5gQfojFsq/pC6V2CPVBWw5elriIQ8A9b9IrkkAnOsw/mUUltCQMs0k11R3iljXZ0ZOzSWfLraqwd3DpFpqeYdEPx29ZzzVTJvFSmSAIG39j19EZWJAEcj9cQb0xg2n2/7Pq6+2IUV49fN92FCqWtKEl6ItIt8kF69gaTqeYz4y413ju8fnfvE4uCQn3v4t5ChWv0FGs00formGR4P1CqtuDhL+mlKY4okXd80bU4Gyc4p0vZwioZowTFBUcyLdbDuMaLqmPAVQhVeHub2UkV/z+gZgC41b4mS92Vvv31HTgRGeTugzVtMc/YrRFrUkQSdhDPxi0lQ1g5DwHWn7AEcxUcDh3cNOQEfVQQynRB0Ql6B0KGlNQ3HaLM7arSQcXeBxIxzni/1lPOfBj7kCQ9M3vOOE+9+pUgGE0u2jg2tGgQo3jIn23Bp4dJIomJgihy2FO0/HkBiiZAap0pY+IygyCYyKTHeIBXoXVE8ZadLiasMjfxkJfphf9XlrJplNxaD8HyHlbo34Jy2yQhHe4lKLOG7GJK3rsDtosIYp0E1O+c/VkMyG/zsNR5SZg4hA8pJBvGvD7LN/7wniJcNsZtSqYUhiNeghr2NGBW+u4EOj1mDHBOsnfUmtUjaOXrhVrtaxZv5bEPcZdqJFqou7jc747+ujcuYZG7lcHaZE6Uix0j7ft912W0Yj2qjcblrK5GdtcWxFc5a9ceqRoNFZ70JtszbJ9KvaHhp2TxmPVrlEvhz7Y5MTxgP2G1cjAok5Td+CENCjpVY1UYyFQo7805EmjJcVsGYU+HfIt8Axr5xieCQ5akmHMo74JLofHZHJVMDtKAGToLCE0NZGjcDJR1+iJ8HJVOQD9S/jElJFSLWfNWmMwWa8SuoVP2BhA80ki8jG24KrsdzgglHXt0eLSyQrhscgSRF5xqYYPZHpkj7HqQGyBRpdnIYhsHqnavqfp2MDgeQlY+M+tjWKEdrY22ojC41vBCiVkQOATqEXVTmoAKNFNYcRqFgU5DNKUaW8iirqQ/lhscJxsIFq3fq0utU46+AlhFDnLCLh2qAjFWO6CO3ralgv6qXJKUHWYZ/QKxXrRxGcI/yGBLITREx0dECXeDHko5VksnZI083BYKrU1jDxJOUVCKrEwBPkLFw8iAKbU7AThdGmht2OzBgyUQGpus0kxzR1JeMHmSBD9913X7NLjFg6FtDhWJqIujGcAKRqo+UVVYa9kPmH3w1HSWNJi3s3Ni7lPOG+i/jJDiUCA4qAOIaO9FzUs1nWVVFUUc3OSu973/uocO4Gy+g54tzTTTcd9R+1+HokRK0KrxY9SqVxe1UF1qRr4qwFVSggupY9RmTx6quvzlRz2WWXOZr7jStI+6ZnyKmRU071LdJCki3Zx8U5zTTT1I+CEFpZAIoJwGQyw8RCvVNPPfUGG2yADTBFiNSO7Sd1kkyS5557bglXGUJ4WDiJBLRqKLbMNHWc+UGfESnFxKagJixPDBusShLcf0KJEZfoYc1RhDVaQIL4D3gy87CgNJMcdREd1EEaYUK/JISDNLt07733Rhy3Oe2rrrqqOG5NgFoct6h5/EPF6r3GRcD+f9q773h5imJ9/JcLKqCAgATJIEqSIFGiEkVykhwFESSD5AxXopKTSAZJkoMIShABRbIEJQcVUESyV1Cvv7fW99evdXfPnvM5Z/PW/jGnp6e7uvqZOdPPVFV3L7fccgKBPRU4nCohPI/9ikDykn69s9mvRKADCJjJMvZWMQk0gqWhiJpmmmmkfQuWHG4L2/EYMs0a4wvwpW6ERkSmmmoq+aJPjOu+3S1b7NTYz4pQ6tYmtMiDQHmRKEEClBFUGyyBBOMuCYZPU8+MmhbviTXyMRhxGLUCUQGN7rnnnqgG+oJbsOXU9mLWWWdlnMCrTDwh08wXqxiHqrpGK+YBvYiRWBnzkoiNDrJ/oBeiatAdyi+11FIWyUU70BH8gOlFgLngDIyH9YXRSO/og1RhWkZ6fKVWbTnyBeFClRuLPrrMqTSUZGhDWHivFXeU1wT6hSMyfjS4RA1auWtgIVxdsa5m2uM0tRChI1gmPANnRK2u2pnZTwiMV/mv3k8dy77UIuBfmoXc/3ntpcxJBMaOgMGPEHOYxy7K2M9mUOSgKb6VF1xwwZITCWOwL2yWkpIfduyqqa0oTlVOKR8Js5ywEHJ8jseeUFUKsC6YOc8g5D9IFf9KToV2sl6IGkEOqgTyYuBJxngaaj3W9KvthbEWt+CgwXsQCz4OvAFpEHZd1ypgeNZurD3IvYJwmHtfOa26So1yatQ/+eST2VfogFGZKqwjMbm9lIkEhdlvLr/8cmErdEAdBLIwWlQVK6fUiDUCSk6MKZRscEkfFatUgE0LJzOZXzgO60iRVhJw5n7C80pOJvoVgeQl/Xpn6/QreUkdUDKreQg0kZc0T6kmS+KRsd6xdeqM2Yw0Ijb4OMoixaUxcRJCWQWZlpyhEqgGR48gZb4P6wWTyXsSLpuhqgxgPsrCGGNzDHFFA9j9QevyBIPW4exvIpAItAgBlnyTQVokvEvE2gfRz0yfxvrwYnDQGE2HNWYwpXA5+TUWOOBXBd8wqMRe5QMOxSB0P+NLBuEuj6aPrKzXXnttXa88n/ptt93G0mttEpbk0UjvUB2LZvo2bdD4sAUa1B3wS80Keu0PGOedd14dYQXpj+50vBd2SuJLipiejiuTCrQageQlrUa4V+VbjoxfmZ+7qgPMzsLuTAHgnzY10ZQ/66Dzi1cV685TX12+6St1MwXA3oQlZ9gCpWQmqhAwfzX2K6jKH8xTHpnZZpvNdJvB7H5ze20Ffev5mhbeXLEprWsRSD9O196aTipmruAZZ5xRVwNmZ5MdzB0wUdPsRKsLWKph4YUXHt2eHXWbaF0mUoVp6Ro6IiYA8bLEk/hEUYrmOmp32AKt063XJQsCXXTRRXu9F03U39SeyrjOJkoeNFGxS/mcc845aB0f2P4mLxnYW9+o42bxmaOIdtQtZPaBpb733XffKCBSj/mkbskmZvLWmyE5OoG25GXdsXBCVLdKFVGiBCzGZatkjMp3rTU0GxSwvn7jpke+KVpjOb17FcgCQntX/6ZrHnsNNl3sAAr0LWQ21rCROgOITL92OXlJv97ZMfXLvrjmK6655pq1Ew1CrrkDhbXMP//8liuIlR7qtmq6oIHf9EshCLbdsj4EW0uUdGrmgkmD5gpZVVNQi9WWrOzpVynK6pBWwjbXsXLuaCkgzMUC4XxJZlqaEmI8wDAcvc54o/ikrRIhx/cWIoJA+LKvnFlqrxORNFbQalCgtCWuk0w+C4UFP5q5ysJsJQbTNKKMlbz5tkr5AUmAPZ04A3KvO9LNJCUdgb1TjSYv6RTy3d6uIZyKVlWy5oFYUaZUyzA4tQiSdZlWXXVVo77lBOy6cv6/f+wNuEXdXlmQwCqZNoixEqUC4upFsUnwAVkGimHG0kzCPviGbGFv8UdyKnnJPffcg5RwHtUlJVxONuaYeeaZLXCJ5dCTBGtfWpBK0C4fk4bQDgtjW/iBvwbvqSQlrlraUvlhC7DJW9HBmlEWeMDYRKXssMMOJoJa98L6E6wvhDO9DMXkNJS/RCARSAQSgWERyLjXYSEa0AKxILTB2CJLbAl2RDNXxareL7/8MkR8vthZjXPEJewEaTBIs+TXBSvWWdpoo414TzbZZBPrW8vBNpASOfZUCwcNmagDCWIGLUhlkUcMgP0Wp8FdYmXxKvkWiUIpZPpYtwCURKxGhUJZQxMpoRUupcWQ7Mg2UyUEo8Kchi2glv4iVZxWIn/RsmjLAqAWxWIvYWQqNqSqJvr+FK1synJqfQ9UdjARSASGRSDtJcNCNKAFYvzGFZgB7MTRYLlMK08fcsghQlIMTnWjZXl5gIgfMJlY0cG62lbJVBgdUTfs/6wdfowWStr+w9Ldf/rTn3hJOGiYNzCYqjUl464wkNDTBMJYT4JvhVPGpQgHQWjszmopz0ceeSTKhzEDK5KweLYVsbh+KMbqYwVMZRoUYCWyFYu2mIc0ysZzyimnqEJJthMmGetpmqCEQrEDBUmKRvv+mE6cvr/F2cFEoJ0IJC9pJ9q91Fbwkgkm+NcTUkVKhDciFjazELcRXcIDrNvNhFC3h7EIihgRbhqcAA9gV+AesmsXn0vsHxbhKTF/geGBWYIothCFb7jhBjuD1JVctj1TXisMLVRlLMFpMAPMQy1hubYLier2T5Fg8jHvxpYcLjmVWfYwa1Ag5nyavyNgheMGeQpticJUuKgsKG6RbLEmEphKg4CbUKZvjihm3y+n1jc3KzuSCHQ/AunH6f571BkNg5fE0FulAUrBjMFyMOOMM9rJYvHFFxfTaqjmNKkqGacMIYZtfh+nDBJCNARqCH21MhtLhvE7hnCeI81hBkgG24NlUQRtWMy7wQxkxKh22zMkScQJfhB0Cvvh7olNv8xnpoOcL37xi5wvYSOxJ1k4klxqUCBidc3f0QVVqIpjIUMCaRkMRrgpWgDST8c0lvTT3cy+JALdgEDuj9MNd6FNOozT/jjGciEgMXJX6Wd+jfk4ZsdYBUQxwzPDgzhTno6qknVP2TMwBtWF0KIglksy5LNGxORkcal8PSMUNcJtz6LFUIZLyCqc9ny3C0kE0uqOXwSLKNOggJlEl112GQ8OasUkI+iV/rZy5Y0a4aZodQHp6UzclCcug0t6+iam8olAVyGQvKSrbkdrlRknXtJaVf4tnbHBTB8ujz//+c8sEJiNxc2GcgYNpQ83Sju3PaOtGUBmJk833XTYGEgZdYbSre/zGUvSidP3dzk7mAi0GYHkJW0GvJPNGUStyJmftp28B/3VNl6iQ/lE9dddzd4kAh1GIONeO3wDsvlEoEcRsEpeGkt69N6l2olANyOQca/dfHdSt0SgexHIjfq6996kZolALyOQ9pJevnupeyLQIQQysqRDwGeziUD/I5D2kv6/x9nDRKAVCOSGOK1ANWUmAolAxr327TPA/R99Y2+PhP3qJGo3o8+4xb59CFrTsTSWtAbXlJoIJAL/QiB5SX8+B0iJpcxsYlfZvVhbrDIn0rlYZy0mmdMAgVyzpAE4eSkRSATGiED6ccYIYJdW/8IXvmChdESk8ldX17TG14UlM4dCoP/mBlvQz7I0Q/U38xOBRKDNCCQvaTPg7WvOXnpV9pK6bacTpy4smTkUAuYG9xmXvfXWW0888cSh+hv5b7zxRuMCeTURSASahUCX8hIrkUcPJawR3qzeDpQcJpPYha5Br/tsgGnQ07zUFAQYSzwzfcZlbVPArNgAH8v7Wo/Y3gsNyuSlRCARaBYC3cVLbDNrC1n7tM0zzzzWF9dJW9XbMKVZvf3hD3+44YYb2mpuk002Ofroo22K2yzJ3SlnWJPJ0ksv3Z2ap1bdiQBjSRueGbs52lvgxhtvfPbZZ5GGVkPx97///SMf+UiDVn7961+7+tprrzUoE5e8YWg+bLEGBexeuf/++zs2KDPCSyeccMLjjz8+wsJZrBcRGPZ5G7ZAd/a6K3iJf8Krr77arrPrrbfefffdt+eee/osi33U7Gpro1oFXnjhhbGbUu13T8j888///vvvn3LKKUstYzaxqwAAQABJREFUtVR/+5Ubm0x8+CrQnc9latWFCISxpNXPzBlnnLHaaqvttNNO2223nW2fP//5zyNDI0TDi+K8886rNH7YXNp+19/+9rcbSGCUDY+nnY+Ury353HPPyXz77bdrL1Xl3HLLLSPXtqpunNoP0h7atoSsvHrmmWeOgp9dfPHF3quVclqXvuKKK2jeOvktlewbGFDFSF/Vlt1Ju3aYqH3eHnzwQfFSpQu1Bcqlbk50fl01/292lPVML7bYYrZytdGrzWbj/cKe8dRTT2Eq1113HRBtTH/bbbeNBU3/PB4+28lef/315Ew00UQ4ylgEdn9dJhP8o/JN3f06p4ZdiABSYsRtw9QtuyEefPDBNnHkwD3ppJMEf3znO9/52te+5rUwLCyIxYEHHohAMLtG4fPPP/93v/vd8ssv36DuBx98gJfomlbmnHPOa6+9duKJJ64sz2zjdCTRWkY4hVErg9l444234447Tj311JWihk2Hndib6v7777dZ9+c+9znbbuNVSyyxxDjtEAk9b1FrBLCa/OEPf/j4xz++9957D9v6qAt4o9p9c9tttx21hA5WNL7sscced911lyetVg3gux1gtKto7dXO5tQ+b9i8x+aGG27gdqBbbYFxfSA70sHO8xKvG0jhJRjDzDPPHG8ftlDvpkBkhhlmsL88/+5ss81WMOLrvfzyyz/xiU94VgJo1b1Q7Di/wgorNHiFPfDAAyWo4phjjplqqqmKzL5MhMnkmWeeqe1dn0UJ1HYwc5qIAFJS/nGaKLZWFAOJ31FHHXXZZZfFJ/hxxx3X4D+6UoKn3cfGnXfeGbzESxkv+epXv2p0ryxWlcZLEBGTny+55BLOI6P4rLPOWlkmLLXTTjttZWZl+qGHHjKwPf3005H5rW99y1sLjXj33XdHPgx4yyFh0WXjio7MMccctCKT8dgbb4S85MILL+T7xsZUfOutt5544glfdAsuuKBuhhG6UvMGaR8zXqoITYMy5RJtxyn+BqRuK6tYweedd965++67V1555SKzbYm1115bWz6A67ZoJDLG1zWk1S3fhswGz9v3v/99I6MPeARljA9kGzoyZBP/7IKffwDx8L6T/DNvtNFGDz/8MD/LzTff7AXxjW98Y4MNNqjSkZ1TSTEojl5hLC6+kBhanPphKipWVSmnHEMHHXSQB9H/qsKbb765/5BytS8Td9xxhxfcv7H5f4eFFlro8MMP78vOZqdagYCnpc0PDBtqPKwSbOy+/kfYL+RJRYxEeT4d6ZdeeinqskBgHoZ8L5xKaV4yWEVljjRjg38c7yLpZZdd1tvGeybKPP/88zjTVVddxcQSOb6jNOSLyOvIi6X2lUIH4wQh3k4+jaJW1XGZZZZR4Ctf+QpRxx57bGWXWY9OP/30qvJxWtupffbZhwQN0cTL0xuvbsWSScKRRx6JtZQcCa9QjXpVwuqII47wcta7Sy+9tLJMZXrXXXe1YFJlTknXwuWS1zslzzrrrFJMwJ+cuDVuEzQo4KmrUqyUj4Sr7N/K33777eUGVZVpcCpmqOoqCz3Dj7vvs+3444+vulp1atwxVEWmr2LBBqVAyS85kcA7q3JqT3nxjFAeBg9zaFil1bDPW4MCP/jBD9zK8nTp5u67716rQ2dz/quzzVe27gmDpleAp9NTG5fYHv2DVRbzplDAbfOvzkwaj/K5554r4UZ6Rr3I1lhjjcoqddMeYl5Dd2iVVVZp/OjXrd5bmfz08Kn89Zb+qW1nEfDktFkBIR1YhX//eCEYrX36j0QHjIG2vrz9UxvYGEvUEtmKecj3ijDEeqWwZBRpW2yxhW9ip8YD/giJV199FcOI/xcfoOTgLlHeECjfkK+AfNFvkR9MyEjmUuSUo75oVz6LiNZVLJcqE0ayGFnJ5wmqvOSdxsVgsERuyInX41CdMuTECw1ovrsq5dRN4y5axB7KVZLV1ZDh0LvU1e23354lQ4IZphSrTOy1117xoj711FNVRDLi6lBwGdFholYUM1qr5XY7PfTQQzVEmnsnwYdS2VBlGoEIrdwI1VWsvFqbpowQ5osuushd0DV3lvwrr7xSSZkMSxIEcgXy6LnkFyQSb0OYKMmb4y74BnYrPZAeTrdGrZdffllhY3xp1LDi+9kS23iwMYuGCMGjjz5KT3VLsdrETTfdRBSxikkEaajVaqjnzaPL+0nsUAVwHdKiXZESmhjJQ1KrZ0tzhnfZDmlpadIFtibh94R96EMfWnPNNdlRuXi9kop4tkRpjwXTrgTLymSTTcbmxuGKxzDYssGGn8KjjGfg+GJ/sJYioW6C95fX2bNCAY3WLdM3mVUTc9pjkO8b9Aa8IxHu2mYQeFLQBf/yLNL+SX2JlpCRxpqYbaeA/2gE5ZVXXjHAODWcOEUaDAyGYfP+hL0XOUaaCCjBZjgX5J988smvv/66t5BBggGfHD/599xzj6j8rbfe2iuLF1imkTXkxIweR/aJIjkSKJExTHdMZZpwwgmHcqbI91KKKl76lUJ4Segj0MQgJ9/4jXwM1Sk+L+9SxRwxjEo5ddNUIt9LFdoR4ImloWiMGV6w3qVMF2uttdYkk0yi+gQT1Hf9E6LjBmZDuGIANPA3gIs0MiNCU2eZW4TvIASqiDI0WAYpISr6UldzNxcsokMM/3PNNZeKYkTqloxM3aEYDTGMYL3y4+XPJoemOEUNfeW67zvssANYGJNkIluMZ6ibWBPWIwwGPuEoMbTjOi4pNu+88zr6eVx1zZilO1iCIcmjIgR1s802gxLLilaiZNVRaEg8517Rc889t6vxtNRqNdTzRhNPJovdUAVEPsEq2kXUJIBfpUbHTzvPS84++2zh9+YD+0TwUth33319gsw000wBDe9jvBGwZrNp3B6PoLuLTGC43lleE0p6OBwxX/fey0W6rkOaB5S/udJT6NHx/xBNRIt9eeR319PSNa+Pks5EItAAgQh37ewDw8BgDBZ7UfzlDRQWc2b496lqlPVmD+OEkQaBMN3Pe99rRHW8pLwHDN4xTkwxxRSa8KHpLcFC4LtFvJq08j6ElPeyIhzLN1rEUmyG7crJGkEsglX4fDdMYjDqGtrJMdSJopDfQH+XDIdBbryvVInCeuQURTPqGFYRrMadUguHYIORIM0wr1aIqjqOP/74IlcMUUZf1E0fqWqusumKRnqmBcMwKmYkhuE000xTVb2cGolVNAbHhlxezo3hwiAfe+wxa8OYbWTU17RXsTAmpATsBmZ3yiRNv9JEZcIwgV8aNQQPGf7dCFdF9lRRusoq0kgMjigBTzMtcAtCnE4++eQglYinwnjEHUbD+GZ2R+KqAgiNoxFHNznXpHGXGG6CQnl+mDrk+8B2hF7ohus4ZUlyO4LHOK368ZS56mnZb7/9IEmC51CZulrJr33edER+mR9eW8DTzqaiDMX8Fxh8G0dfKdn+X+d5yW677eahjzAoDxlABUMddthhgYVICKTBf4g3I1Oq2y8EyePODGisZRNTjAfOvfQx4S4S5cnjnfVw16JJODlk+j9cffXVV1xxRa8t9piIe6ot3085xUZSEv3Uu+xLixBoW7hr0d//su9mM2u830tmvMc5X/yzoxc+6Mul2gSLiEzDTwzqXs0GDAJFg37pS19CKfhEjDQ+b3w7KumtEvOMIgbTC8dVBMV3LWu894MhVo4xw8Dp49hotM466xjPRD8IxvdeMi6GGrGSIau+WTnsDb7m2fBd8p1tDPZp7rOKUzUKD3WkRuijoRgXg1IYub2vvLVUNJw37pQylIkOeiVycAf3qm2UrYhuYefALbAuABqwdcG9MEIbVn3v6Yg3c231yPECl2ATYsPAXUxTwGMaw+UFbqIDhxHuy5UQsPCbMIRzVPm89ENZhrKXhInLWMCAQUn8CZfSkV122YXaQ+kp39PlCBx1URPPCbYKcyZ2EIEanQ3uSGYEwypcDPB03njjjUOOQV1hliHzOhmcSEY7jCzBdYrxhiE/hipMSxV1h4oRBrKRS+SyGwFzOPhEH0qr6IVj5fMWz3B8qNctgPRgVNie/wUFEFbHrvthl938Y65Egf3n+1oKPb2Y+OrcvG222QaXjDSHn/98BfCVxt3h62ULRYD8H3qG/Lt6KBtX6ZurQPPrm+5kR1qNwL+CXdseH21giAfV0cDAMysQQdoApr/hfQ8ryFDdZyRQ3khQ4saY6EOmgTOCE21sGfIJ0UeFJYzEMrEfL5YoL4DAqOOSEcXoYtgIZfAVn/jyDQnxCopiwi9UjDLqUsAYIMdoXZQxzKA+5ZSQqp+QAlX03ZF/wVX9NdyWYj6mhdY17pTCjCtFTonkKEJKIkJA9A7/MGj5cgvdvBjBondG7lJYSEdlaE7J56sSEoFKRo53LP2Hhcsnu65xSxU0IpxFxSKZoVej5bQyETFDlGTb4IxzCXXT5RIMVFlYGmjmVTAYsGfE/aIwBXBNYGIMymAkJTDZYEG4/qqI/TglHLN061HbEI5DGImkCYShAjF1A6eUINbDAEmnWGlU8XR5nCJddWTkIwGjLfmQ8bTU1UqZ2udNJuTj4axbAGmDWzyiSnoaS1vdk+jJ/YQZtdxj0Pv0YYPyAeH7iQnEv2vX8b5uUsh3CXW8d7pJqdSlSxHwtHgF+9Ruv36oCdOmF70ByTvdJ6N/c2ZOn85GKW9tHpbGk2YNLbwYlasTeR177cZHdvTIcMsWwu9DppFp4YUXll9iTeS4qpXiEfbNw78gU/nK2bNyjBzF9cz34WvHlxKrQ8RkGAVFt/ie9iWtawwJmkYajIh1scWcUCVN+C7Xa2UorzvFcsAZxBDCPNygU2rhZ16M3pYiOQy6pSO1jfo6N1Bx6FRdMuPUGCYTXRCv4HnwIY42MXJXlfQBqblKeFkacJph4aqSg6mwfMMHy/nUpz6FoNCNVYMfv6pknCqp0YKMTMhMOeWU4l1qy8e4W4WDgTlyyq2vrBhX9c7PcONYi1KRUFmxpGurUMPV8LCUYpHwaLlZXFT4EC+BfwFk0V3GmLVeCle2WPu8IVtuVpFfWyDkLLrooquuumpj02Npsc2JnuQltRj5P3FHGf1qL2VOIpAIjAIBK2cI4WJsH0XdrFKFgBgUjiTOl+mmm85HqmG+Ma+qqt7BU54R3NQsXOMizRmBkMIy5rVCMdEPAlQhhpXyByE37BB1eUYrWu+4TKxC91Fbvh7rO8AcwQ3vTBN1w/aQEn6iiK5touSmiOoTXsJcxpSHa8cHSlOgSSGJwMAikKa1gb312fFBQEAsDusX9tOdne183GtTcOHuDatmU6SlkERgkBEID076+wb5Gci+9zECYla4EcVud20f609G71p1h1LM5KiupX5D6Zz5iUB3ImAODg9Od+qWWiUCicAYEeAjE7MyVHjTGIU3pXqf2EuagkUKSQQSAaEPZpJnWEk+CYlAvyIQUduV+811W0/7JL6k22BNfRKBXkSgO8NKrI0Rq0X1IqSpcyLQhQiYJ1U5ianbNEx7SbfdkdQnEegMAkgJD063hZXwhZuUEYtsdgaXbDUR6DsEupmUALvr4kusd9R3z0B2KBH4fwh0s3+k/Uu7juSxiJVebW87ksLjVMba9pY+sm7HONXKwolAItBqBDrMS7AQC+dFJx966CGJLlyrv9X3IOUPCALxhJeH3LpG3WOcYCwRVtI9+pRHwopS0tbcLDmjTliiyupkrC8h4eKLLyY2ecmo8cyKiUCLEOgYL/EeNH9ar2I9QYlu/pRsEfopdjARQMdjbzNrl8V2RZ0lBPHP2JGlXYd9AIKXVC4kOmyVoQrY88XqsZZIt8eWVTit7+kusJowxogEtBr6UBUzPxFIBNqJQAd4iYB/H47sIhhJcpF23uxsq0sQ8NjHk4+O4AQcKOFD6Qg7QZK03rUTgy007q7FfnjjdPswD2uV+vixyvhWW23laPVFC6vb89Y+fDYWIY1wizFa83vBBRcUyFK51Pc4tZWFE4FEoIkItDvuFSlhvvYS9HEWr+YmdiZFJQI9hwAuYvNYVhPkAEdpv/4cqd02MdgGIja4QZjMGrDrCkzslhLIwMq2ZxZxt6swv4xMG/+asGM9Bru5Mn7IUcvqz3PNNZdNYexMbntOW9vIxz8sc2lvEaTEliu2CCZWdTTFrp9IiU1YRJwoEG0RWKy5kZPHRCARaAMCbbWXBCnpyEdhG6DMJhKBUSPgn2LppZfedNNNSWjnPwgm1FVhLrqPlMDhzjvvlLaHma1YbK2HUjgVA2vrO4aNNddc09pQO+ywgz3wdt55Z1uo2FTFVsPK2BHX3nhMI9J2v7MJXPEB8dewl9ifHFkxH8El25tVbrxCpgCU2MXNvvYmAS233HLk5C8RSATaiUD77CVJStp5X7OtnkOA+ZAdsZ1Wk3AhtZMGjeSmoBdICcMGe8mXvvSlq6++Gs8QC6Iu6wXPyxZbbIHD4RPhdmHeOPfcc9VCU5hA7r//fhvLffWrX1Xe3vT2l7f7brSL1hxzzDEkxCRJR1veV6pkK1pWlsi55JJLJHbdddfKAplOBBKBNiDQJl7ShZ9lbQA3m0gExgmBQk0MyeNUcRSFuzas5IILLuBkEaMqpqREvbBkPPLII/fddx9ziAARFATbOPTQQ3X8j3/8o+OWW27Jd8O8oZid6F2yea8dQHh2FlpoIUf0JRZne/zxxwMuppFwEvEB4THYD5rC7uKqoJNTTjmFGyinBwZWeUwE2olAm3iJr8Bu+yxrJ8rZViIwQgRQE9EeZfL8CGuNolgXhpVEL5gxbHCPiGAVd999N1uIKBBzeh9++GEFBIKIZr3xxhsFtH7xi1/ENvAJe7UHR7EpPO6i2I9+9KNXXnnlqKOOYj6xTbzOHnTQQbFZ/IsvvhgN4T0MJNJi3eyuRRTSY4bOl7/8ZXYa+WF0icJ5TAQSgbYh0A5ewlgSkyHb1qtsKBHoXQQ4KcpKJy3qRTfbL1dYYQWxqGbQCHH93ve+59QnDYIyyyyzQOPggw8WcWINElEjIlEYS7beeusjjzwygkLwCTOB33vvvbvuumvttdfGbHwRucS/g8qwlwhVmWGGGQLVhRdeWAAsi8g3v/nNjTfeGGvZbbfdkBjtKqCkAlEyj4lAItBOBNqxP04s0pD2knbe12yrpxEQjNW6WfQRVhKjb3eixIzBnyLUo4SsmlAz6aSTIhCXXXbZZJNNJvqVmUTQq2ARO7ZX7ozK7IGIiI298MILRbmKKRGGwqCy4447Mof89a9/FeuK2eg4Jw6Wg9mIruWyCWYTgIgFXnXVVV3tTnxSq0SgvxFoOS/xEmRx7c4lm/r71mbvehcBwR9cD634ryHZbBdxGz06S59/h49G7Mh0003Ha4PAmTPc3Bv9q1/9Cim5+eabsZnmSk5piUAiMBIE2jFP2MfHSFTJMolAIhAIIA14CQ7RdPaAlHTbaiXjdNO5dfzGqcq4FmaSmXPOOZOUjCtuWT4RaBYCLY8vYSzhL2+WuiknEUgERo1ARHqlR7UBgBxAHEMCUxqUyUuJQCLQUgRazktaHcHXUnRSeCLQQQRiD51mKRAe1SQljfHkJzJtuDJgpXH5vJoIJAJNR6Adfpym26KbjkIKTAS6DYHmej8j1rUsB9Jtne0efWzgR5nZZpute1RKTRKBQUOg5faSQQM0+5sINAUB3k8+0KaIIsR02Z4OK2kWDsPKWWKJJZ577jlTfoYtmQUSgUSgRQi0lpcI3Gvpgon33nuvhQeuvPJKkwZbBFADsbbbsENYgwJ5KRHoBgRMWkFK0oMzwnsRq9SPsHAWSwQSgaYj0Fpe0nR1qwTaO4MP3h4W2I/pgpZ8sLxjVZkWnYqPs+ySdZ9aJD/FJgJNQYAHp9t25mtKv1JIIpAI9CsCPcxLLLX06KOP2mh08803t2K0qX2WUbLIo3CWO+64o+k37Iwzzth9992LgSR2/Cp7bTS9uRTYUgTsFmtHldi0tqUNjUX42GPGI6wkLSVjuQtZNxFIBNqMQDviXlvUpauuusoKSF/5yldsXB5NWL7a1uT2/dpss8041FdZZZUmNo30kG/tSJtuYCec0IRzVNlgLJbOpImhroktpqjWIWBjtltvvfWee+6xSFe/Bjl6OP0XZKxr656ilJwIJAKtQKCH7SWzzz47RCxEbQ/0iC9hO8FRzj77bPmnnnrqqPGK3byQjyeffDLSRKFBtv6ygvVSSy1lJLO5V7Ru3w07gQnj75JYOVDwMTXoO1OBlaNiw9UGxdp5yd6wiMIbb7zRokatPo6CHPfvX4Cz8sorA2HDDTd8/fXXW9Rox8V27c58HUcmFUgEEoFuRqCH7SW2KYfsJJNMwo8jANZGXzbiMcwgEPIr99zi1rFruR037J/OFzP99NMLWb322mvXXXfdKaecsur27Lvvvj4xDdtldwx0xyteCMvXvvY1+4fFB6jNSG1YKqjFfqfTTjttlRCbh5144ok2KXXJGk2Cc6sKVJ7iVU5tBYIG2TcV46lcPqFWeaPsFVdcYRHu2hW4r7nmmp122unkk0+25UdlE9JaOf744/Wa2nFJc6WhBx54YOaZZ7ZXWVWt0Z3aUC22IInqGB7aYVmIaaaZZtZZZ62SaXv673//+5Fpn/pjjz0W1XMT/SICEZHij+Ce22STTarqxqktTtDHuhHWwo9OP/30yoVAbEK74IILqmjmhZ8EbZG5qaaaCsdlYLCvW+P7VVeHbsvMsJJuuyOpTyKQCIwUAS/l1v2MqXbYapF8wQGGkL322ssoIlF+aISh6B//+Ee0y+0SlwzkEhaxls/5In3SSSdV6XbppZfKx1eiCsbzjW98g0CnhkndsSuY/dYNYyoSyHdTJcGpcTTammeeeULOTTfdVFus5GyzzTY+319++eUorzmhM3G1rvIcVcRutNFGRUIknnrqKfmaDvWqrhZRW2yxBfpis3j2iSgjTEdFMTqlCuqz0kor6akcvXY0tGN+6At7A37GR2Yzevm2lX/33XdZjOyLZqc0OWedddYiiywiAFnaL9hP4OCIG0V+HKEqU8fdEWhL6z6LlF5oKMrYLVb+LrvsUlmxMq1HCvz6178umciivWEpJt+PfMzSpiogKmXsJRun55xzjjJAM2/l38VnsCFcFNOvww47TGcrn6gioaUJsFNmdE0cfvjhrfu/G51KWSsRSAQSgREi0MN+HNuBIl/4hzHDTqFBxNhFxKIa5GJ3UAOwHEOdMclqEF70v/nNb3y4q6X8xz72MTDJibrCDG1YqkwsaWV2z/nnn88fJJ/txEe/vUWYQOaff/6wsnz0ox/1hR11y5FkxQxyhxxyiPCXb3/72y4ZzkuB2oQB+Nlnn+VTII2HiHqPPPKIYo2V17oytktVXoKFxrguIcKm1ggknynC0Y+taIoppjBUxxJSciJ6lxnp39f/dbj99ttZOHCLF154QXdQQFYNfWduMU6jWbpmMpSSOMqZZ57JQ0JbcP3whz/UcSaZ0047zVVcwU6tjFj77bffDTfcsOKKK+IHuMi/2vj3jzPFX643N1F15qjtt9+eZYtJI4wlbpCKymBvUaX2SFWZlVYrBhi1QLT88su7BGEoiYz2JJTqRx99NC7lNB4ksUEPPvhgxJqIzJCvR1is3uks/6B7pHetczYVxcaYiFhXT+8Y5WT1RCARSAQ6gkAP85KYEWO8sQG6L3WGDQjuv//+RpeAEpM44ogjpMWrmqeDbfhE5sqZaKKJYv90w6RB15FbxCXRsgob1/kFJNhCQo7hzaDI/u+0cooECsJ8EmUYGCIShcUlQmINz0Y1REeBZZddNorVPTI86IVahmR7cyjDgNFAeforQ21jP0JmtoXCNNQFNAhtqtsKGwbbBlcISwkL03rrrYerRUnmGQkmijjVKU4oab4wtEACtiwHEhgJfGLwNlpHeazCEjLSylgng3r8QbogB5vRNURE12xbH+hF4HDUVYVHzE2J05lmmklJYBaPEiakTONtY3EybLLQrFtuuQXXXHzxxclkOMF4NGE6FQRYa6KzLtEToUFZgmowDilgGXK90zWN8ohRHjFFzlh9KK93jCuhatcecwm1rr01qVgikAiMBIEe5iUxDhmM9ZN1RFyIN7JhxogbG8QbdXy4G+e23XbbwGLnnXeuHFcMzDFT1MDJ2MBcYT2SOeaYA1GoxS6MEPFpHlcpUOwl3EkxGccntQGPHGMt3YyOTAIRx1ArM3KiC8JKdGHqqac2LqJWwyr/3nvvnXDCCSTcdtttfBwibDhiNthgg6Fakc+igBOIAKWq8ogXD4i4E6EVrrKROKI4+oK7SJcYkTDkyIEnkwNTBCWZauT4uRpcxCgugUlwSynpXugIp5g4GGL1jhlJeZf4ev5d9b9qw04nnHBCl3ALPM8NooxTdaN83SPQYIg6uKpFIUSlmC4w2IAIvRBAw66DbBGuQFAfFh36OMVsPBuMNG4We1VIY2JxXwhRLExfuEsR3oUJxpJcQq0L70uqlAgkAuOAwAj9PaMr1tL4EuOisSTiRYp6PpSNHPJ9r0ewCJN+uVoS/DLKVP18WEeBCFgxKJbykeD7MMiVzAhrYFwJDxEC5JLWGR6YHEqxYRPUMNiXaA/GCQ1dcskl8usqf/PNN1dp7hSvYr9p3BYXjD4iNIrx5hjyVWTRwScCNHxiySWXlAlVR8G/rDgSfnw6jvQEu+qYkLSEzsqnZ1RBSmQG8shNSIOnAFvFOH3YIaKKqBHERaafhFrlx+gV+XGsG8RTCkswiigJtAgMolg0h1/S1m1CVhRzU9h7oqd4DMOPWlw2yJwEF1LIFNfslPkkhEgT6OgnZ1iQKxUbS3oU8SUZVjIWwLNuIpAIdAkCPWwvYZw3Nbd4H4KLsXZwyriELsQl8ztifZFC1nwfi6x0qjq/ieHZl/SBBx4oKiXKhJ8iPtxLLQmeoJicHJkhXyRE7IoeX/ZOWQXELlRWZJOo0qHyqm/cY445prghuFqM05/5zGeUqau8AZhZwlW8ASHQWUMvwhEhNZWSq9IsEOIkqM2AREMBEwogVXrKBIKUYBIWaAEF6mOMp7NIFGUEKyy00EIIk+lCs8wyixwtOurXdNNNh3xgHj7TyeRwkc/GQCu2EDYGIDNKSTNWcabw1FgJhq2CAUPCkK8v/hnUKj+ske1HRT+Zq666arlUN8H8o2kA8r+I97z++us1pAn9FTtinRJaESIS1kwfOhOCdsT8HfOD5O+www7FpiXKVbt8QwJo1l9/fTyGvSem8DDFDQtyXQ3bkBlhJWEpbENz2UQikAgkAi1CYLyqIaG5zfgYFefYuncl74DZwsXdUJT3ZRzjh/HepFn5Bk4jKGsKKwjeYNgz7ho7w4VRKkbC0MXUXzvVVnN+EZuipFU39txzT2O5Ed04bZayTGMhm4om5ptvPmElRAl69b3uEteMWc3RROOjj/Lxxx+/gfJbb721Edeg3lhO1VXKMBL4hefCVc4m9hJmgKqSlae6XHfHkFAS1ArXjtZx1SXVkRI461ERK1PQz9xzz23ukl/QxHK1JBAUvhVsY/LJJy+Z45oQF4IehcNOXZyDgUTgi065g3XvSFG+tMWIwpfk2LbNsf3vCH5qQGeLbpHw+KUHpwqTPE0EEoFeRKC3eclIEOdZ8F1ehiURAxgJr0ehFyMRMk5lmGpMAjJ7JQJH2APMheHlGSoitYHwFinPlSO2VLxwXcLRQJ92XkIOxIIADZEae7v4E7JoBtbo7ntYI0bOLMeu8DjxEuppMdebHzvsKSERSAQ6jsC/libr79+a//4JobBACJNJeEBa2mVhmCI5/PAS5hyLidVadEaoQIuUN8PIb4Q6dKqYAAtMgielKQqw6LgvoxYlzoZ3r65xZdQym1URKeGMa51Vsll6ppxEIBFIBEaCQA/Hl4yke6UMOvKpT32qDaSktCghUtIUmFGTkiKqI8qX1juVEDyr46ZDd0oBU6sYuiwox9dpPk6EmHRKmQbtmoaWq5U0wCcvJQKJQG8h0P/2kt66H6ltIGB2kh31rDrTQU+TwBRBMCKThPEyfY1rNE97biVjibCStkW9tKdT2UoikAgMMgLJSwb57ndv32MxFbHDHVTRBCuBsWY8mV1MjQ5aboYCIaJeRh4bO5SczE8EEoFEoHsQGBQ/TvcgnpqMBIGYJl27MeFI6jarDFONZehii0QMoAvtJbG0a7P6m3ISgUQgEegGBNJe0g13IXWoRgAjMX947KE51XLH/dxUaguijHu9dtRIS0k7UM42EoFEoL0I9IO9xADWXtBa2JplWK3tMWwD/dTloTrbDaRkKN0yPxFIBBKBRKBFCPQ8L7nrrrusNxobwrUIo7aJtcyrFTvsrdO4xX7qcuOe5tVEIBFIBBKBQUOg53mJ9cHcs8rt9JyaPfGlL33JMN9btzN2SH788ccr1bauq1mgFlYvmXW7zIJiOfayfFwpnIlEIBFIBBKBRKCHEOjt+BILq8eaqpYEtb4qJ4h1zA466CB7o2AqBvjYA6Un7gf3jcW7qGqhTzv1ROiALets7GK9WvmWQrE0/lBdtta7jWDsFWx6bezv0xO9TiUTgUQgEUgEEoFKBHqVl/zkJz/Zdddd33rrreiMjW/8rGM277zz2nku1k976qmnRsFLGB5+/vOfk8bi0njjmEocx5JGrTbccEMb7YYQnbK9nG1cLMHuaGcZK5+aFWKTHQuND9Vle/RcdtllSAmCMiwvsSg73qabr7766lZbbQW3seifdROBRCARSAQSgWYh0Ku8xCAdIzTTiNXK7cG7//77l2XC3377bQDZp2YomKyXZUPBRx55ZK655rK5Lv7BDnH88cfbj6bsaffkk09W8hL2DEtZ+JG5/PLL77777gjQUPJLPkphrxzDv2kdqkw//fQWD6UwtkF/8zxxArsZBymxj6DWTUU599xzWUeKECvV7r333rbxa9Blhe2IWzbFdVpX4RdeeOGEE06wkqmF+UO+bXh32mmn0lYmEoFEIBFIBBKBDiLQq7zE3vTWuLTuliF/mWWWMYoXUgJN3hzHGHpZPg455BDb9fHvxLa3t912m+1/FcADeH/8hM2effbZEcOBc1hmdPbZZ8chyo0R5LHZZpsxMIQl5tRTTyX80EMPLQXqJo4++mikxCXkiS/Gvrja4nKiycMPP2xP49hn+Ac/+MENN9yAoEw55ZQMPDpSSUpUx5mYNxp3WTFbGSsZy2wMpfAuu+wSjULAQmH27pl11lnrKp+ZiUAikAgkAolA+xHo1bhXa16xQGAJsf9cGEgKfBNOOKG0sVmsxhprrHH33XejHRdeeKFMbhpDsoq8HnY7W2+99WRiDHZCieqCVKaYYop55pmHISFyHA8++GCkhCh8ggtJjrrlat3ENddcg5RgCb/4xS8URjswJGyGVsqLUcUPwuFCSS4bpES+7hRPTRFrSzZU6d13323QZYXRoB133DFqDaWwJdWjAGPSjDPOiK595CMfKQ1lIhFIBBKBRCAR6CwCvcpLCmpCJaT/8Y9/RI7h1trhYRcxn3bTTTeVH/u/x/xbg7cc/ID7ZqmllhK3gaPYkm2RRRa57777+IMee+wxW6LgK/hEyOT0CfcNfmBcX2GFFeQzq8TVukc+lCOOOMIldVGKRRddVBBJsAo7v8jHb7SI4qAmWFERoiPRIznm3TCBSATNKvN06nZZMR1hVjELqYHCImrZZpZeemlrqzM4melT4lqKDplIBBKBRCARSAQ6hUDP85JJJ50UdrhIIChYVeBI0BSDtKH6lltusbHZSiutZAB+7bXXeFIWX3xxy4rb9f7NN99cd911b7755jCN8J5Yd9yUFsaMe++911wYlhWeEfNcCBd9stpqq4kqZfk49thjOUQa3DNNixTZfvvtsZ8otvPOO59zzjnSEb9CiFNWHxEhwlyKKJoUe4n9WWjiElUdEaYoVrfLLvEWOdr5trHCbDMXX3yxSBrUincJORO5EpLzmAgkAolAIpAIdBaBXo0vKahNPPHEyMejjz7KP2K4FVIqNAT/iAInnXTSLLPMIs3OgaBEvKegCoTDrwiJxOGHHy4cdfPNN5cQCorfEMghwvihAPNGOH2qatU95bKRr2mmlwMPPLCyTETjEh5kSJTrRRddJMZ2jjnmUIxu+Af9BaheffXVq6++ukxaOWJRIadul10K+oLWROtDKazjKNcqq6wi0kVbgl1OPvlk/iZ2nZCfx0QgEUgEEoFEoFMI9Ly9BHALLLCAkRgdMcQaXzfYYANxr8gKGmH2bCC75pprMlGIHWFUEPkRFoUCOt8KusBp8q1vfYs0Rg7unmeffVYBoSEx2XjPPfeMBc2iFrIiVCUoS5FTEoRIM6tU7WASRpQddtihTJzBD0wGFr0bdaMiv9I666wjh8nEEb9RpnL2b22XFQs9xbE2VviBBx5gyLFILjXYbFiA1L3//vsd85cIJAKJQCKQCHQWgfFMW22dBiI6GQaEZbSuCZJ5LvbZZx9xFb74BYiUSFjUhKOkNG05k/HHH58dIpwjPDum8wj7ME4LyBAByrXBhOAXrhYVRYR885vfZHjAdXh2zN9hwBDtITpEMIogFTErhfqUhiLBOcIOIc1kMt1002FOAlaYc0zDqZrIw+tEsajFIYUAMWMQy7AhNDXyuZP0pXSnbpeVVN10HgIbKMyPc+KJJ1511VX0D+Hieffbb7/K+UeRn8cOIuB/R3RUFa/toD7ZdCKQCCQC7UGgH3jJuCIlnuOss866/fbbI5ID20ACmBAkQtR7773HNCKMo/AAdhEUxHJnJUYVZcFvLMbaYHs5REcAR1kbHvURzsJPxBEzrjqPa/mRKMxKJEBYNxt0YVzbzfLNQiB5SbOQTDmJQCLQWwgMIi+JO2RWC9uJKcGV84GHvXnGcjYJDKZQlmGrMEu8/PLLTCbsN8MWbnqBUSjcdB1S4CgQSF4yCtCySiKQCPQBAj0f9zrqe8BUUBmxMUI5k//7N8LCUQwdsWDrOFVpYuFRKNzE1lNUIpAIJAKJQCIwTgj0Q9zrOHU4CycCvYJAxC/3irapZyKQCCQCTUGg5bzEEu9NUTSFJAIDhcDPfvazgepvdjYRSAQSgUCgtbzEiqIJdCKQCIwOAcsEj65i1koEEoFEoHcRaC0vCVxE8PUuQKl5IpAIJAKJQCKQCLQNgZbzkvSRt+1eZkP9hMCwG0P2U2ezL4lAIpAIFARazktKS5lIBBKBcULA9orjVD4LJwKJQCLQBwi0dv0SAFlL1LHVS772wZ3ILiQClQhY6jcXe60EJNOJQCIwIAi03F6yxx575JScAXmYspvNQsB+1zbBbpa0lJMIJAKJQA8h0HJeYkqOEJMMfe2hZyJV7TgC3/3udzuuQyqQCCQCiUBHEGg5L9ErJhO793Wke9loItBzCDCW0PmAAw7oOc1T4UQgEUgExo5AO3hJrGKSJpOx362UMCAIpBNnQG50djMRSARqEWh53Gs0GZuQXXTRRbnSWu09yJxEoCDAWGKGcMaJF0AykQgkAoOGQDvsJTBFR3wCbrrppmk1GbQnLPs7cgSSlIwcqyyZCCQC/YpA+/YTDn85apJWk359mLJfY0EgSclY0Mu6iUAi0DcIjH/IIYe0rTPLLLPMX/7yl/33399Rum3tZkOJQDcjwIi48847//73v0/3TTffptQtEUgE2oNAW3mJLgU1kdhkk02wk3/+85+zzDJLe7qarSQC3YZAMJKf//zntug7/fTTu0291CcRSAQSgfYj0Ka417odY7i2TkNsoFO2Ts21t+tilZl9g8DPfvaz2PvGeoMefrPoMxi8b25udiQRSATGjkAneUlo75PRmzrS5X099o6lhESgqxAoG1gGBQ/+nYykq+5RKpMIJALdgEDneUk3oJA6JAKJQCKQCCQCiUA3INCmecLd0NXUIRFIBBKBRCARSAS6HIHkJV1+g1K9RCARSAQSgURggBBIXjJAN7t0VUzPGmusUU4zkQgkAolAIpAIdAkCyUu65EakGolAIpAIJAKJQCLwX8lL8iFIBBKBRCARSAQSgW5BIHlJt9yJ1CMRSAQSgUQgEUgEkpfkM5AIJAKJQCKQCCQC3YJA8pJuuROpRyKQCCQCiUAikAgkL8lnIBFIBBKBRCARSAS6BYHkJd1yJ1KPRCARSAQSgUQgEUheks9AIpAIJAKJQCKQCHQLAslLuuVOpB6JQCKQCCQCiUAikLwkn4FEIBFIBBKBRCAR6BYEkpd0y51IPRKBRCARSAQSgUQgecmAPgMPPfTQgPY8u50IJAKJQCLQxQgkL+nim5OqJQKJQCKQCCQCA4ZA8pIBu+HZ3UQgEUgEEoFEoIsRSF7SxTcnVUsEEoFEIBFIBAYMgeQlA3bDs7uJQCKQCCQCiUAXI5C8pItvTqqWCCQCiUAikAgMGALj/fOf/xywLmd3/4XAjDPO+Nvf/jaxSAQSgUQgEUgEugqBtJd01e1IZRKBRCARSAQSgYFGIHnJQN/+9nT+zDPPfO+999rTVraSCCQCiUAi0NMIJC/p6dvXG8offvjhDzzwQANd33rrrX/84x8NCuSlXkTg/fff75Sv8IMPPkgq3IvPTOqcCEAgeUk+Bu1A4C9/+UuDZrbeeuvtt9++QYG81HMI/PGPf1xooYWWWGKJ3XbbrfHdb0XXTj311MUWW6wVklNmIpAItBqB5CWtRnjQ5Ych5EMf+lADIH71q1/94Q9/aFAgLhne9t9//zEOck8++eS3v/3tYdsatkCz5AzbUI8WuP7665nBNthggxtvvHG11VZ7/fXX29mRJ554Io1w7QQ820oEmohA8pImgtn/ol566aVLL720sp8//elPP/e5zzlWZlamWdSdfvSjH/3rX/969dVX/+1vf6u8Kv3mm2/+7//+7zvvvFOVX3v6pz/96YILLvj5z39ee2nkOQ8//PCJJ56oxVLl3nvvHcVuQbVyisBWJMTo/N///V8rJLdIJg/OZz/7WRTw1ltvxU2/8pWvuNEtaqtW7LPPPitzJA9Vbd3MSQQSgc4iMEFnm8/WewuBk0466bLLLvv0pz/NRB+aG+MnnHDC+eabb6iOBBGZaKKJrrvuuj322OOuu+76zne+U1n4ueeec4q4VGbWTWM28q+44or777/fIIcPrb/++nVLNsgMRnLCCSf8/e9/FwCx5pprIiW/+MUvzjvvvAa1ai/VyllkkUVqizUrxwDPJ2Kkb5bAVsthIIl7akb65ZdfvvLKK3tUDj744Fa3S761D55++mmJiSeeuA3NZROJQCLQXASSlzQXzz6Xtsoqq+AlrCPBSx555JH77rvve9/73uSTTz5Uz439Ln3sYx9be+21JZSvKvnGG2/I+eQnP1mVX3n6zW9+02c3e4nMG264AcuZY445DHiVZRqn33777S233PKxxx4LPnHaaadNNtlk6JTWSeOUaVy9XG0gp5QZYeJ3v/vdDDPMMMLCH/7whyk5cl5yzTXXCPzcZJNNinxmISDPNNNMJae5CYYxVopJJ510+umnJ3n88ccv8qeZZpqddtqp8VJJgGVimW666Ro8S0Vg40TwV/cXaI1L5tVEIBHoRgS8LPI3gAgYEUfRa66EeeaZZ9111426X//61z//+c+z0js1LP34xz++5JJLbr/9dsWK8Oeff15br7zySsmJxDPPPHPTTTcZjcQfKMBcEfl15SyzzDLa5QtQ8thjj40WKwVqVxkWC3N/SKi8FOkXX3xRXdr6cJdgtillMC055bQyIUaB7+kHP/gBr82wciorljRadvzxx/M9lZxIMB5oVPddwtj0DpivvfZaVbFyqmunn356Oa1MsB7pPrqGiJT8lVZaiVmrnDIvOT3iiCPkcMZtt912cR/NkyplahOYk5JbbbXV3XffXXm1Vmc6kK9Hfh6PjTbaiHx35MEHH3z33XdLXbfm+9//Ppkbb7yx7pT+nnXWWf+u+q/DPvvsU9mRqAsoSBY5tQq4JNLWE6V3uCY5WinlM5EIJAI9hMC/bJ75G0AEvLhH1+tdd91VXZ+kTz31lMTZZ59Nzp///OcY7w2fxqdDDz20CH/00UcVM67IueiiiwQkSvAHyfRT/sILL5TgzWkgx5gUXEfJM844owiPhObkGw6/+tWvSnD0VBWIUzpLcNkow9JTymAecpyyAxlNhWoGs8FFnLq05JJLOhr4G8spAisTBmx1v/GNb1RmGuZlcmoYR0M+g5DE7rvvXlmsMk0H/i8xv4ZbCN98882uckVxjalIzwUWWAAChQRwVMkXTRxCMBKnyBmcFSZBi6ogapWtVKZRyUI11NV6XK2rMwqiMMsZfokSzTLLLKqUn4bMyoH5sssuK5MCm2++ufIS8Hn11VdlrrDCCh4PD4ZMrIX+OksUPbFDFSHQQIGf/exnRVvMmEDSKruT6UQgEegVBDLutRuNWN2s04orrki9n/zkJ8wAPCBGcadf+9rXxBOIIPnlL38511xzGZ8YJKIX4TeJUAMMwOeyceiYY44xRyNiTRRW8uWXX24gh0F+vPHGC4H+tSIRx3vuuYcE41CQEplDzf35yEc+4mrY9iuFiI+Rj83wemAkSIPYXiGTzDNzzz23HhmJFTCyRuTmUHKUqf1FYV4nnT3wwAPJxwx4lMxi3XHHHXfeeWdVWEo0JNHA7wBqCIunQY+U3HvvvRmNDN6MPW4E8udG8NSccsopocOmm24qYaKTo7Gf3woN4sTBYwzzW2yxxdJLL63jDVrUkHunXYDwwmgFpYNArc6///3vdQqp2mabbUj+0Y9+dOedd2pI02jrUUcdtdZaaxH161//WtgH9oA3nH/++Uw1888/vyoohZICaOCvIeE+hDjFpbBYphFUTEWkioS6CribBx10kGcAXbvqqqsiNJtWxOYvEUgEeg+BXiFQqWdzEfASH51AVgdDCzsHCYY6QowW0uERCEuAU+aTsHAYj52GBcIgtMYaa3BhkBC2+iOPPNJVP9b7xnJCWxWPO+446d/85jdhhPDl7aPfh3XIwZCirShfe2QFUdJo55IxzEIXYQCQaQ0V3+i+y8k0oPpwDzMPFhXCK+1AtXJq25JjZFWXr4rNQEKLtIUeCxNHhhy2gRDOYGDorStEpquK6T7Nw8Bj/KYhA4OrpLkUcoqNxC1AVgTlaA4LxGNCZy1GYdVRhKFaPOCAAwjkkVFAXWk3q67OIfbKK6+sFOVUlUqPnlsmh/KlmFspJ2RWevoEIcmv/Jl17NRsrLoKhFEq7mkYSxRmwikNZSIRSAR6CIGMe+09KtlZjRkA1llnHWYPavjQd4xZDwZyH8rM+AbgVVdd9Vvf+tYuu+xy9NFHR/yjj1e2/amnntoQ4hvd1/Mdd9whHoWVwtewqFj8QBwDaUPJ8eHuKgmxhKjxiV9ADj70pS99iZdBYIHTYeM6RVYqJnKCDuwNiy+++BRTTCGHxQJhYmsR2xvOCHYFWrHuGGKZH3AUhhlBnVaBU75WjszaX4Cz1157keYqY4OOMEWI7ozhnBvLJa2wqfz3fw9pv4yYX0YLtAalI4r1grYwhzCFofo///M/yMRmm212zjnn0HPbbbfVnNuhRSEyhGtRRaaIKaecEkuYffbZG8xYCTwZM8ASxUSS1tU5bF1Vk3LFumrLfUcRJPwCsXPPPXfaaadl9kBYWU1YWWJBGtXlR8k4zjzzzDjixRdfzM6EV5188slch0JSCKwCLUKnxbJQz4O34IILsifhZJ6NMnGsUnKmE4FEoKsR6CEOlao2EQEv91FL82GqenxJh5CIcvAVzuxhmJFpBFWGScOnvITARpnYjAHm8ccflxM/Qaw+qQ20vunZWhrIiYYMYyqG7SG+vBlg5BiQooAjaRotp7UJQ7sqYTMQlhtf9sa8KBnTZMzciYgZha0d4hIzTNg2SuBClZzahuQIAdE1zdH2sMMOIzM+613i4JAfyERdTVSaDSoFagvPKzn77bcfNxA8SfDj+HjhhRdcFSLjFD5R0jJ0Khqw4xTJcLXSpAR8Lda1MBnXGZDcQVVgJc5DX+rqzJWmDMIRrcQR81CrNB2Z6FHcO+VZgOgvH55OzTYq1XmF5PCgIa8lrgUXYW+rqwAKFSCrxZ0XQTYKe7qKzEwkAolAryAwHkW7mjelcq1BwCTbUe9d4pnhyDfAVH7g+uj3VV0Z2+Fb2Xe5IAZEhJ1jqqmm0hWsRTEfxxwcs846q/zonxwxKL7pG8hR0uhrxs3HP/5xFCE+hckRdaGWb+tPfepTCIqgCtEbLDFDIWfwZm4R3MAKEpONtT7JJJOU8iwxomRYIIR06ELJlzBkUjuCXWrlVJYsafNELOLCgFFyIoE/CbxgtzB+L7XUUoiRUZwVx2zb2rAP+RNMMEGBl8LgnXPOOeW7HWHPCLGg0NwnPvGJqubiFL0QksLysfrqq7PBIEnKowvoS1V5qM4222x4gFum3Wh6KJ1ZNdyOSjVIq0K1yNciq1uE9cjEjfCP5ZZbrhSQ4DmqnGkcl5TU37qgoTX8gExf7k4UBoJflUqVTWQ6EUgEuhOB5CXdeV9arpXx2HfqF77whZa31PoGOEEs1yEUl5nEkiQ+nc1MKcNe69sffQuCdWjOn2VY5cdB9VgpClcbvdyGNQHF+8MNxLGiRQanusuiiELFO5krqoR1ROdKHTquQKUymU4EEoGmI5C8pOmQ9obAfuIlvYF4r2nJp8YkU7XtQK91IvVNBBKB3kNgyDi73utKapwIJALNQ4ARRcSPONPmiUxJiUAikAgMj0DykuExyhKJwAAiYF6SXgvfGcC+Z5cTgUSggwjkPOEOgp9NJwLdi4CJvibaRFxw92qZmiUCiUDfIZD2kr67pSPrkM14R1awq0vFwhXNUtFq96Oeo9QsHbpKTpKSrrodqUwiMCAIJC8ZkBvdh900o8TsGzNZmtI3s3nNdF1iiSWsNx8rfTVFbApJBBKBRCARGCcEkpeME1xZuIsQsF4FbZoVmGmlc+tqmKZrzXiLv1m5pIu62mJVfvjDH5qu3OJGUnwikAgkAiNCIHnJiGDKQl2IgCXOaGV5+KboxoNjBoo12m+99VaLetm0z4qlTZHcaiEWI2HsGUsrt9xyy3e/+91KCWYIW+akMifTiUAikAi0B4HkJe3BOVtpPgLWRSU0dioeu3QGkhAlqOLyyy8XuWJfmLGLbbUEdMq68tZAG4vvyUplwDzjjDPsRGNjXiwHP7NdkRX6W61/yk8EEoFEoAqBnI9TBUieNg0Bq4bHpi2WBm+wI92o24ug17IWPmJx7733WvPeBFc75Y6r2MpVz01F2WmnnRps0WA1dGvG2/bWSvC2vBnXtsZYvhJYLOqSSy7he7KNMLWtcjtyqO1IbPG0p59+OvSx4539ZRiN7C9jp+jPfOYzYzTDjLGbWT0RSAQGE4HkJYN530fZaw4OG9CID7U1jJVAbbZiS5q6srCENddc88UXX3T1k5/8pJ17hZSqJZRBJt4gksPI56rwVZzA4vEhRxPGXbuxGDWNkXbRswUgqhHDP0JgOzf749jdxvQZ+7wQcueddx555JF22gsJtpdrwEvsmWI7GDvYcQAtvfTS6623nq1kbBGM09gKR6MEMpzY/i2kVR1tEceocNttt0W+rWtDMZ2yq4413Ut5m+HVbnPjql1p6m5eo9cCXOwzHBLoCS6FES/bBcSOPC7VAmtvHT+XkAz7FqFK6IUfygK3su8Pjgg6wMYWd8rrC1KiPCJCbAAbrTvatC/S/FkXXHCBvXltamPrZsE3Ekgh3ezsY49loce2JMydaAp0mUgEEoExIeCjMH8DiIBt22yWNq4dF3NgwIttaSUqtxSuEmXDWwV8gtslzhgvjYg4+tn8NoZGo51adoxbdtllS3VVtttuO/vDKYn6uCph/DPM2wcnTuXYipYQu92qaGdaOfiBPfCM7kVUbYpP48wAACZ/SURBVIJRQVsKIy6bb745sRL333//AQccILP87BpjVg5KVCuBIUExtaBXuQ1vbHRstI4qripDGcYM8bm2IiLTznyxAzD/iGJHH300QoBM0B/Hsl0wyYJmXLLRj+441UdKHnrooUWTWmDJFLSrgO4fccQRLD0BiOooVFSkhlOiAEimkpHPgyOB+bkUOeWI6rkLNvbTEXXhJuZGgmJgtHuiVs455xw5fieddFKpmIlEIBFIBMaCwH+NpXLW7V0ERsdL7DAX45Dhbe+99zaYDcUDFDDwF3yMZCoqHxvfM2kcc8wxcp555hnHXXfdNUoa6aNi8BKX/Iyajo8++uj+++9PAm4kHZlrr722ij7lg20wVxhHS6O1Ca0TpTAPhatMJvYlxhikQyB7Aw5xyCGH4EahapUQHbELMSFGd3YXHYkCdgqUUwrvuOOOVNUdbAx5iqE9KIW6iJqSEoUPiepw6nfWWWe5FNJCAZ2SjwuG8FpgbfmrR6Vp1ovgNCuttJKoEfmWkycBoxKDwrYkTWYpL3HaaafJrMyRlsM9dOqpp0roTpCw559/Xjf1CFWS76cvLCi4ZlX1PE0EEoFEYHQIZNzrmKxNg1bZqKzLE000ke9vvg9elfDU1OLA2DDVVFOVfHvkqoVGxHpuHDd8KK7GLN8JJvh//sRjjz1WRTvrRkMKmCeCtfDX8B+df/75nAvLL7/8pJNOKu2qCA8MgzMFWfHJ/s477/im55oZavX08DXsvPPOoRsfx8ILL8wfQdRMM83kuOKKK4ohNYSffvrpdZee05ELL7zQrFq0g02FA4VnSkU/XhtOKAkzeqIAp4nT6667joNGgn1oscUWw3j4jJz6nXfeef/++18MD1//+tddZbxhzFBAFCoFrLiKnSjDjOQ/XKIWWCpxx7C4cP1gV/ARDCto1zyd2WabTZWzzz4b2sgWDSOYl8zK6TbhJAr5CMddd92llvuF/zlKs7tEX2aZZRZAuWtxg6wfg7gAnA9LsfwlAolAIjB2BJKXjB3DAZLAk6K3GIDxafbZZzdolSG2FgVOisrMueaay/e3KamGPeEmrCkGS6EqxlH+GlTAlzc2QCbbQAx7poTELi1GaAO2Swbga6+9VqYxWJCHHHU5I0Q5MA8gBBJGzbXWWkvwRGXrkZ5uuukkzj33XEEthm3miuOOO44LRqagUUfzY6NkgyNKpDDvie4AAZ+QEHiBopkaYwjfcsstKfarX/2qzDSWSaA4GxQB+UBTyqU999zTJSBQJpaJC/KkJKcMUcw5DEWiZzh6EEGFq4DFS2S+8sor+JbuSIv7+djHPsZwEuWFy/AlwXadddZxv7hg8BU2D6RHYT+KOb788svsKww2Yk0iE4Y6KE0so9QVV1zBLkKChuIGUVKkLQ1xO2RIyfwlAolAIjBWBEZnZslavY7A6Pw4Zrvss88+pe9GdFEI5bQyYawySFfm+KQObwvjv6u8D9wKCnCmCImQiVgIvIiABuMc34HBOyTwmHAWMJb8y3Pwb0eMWA2XmBm4PIyRMhkwxGqwK4h6cYo3VLZe0kRp/V9SZphBhEQppi0S6vpuSt1IRC/0TlsYADlU5c9iq+Cv4eYQOEIlDhd9cYk+3D1Khs5YlEuIAjUwPPYPavNbEQ5emRJyiKUPtAOlcDMhEwpUAYtMKIys4DESVT9sj6kpHEn4XAQVoSD0VDJUooN0iSMJlw3NCaQMykiTEMstFZ4yMEZ4kALKL7PMMqMIV1I3f4lAIpAIVCEwnvOxUpus34MIiDAw+JnoMRbdDcZDzUpl2DBrI5wjlU0YZTk7yiSRykuN09GW4ZylxDyd0i5aw63DcXPyySf7pkcvTKsxAOtdgxkiLBbU861f2Sg30EgUY48RkCHUhguJKcI8I5zDXOhKUSUd/29F28iPvsChcnJy5SVpGtI/vCdxiS1H15glaoGFanQWvwmbliroAg01wdkENJfMYwpRjnKYWMoNYmfSI9QQ8wgQqOcXU4ok4MwGU6lPESXR4EmoLJbpRCARSASGRSB5ybAQ9WeBpvCS/oQme5UIJAKJQCLQOQQyvqRz2GfLiUAikAgkAolAIvCfCCQv+U888iwRSAQSgUQgEUgEOodA8pLOYZ8tJwKJQCKQCCQCicB/IpC85D/xyLNEIBFIBBKBRCAR6BwCyUs6h32nW7ZMSKdVyPYTgUQgEUgEEoH/QCB5yX/AkSeJQCKQCCQCiUAi0EEEkpd0EPxsOhFIBBKBRCARSAT+A4HkJf8BR54kAolAIpAIJAKJQAcRSF7SQfCz6UQgEUgEEoFEIBH4DwR6Zr1Xy2bHGtgSVvWuXcD7P7qVJ/+JwE9/+tPIKLGuv/zlL+Usuuii/1nwv+weV5WTp4lAIpAIJAKJQNsQ6HZeYvNSO6vZ+hUiNmi1oYlt6G17Zqe3JmJkd9kZZ5xx+umnJ9MQfvvtt9sUvonyOytKjzbddFM7uVSqYbeUytOSzl1hCxSZSAQSgUQgEWg/AhO0v8mRtGgfsptvvvnMM8+0vbuNTO0FP9lkk8UWYrZu/81vfqOAPclkTj755CMR2KCMPdxt1nrdddcFL7H921BjdgMh3XzJ5nx2q3/mmWeGVfLrX//6sGWyQCKQCCQCiUAi0DoEutFeYm/ShRZayE73iy222DbbbGOPU44bROTSSy998cUXr7zySlutBiK2X7/tttvGgg5LjF3jbUy/8cYbk2MPejJt777ZZpuNRWy31WUywTmG5VtpLOm2G5f6JAKJQCIwaAh0o70EC5l66qnxEqEkM888c+wRf8011xx88MFxe1hQ7Cw/33zzzTbbbOWGPfnkk5dffvknPvGJddddV3X5ql977bWTTjrpCiusEEJK4ZI48sgjl1tuuSAlMm+99VbHNddcsxTojwSTySc/+cnGJpM0lvTHvc5eJAKJQCLQ0wh0o70EoKwjZ5111rnnnoudLL300nvvvfdcc811xx13LLDAAoceeujrr7/OdlKJ+4UXXrjffvtx6zClYC333HPPu+++u9566z3xxBOKsbucdtppQVYqa0kjJU8//fQXv/jFSSaZ5M033xTCQsJDDz2E31SV7PXTYU0mF110EfrS691M/ROBRCARSAR6GoEunSc88cQT77zzzuaMnHLKKa+++upqq62GlHDo4BYIxN///vdK0B955BGkZJFFFrnpppvWWmut3/3ud2JQuHuQEkwFuSGBs6aySkmffvrp6AsHhzLvvPNOeIiQoVKgbxJhMhmqO4wlSUqGAifzE4FEIBFIBNqGQDfyEgYMVAMEJgZzqdx4441zzjnneeedV0DhoJFGJo466iiJE088Ef+47777llhiCe4ehAatCZ8Fnw4nzkEHHfTggw++9NJLRUJJzDHHHIJer/r374wzzpDPbKC5UqCfEiYZVc3K6afeZV8SgUQgEUgE+gCBbowvOfvss80NXn755c0H5tDBJ0zAEQkbcDOZRKzrFVdccc455+y1117MKsJUd9llF5OKp5tuOoGrSoqQdWQ1WXLJJe+++27poUJMQqxjGEv6z4NTOtggyiSXLSkoZSIRSAQSgUSggwh0Iy/ZbbfdrFMiyMMMXtAIfbVaya677howISgsHEsttRTmsccee2Abn/nMZ1CZrbbaqngiXnvtNa6c9ddfX5mtt94a1Tj22GPFnTQGGqeZaKKJqpxEjav03FUmk6qJOdNMMw3nV891JBVOBBKBRCAR6EsEujTutQHWZhFz3/z+97/noxG1qqRwkFVWWeWVV15ZeeWVuX7uv/9+6bnnnls8CuIiakRISgOBg3Zp2WWXrZqYk9ODB+0ZyP4mAolAItC1CHRjfEljsBhIRLmeeuqpQUoUZg654YYbDjvssA8++EDoyZZbbsnRM+WUU77//vuuJimpwrMqyiSnB1fhk6eJQCKQCCQCHUSg9+wlIwSLM0hgrBVjR1h+oIpZdL/0N40lBYpMJAKJQCKQCHQcgd6zl4wQsoUXXti0YU6cEZYfqGLFRlISA9X97GwikAgkAolA1yLQt/aSN954Q9yreT11l1Pr2vvRNsXCZJLGkrYBng0lAolAIpAIjASBbpyPMxK9hy1jP78f//jHwxYb2AJpKRnYW58dTwQSgUSgmxHoW3tJN4OeuiUCiUAikAgkAolAXQT6Nr6kbm8zMxFIBBKBRCARSAS6GYHkJd18d1K3RCARSAQSgURgsBBIXjJY9zt7mwgkAolAIpAIdDMCyUu6+e6kbolAIpAIJAKJwGAh0F3zcX76058OFvzZ24FHoGzqNPBIJACJQCKQCPwLgc7zkv/5n/+xIbBd+uKGfO5zn8s7kwgMCAKVj/2iiy6auzoPyH3PbiYCiUADBDrGS5hGvvOd73gvIyJ216Nifjg2uE95qY8RCDPhz372M4vdxboySVD6+HZn1xKBRKAxAp1Zv4SN5Lvf/a5X8NJLL510pPEdyqsDhUD510hqMlD3PTubCCQCBYEO8JJw3Fx33XVFiUwkAolAJQJrrLFGunUqAcl0IpAIDA4C7Z6Pk6RkcJ6t7OmoEcDaBV35Zxm1hKyYCCQCiUCPItBWXsKPzn2TlpIefVZS7XYiENQkZ6i1E/NsKxFIBLoBgbbyEpF9uV1cN9z11KEnEBAPLja8J1RNJROBRCARaBYCbY0vMd3gt7/9bbNUTzmJQN8jINAEO8nY8L6/0dnBRCARKAi0z17CWZ7GkoJ7JkaHwF/+8pczzzzz9ddfH131nqsl+jVNJj1311LhRCARGAsC7eMl4vjGomgT6/7oRz9accUV33rrrSbKTFHtQeC22247/PDDzz///PY01/FWTKQva691XJlUIBFIBBKBNiDQPl7i9eol24YulSb++c9/vvjiiy+99FLJicQdd9zxm9/85v3336/Kj9O3336b5fy5556re3XAM48++misblgQrr322hNOOGHYYqMo8OSTT6r13nvvjaJuL1ZJD04v3rXUORFIBMaCQPt4CS3b9pL94Q9/yCIyxxxzLLXUUksuuaTjVVddVWAKS8mUU05ZcioTL7/88uWXX/6DH/ygMnMk6b/+9a+vvvrqSEr2bpmzzz779NNPr9K/tuPmkvA+/O1vf6sqOfbT4Itvvvnm2EX1kIScldNDNytVTQQSgTEi0FZeMkZdR1j9vPPOE8jCIvK///u/88033zLLLMNqsssuu+y0007/+Mc/CMFLJptssvHHH7+uwCjzu9/9zke5r3NyRjgKfuUrX1lkkUXuvffeumL7JpP96YMPPnjhhRcef/zxV155Rb9qOx4YinH+/e9//9hjjyETbFdNQSB4yYc+9KGmSOsJIbljVE/cplQyEUgEmoVAm/bH8cHXntfrr3/96wMPPBA6E000kY/78BwZRDfddNNrrrlmqqmmOuigg0RNTj311FUIGkp//vOfYzAPPvigSwr7RZnFFlvsiiuuqCpfe7rAAgs8/PDDW2+99SOPPDIU6amt1RM5evTMM88AB9Xz+9SnPlXUfvrpp0vHr7/++ieeeAIdcVSg0jzG/vT5z3++1Bp1IuZzTT755KOWkBUTgUQgEUgEuhmBNvGS9kDgo/yb3/ymtrbaaivspHxVzzLLLPw4vDnf+9731llnnT/+8Y+f/vSnq1Q6/vjjTzzxxMrMz372s0bceeaZhyeoMn+otHjMnXfeGTFqESl54403/vznP88222zjjTfeUDqUfH3EEjbffPMCQrlUmTC9hWUIPh/+8Icr8yvT99xzzwYbbFCZM/PMMy+44IKQWWihhSaccMLouPghpqnKYoxSppOA0U9Jl3Rh4okn/shHPlKK/d///d9///c4GO2YatSdffbZiwQJDiNWnE984hNarMyvm/7Tn/505513Mmv94he/UODCCy+caaaZqkrCBMjTTz99VX5HTmFo4Z9KktcRNbLRRCARSATag8A4DAntUWgsrfAs/OpXv2LeOOyww6rGYwaSjTbaiHAf/Uamj33sY1UNTTvttHIQEV4JidVWW+2mm2468sgjGVpmnXXWUlgEyRlnnHHEEUc4ElXyI8EeE2MzK4KxRKZBXaTLtttu2zjYAqPCnDiYFBN1u/jii991111FOAPP7rvvzif1xS9+cd555xVV6pKSuNc3vvENQ3spGQmOpzXXXPOQQw559913qy6VU/Yhoan6u/zyy7N/mMUdrhZjNuUVO+uss6hx2mmnwYrxCZPDTiSM/XQ76aSTOMsWXnjhEKjjaB++ghyALmjf1Vdffc4559B8pZVWcjuIdWv23nvvooPFf1UhquTUJjCnG264AaHkEnKVtcZxueWWKyXFEsEcMtjPZptt9s4777hUF3+IuRHsdpx6F198MZcQYENgkcbSs9566+k46w4qEGG25WomEoFEIBFIBFqOgNGoDT9TYFZfffVWNySyZIYZZrj00kvrNvTVr37VVR4lR6POoYceuuuuu+61114XXXQRhqHK3//+9zgqsMUWW9QKUdeI62r8CBF9Uor5lBfsyaQhB5tR8vnnny/lOTJKydoEBxCZVubYZ599QrhQlShG4LLLLiuT2eZrX/taXDWaajrSf/jDH6oERjFzZ6ryyylSsuOOO6rO5rH99ts7SnNjKSCh76hPCHcETiDj6tprr61HRU4kSsdLMU2r+Oijj1aWZL+RiamUTHxCziWXXFJyqhKaBrIy8WPGkMBySrELLrggLrm5Ic0tcLUu/j/+8Y+j8CqrrHL//fcDociJhLsQ90s30VOF11133aoy7T9ljvJrf7vZYiKQCCQCHUGgr+wlfAp4XF03xwMPPHDLLbcYaWIajhHOJ7ioEZmowAorrMDWEv6XONZaGlg+Ntlkk49+9KMnn3yyT20GBkI4bty2II+cFJxBEYnCPIA6rL/++o7sBwoYCBtwzLDucCQhSZT06S+klC9Glf333z8sDTfffDM7kKsyqRfeEAYMwRa8EmX6roGfpYdFYc899xyqRdYC0TMsHzFxhnlAyQjvJVBPd9hhBzlMKY4CbopnirunysCgQOl4KRY3ogrDMHjMNddcqvixTICRfQUJiJzaIzWAzOAkVGjfffeNqUABi8JPPfXUfvvtJ+FW+oU9jDJy6uLP/hGGFkY1N1HETGWL2BU+p3caRSIDPaSwskymE4FEIBFIBFqNQF/xEk4BeGEGVa4NzhdWAZeOPfbYAqiRTFCCqAjjkNEIISiXJIoEy5kYp+Wwrzga0ddaay3DXkw8NrgyA0TF8A3FhJHXXntNJm6x5ZZbGk3nnHNOoaNRrO6Ra0k+Pw5aoAmOBqfsDaJVQj7mQQgjinF61VVXNZwHLxHYq0WuB2OqcZpF5+CDD4bDKaecUpefEcudcdRRR0VzhurPfOYzOA2BWIhLoPBzlZMFzZIwocYxfiwiEkHFQITVOVZ2PIoxRUgUDAEIxliktXjQcCxlDjjggMpwk6geR/1iwOD6wbTYSBgwzN92CVZ6KsEqEyX1XZe/9a1vOcUdHevi765ZkI05jffq1ltv1V9esOKp8QDE9KJTTz1VwI04JHKYjqKJPCYCiUAikAi0B4G+4iVCB9gA+CN8W4tLgKClNYw3LPyGHJ4LMRBlJo7BKUbuaaaZRsnKYBFCytxg83dwCwVQEJ4Un/sGXXEnrAg4hHzpKCyiUw4CIZNvxVHQQ7AZg6vRfaiV3JSMEVFCGKZwFuWlNcH/JYFCCTr5V/joZz/LfOJbX2b82G94oyJ93HHH8V/Qweg7ySST/P9Fqv/SBAEyJIsv0RDPhfBY66gatosaAOTsYIlxlRpFBPoirbojo4LJR5hTZcej5BRTTCERsCB/AlMuu+yyT37ykzJFk2iF/LvvvlsMbwz/UavqaCEZOV/+8pcd3R1mFYQpjCuadhdE8LiJolhWXnllonCXn/zkJ9ib8g3wdxNRPfrouwRTWcQ7YyoqMkqFD8hVRprkJTDJXyKQCCQC7USgr+bj+PI+5phjxGMyOfgZU3lAAk30YptttpEWocmYbxg2whl7OAUiQNXYXHA3gorecMnxyiuvVMylueee21DKp2BUNrI61YQAWB/xxk5hsPPPPz8nUfgRYoBnnonJJoJVSeBLWmKJJUorlYmohQBFWwgBHRCsCLmddNJJkY/CP6JijP1GZae4iDH7xhtvlBZkUzmPt7KVSD/77LMSE0wwgeAJv8oCoYahfbvttot8wbYQIDxIWHA4M1mUQY9kIgRKlo5HrVKMh4jDSyYEVMHh+Jj8ohhSMpRRR4GYJsNsY24wGxjY3SNGEbxKSAr2QCtcSuxtCb8NsY5D4e+eQkworhvhpyPicL/97W8zRIWhi6WEkkVOJhKBRCARSATajEBf2Utg5ytfzIThFinh8jDMm4LhM5qpvwyBHCt8Ma4acZESIyuDhG/uAn1ManUMpwBO4xLGE3zFmMenwI9jMJYpnpco8pUxDGtXggfEaBdzfJwSzgYTzgWntb8NN9yQUYFRpFwytYTjwxxROawLwRjKVeOriSfsQHK4cjAYab4MHTHclmJ1E9E7JfmwKgtwteBDHF40L3N3A4ESZhGL0OisRtkwKBZ8pXQ8BAYPO/fcc9EaLET3ha9+/OMfZ59wR0ARxUBXqUBVGi9RHdrsFkiJGeDhqeG+cSocRKjNfffdh4dVVmSUEm7MdFQXf64llFTED18YixEPUXiXeNmCEQor4cwqApXHisL2VjIzkQgkAolAItA6BMYrYZuta4NkcQ8mSmAMLW1lnIT72mZNmW666cI3UVlXPIpPah4KAxgbRjE/GKU4CJT3mV5ZnhzuIcOtAlaJretDMdpV1aqUUDfNVYEfYAbIk+Efd6Etd5LBGBPij8AthOviLoVG1JVTm8lzYdiWz2jE8GPoFZZLspxYuKyySqXmOijU1MRdvITFgg5RsrbjfEn8RKDD4dhFqvqO30CM06SyobppjAELoWTlbULLVDddOWJdsSIeOmG2XE7hR2MUqTKilF4IptH9yqAZ4cnmVHse3G40CE/FbtneRLeIjJGPEkXET10NW50ZBieBOK1uKOUnAolAItANCAwuL+kG9EeigwFVJCZKV3xSCJBYit12223GGWcciYS6ZcwEFssSo3gUYHGxVD/DRt3yTczEqzAV8bnhWRuLZKwFOKJwsIeQw8TC3YMMNeZqAo9QENE5TESFM4moRdewpQhDJpAJao011hAZreRY9BxL3eQlY0Ev6yYCiUDPIZC8pGdumXm2jBlMJuhI8UmNUXvOILN+mCIEspTheYwyh63OFiVwlTUi5mwPW37YAqibqb8iZqwD2xQCIa6I9UWsSaWRZlg1WlQgeUmLgE2xiUAi0J0I9FXca3dC3CytLIve9JXRuZyGjUdplv4hJ9ZvFXrcLFJCLEYVQTPNUlWIbrNEpZxEIBFIBBKBcUKg3+Jex6nzWbj9CAh95XPhHGl/09liIpAIJAKJQPcjkLyk++9RX2loDpT+mJfbV73KziQCiUAikAg0CYHkJU0CMsWMDIFY7zXWWBtZjSyVCCQCiUAiMEAIJC8ZoJvdDV21B7K5RW2Lse2GLqcOiUAikAgkAiNHIONeR45VlmwCArGCbRMEpYhEIBFIBBKBfkQg7SX9eFezT4lAIpAIJAKJQG8ikLykN+9bap0IJAKJQCKQCPQjAslL+vGuZp8SgUQgEUgEEoHeRCB5SW/et9Q6EUgEEoFEIBHoRwT6MO712WeftdGdnV8eeOABe7DZBaZLZn/Y0s/Oxra5sSeLBU8jAtRGwRNPPLFd4srTZQ+8xnu7lJKZSAQSgUQgEUgE+gyBvuIldtndaqutbG4SN8kevDaeLTfMlrm33nqr3e0r9/tFC7CEUqYy8fbbb9s+psHS7y+99JL9U7CKUss2crfddtv7778/7bTT2jqucq8WG8VtvPHGNryNwjawtUWtDWytfGr3WlvvRv53v/td+6HsueeeO++8cxGbiUQgEUgEEoFEYEAQ6Ctecu655wYpwQlsjVs1JfW4445jR5l33nnxgLi7v/jFL5T88Y9/jEBcffXVqMDRRx999tlnK/mFL3xhgw02eOyxxy655JKlllrqpz/9qY3cvvzlLwe3UCUWU//whz981113ffzjHyfwxBNPtBtteW5uueUWJKNwoEsvvRQpmWGGGTAntazhobzd5siZf/75S62LL75Yeuqppy45mUgEEoFEIBFIBAYHgb6KLzHk25jezbNd7XnnnccWUnkjn3nmGad4Scm85pprpO3Ty+lz7LHHIh+nnHIKosABtOOOOyIlruIljtdee+2hhx56xRVX3H333QogNJtvvrnEW2+9deONNyrA4IGUfPazn1XyxRdfXHXVVXEg/Mal+N1www0SJ5988rbbbrvlllteddVVWudmkjnXXHNFmSeeeOK5555jg1l77bUjJ4+JQCKQCCQCicBAIdBXvIQ14vbbb2fzsMz5Oeecs9hiix1xxBGvv/563FE72fLszDjjjHH6/PPPf//735decsklI2fTTTeNhHzuGHu4TDbZZE899VRkvvLKK3vssUekLVrqdKWVVnKqwJtvvvmd73wHyUBcFlxwQQ4g1V0666yzkJ6oEomFF144Th1tgctwIhFLs0scdthhjgcccEBluImc/CUCiUAikAgkAgOCQF/xEvdMiCtXC6vG8ccfzxty+umnL7744gwhLmEkLBxh3hB8qljcY+Go5WYvsMAC0oqxfHCpzDfffIwf//znP2sLbLHFFmiHfGYYkbYS66yzDn8QUrLddtuRgAPJZGURxypR6JF0+cU2MSeddBKWs++++1IbuSGnFMhEIpAIJAKJQCIwUAj0Gy9hyfBDAtZbbz10hBnD7WQIQRS4ZqSRBnNh/ATJxq62nD5xy1lHxJxKS+AcE044IfOGir/97W+jwNJLL73hhhtKoy8HH3zweOONxySD4swyyywy+XFQClXQi/XXX1+Y7ZxzznnTTTdtttlmJfy2JEIgfw36osyiiy560UUXySSB2Liax0QgEUgEEoFEYNAQ6DdegkaILEE4EAhxrKwjctxU8bBiPsS0MoRgEsssswwecNlllyEWPCwcQMpEgAjiIkY1puGIisVRxKkogEAcc8wxK664IkuJujH3WLAIJ44ZPcwzJNx3332O++23Hz406aSTsrh8+tOfvvPOO4888sgwjVTNWBYwe/3119NBKyr6rb766pHIYyKQCCQCiUAiMIAIjFfppGhd/8N0YRJK65oIyR988MEFF1xw5pln8oxEjjBSTGLXXXdt3PTf/va3KtIQ5Ut+SVTJKfmmB//5z382Q7jS4MGJ8+ijj84999z0Eda68sorV1Uvp5/73Oc4nm6++eaSk4lEAAJmkjOniTpKNBKBRCARGAQE2sRLQCm8ozhE2oCsmTLsHOwQft2/TBlDCw8O39A222zTBnCyiR5CAC8RcG3ieg/pnKomAolAIjBqBPpq/ZJKFIKRVOZ0czpmI+f04G6+R53S7aGHHkpS0inws91EIBFoPwLtiy/hp4h5Me3vZJe3aAKz1U2E4k455ZRdrmqqlwgkAolAIpAItBSB9vGSlnajp4ULfRWcy1zf071I5VuBQFL5VqCaMhOBRKCbEWgfL+Ejj1m73QxHR3S78sortRuTljuiQDbatQhYNfjrX/9616qXiiUCiUAi0HQE2sdLwkee33+1tzDWe42JxLVXM2eQEbDFUs7EGeQHIPueCAwgAu2bjwNcq5b98pe/bMNs4d66kVbEtwiK+JveUju1bTUCscpf8pJW45zyE4FEoKsQaJ+9RLe9YU0uiLdtV6HQWWXse5ykpLO3oAtbZ1lkLOlCxVKlRCARSARaikC75wlbKTW2x8uvwJbe1xTe0wggJf5NRJbkv0lP38dUPhFIBEaBQLt5iSiTpCajuE9ZZXAQSFIyOPc6e5oIJAK1CLQ1vqQ0H29ep/lFWDDJRCLg/8KcNb7O/L/IhyERSAQGFoHO8JKAOwJNONHLTEgb9rqUq1sO7OM4UB3HQqK/JgOLB490Ljk/UM9AdjYRSARqEegkLynaBEGJV7OPxZKfiUSgvxEo8c525kPKk5H39+3O3iUCicBIEOgKXjISRbNMIpAIJAKJQCKQCPQ9Am2dJ9z3aGYHE4FEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCCQvGQt6WTcRSAQSgUQgEUgEmolA8pJmopmyEoFEIBFIBBKBRGAsCPx/Y1C56BpaqKwAAAAASUVORK5CYII=\"\n    }\n   },\n   \"cell_type\": \"markdown\",\n   \"id\": \"ed20a4f2-ec79-44d7-9550-7ad5699c136d\",\n   \"metadata\": {},\n   \"source\": [\n    \"![image.png](attachment:e3835897-9292-49af-a248-95eaa1d0b86a.png)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e11663b3-ff06-4f4d-a17f-b215b22f99cd\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Setup and Dependencies\\n\",\n    \"\\n\",\n    \"We'll be using two new libraries for our demonstration \\n\",\n    \"\\n\",\n    \"1. `spaCy` : This provides a handful of useful utilities to do generic NLP tasks with\\n\",\n    \"2. `nltk` : This was used by the original paper to count the number of tokens in our generated summaries\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"35dd5dae-0659-4b86-b8f2-57ec56087831\",\n   \"metadata\": {},\n   \"source\": [\n    \"We'll need to install the tokenizer packages and the spacy english library before we can proceed with the rest of the lesson\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 1,\n   \"id\": \"0dbdda0a-2648-4e0f-8633-ea19bef4a460\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stderr\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"[nltk_data] Downloading package punkt to /Users/admin/nltk_data...\\n\",\n      \"[nltk_data]   Package punkt is already up-to-date!\\n\"\n     ]\n    },\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"\\u001b[38;5;2m✔ Download and installation successful\\u001b[0m\\n\",\n      \"You can now load the package via spacy.load('en_core_web_sm')\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"import nltk\\n\",\n    \"\\n\",\n    \"nltk.download(\\\"punkt\\\")\\n\",\n    \"\\n\",\n    \"!python -m spacy download en_core_web_sm --quiet\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"90874bad-06b5-4656-beec-73fe984efbcb\",\n   \"metadata\": {},\n   \"source\": [\n    \"Once that's done, let's now move on to writing some code.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"424ca094-9ae2-4da4-90f8-32ec89cddabc\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Definitions\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"68397732-fd6f-424d-8823-7818a0752aea\",\n   \"metadata\": {},\n   \"source\": [\n    \"There are a few different definitions which we'll need to understand in the tutorial. They are\\n\",\n    \"\\n\",\n    \"1. Tokens and tokenizers\\n\",\n    \"2. Entities\\n\",\n    \"3. Entity-Dense\\n\",\n    \"\\n\",\n    \"Once we've gotten a hang of these concepts, we'll walk through a simple implementation of a Chain Of Density summarizer\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"4cf72a9d-db37-4ec9-b242-171468090bc1\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Tokens and Tokenizers\\n\",\n    \"\\n\",\n    \"In the original paper, the authors used `NLTK` to split the generated summary into tokens. These represent the smallest units that each sentence could be broken into where each hold semantic meaning.\\n\",\n    \"\\n\",\n    \"Let's walk through a simple example to see how the `NLTK` tokenizer might work\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 2,\n   \"id\": \"bd6ebf95-60c6-4ec8-be17-d5ab436a67fd\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"['My', 'favourite', 'type', 'of', 'Sashimi', 'is', 'Toro']\"\n      ]\n     },\n     \"execution_count\": 2,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"import nltk\\n\",\n    \"\\n\",\n    \"sentence = \\\"My favourite type of Sashimi is Toro\\\"\\n\",\n    \"\\n\",\n    \"nltk.word_tokenize(sentence)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"281f523d-7707-4e33-af29-f233a1f7bf2a\",\n   \"metadata\": {},\n   \"source\": [\n    \"NLTK's word tokenizer does more than just split by empty whitespace. It handles a lot of nice edge cases and contractions such as `don't` or `I'm`.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 3,\n   \"id\": \"8a87b231-57b0-426c-98d5-cd7d8b512121\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"['I', \\\"'m\\\", 'fascinated', 'by', 'machine', 'learning', '!']\"\n      ]\n     },\n     \"execution_count\": 3,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sentence = \\\"I'm fascinated by machine learning!\\\"\\n\",\n    \"\\n\",\n    \"nltk.word_tokenize(sentence)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"6719c508-f575-41a5-91a2-47b2fa76cd3f\",\n   \"metadata\": {},\n   \"source\": [\n    \"We can then calculate the number of tokens by simply finding the `len` of the generated sequence.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 4,\n   \"id\": \"c905dff4-5753-4274-90fe-44aa3393ff0f\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"name\": \"stdout\",\n     \"output_type\": \"stream\",\n     \"text\": [\n      \"['I', \\\"'m\\\", 'fascinated', 'by', 'machine', 'learning', '!']\\n\",\n      \"7\\n\"\n     ]\n    }\n   ],\n   \"source\": [\n    \"sentence = \\\"I'm fascinated by machine learning!\\\"\\n\",\n    \"tokens = nltk.word_tokenize(sentence)\\n\",\n    \"print(tokens)\\n\",\n    \"print(len(tokens))\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"692316bc-10e6-421f-adba-5323376b95d6\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Entities\\n\",\n    \"\\n\",\n    \"A named entity is an object in the real-world that we identify using a name. Common examples include people, countries, products or even books that we know and love. We can use the `spaCy` library for us to be able to detect the number of entities in a given sentence.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 5,\n   \"id\": \"47a4a8f6-295d-4040-beb1-3c8e9ff3bf99\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"# First we load in the library\\n\",\n    \"import spacy\\n\",\n    \"\\n\",\n    \"# Then we initialise an NLP object.\\n\",\n    \"nlp = spacy.load(\\\"en_core_web_sm\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 6,\n   \"id\": \"51197222-2124-46f8-9a57-555d43836401\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"(Apple, U.K., $1 billion)\"\n      ]\n     },\n     \"execution_count\": 6,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sentence = \\\"Apple is looking at buying U.K. startup for $1 billion\\\"\\n\",\n    \"\\n\",\n    \"doc = nlp(sentence)\\n\",\n    \"doc.ents\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"5e2560b2-ca27-4223-84ed-e01f9542fdbd\",\n   \"metadata\": {},\n   \"source\": [\n    \"We can see that Spacy was able to identify unique and named entities that were present within the sentence using the `doc.ents` property. Let's see a few more examples.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 7,\n   \"id\": \"9c2ad5a0-2f24-442e-a46a-3a265ef873f6\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"()\"\n      ]\n     },\n     \"execution_count\": 7,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sentence = \\\"A knowledge graph, also known as a semantic network\\\\\\n\",\n    \", represents real-world entities and their relationships\\\"\\n\",\n    \"\\n\",\n    \"doc = nlp(sentence)\\n\",\n    \"doc.ents\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 8,\n   \"id\": \"dc7964d3-61f6-436e-bfb0-080cd46c41bf\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"(J.K., one, Harry Potter')\"\n      ]\n     },\n     \"execution_count\": 8,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sentence = \\\"For example, a node representing an author like 'J.K. Rowling'\\\\\\n\",\n    \"can be connected to another node representing one of her books, 'Harry Potter'\\\\\\n\",\n    \", with the edge 'author of'\\\"\\n\",\n    \"\\n\",\n    \"doc = nlp(sentence)\\n\",\n    \"doc.ents\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"11b7737d-d5a7-4aa4-bdea-b0d12d1589ed\",\n   \"metadata\": {},\n   \"source\": [\n    \"As we can see from the examples above, entities are not nouns. They're direct or indirect references to people, places, concepts.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"c8e69fa8-defa-4f47-b8cc-cfcfa4cbcfba\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Entity Density\\n\",\n    \"\\n\",\n    \"Now that we know what tokens and tokens are, we can move on to our last concept - that of entity density. Entity density is simply the mean number of entities present per token within your string of text.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 9,\n   \"id\": \"15accf59-a264-4e1c-9b77-8b486e423f95\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"nlp = spacy.load(\\\"en_core_web_sm\\\")\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def calculate_entity_density(sentence: str):\\n\",\n    \"    tokens = nltk.word_tokenize(sentence)\\n\",\n    \"    entities = nlp(sentence).ents\\n\",\n    \"    entity_density = round(len(entities) / len(tokens), 3)\\n\",\n    \"\\n\",\n    \"    return len(tokens), len(entities), entity_density\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 10,\n   \"id\": \"648206dc-a734-49eb-bd2e-8b46a914cacf\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"(17, 0, 0.0)\"\n      ]\n     },\n     \"execution_count\": 10,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sentence_1 = \\\"A knowledge graph, also known as a semantic network\\\\\\n\",\n    \", represents real-world entities and their relationships\\\"\\n\",\n    \"\\n\",\n    \"calculate_entity_density(sentence_1)\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 11,\n   \"id\": \"9fd5717f-202a-4b39-976c-a32d0f1a4b29\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"(11, 3, 0.273)\"\n      ]\n     },\n     \"execution_count\": 11,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"sentence_2 = \\\"Apple is looking at buying U.K. startup for $1 billion\\\"\\n\",\n    \"\\n\",\n    \"calculate_entity_density(sentence_2)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"1d9ac4df-5e7a-4186-83f2-bb542dba6189\",\n   \"metadata\": {},\n   \"source\": [\n    \"This gives us a quantitative method to be able to understand and compare two different sentences/summaries.\\n\",\n    \"\\n\",\n    \"We want summaries that are more entity-dense\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 12,\n   \"id\": \"ae27bcc5-da32-4aaa-9ebb-dbc21700ee14\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"((82, 11, 0.134), (71, 17, 0.239))\"\n      ]\n     },\n     \"execution_count\": 12,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"summary_1 = \\\"\\\"\\\"\\n\",\n    \"This article discusses an incident that occurred during the Chinese Grand Prix\\n\",\n    \"involving two racing drivers, Jenson Button and Pastor Maldonado. The two were \\n\",\n    \"competing for the 13th place when Button collided with Maldonado's vehicle, \\n\",\n    \"causing damage to both cars. The incident resulted in a penalty for Button, \\n\",\n    \"who was demoted to 14th place. Maldonado, on the other hand, had to retire from \\n\",\n    \"the race due to the damage his car sustained.\\n\",\n    \"\\\"\\\"\\\"\\n\",\n    \"\\n\",\n    \"summary_2 = \\\"\\\"\\\"\\n\",\n    \"Jenson Button's McLaren collided with Pastor Maldonado's Lotus during the Chinese \\n\",\n    \"Grand Prix, causing front wing damage to Button's car and rear-end damage to \\n\",\n    \"Maldonado's, forcing his retirement. Button received a five-second penalty and \\n\",\n    \"two superlicence points, dropping himto 14th. Fernando Alonso advanced two places, \\n\",\n    \"while Button was lapped by Nico Rosberg and Alonso by Sebastian Vettel and \\n\",\n    \"Kimi Raikkonen.\\n\",\n    \"\\\"\\\"\\\"\\n\",\n    \"\\n\",\n    \"calculate_entity_density(summary_1), calculate_entity_density(summary_2)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"9d59c170-a4fb-4687-8012-9cb0ed807a8c\",\n   \"metadata\": {},\n   \"source\": [\n    \"We can see that the final summary is almost twice as dense as the first summary and is hence more *entity dense*.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"112b2f52-b15a-46d5-9767-e8a95d1f674f\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Implementation\\n\",\n    \"### Data Classes\\n\",\n    \"\\n\",\n    \"Let's start by walking through some of the data models that we'll be using as the response_model for our open ai function calls. We'll need a total of two different classes\\n\",\n    \"\\n\",\n    \"1. Initial Summary: which is the lengthy and overly verbose article\\n\",\n    \"2. Rewritten Summary : which represents\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 13,\n   \"id\": \"2ac40d98-2843-4c9c-bc18-50ab1d4ffa94\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from pydantic import BaseModel, Field, field_validator\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 14,\n   \"id\": \"486e85fc-3fc8-4143-bdf4-d7cef91a37cf\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"class InitialSummary(BaseModel):\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    This is an initial summary which should be long ( 4-5 sentences, ~80 words)\\n\",\n    \"    yet highly non-specific, containing little information beyond the entities marked as missing.\\n\",\n    \"    Use overly verbose languages and fillers (Eg. This article discusses) to reach ~80 words.\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"\\n\",\n    \"    summary: str = Field(\\n\",\n    \"        ...,\\n\",\n    \"        description=\\\"This is a summary of the article provided which is overly verbose and uses fillers. \\\\\\n\",\n    \"        It should be roughly 80 words in length\\\",\\n\",\n    \"    )\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"c3b8e382-dcfc-487f-8141-6dd9093c01b0\",\n   \"metadata\": {},\n   \"source\": [\n    \"Pydantic is extremely handy because it allows us to do two things\\n\",\n    \"\\n\",\n    \"1. We can validate that our generated outputs are consistent with what we want, **and write vanilla python to validate so**\\n\",\n    \"2. We can export the generated class definition into a simple schema that fits in perfectly with OpenAI's function calling\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 15,\n   \"id\": \"609a9edd-7c4e-4586-a5be-037c4c3c7ff7\",\n   \"metadata\": {},\n   \"outputs\": [\n    {\n     \"data\": {\n      \"text/plain\": [\n       \"{'description': 'This is an initial summary which should be long ( 4-5 sentences, ~80 words)\\\\nyet highly non-specific, containing little information beyond the entities marked as missing.\\\\nUse overly verbose languages and fillers (Eg. This article discusses) to reach ~80 words.',\\n\",\n       \" 'properties': {'summary': {'description': 'This is a summary of the article provided which is overly verbose and uses fillers.         It should be roughly 80 words in length',\\n\",\n       \"   'title': 'Summary',\\n\",\n       \"   'type': 'string'}},\\n\",\n       \" 'required': ['summary'],\\n\",\n       \" 'title': 'InitialSummary',\\n\",\n       \" 'type': 'object'}\"\n      ]\n     },\n     \"execution_count\": 15,\n     \"metadata\": {},\n     \"output_type\": \"execute_result\"\n    }\n   ],\n   \"source\": [\n    \"InitialSummary.model_json_schema()\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"e910611e-2033-4db5-91b6-ebc97c11d252\",\n   \"metadata\": {},\n   \"source\": [\n    \"It's important here to provide a good description of the overall class and the respective fields. This is because all of the descriptions that we write for the individual fields and the class itself **are directly used by the llm when generating outputs**.\\n\",\n    \"\\n\",\n    \"Now, as a quick recap, when we rewrite our summaries at each step, we're performing a few things\\n\",\n    \"\\n\",\n    \"1. We identify any entities from the original article that are relevant which are **missing from our current summary**\\n\",\n    \"2. We then rewrite our summary, making sure to include as many of these new entities as possible with the goal of increasing the entity density of the new summary\\n\",\n    \"3. We then make sure that we have included all of the entities in our previous summary in the new rewritten summary.\\n\",\n    \"\\n\",\n    \"We can express this in the form of the data model seen below called `RewrittenSummary`.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 16,\n   \"id\": \"d3d589ca-00cd-42cc-9a7a-a8f0620b4ea1\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"class RewrittenSummary(BaseModel):\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    This is a new, denser summary of identical length which covers every entity\\n\",\n    \"    and detail from the previous summary plus the Missing Entities.\\n\",\n    \"\\n\",\n    \"    Guidelines\\n\",\n    \"    - Make every word count : Rewrite the previous summary to improve flow and make space for additional entities\\n\",\n    \"    - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\\n\",\n    \"    - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\\n\",\n    \"    - Make space with fusion, compression, and removal of uninformative phrases like \\\"the article discusses\\\"\\n\",\n    \"    - Missing entities can appear anywhere in the new summary\\n\",\n    \"\\n\",\n    \"    An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"\\n\",\n    \"    summary: str = Field(\\n\",\n    \"        ...,\\n\",\n    \"        description=\\\"This is a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities. It should have the same length ( ~ 80 words ) as the previous summary and should be easily understood without the Article\\\",\\n\",\n    \"    )\\n\",\n    \"    absent: list[str] = Field(\\n\",\n    \"        ...,\\n\",\n    \"        default_factory=list,\\n\",\n    \"        description=\\\"this is a list of Entities found absent from the new summary that were present in the previous summary\\\",\\n\",\n    \"    )\\n\",\n    \"    missing: list[str] = Field(\\n\",\n    \"        default_factory=list,\\n\",\n    \"        description=\\\"This is a list of 1-3 informative Entities from the Article that are missing from the new summary which should be included in the next generated summary.\\\",\\n\",\n    \"    )\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"06529289-309f-4143-979b-8d4119b7d141\",\n   \"metadata\": {},\n   \"source\": [\n    \"We'd also want our rewritten summary to have\\n\",\n    \"\\n\",\n    \"1. No missing entities => `absent` should have a length of 0\\n\",\n    \"2. New entities to be added in the next rewrite -> `missing` should have at least 1 entry\\n\",\n    \"3. A minimum length of 60 tokens and to have a density of at least 0.08 ( **NOTE**: 60 tokens and the 0.08 cut off are chosen arbitrarily, feel free to adjust them even higher if you wish. However, this might require you to add more retries in your code )\\n\",\n    \"\\n\",\n    \"We can do so using the `field_validator` that we learnt in the previous lesson. This allows us to add in a validator for a specific field to ensure it meets our requirements. \\n\",\n    \"\\n\",\n    \"This gives us the final definition of our `RewrittenSummary` class as seen below\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 17,\n   \"id\": \"8f81f281-0950-4973-81b6-e1acd8b35aa0\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"class RewrittenSummary(BaseModel):\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"    This is a new, denser summary of identical length which covers every entity\\n\",\n    \"    and detail from the previous summary plus the Missing Entities.\\n\",\n    \"\\n\",\n    \"    Guidelines\\n\",\n    \"    - Make every word count : Rewrite the previous summary to improve flow and make space for additional entities\\n\",\n    \"    - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\\n\",\n    \"    - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\\n\",\n    \"    - Make space with fusion, compression, and removal of uninformative phrases like \\\"the article discusses\\\"\\n\",\n    \"    - Missing entities can appear anywhere in the new summary\\n\",\n    \"\\n\",\n    \"    An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\\n\",\n    \"    \\\"\\\"\\\"\\n\",\n    \"\\n\",\n    \"    summary: str = Field(\\n\",\n    \"        ...,\\n\",\n    \"        description=\\\"This is a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities. It should have the same length ( ~ 80 words ) as the previous summary and should be easily understood without the Article\\\",\\n\",\n    \"    )\\n\",\n    \"    absent: list[str] = Field(\\n\",\n    \"        ...,\\n\",\n    \"        default_factory=list,\\n\",\n    \"        description=\\\"this is a list of Entities found absent from the new summary that were present in the previous summary\\\",\\n\",\n    \"    )\\n\",\n    \"    missing: list[str] = Field(\\n\",\n    \"        default_factory=list,\\n\",\n    \"        description=\\\"This is a list of 1-3 informative Entities from the Article that are missing from the new summary which should be included in the next generated summary.\\\",\\n\",\n    \"    )\\n\",\n    \"\\n\",\n    \"    @field_validator(\\\"summary\\\")\\n\",\n    \"    def min_length(cls, v: str):\\n\",\n    \"        tokens = nltk.word_tokenize(v)\\n\",\n    \"        num_tokens = len(tokens)\\n\",\n    \"        if num_tokens < 60:\\n\",\n    \"            raise ValueError(\\n\",\n    \"                \\\"The current summary is too short. Please make sure that you generate a new summary that is around 80 words long.\\\"\\n\",\n    \"            )\\n\",\n    \"        return v\\n\",\n    \"\\n\",\n    \"    @field_validator(\\\"missing\\\")\\n\",\n    \"    def has_missing_entities(cls, missing_entities: list[str]):\\n\",\n    \"        if len(missing_entities) == 0:\\n\",\n    \"            raise ValueError(\\n\",\n    \"                \\\"You must identify 1-3 informative Entities from the Article which are missing from the previously generated summary to be used in a new summary\\\"\\n\",\n    \"            )\\n\",\n    \"        return missing_entities\\n\",\n    \"\\n\",\n    \"    @field_validator(\\\"absent\\\")\\n\",\n    \"    def has_no_absent_entities(cls, absent_entities: list[str]):\\n\",\n    \"        absent_entity_string = \\\",\\\".join(absent_entities)\\n\",\n    \"        if len(absent_entities) > 0:\\n\",\n    \"            print(f\\\"Detected absent entities of {absent_entity_string}\\\")\\n\",\n    \"            raise ValueError(\\n\",\n    \"                f\\\"Do not omit the following Entities {absent_entity_string} from the new summary\\\"\\n\",\n    \"            )\\n\",\n    \"        return absent_entities\\n\",\n    \"\\n\",\n    \"    @field_validator(\\\"summary\\\")\\n\",\n    \"    def min_entity_density(cls, v: str):\\n\",\n    \"        tokens = nltk.word_tokenize(v)\\n\",\n    \"        num_tokens = len(tokens)\\n\",\n    \"\\n\",\n    \"        # Extract Entities\\n\",\n    \"        doc = nlp(v)\\n\",\n    \"        num_entities = len(doc.ents)\\n\",\n    \"\\n\",\n    \"        density = num_entities / num_tokens\\n\",\n    \"        if density < 0.08:\\n\",\n    \"            raise ValueError(\\n\",\n    \"                f\\\"The summary of {v} has too few entities. Please regenerate a new summary with more new entities added to it. Remember that new entities can be added at any point of the summary.\\\"\\n\",\n    \"            )\\n\",\n    \"\\n\",\n    \"        return v\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"3e182039-ad7f-4918-b2f9-4c567d95a890\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Putting it all together\\n\",\n    \"\\n\",\n    \"Now that we have our models, let's implement a function to summarize a piece of text using a Chain Of Density summarization\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 18,\n   \"id\": \"fc66ffcc-db30-429a-8007-4d4a24bf2426\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"from openai import OpenAI\\n\",\n    \"import instructor\\n\",\n    \"\\n\",\n    \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\\n\",\n    \"\\n\",\n    \"\\n\",\n    \"def summarize_article(article: str, summary_steps: int = 3):\\n\",\n    \"    summary_chain = []\\n\",\n    \"    # We first generate an initial summary\\n\",\n    \"    summary: InitialSummary = client.create(\\n\",\n    \"        model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"        response_model=InitialSummary,\\n\",\n    \"        messages=[\\n\",\n    \"            {\\n\",\n    \"                \\\"role\\\": \\\"system\\\",\\n\",\n    \"                \\\"content\\\": \\\"Write a summary about the article that is long (4-5 sentences) yet highly non-specific. Use overly, verbose language and fillers(eg.,'this article discusses') to reach ~80 words\\\",\\n\",\n    \"            },\\n\",\n    \"            {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"Here is the Article: {article}\\\"},\\n\",\n    \"            {\\n\",\n    \"                \\\"role\\\": \\\"user\\\",\\n\",\n    \"                \\\"content\\\": \\\"The generated summary should be about 80 words.\\\",\\n\",\n    \"            },\\n\",\n    \"        ],\\n\",\n    \"        max_retries=2,\\n\",\n    \"    )\\n\",\n    \"    prev_summary = None\\n\",\n    \"    summary_chain.append(summary.summary)\\n\",\n    \"    for _i in range(summary_steps):\\n\",\n    \"        missing_entity_message = (\\n\",\n    \"            []\\n\",\n    \"            if prev_summary is None\\n\",\n    \"            else [\\n\",\n    \"                {\\n\",\n    \"                    \\\"role\\\": \\\"user\\\",\\n\",\n    \"                    \\\"content\\\": f\\\"Please include these Missing Entities: {','.join(prev_summary.missing)}\\\",\\n\",\n    \"                },\\n\",\n    \"            ]\\n\",\n    \"        )\\n\",\n    \"        new_summary: RewrittenSummary = client.create(\\n\",\n    \"            model=\\\"gpt-4-1106-preview\\\",\\n\",\n    \"            messages=[\\n\",\n    \"                {\\n\",\n    \"                    \\\"role\\\": \\\"system\\\",\\n\",\n    \"                    \\\"content\\\": \\\"\\\"\\\"\\n\",\n    \"                You are going to generate an increasingly concise,entity-dense summary of the following article.\\n\",\n    \"\\n\",\n    \"                Perform the following two tasks\\n\",\n    \"                - Identify 1-3 informative entities from the following article which is missing from the previous summary\\n\",\n    \"                - Write a new denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities\\n\",\n    \"\\n\",\n    \"                Guidelines\\n\",\n    \"                - Make every word count: re-write the previous summary to improve flow and make space for additional entities\\n\",\n    \"                - Make space with fusion, compression, and removal of uninformative phrases like \\\"the article discusses\\\".\\n\",\n    \"                - The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.\\n\",\n    \"                - Missing entities can appear anywhere in the new summary\\n\",\n    \"                - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\\n\",\n    \"                \\\"\\\"\\\",\\n\",\n    \"                },\\n\",\n    \"                {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"Here is the Article: {article}\\\"},\\n\",\n    \"                {\\n\",\n    \"                    \\\"role\\\": \\\"user\\\",\\n\",\n    \"                    \\\"content\\\": f\\\"Here is the previous summary: {summary_chain[-1]}\\\",\\n\",\n    \"                },\\n\",\n    \"                *missing_entity_message,\\n\",\n    \"            ],\\n\",\n    \"            max_retries=3,\\n\",\n    \"            max_tokens=1000,\\n\",\n    \"            response_model=RewrittenSummary,\\n\",\n    \"        )\\n\",\n    \"        summary_chain.append(new_summary.summary)\\n\",\n    \"        prev_summary = new_summary\\n\",\n    \"\\n\",\n    \"    return summary_chain\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"0a034f57-1299-4fae-8fd5-f2d9a9ca985b\",\n   \"metadata\": {},\n   \"source\": [\n    \"### Trial Run\\n\",\n    \"\\n\",\n    \"Let's try running this on some sample text which we can import in from our repository. We've provided a sample article in a file called `article.txt`\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": 19,\n   \"id\": \"6044c72b-fdc7-4cea-893b-a408c7b60230\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"with open(\\\"./assets/article.txt\\\", \\\"r+\\\") as file:\\n\",\n    \"    article = file.readline()\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"2302dedc-f22a-41e9-b9c2-1579a4e8f623\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"%%time\\n\",\n    \"\\n\",\n    \"summaries = summarize_article(article)\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"a17de9a7-17c0-4b5f-b788-74a7347c4952\",\n   \"metadata\": {},\n   \"source\": [\n    \"We can see that it took roughly 40 seconds to do an iterative chain of density using this article. But does our approach increase the density of each individual summary? We can check by calculating the entity density of each summary in our list of summaries using the `calculate_entity_density` function we defined above.\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"99f7361c-2737-44ef-8515-1919e009e718\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"for index, summary in enumerate(summaries):\\n\",\n    \"    tokens, entity, density = calculate_entity_density(summary)\\n\",\n    \"    print(\\n\",\n    \"        f\\\"Article {index + 1} -> Results (Tokens: {tokens}, Entity Count: {entity}, Density: {density})\\\"\\n\",\n    \"    )\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"70571151-f378-4936-889d-0e1ca5082307\",\n   \"metadata\": {},\n   \"source\": [\n    \"We can take a look at the articles themselves to see if they qualitatively show improvement\"\n   ]\n  },\n  {\n   \"cell_type\": \"code\",\n   \"execution_count\": null,\n   \"id\": \"e7149f4d-41ca-4cb1-8438-65cd97cb4246\",\n   \"metadata\": {},\n   \"outputs\": [],\n   \"source\": [\n    \"for summary in summaries:\\n\",\n    \"    print(f\\\"\\\\n{summary}\\\\n\\\")\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"ba77b7b2-152a-4ad0-9076-4c59a454bed0\",\n   \"metadata\": {},\n   \"source\": [\n    \"As we can see, the articles progressively introduce more entities and become more entity dense. We've performed 4 rounds of summarization here but you could definitely do with maybe 2-3 if latency is a significant issue.\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"c2932bc2-7e93-4434-b9ad-a68981630961\",\n   \"metadata\": {},\n   \"source\": [\n    \"## Future Steps\"\n   ]\n  },\n  {\n   \"cell_type\": \"markdown\",\n   \"id\": \"cf93e36c-f28a-4824-8b15-b23478577ce7\",\n   \"metadata\": {},\n   \"source\": [\n    \"This guide showed how to to generate complex summaries using chain of density summarization. We spent some time covering how to apply more complex validators - using `spaCy` and `NLTK` to ensure we had a minimum number of tokens and entity density as well as how you might apply instructor in a multi-stage process.\\n\",\n    \"\\n\",\n    \"By building in validation at each step of the process, this helps to improve the performance of your LLM across various tasks.\\n\",\n    \"\\n\",\n    \"For those looking to delve deeper, here are some to-do lists to explore.\\n\",\n    \"\\n\",\n    \"- **Validate Increasing Entity Density**: `Pydantic` exposes a more complex validator that can take in an arbitrary python dictionary. Use the validation context to check the entity density of the previous summary and the new summary to validate that our model has generated a more entity-dense rewrite\\n\",\n    \"- **Fine-Tuning** : `Instructor` comes with a simple to use interface to help you fine-tune other OpenAI models for your needs. This can be accomplished by capturing the outputs of LLMs using the `Instructions` module to generate training data for fine-tuning. In this specific case, finetuning a model to generate dense summaries could decrease latency and cost significantly by replacing the iterative LLM calls that we make .\\n\",\n    \"\\n\",\n    \"By accomplishing these tasks, you'll gain practical experience in tuning your models to suit your specific tasks as well as build in more complex validation processes when working with LLMs to ensure more reliable, accurate and consistent outputs.\"\n   ]\n  }\n ],\n \"metadata\": {\n  \"kernelspec\": {\n   \"display_name\": \".venv\",\n   \"language\": \"python\",\n   \"name\": \"python3\"\n  },\n  \"language_info\": {\n   \"codemirror_mode\": {\n    \"name\": \"ipython\",\n    \"version\": 3\n   },\n   \"file_extension\": \".py\",\n   \"mimetype\": \"text/x-python\",\n   \"name\": \"python\",\n   \"nbconvert_exporter\": \"python\",\n   \"pygments_lexer\": \"ipython3\",\n   \"version\": \"3.11.6\"\n  }\n },\n \"nbformat\": 4,\n \"nbformat_minor\": 5\n}\n"
  },
  {
    "path": "docs/tutorials/7-synthetic-data-generation.ipynb",
    "content": "{\n  \"cells\": [\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"# Synthetic Data Generation\\n\",\n        \" \\n\",\n        \"RAG Applications are often tricky to evaluate, especially when you haven't obtained any user queries to begin. In this notebook, we'll see how we can use `instructor` to quickly generate synthetic questions from a dataset to benchmark your retrieval systems using some simple metrics. \"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Data Ingestion\\n\",\n        \"\\n\",\n        \"Let's first start by installing the required packages and ingesting the first 200 rows of the `ms-marco` dataset into our local database. \"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 91,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"\\u001b[2mAudited \\u001b[1m7 packages\\u001b[0m in 301ms\\u001b[0m\\n\"\n          ]\n        }\n      ],\n      \"source\": [\n        \"!uv pip install instructor openai datasets lancedb tantivy tenacity tqdm\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"We're using `lancedb` here to easily ingest large amounts of data. This is preferable since we can define our table schema using a `Pydantic` Schema and also have LanceDB automatically handle the generation of the embeddings using their `get_registry()` method that we can define as an object property.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 6,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"from lancedb import connect\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"DB_PATH = \\\"./db\\\"\\n\",\n        \"DB_TABLE = \\\"ms_marco\\\"\\n\",\n        \"\\n\",\n        \"# Create a db at the path `./db`\\n\",\n        \"db = connect(DB_PATH)\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 31,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"from lancedb.pydantic import LanceModel, Vector\\n\",\n        \"from lancedb.embeddings import get_registry\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"func = get_registry().get(\\\"openai\\\").create(name=\\\"text-embedding-3-small\\\")\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class Chunk(LanceModel):\\n\",\n        \"    passage: str = func.SourceField()\\n\",\n        \"    chunk_id: str\\n\",\n        \"    embedding: Vector(func.ndims()) = func.VectorField()\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"table = db.create_table(DB_TABLE, schema=Chunk, exist_ok=True, mode=\\\"overwrite\\\")\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 32,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"from datasets import load_dataset\\n\",\n        \"\\n\",\n        \"N_ROWS = 200\\n\",\n        \"\\n\",\n        \"dataset = load_dataset(\\\"ms_marco\\\", \\\"v1.1\\\", split=\\\"train\\\", streaming=True).take(N_ROWS)\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 33,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"dict_keys(['answers', 'passages', 'query', 'query_id', 'query_type', 'wellFormedAnswers'])\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"# from itertools import islice\\n\",\n        \"first_item = next(iter(dataset))\\n\",\n        \"first_item.keys()\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 36,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"[\\\"Since 2007, the RBA's outstanding reputation has been affected by the 'Securency' or NPA scandal. These RBA subsidiaries were involved in bribing overseas officials so that Australia might win lucrative note-printing contracts. The assets of the bank include the gold and foreign exchange reserves of Australia, which is estimated to have a net worth of A$101 billion. Nearly 94% of the RBA's employees work at its headquarters in Sydney, New South Wales and at the Business Resumption Site.\\\",\\n\",\n              \" \\\"The Reserve Bank of Australia (RBA) came into being on 14 January 1960 as Australia 's central bank and banknote issuing authority, when the Reserve Bank Act 1959 removed the central banking functions from the Commonwealth Bank. The assets of the bank include the gold and foreign exchange reserves of Australia, which is estimated to have a net worth of A$101 billion. Nearly 94% of the RBA's employees work at its headquarters in Sydney, New South Wales and at the Business Resumption Site.\\\",\\n\",\n              \" 'RBA Recognized with the 2014 Microsoft US Regional Partner of the ... by PR Newswire. Contract Awarded for supply and support the. Securitisations System used for risk management and analysis. ']\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"first_item[\\\"passages\\\"][\\\"passage_text\\\"][:3]\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 34,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"import hashlib\\n\",\n        \"from itertools import batched\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"def get_passages(dataset):\\n\",\n        \"    for row in dataset:\\n\",\n        \"        for passage in row[\\\"passages\\\"][\\\"passage_text\\\"]:\\n\",\n        \"            yield {\\n\",\n        \"                \\\"passage\\\": passage,\\n\",\n        \"                \\\"chunk_id\\\": hashlib.md5(passage.encode()).hexdigest(),\\n\",\n        \"            }\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"passages = batched(get_passages(dataset), 10)\\n\",\n        \"\\n\",\n        \"for passage_batch in passages:\\n\",\n        \"    # print(passage_batch)\\n\",\n        \"    table.add(list(passage_batch))\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Synthetic Questions\\n\",\n        \"\\n\",\n        \"Now that we have the first ~2000 passages from the MS-Marco dataset ingested into our database. Let's start generating some synthetic questions using the chunks we've ingested. \\n\",\n        \"\\n\",\n        \"Let's see how we might do so using `instructor` by defining a datamodel that can help support this use-case.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 35,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"from pydantic import BaseModel, Field\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"class QuestionAnswerPair(BaseModel):\\n\",\n        \"    \\\"\\\"\\\"\\n\",\n        \"    This model represents a pair of a question generated from a text chunk, its corresponding answer,\\n\",\n        \"    and the chain of thought leading to the answer. The chain of thought provides insight into how the answer\\n\",\n        \"    was derived from the question.\\n\",\n        \"    \\\"\\\"\\\"\\n\",\n        \"\\n\",\n        \"    chain_of_thought: str = Field(\\n\",\n        \"        description=\\\"The reasoning process leading to the answer.\\\"\\n\",\n        \"    )\\n\",\n        \"    question: str = Field(description=\\\"The generated question from the text chunk.\\\")\\n\",\n        \"    answer: str = Field(description=\\\"The answer to the generated question.\\\")\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Once we've defined this data-model, we can then use it in an instructor call to generate a synthetic question.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": null,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"{\\n\",\n            \"  \\\"chain_of_thought\\\": \\\"To form a specific question from the given text chunk, I should focus on the unique details provided about the Reserve Bank of Australia, such as its creation, functions, and assets.\\\",\\n\",\n            \"  \\\"question\\\": \\\"When was the Reserve Bank of Australia established as Australia's central bank and banknote issuing authority?\\\",\\n\",\n            \"  \\\"answer\\\": \\\"The Reserve Bank of Australia was established as Australia's central bank and banknote issuing authority on 14 January 1960.\\\"\\n\",\n            \"}\\n\"\n          ]\n        }\n      ],\n      \"source\": [\n        \"import instructor\\n\",\n        \"\\n\",\n        \"client = instructor.from_provider(\\\"openai/gpt-4o\\\")\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"def generate_question(chunk: str) -> QuestionAnswerPair:\\n\",\n        \"    return client.create(\\n\",\n        \"        model=\\\"gpt-4o\\\",\\n\",\n        \"        messages=[\\n\",\n        \"            {\\n\",\n        \"                \\\"role\\\": \\\"system\\\",\\n\",\n        \"                \\\"content\\\": \\\"You are a world class AI that excels at generating hypothetical search queries. You're about to be given a text snippet and asked to generate a search query which is specific to the specific text chunk that you'll be given. Make sure to use information from the text chunk.\\\",\\n\",\n        \"            },\\n\",\n        \"            {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"Here is the text chunk: {chunk}\\\"},\\n\",\n        \"        ],\\n\",\n        \"        response_model=QuestionAnswerPair,\\n\",\n        \"    )\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"text_chunk = \\\"\\\"\\\"\\n\",\n        \"The Reserve Bank of Australia (RBA) came into being on 14 January 1960 as Australia 's central bank and banknote issuing authority, when the Reserve Bank Act 1959 removed the central banking functions from the Commonwealth Bank. The assets of the bank include the gold and foreign exchange reserves of Australia, which is estimated to have a net worth of A$101 billion. Nearly 94% of the RBA's employees work at its headquarters in Sydney, New South Wales and at the Business Resumption Site.\\n\",\n        \"\\\"\\\"\\\"\\n\",\n        \"print(generate_question(text_chunk).model_dump_json(indent=2))\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Now that we've seen how to generate a single question, let's see how we might be able to scale this up. We can do so by taking advantage of the `asyncio` library and `tenacity` to handle retries.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 56,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"[\\\"Since 2007, the RBA's outstanding reputation has been affected by the 'Securency' or NPA scandal. These RBA subsidiaries were involved in bribing overseas officials so that Australia might win lucrative note-printing contracts. The assets of the bank include the gold and foreign exchange reserves of Australia, which is estimated to have a net worth of A$101 billion. Nearly 94% of the RBA's employees work at its headquarters in Sydney, New South Wales and at the Business Resumption Site.\\\",\\n\",\n              \" \\\"The Reserve Bank of Australia (RBA) came into being on 14 January 1960 as Australia 's central bank and banknote issuing authority, when the Reserve Bank Act 1959 removed the central banking functions from the Commonwealth Bank. The assets of the bank include the gold and foreign exchange reserves of Australia, which is estimated to have a net worth of A$101 billion. Nearly 94% of the RBA's employees work at its headquarters in Sydney, New South Wales and at the Business Resumption Site.\\\"]\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"chunks = table.to_pandas()\\n\",\n        \"chunks = [item for item in chunks[\\\"passage\\\"]]\\n\",\n        \"chunks[:2]\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 98,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"from asyncio import Semaphore\\n\",\n        \"from tenacity import retry, stop_after_attempt, wait_exponential\\n\",\n        \"import asyncio\\n\",\n        \"import instructor\\n\",\n        \"\\n\",\n        \"client = instructor.from_provider(\\\"openai/gpt-3.5-turbo\\\", async_client=True)\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"async def generate_questions(chunks: list[str], max_queries: int):\\n\",\n        \"    @retry(\\n\",\n        \"        stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10)\\n\",\n        \"    )\\n\",\n        \"    async def generate_question(\\n\",\n        \"        chunk: str, sem: Semaphore\\n\",\n        \"    ) -> tuple[QuestionAnswerPair, str]:\\n\",\n        \"        async with sem:\\n\",\n        \"            return (\\n\",\n        \"                await client.create(\\n\",\n        \"                    model=\\\"gpt-3.5-turbo\\\",\\n\",\n        \"                    messages=[\\n\",\n        \"                        {\\n\",\n        \"                            \\\"role\\\": \\\"system\\\",\\n\",\n        \"                            \\\"content\\\": \\\"You are a world class AI that excels at generating hypothetical search queries. You're about to be given a text snippet and asked to generate a search query which is specific to the specific text chunk that you'll be given. Make sure to use information from the text chunk.\\\",\\n\",\n        \"                        },\\n\",\n        \"                        {\\\"role\\\": \\\"user\\\", \\\"content\\\": f\\\"Here is the text chunk: {chunk}\\\"},\\n\",\n        \"                    ],\\n\",\n        \"                    response_model=QuestionAnswerPair,\\n\",\n        \"                ),\\n\",\n        \"                chunk,\\n\",\n        \"            )\\n\",\n        \"\\n\",\n        \"    sem = Semaphore(max_queries)\\n\",\n        \"    coros = [generate_question(chunk, sem) for chunk in chunks]\\n\",\n        \"    return await asyncio.gather(*coros)\\n\",\n        \"\\n\",\n        \"\\n\",\n        \"questions = await generate_questions(chunks[:300], 10)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Benchmarking Retrieval\\n\",\n        \"\\n\",\n        \"Now that we've generated a list of questions to query our database with, let's do a quick benchmark to see how full text search compares against that of hybrid search. We'll use two simple metrics here - Mean Reciprocal Rank ( MRR ) and Recall.\\n\",\n        \"\\n\",\n        \"Let's start by making sure we have an inverted index created on our table above that we can perform full text search on\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 64,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"table.create_fts_index(\\\"passage\\\", replace=True)\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"This allows us to then use the `.search` function on each table to query it using full text search. Let's see an example below.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 67,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"A rebuildable atomizer (RBA), often referred to as simply a “rebuildable,” is just a special type of atomizer used in the Vape Pen and Mod Industry that connects to a personal vaporizer. 1 The bottom feed RBA is, perhaps, the easiest of all RBA types to build, maintain, and use. 2  It is filled from below, much like bottom coil clearomizer. 3  Bottom feed RBAs can utilize cotton instead of silica for the wick. 4  The Genesis, or genny, is a top feed RBA that utilizes a short woven mesh wire.\\n\",\n            \"Results-Based Accountability® (also known as RBA) is a disciplined way of thinking and taking action that communities can use to improve the lives of children, youth, families, adults and the community as a whole. RBA is also used by organizations to improve the performance of their programs. RBA improves the lives of children, families, and communities and the performance of programs because RBA: 1  Gets from talk to action quickly; 2  Is a simple, common sense process that everyone can understand; 3  Helps groups to surface and challenge assumptions that can be barriers to innovation;\\n\"\n          ]\n        }\n      ],\n      \"source\": [\n        \"for entry in table.search(\\\"RBA\\\", query_type=\\\"fts\\\").limit(2).to_list():\\n\",\n        \"    print(entry[\\\"passage\\\"])\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"### Metrics\\n\",\n        \"\\n\",\n        \"Now that we've figured out how we might be able to query our table using full text search. Let's take a step back and see how we can implement some metrics to quantiatively evaluate the retrieved items. It's important to note that when we want to evaluate the quality of our listings, we always take it at some subset of k.\\n\",\n        \"\\n\",\n        \"This is important because k is often constrained by a business outcome and can help us determine how well our solution works\\n\",\n        \"\\n\",\n        \"Eg. Here are some hypothetical scenarios\\n\",\n        \"\\n\",\n        \"- k=5 : We'd like to display some recommended items based of a user query (Eg. Help me plan out a dinner with Jonathan next week -> Display 5 possible actions)\\n\",\n        \"- k=10 : We have a small carousel with recommended items for a user to buy\\n\",\n        \"- k=25 : We're using a re-ranker, is it filtering out the irrelevant chunks from the relevant chunks well?\\n\",\n        \"- k=50 : We have a pipeline that fetches information for a model to respond with, are we fetching all relevant bits of information\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"#### Reciprocal Rank\\n\",\n        \"\\n\",\n        \"Reciprocal Rank\\n\",\n        \"Imagine we're spotify and we want to suggest a couple of songs to the user. Which is a better result among the two lists of retrieved songs below? ( Note that 2 is the answer we want )\\n\",\n        \"\\n\",\n        \"- [0,1,2,3,4]\\n\",\n        \"- [0,1,3,4,2]\\n\",\n        \"\\n\",\n        \"Obviously if we're suggesting songs to the user, we want the first relevant song to be listed as early as possible! Therefore we'd prefer 1 over 2 in the example above because 2 is ordered earlier in the first case. A metric that works well for this is the Reciprocal Rank (RR).\\n\",\n        \"\\n\",\n        \"![](../img/mrr_eqn.png)\\n\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 84,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"def rr(results, labels):\\n\",\n        \"    return max(\\n\",\n        \"        [\\n\",\n        \"            round(1 / (results.index(label) + 1), 2) if label in results else 0\\n\",\n        \"            for label in labels\\n\",\n        \"        ]\\n\",\n        \"    )\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"This is an aggressive metric and once we get to an position of > 10, the value doesn't change much anymore. Most of the big changes happen at indexes < 10.\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"#### Recall\\n\",\n        \"\\n\",\n        \"Another metric that we can track is recall which measures how many of our retrieved items were retrieved. \\n\",\n        \"\\n\",\n        \"![](../img/recall_eqn.png)\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 69,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"def recall(results, relevant_chunks):\\n\",\n        \"    return sum([1 if chunk in results else 0 for chunk in relevant_chunks]) / len(\\n\",\n        \"        relevant_chunks\\n\",\n        \"    )\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Using Our Questions\\n\",\n        \"\\n\",\n        \"Now that we've seen two metrics that we can use and how we might be able to generate some synthetic questions, let's try it out on an actual question.\\n\",\n        \"\\n\",\n        \"To do so, we'll first generate a unique chunk id for our original passage that we generated the question from. \\n\",\n        \"\\n\",\n        \"We'll then compare the chunk_ids of the retrieved chunks and then compute the `mrr` and the `recall` of the retrieved results.\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 86,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"('b6d9bf888fd53590ee69a913bd9bf8a4',\\n\",\n              \" \\\"What factors influence the average salary for people with a bachelor's degree?\\\",\\n\",\n              \" \\\"However, the average salary for people with a bachelor's degree varies widely based upon several factors, including their major, job position, location and years of experience. The National Association of Colleges and Employers conducted a salary survey that determined the average starting salary for graduates of various bachelor's degree programs.\\\")\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"import hashlib\\n\",\n        \"\\n\",\n        \"sample_question, chunk = questions[0]\\n\",\n        \"\\n\",\n        \"chunk_id = hashlib.md5(chunk.encode()).hexdigest()\\n\",\n        \"chunk_id, sample_question.question, chunk\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 81,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"['b6d9bf888fd53590ee69a913bd9bf8a4',\\n\",\n              \" '7a0254c9dc709220367857dcb67f2c8d',\\n\",\n              \" '04e7e6f91463033aa87b4104ea16b477']\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"retrieved_results = (\\n\",\n        \"    table.search(sample_question.question, query_type=\\\"fts\\\").limit(25).to_list()\\n\",\n        \")\\n\",\n        \"retrieved_chunk_ids = [item[\\\"chunk_id\\\"] for item in retrieved_results]\\n\",\n        \"\\n\",\n        \"retrieved_chunk_ids[:3]\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"We can now compute the results for the retrieved items that we've obtained using full text search relative to the ground truth label that we have - the original chunk that we generated it from\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 85,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"(1.0, 1.0)\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"recall(retrieved_chunk_ids, [chunk_id]), rr(retrieved_chunk_ids, [chunk_id])\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"Scaling it up for different values of `k`, where we can see how this value changes for different subsets of the retrieved items is relatively simple. \\n\",\n        \"\\n\",\n        \"We can generate this mapping automatically using `itertools.product`\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 112,\n      \"metadata\": {},\n      \"outputs\": [],\n      \"source\": [\n        \"from itertools import product\\n\",\n        \"\\n\",\n        \"SIZES = [3, 5, 10, 15, 25]\\n\",\n        \"METRICS = [[\\\"mrr\\\", rr], [\\\"recall\\\", recall]]\\n\",\n        \"\\n\",\n        \"score_fns = {}\\n\",\n        \"\\n\",\n        \"for metric, size in product(METRICS, SIZES):\\n\",\n        \"    metric_name, score_fn = metric\\n\",\n        \"    score_fns[f\\\"{metric_name}@{size}\\\"] = (\\n\",\n        \"        lambda predictions, labels, fn=score_fn, k=size: fn(predictions[:k], labels)\\n\",\n        \"    )  # type: ignore\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"## Running an Evaluation\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"We can now use the code above to run a test to see how our full text search performs for our synthetic questions. \"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 114,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"name\": \"stdout\",\n          \"output_type\": \"stream\",\n          \"text\": [\n            \"100%|██████████| 300/300 [00:07<00:00, 41.64it/s]\\n\"\n          ]\n        }\n      ],\n      \"source\": [\n        \"import hashlib\\n\",\n        \"from tqdm import tqdm\\n\",\n        \"\\n\",\n        \"fts_results = []\\n\",\n        \"\\n\",\n        \"for sample_qn, chunk in tqdm(questions):\\n\",\n        \"    chunk_id = hashlib.md5(chunk.encode()).hexdigest()\\n\",\n        \"    cleaned_question = \\\"\\\".join(\\n\",\n        \"        char for char in sample_qn.question if char.isalnum() or char.isspace()\\n\",\n        \"    )\\n\",\n        \"    retrieved_results = (\\n\",\n        \"        table.search(cleaned_question, query_type=\\\"fts\\\").limit(25).to_list()\\n\",\n        \"    )\\n\",\n        \"    retrieved_chunk_ids = [item[\\\"chunk_id\\\"] for item in retrieved_results]\\n\",\n        \"\\n\",\n        \"    fts_results.append(\\n\",\n        \"        {\\n\",\n        \"            metric: score_fn(retrieved_chunk_ids, [chunk_id])\\n\",\n        \"            for metric, score_fn in score_fns.items()\\n\",\n        \"        }\\n\",\n        \"    )\"\n      ]\n    },\n    {\n      \"cell_type\": \"code\",\n      \"execution_count\": 115,\n      \"metadata\": {},\n      \"outputs\": [\n        {\n          \"data\": {\n            \"text/plain\": [\n              \"mrr@3        0.784267\\n\",\n              \"mrr@5        0.791267\\n\",\n              \"mrr@10       0.797633\\n\",\n              \"mrr@15       0.798133\\n\",\n              \"mrr@25       0.798433\\n\",\n              \"recall@3     0.896667\\n\",\n              \"recall@5     0.926667\\n\",\n              \"recall@10    0.973333\\n\",\n              \"recall@15    0.980000\\n\",\n              \"recall@25    0.986667\\n\",\n              \"dtype: float64\"\n            ]\n          },\n          \"execution_count\": null,\n          \"metadata\": {},\n          \"output_type\": \"execute_result\"\n        }\n      ],\n      \"source\": [\n        \"import pandas as pd\\n\",\n        \"\\n\",\n        \"df = pd.DataFrame(fts_results)\\n\",\n        \"df.mean()\"\n      ]\n    },\n    {\n      \"cell_type\": \"markdown\",\n      \"metadata\": {},\n      \"source\": [\n        \"We can see that on average full text search is able to surface the relevant item 97-98% of the time if we take `k=10` and that we have the relevant item in between the first and second item here.\\n\",\n        \"\\n\",\n        \"Now, because these are synthetic question, there's likely to be a large amount of overlap in the phrases used in the questions and the original source text, leading to the high values.\\n\",\n        \"\\n\",\n        \"In actual production applications and your domain specific dataset, it's useful to do these experiments and see what works best for your needs.\"\n      ]\n    }\n  ],\n  \"metadata\": {\n    \"kernelspec\": {\n      \"display_name\": \"venv\",\n      \"language\": \"python\",\n      \"name\": \"python3\"\n    },\n    \"language_info\": {\n      \"codemirror_mode\": {\n        \"name\": \"ipython\",\n        \"version\": 3\n      },\n      \"file_extension\": \".py\",\n      \"mimetype\": \"text/x-python\",\n      \"name\": \"python\",\n      \"nbconvert_exporter\": \"python\",\n      \"pygments_lexer\": \"ipython3\",\n      \"version\": \"3.12.3\"\n    }\n  },\n  \"nbformat\": 4,\n  \"nbformat_minor\": 2\n}\n"
  },
  {
    "path": "docs/tutorials/index.md",
    "content": "---\ntitle: Instructor Tutorials\ndescription: Interactive, step-by-step tutorials for learning how to use Instructor effectively\n---\n\n# Instructor Tutorials\n\n<div class=\"grid cards\" markdown>\n\n- :material-school: **Learning Path**\n\n    Follow our structured learning path to become an Instructor expert\n\n    [:octicons-arrow-right-16: Start Learning](#tutorial-pathway)\n\n- :material-notebook-edit: **Interactive Formats**\n\n    Run our Jupyter notebooks in your preferred environment\n\n    [:octicons-arrow-right-16: Run Options](#running-options)\n\n- :material-certificate: **Skill Building**\n\n    Gain practical skills for real-world AI applications\n\n    [:octicons-arrow-right-16: What You'll Learn](#skills-gained)\n\n- :material-help: **Support**\n\n    Get help when you need it\n\n    [:octicons-arrow-right-16: Get Help](#getting-help)\n\n</div>\n\n## Tutorial Pathway {#tutorial-pathway}\n\nOur tutorials follow a carefully designed learning path from basic concepts to advanced applications. Each tutorial builds on previous concepts while introducing new techniques.\n\n| Tutorial | Topic | Key Skills | Difficulty |\n|----------|-------|------------|------------|\n| 1. [Introduction to Structured Outputs](./1-introduction.ipynb) | Basic extraction | Pydantic models, basic prompting | 🟢 Beginner |\n| 2. [Tips and Tricks](./2-tips.ipynb) | Best practices | Advanced models, optimization | 🟢 Beginner |\n| 3. [Applications: RAG](./3-0-applications-rag.ipynb) | Retrieval-augmented generation | Information retrieval, context handling | 🟡 Intermediate |\n| 4. [Applications: RAG Validation](./3-1-validation-rag.ipynb) | Validating RAG outputs | Quality control, validation hooks | 🟡 Intermediate |\n| 5. [Validation Techniques](./4-validation.ipynb) | Deep validation | Custom validators, error handling | 🟡 Intermediate |\n| 6. [Knowledge Graphs](./5-knowledge-graphs.ipynb) | Graph building | Entity relationships, graph visualization | 🔴 Advanced |\n| 7. [Chain of Density](./6-chain-of-density.ipynb) | Summarization techniques | Iterative refinement, content density | 🔴 Advanced |\n| 8. [Synthetic Data Generation](./7-synthetic-data-generation.ipynb) | Creating datasets | Data augmentation, testing data | 🔴 Advanced |\n\n## Running Options {#running-options}\n\nChoose your preferred environment to work through these interactive Jupyter notebooks:\n\n<div class=\"grid cards\" markdown>\n\n- :material-laptop: **Run Locally**\n\n    ```bash\n    git clone https://github.com/jxnl/instructor.git\n    cd instructor\n    pip install -e \".[all]\"\n    jupyter notebook docs/tutorials/\n    ```\n\n- :material-google: **Google Colab**\n\n    Look for the \"Open in Colab\" button at the top of each notebook\n\n    Perfect for cloud execution without local setup\n\n- :simple-mybinder: **Binder**\n\n    Click the \"Launch Binder\" button to run instantly in your browser\n\n    No installation or API keys required for basic examples\n\n</div>\n\n## Skills Gained {#skills-gained}\n\nBy completing this tutorial series, you'll gain practical skills in:\n\n- **Structured Extraction**: Define Pydantic models that capture exactly the data you need\n- **Advanced Validation**: Ensure LLM outputs meet your data quality requirements\n- **Streaming Responses**: Process data in real-time with partial and iterative outputs\n- **Complex Applications**: Build RAG systems, knowledge graphs, and more\n- **Multi-Provider Support**: Work with different LLM providers using a consistent interface\n- **Production Techniques**: Learn optimization strategies for real-world applications\n\n## Setup Requirements\n\nBefore starting, make sure you have:\n\n- **Python Environment**: Python 3.8+ installed\n- **Dependencies**: Install with `pip install \"instructor[all]\"`\n- **API Keys**: Access to OpenAI API or other supported providers\n- **Basic Knowledge**: Familiarity with Python and basic LLM concepts\n\n## Getting Help {#getting-help}\n\nWe're here to support your learning journey:\n\n- **Documentation**: Check the [core concepts](../concepts/index.md) for detailed explanations\n- **FAQ**: Browse our [frequently asked questions](../faq.md)\n- **Community**: Join our [Discord server](https://discord.gg/bD9YE9JArw) for real-time help\n- **Issues**: Report problems on [GitHub](https://github.com/jxnl/instructor/issues)\n- **Examples**: See [practical examples](../examples/index.md) of Instructor in action\n\n<div class=\"grid cards\" markdown>\n\n- :material-play-circle: **Ready to Begin?**\n\n    Start your journey with our first tutorial on structured outputs\n\n    [:octicons-arrow-right-16: Start Learning](./1-introduction.ipynb){: .md-button .md-button--primary }\n\n</div>\n\n"
  },
  {
    "path": "docs/why.md",
    "content": "---\ndescription: Discover why Instructor is the simplest, most reliable way to get structured outputs from LLMs.\n---\n\n# Why use Instructor?\n\nYou've built something with an LLM, but 15% of the time it returns garbage. Parsing JSON is a nightmare. Different providers have different APIs. There has to be a better way.\n\n## The pain of unstructured outputs\n\nLet's be honest about what working with LLMs is really like:\n\n```python\n# What you want:\nuser_info = extract_user(\"John is 25 years old\")\nprint(user_info.name)  # \"John\"\nprint(user_info.age)   # 25\n\n# What you actually get:\nresponse = llm.complete(\"Extract: John is 25 years old\")\n# \"I'd be happy to help! Based on the text, the user's name is John\n# and their age is 25. Is there anything else you'd like me to extract?\"\n\n# Now you need to:\n# 1. Parse this text somehow\n# 2. Handle when it returns JSON with syntax errors\n# 3. Validate the data matches what you expect\n# 4. Retry when it fails (which it will)\n# 5. Do this differently for each LLM provider\n```\n\n## The Instructor difference\n\nHere's the same task with Instructor:\n\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nclient = instructor.from_provider(\"openai/gpt-4\")\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"John is 25 years old\"}],\n)\n\nprint(user.name)  # \"John\"\nprint(user.age)   # 25\n```\n\n**That's it.** No parsing. No retries. No provider-specific code.\n\n## Real problems Instructor solves\n\n### 1. \"It works 90% of the time\"\n\nWithout Instructor, your LLM returns perfect JSON most of the time. But that 10% will ruin your weekend.\n\n```python\n# Without Instructor: Brittle code that breaks randomly\ntry:\n    data = json.loads(llm_response)\n    user = User(**data)  # KeyError: 'name'\nexcept:\n    # Now what? Retry? Parse the text? Give up?\n    pass\n\n# With Instructor: Automatic retries with validation errors\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n    max_retries=3,  # Retries with validation errors\n)\n# Always returns valid User object or raises clear exception\n```\n\n### 2. \"Each provider is different\"\n\nEvery LLM provider has its own API. Your code becomes a mess of conditionals.\n\n```python\n# Without Instructor: Provider-specific spaghetti\nif provider == \"openai\":\n    response = openai.chat.completions.create(\n        tools=[{\"type\": \"function\", \"function\": {...}}]\n    )\n    data = json.loads(response.choices[0].message.tool_calls[0].function.arguments)\nelif provider == \"anthropic\":\n    response = anthropic.messages.create(\n        tools=[{\"name\": \"extract\", \"input_schema\": {...}}]\n    )\n    data = response.content[0].input\nelif provider == \"google\":\n    # ... different API again\n\n# With Instructor: One API for all providers\nclient = instructor.from_provider(\"openai/gpt-4\")\n# or\nclient = instructor.from_provider(\"anthropic/claude-3\")\n# or\nclient = instructor.from_provider(\"google/gemini-pro\")\n\n# Same code for all providers\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n)\n```\n\n### 3. \"Complex data structures are impossible\"\n\nNested objects, lists, enums - LLMs struggle with complex schemas.\n\n```python\n# Without Instructor: Good luck with this\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"users\": {\n            \"type\": \"array\",\n            \"items\": {\n                \"type\": \"object\",\n                \"properties\": {\n                    \"name\": {\"type\": \"string\"},\n                    \"addresses\": {\n                        \"type\": \"array\",\n                        \"items\": {\n                            \"type\": \"object\",\n                            \"properties\": {\n                                \"street\": {\"type\": \"string\"},\n                                \"city\": {\"type\": \"string\"}\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n}\n\n# With Instructor: Just use Python\nfrom typing import List\n\nclass Address(BaseModel):\n    street: str\n    city: str\n\nclass User(BaseModel):\n    name: str\n    addresses: List[Address]\n\nclass UserList(BaseModel):\n    users: List[User]\n\n# Works perfectly\nresult = client.create(\n    response_model=UserList,\n    messages=[{\"role\": \"user\", \"content\": \"...\"}],\n)\n```\n\n## The cost of not using Instructor\n\nLet's talk real numbers:\n\n**Time wasted:**\n- 2-3 hours implementing JSON parsing and validation\n- 4-6 hours debugging edge cases\n- 2-3 hours for each new provider you add\n- Ongoing maintenance as APIs change\n\n**Bugs in production:**\n- Malformed JSON crashes your app\n- Missing fields cause silent failures\n- Type mismatches corrupt your database\n- Customer complaints about reliability\n\n**Developer frustration:**\n- \"It worked in testing!\"\n- \"Why is the JSON different this time?\"\n- \"How do I handle when it returns a string instead of a number?\"\n\n## What developers are saying\n\nBased on our GitHub issues and Discord:\n\n- **\"Reduced our LLM code by 80%\"** - Common feedback\n- **\"Finally, LLM outputs I can trust\"** - From production users\n- **\"The retries alone are worth it\"** - Saves hours of edge-case handling\n- **\"Works exactly the same with every provider\"** - No more provider lock-in\n\n## Start now, thank yourself later\n\nEvery day without Instructor is another day of:\n- Debugging malformed JSON\n- Writing provider-specific code\n- Handling validation manually\n- Explaining to your PM why the LLM integration is flaky\n\nInstall Instructor:\n```bash\npip install instructor\n```\n\nTry it in 30 seconds:\n```python\nimport instructor\nfrom pydantic import BaseModel\n\nclient = instructor.from_provider(\"openai/gpt-4\")\n\nclass User(BaseModel):\n    name: str\n    age: int\n\nuser = client.create(\n    response_model=User,\n    messages=[{\"role\": \"user\", \"content\": \"John is 25 years old\"}],\n)\n\nprint(user)  # User(name='John', age=25)\n```\n\n## When NOT to use Instructor\n\nLet's be clear - you might not need Instructor if:\n\n- You only need raw text responses (chatbots, creative writing)\n- You're building a one-off script with no error handling\n- You enjoy debugging JSON parsing errors at 3am\n\nFor everyone else building production LLM applications, Instructor is the obvious choice.\n\n[Get Started →](index.md#quick-start-extract-structured-data-in-3-lines){ .md-button .md-button--primary }"
  },
  {
    "path": "ellipsis.yaml",
    "content": "# Reference: https://docs.ellipsis.dev\nversion: 1.1\npr_review:\n  auto_review_enabled: true\n  auto_summarize_pr: true\n  confidence_threshold: 0.85\n  rules:\n    # Control what gets flagged during PR review with custom rules. Here are some to get you started:\n    - \"Code should be DRY (Dont Repeat Yourself)\"\n    - \"Extremely Complicated Code Needs Comments\"\n    - \"Use Descriptive Variable and Constant Names\"\n    - \"Function and Method Naming Should Follow Consistent Patterns\"\n    - \"If library code changes, expect documentation to be updated\"\n    - \"If library code changes, check if tests are updated\"\n    - \"If a new `md` file is created in `docs` make sure its added to mkdocs.yml\"\n    - \"Assertions should always have an error message that is formatted well. \"\n    - \"Make sure hub examples are added to mkdocs.yml\"\n"
  },
  {
    "path": "examples/__init__.py",
    "content": ""
  },
  {
    "path": "examples/anthropic/run.py",
    "content": "from pydantic import BaseModel\nimport anthropic\nimport instructor\n\n# Patching the Anthropics client with the instructor for enhanced capabilities\nclient = instructor.from_anthropic(anthropic.Anthropic())\n\n\nclass Properties(BaseModel):\n    key: str\n    value: str\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    properties: list[Properties]\n\n\nuser = client.messages.create(\n    model=\"claude-3-haiku-20240307\",\n    max_tokens=1024,\n    max_retries=0,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Create a user for a model with a name, age, and properties.\",\n        }\n    ],\n    response_model=User,\n)\n\nprint(user.model_dump_json(indent=2))\n"
  },
  {
    "path": "examples/anthropic-web-tool/run.py",
    "content": "import instructor\nfrom pydantic import BaseModel\n\n\n# Noticed thhat we use JSON not TOOLS mode\nclient = instructor.from_provider(\n    \"anthropic/claude-3-7-sonnet-latest\",\n    mode=instructor.Mode.JSON,\n    async_client=False,\n)\n\n\nclass Citation(BaseModel):\n    id: int\n    url: str\n\n\nclass Response(BaseModel):\n    citations: list[Citation]\n    response: str\n\n\nresponse_data, completion_details = client.messages.create_with_completion(\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a helpful assistant that summarizes news articles. Your final response should be only contain a single JSON object returned in your final message to the user. Make sure to provide the exact ids for the citations that support the information you provide in the form of inline citations as [1] [2] [3] which correspond to a unique id you generate for a url that you find in the web search tool which is relevant to your final response.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": \"What are the latest results for the UFC and who won? Answer this in a concise response that's under 3 sentences.\",\n        },\n    ],\n    tools=[{\"type\": \"web_search_20250305\", \"name\": \"web_search\", \"max_uses\": 3}],\n    response_model=Response,\n)\n\nprint(\"Response:\")\nprint(response_data.response)\nprint(\"\\nCitations:\")\nfor citation in response_data.citations:\n    print(f\"{citation.id}: {citation.url}\")\n"
  },
  {
    "path": "examples/asyncio-benchmarks/run.py",
    "content": "\"\"\"\nAsyncio Benchmarks with Instructor\n\nThis script demonstrates and benchmarks different asyncio patterns for LLM processing:\n- Sequential processing (baseline)\n- asyncio.gather (concurrent, ordered results)\n- asyncio.as_completed (concurrent, streaming results)\n- Rate-limited processing with semaphores\n- Error handling patterns\n- Progress tracking\n- Batch processing with chunking\n\nRun this script to see performance comparisons and verify all code examples work.\n\"\"\"\n\nimport asyncio\nimport time\nimport logging\nimport instructor\nfrom pydantic import BaseModel, field_validator\nfrom openai import AsyncOpenAI, OpenAI\nimport os\n\n# Set up logging\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\n# Set up the async client with Instructor\nclient = instructor.from_openai(AsyncOpenAI())\nsync_client = instructor.from_openai(OpenAI())\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    occupation: str\n\n    @field_validator(\"age\")\n    @classmethod\n    def validate_age(cls, v):\n        if v < 0 or v > 150:\n            raise ValueError(f\"Age {v} is invalid\")\n        return v\n\n\n# Sample dataset\ndataset = [\n    \"John Smith is a 30-year-old software engineer\",\n    \"Sarah Johnson is a 25-year-old data scientist\",\n    \"Mike Davis is a 35-year-old product manager\",\n    \"Lisa Wilson is a 28-year-old UX designer\",\n    \"Tom Brown is a 32-year-old DevOps engineer\",\n    \"Emma Garcia is a 27-year-old frontend developer\",\n    \"David Lee is a 33-year-old backend developer\",\n]\n\n\nasync def extract_person(text: str) -> Person:\n    \"\"\"Extract person information from text using LLM.\"\"\"\n    return await client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=Person,\n        messages=[{\"role\": \"user\", \"content\": f\"Extract person info: {text}\"}],\n    )\n\n\n# Method 1: Sequential Processing (Baseline)\nasync def sequential_processing() -> tuple[list[Person], float]:\n    \"\"\"Process items one by one - slowest method.\"\"\"\n    start_time = time.time()\n    persons = []\n\n    for text in dataset:\n        person = await extract_person(text)\n        persons.append(person)\n        print(f\"Processed: {person.name}\")\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"Sequential processing took: {duration:.2f} seconds\")\n    return persons, duration\n\n\n# Method 2: asyncio.gather - Concurrent Processing\nasync def gather_processing() -> tuple[list[Person], float]:\n    \"\"\"Process all items concurrently and return in order.\"\"\"\n    start_time = time.time()\n\n    # Create tasks for all items\n    tasks = [extract_person(text) for text in dataset]\n\n    # Execute all tasks concurrently\n    persons = await asyncio.gather(*tasks)\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"asyncio.gather took: {duration:.2f} seconds\")\n\n    # Results maintain original order\n    for person in persons:\n        print(f\"Processed: {person.name}\")\n\n    return persons, duration\n\n\n# Method 3: asyncio.as_completed - Streaming Results\nasync def as_completed_processing() -> tuple[list[Person], float]:\n    \"\"\"Process items concurrently and handle results as they complete.\"\"\"\n    start_time = time.time()\n    persons = []\n\n    # Create tasks for all items\n    tasks = [extract_person(text) for text in dataset]\n\n    # Process results as they complete\n    for task in asyncio.as_completed(tasks):\n        person = await task\n        persons.append(person)\n        print(f\"Completed: {person.name}\")\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"asyncio.as_completed took: {duration:.2f} seconds\")\n    return persons, duration\n\n\n# Method 4: Rate-Limited Processing with Semaphores\nasync def rate_limited_extract_person(\n    text: str, semaphore: asyncio.Semaphore\n) -> Person:\n    \"\"\"Extract person info with rate limiting.\"\"\"\n    async with semaphore:\n        return await extract_person(text)\n\n\nasync def rate_limited_gather(concurrency_limit: int = 3) -> tuple[list[Person], float]:\n    \"\"\"Process items with controlled concurrency using asyncio.gather.\"\"\"\n    start_time = time.time()\n\n    # Create semaphore to limit concurrent requests\n    semaphore = asyncio.Semaphore(concurrency_limit)\n\n    # Create rate-limited tasks\n    tasks = [rate_limited_extract_person(text, semaphore) for text in dataset]\n\n    # Execute with rate limiting\n    persons = await asyncio.gather(*tasks)\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(\n        f\"Rate-limited gather (limit={concurrency_limit}) took: {duration:.2f} seconds\"\n    )\n    return persons, duration\n\n\nasync def rate_limited_as_completed(\n    concurrency_limit: int = 3,\n) -> tuple[list[Person], float]:\n    \"\"\"Process items with controlled concurrency using asyncio.as_completed.\"\"\"\n    start_time = time.time()\n    persons = []\n\n    # Create semaphore to limit concurrent requests\n    semaphore = asyncio.Semaphore(concurrency_limit)\n\n    # Create rate-limited tasks\n    tasks = [rate_limited_extract_person(text, semaphore) for text in dataset]\n\n    # Process results as they complete\n    for task in asyncio.as_completed(tasks):\n        person = await task\n        persons.append(person)\n        print(f\"Rate-limited completed: {person.name}\")\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(\n        f\"Rate-limited as_completed (limit={concurrency_limit}) took: {duration:.2f} seconds\"\n    )\n    return persons, duration\n\n\n# Advanced Patterns\nasync def robust_gather_processing() -> tuple[list[Person], float]:\n    \"\"\"Process items with error handling.\"\"\"\n    start_time = time.time()\n    tasks = [extract_person(text) for text in dataset]\n\n    # Execute with error handling\n    results = await asyncio.gather(*tasks, return_exceptions=True)\n\n    persons = []\n    for i, result in enumerate(results):\n        if isinstance(result, Exception):\n            print(f\"Error processing item {i}: {result}\")\n        else:\n            persons.append(result)\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"Robust gather processing took: {duration:.2f} seconds\")\n    return persons, duration\n\n\nasync def timeout_gather_processing(\n    timeout_seconds: float = 30.0,\n) -> tuple[list[Person], float]:\n    \"\"\"Process items with timeout.\"\"\"\n    start_time = time.time()\n    tasks = [extract_person(text) for text in dataset]\n\n    try:\n        persons = await asyncio.wait_for(\n            asyncio.gather(*tasks), timeout=timeout_seconds\n        )\n        end_time = time.time()\n        duration = end_time - start_time\n        print(f\"Timeout gather processing took: {duration:.2f} seconds\")\n        return persons, duration\n    except asyncio.TimeoutError:\n        end_time = time.time()\n        duration = end_time - start_time\n        print(\n            f\"Processing timed out after {timeout_seconds} seconds (took {duration:.2f}s)\"\n        )\n        return [], duration\n\n\nasync def progress_tracking_processing() -> tuple[list[Person], float]:\n    \"\"\"Process items with progress tracking.\"\"\"\n    start_time = time.time()\n    persons = []\n    total_items = len(dataset)\n    completed = 0\n\n    tasks = [extract_person(text) for text in dataset]\n\n    for task in asyncio.as_completed(tasks):\n        person = await task\n        persons.append(person)\n        completed += 1\n        print(\n            f\"Progress: {completed}/{total_items} ({completed / total_items * 100:.1f}%)\"\n        )\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"Progress tracking processing took: {duration:.2f} seconds\")\n    return persons, duration\n\n\nasync def chunked_processing(chunk_size: int = 3) -> tuple[list[Person], float]:\n    \"\"\"Process items in chunks to manage memory and rate limits.\"\"\"\n    start_time = time.time()\n    all_persons = []\n\n    # Process in chunks\n    for i in range(0, len(dataset), chunk_size):\n        chunk = dataset[i : i + chunk_size]\n        print(f\"Processing chunk {i // chunk_size + 1}\")\n\n        tasks = [extract_person(text) for text in chunk]\n        chunk_results = await asyncio.gather(*tasks)\n        all_persons.extend(chunk_results)\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"Chunked processing took: {duration:.2f} seconds\")\n    return all_persons, duration\n\n\nasync def benchmark_all_methods():\n    \"\"\"Run all processing methods and compare performance.\"\"\"\n    print(\"=== Python asyncio.gather and asyncio.as_completed Performance Test ===\\n\")\n\n    # Check if OpenAI API key is set\n    if not os.getenv(\"OPENAI_API_KEY\"):\n        print(\"⚠️  OPENAI_API_KEY not set. Using mock responses for demonstration.\")\n        return\n\n    # Test different methods\n    methods = [\n        (\"Sequential\", sequential_processing),\n        (\"asyncio.gather\", gather_processing),\n        (\"asyncio.as_completed\", as_completed_processing),\n        (\"Rate-limited gather (3)\", lambda: rate_limited_gather(3)),\n        (\"Rate-limited as_completed (3)\", lambda: rate_limited_as_completed(3)),\n        (\"Robust gather\", robust_gather_processing),\n        (\"Timeout gather\", timeout_gather_processing),\n        (\"Progress tracking\", progress_tracking_processing),\n        (\"Chunked processing\", chunked_processing),\n    ]\n\n    results = {}\n\n    for name, method in methods:\n        print(f\"\\n{'=' * 50}\")\n        print(f\"Testing: {name}\")\n        print(\"=\" * 50)\n\n        try:\n            persons, duration = await method()\n            results[name] = {\n                \"count\": len(persons),\n                \"duration\": duration,\n                \"success\": True,\n            }\n            print(f\"✓ Success: {len(persons)} items processed in {duration:.2f}s\")\n\n            # Show first few results\n            for person in persons[:3]:\n                print(f\"  - {person.name}, {person.age}, {person.occupation}\")\n            if len(persons) > 3:\n                print(f\"  ... and {len(persons) - 3} more\")\n\n        except Exception as e:\n            results[name] = {\n                \"count\": 0,\n                \"duration\": 0,\n                \"success\": False,\n                \"error\": str(e),\n            }\n            print(f\"✗ Failed: {e}\")\n\n    # Print summary table\n    print(f\"\\n{'=' * 80}\")\n    print(\"PERFORMANCE SUMMARY\")\n    print(\"=\" * 80)\n    print(f\"{'Method':<25} {'Items':<6} {'Time (s)':<10} {'Speed':<15} {'Status'}\")\n    print(\"-\" * 80)\n\n    for name, result in results.items():\n        if result[\"success\"]:\n            speed = (\n                f\"{result['count'] / result['duration']:.1f} items/s\"\n                if result[\"duration\"] > 0\n                else \"N/A\"\n            )\n            status = \"✓ Success\"\n        else:\n            speed = \"N/A\"\n            status = \"✗ Failed\"\n\n        print(\n            f\"{name:<25} {result['count']:<6} {result['duration']:<10.2f} {speed:<15} {status}\"\n        )\n\n    # Calculate speedup compared to sequential\n    if \"Sequential\" in results and results[\"Sequential\"][\"success\"]:\n        baseline = results[\"Sequential\"][\"duration\"]\n        print(f\"\\nSpeedup compared to sequential processing:\")\n        for name, result in results.items():\n            if name != \"Sequential\" and result[\"success\"] and result[\"duration\"] > 0:\n                speedup = baseline / result[\"duration\"]\n                print(f\"  {name}: {speedup:.1f}x faster\")\n\n\ndef sync_example():\n    \"\"\"Show sync version for comparison.\"\"\"\n    print(\"\\n\" + \"=\" * 50)\n    print(\"Sync Example (for comparison)\")\n    print(\"=\" * 50)\n\n    start_time = time.time()\n    persons = []\n\n    for text in dataset[:3]:  # Just first 3 for demo\n        person = sync_client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=Person,\n            messages=[{\"role\": \"user\", \"content\": f\"Extract person info: {text}\"}],\n        )\n        persons.append(person)\n        print(f\"Sync processed: {person.name}\")\n\n    end_time = time.time()\n    duration = end_time - start_time\n    print(f\"Sync processing (3 items) took: {duration:.2f} seconds\")\n\n\nasync def main():\n    \"\"\"Main function to run all examples.\"\"\"\n    try:\n        await benchmark_all_methods()\n\n        # Run sync example if API key is available\n        if os.getenv(\"OPENAI_API_KEY\"):\n            sync_example()\n\n    except KeyboardInterrupt:\n        print(\"\\n⚠️  Interrupted by user\")\n    except Exception as e:\n        print(f\"❌ Error: {e}\")\n        logger.exception(\"Unexpected error occurred\")\n\n\nif __name__ == \"__main__\":\n    print(\"🚀 Starting asyncio benchmarks with Instructor...\")\n    print(\"💡 Make sure to set OPENAI_API_KEY environment variable\")\n    print(\"⏱️  This will take a few minutes to complete all benchmarks\\n\")\n\n    asyncio.run(main())\n"
  },
  {
    "path": "examples/auto-ticketer/run.py",
    "content": "import instructor\nfrom openai import OpenAI\n\nfrom typing import Optional\nfrom pydantic import BaseModel, Field\nfrom enum import Enum\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass PriorityEnum(str, Enum):\n    high = \"High\"\n    medium = \"Medium\"\n    low = \"Low\"\n\n\nclass Subtask(BaseModel):\n    \"\"\"\n    Correctly resolved subtask from the given transcript\n    \"\"\"\n\n    id: int = Field(..., description=\"Unique identifier for the subtask\")\n    name: str = Field(..., description=\"Informative title of the subtask\")\n\n\nclass Ticket(BaseModel):\n    \"\"\"\n    Correctly resolved ticket from the given transcript\n    \"\"\"\n\n    id: int = Field(..., description=\"Unique identifier for the ticket\")\n    name: str = Field(..., description=\"Title of the task\")\n    description: str = Field(..., description=\"Detailed description of the task\")\n    priority: PriorityEnum = Field(..., description=\"Priority level\")\n    assignees: list[str] = Field(..., description=\"List of users assigned to the task\")\n    subtasks: Optional[list[Subtask]] = Field(\n        None, description=\"List of subtasks associated with the main task\"\n    )\n    dependencies: Optional[list[int]] = Field(\n        None, description=\"List of ticket IDs that this ticket depends on\"\n    )\n\n\nclass ActionItems(BaseModel):\n    \"\"\"\n    Correctly resolved set of action items from the given transcript\n    \"\"\"\n\n    items: list[Ticket]\n\n\ndef generate(data: str):\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=ActionItems,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"The following is a transcript of a meeting between a manager and their team. The manager is assigning tasks to their team members and creating action items for them to complete.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Create the action items for the following transcript: {data}\",\n            },\n        ],\n    )\n\n\nprediction = generate(\n    \"\"\"\nAlice: Hey team, we have several critical tasks we need to tackle for the upcoming release. First, we need to work on improving the authentication system. It's a top priority.\n\nBob: Got it, Alice. I can take the lead on the authentication improvements. Are there any specific areas you want me to focus on?\n\nAlice: Good question, Bob. We need both a front-end revamp and back-end optimization. So basically, two sub-tasks.\n\nCarol: I can help with the front-end part of the authentication system.\n\nBob: Great, Carol. I'll handle the back-end optimization then.\n\nAlice: Perfect. Now, after the authentication system is improved, we have to integrate it with our new billing system. That's a medium priority task.\n\nCarol: Is the new billing system already in place?\n\nAlice: No, it's actually another task. So it's a dependency for the integration task. Bob, can you also handle the billing system?\n\nBob: Sure, but I'll need to complete the back-end optimization of the authentication system first, so it's dependent on that.\n\nAlice: Understood. Lastly, we also need to update our user documentation to reflect all these changes. It's a low-priority task but still important.\n\nCarol: I can take that on once the front-end changes for the authentication system are done. So, it would be dependent on that.\n\nAlice: Sounds like a plan. Let's get these tasks modeled out and get started.\"\"\"\n)\n\nprint(prediction.model_dump_json(indent=2))\n\"\"\"\n{\n  \"items\": [\n    {\n      \"id\": 1,\n      \"name\": \"Improve Authentication System\",\n      \"description\": \"Revamp the front-end and optimize the back-end of the authentication system\",\n      \"priority\": \"High\",\n      \"assignees\": [\n        \"Bob\",\n        \"Carol\"\n      ],\n      \"subtasks\": [\n        {\n          \"id\": 2,\n          \"name\": \"Front-end Revamp\"\n        },\n        {\n          \"id\": 3,\n          \"name\": \"Back-end Optimization\"\n        }\n      ],\n      \"dependencies\": []\n    },\n    {\n      \"id\": 4,\n      \"name\": \"Integrate Authentication System with Billing System\",\n      \"description\": \"Integrate the improved authentication system with the new billing system\",\n      \"priority\": \"Medium\",\n      \"assignees\": [\n        \"Bob\"\n      ],\n      \"subtasks\": [],\n      \"dependencies\": [\n        1\n      ]\n    },\n    {\n      \"id\": 5,\n      \"name\": \"Update User Documentation\",\n      \"description\": \"Update the user documentation to reflect the changes in the authentication system\",\n      \"priority\": \"Low\",\n      \"assignees\": [\n        \"Carol\"\n      ],\n      \"subtasks\": [],\n      \"dependencies\": [\n        2\n      ]\n    }\n  ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/automodel/run.py",
    "content": "#!/usr/bin/env python\n\"\"\"\nExample demonstrating the unified provider interface with string-based initialization.\nCreates clients for multiple providers with both sync and async interfaces.\n\"\"\"\n\nimport os\nimport asyncio\nfrom typing import Any\nimport instructor\nfrom pydantic import BaseModel, Field\n\n\nclass UserInfo(BaseModel):\n    \"\"\"Simple model to extract user information from text.\"\"\"\n\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n    occupation: str = Field(description=\"The user's job or profession\")\n\n\nasync def test_async_client(\n    client_name: str, client: instructor.AsyncInstructor\n) -> dict[str, Any]:\n    \"\"\"Test an async client and return the results.\"\"\"\n    print(f\"Testing async client: {client_name}\")\n    try:\n        result = await client.chat.completions.create(\n            response_model=UserInfo,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"John Smith is a 35-year-old software engineer.\",\n                }\n            ],\n        )\n        print(f\"✅ Async {client_name} result: {result.model_dump()}\")\n        return {\"provider\": client_name, \"success\": True, \"result\": result.model_dump()}\n    except Exception as e:\n        print(f\"❌ Async {client_name} error: {str(e)}\")\n        return {\"provider\": client_name, \"success\": False, \"error\": str(e)}\n\n\ndef test_sync_client(client_name: str, client: instructor.Instructor) -> dict[str, Any]:\n    \"\"\"Test a sync client and return the results.\"\"\"\n    print(f\"Testing sync client: {client_name}\")\n    try:\n        result = client.chat.completions.create(\n            response_model=UserInfo,\n            messages=[\n                {\"role\": \"user\", \"content\": \"Jane Doe is a 28-year-old data scientist.\"}\n            ],\n        )\n        print(f\"✅ Sync {client_name} result: {result.model_dump()}\")\n        return {\"provider\": client_name, \"success\": True, \"result\": result.model_dump()}\n    except Exception as e:\n        print(f\"❌ Sync {client_name} error: {str(e)}\")\n        return {\"provider\": client_name, \"success\": False, \"error\": str(e)}\n\n\nasync def main():\n    \"\"\"Create and test multiple clients using the unified provider interface.\"\"\"\n    # Collect the test results\n    sync_results = []\n    async_results = []\n\n    # Test OpenAI clients\n    if os.environ.get(\"OPENAI_API_KEY\"):\n        # Sync client\n        openai_client = instructor.from_provider(\"openai/gpt-3.5-turbo\")\n        sync_results.append(test_sync_client(\"OpenAI\", openai_client))\n\n        # Async client\n        openai_async = instructor.from_provider(\n            \"openai/gpt-3.5-turbo\", async_client=True\n        )\n        async_results.append(\n            asyncio.create_task(test_async_client(\"OpenAI\", openai_async))\n        )\n    else:\n        print(\"⚠️ OPENAI_API_KEY not set, skipping OpenAI tests\")\n\n    # Test Anthropic clients\n    if os.environ.get(\"ANTHROPIC_API_KEY\"):\n        # Sync client\n        anthropic_client = instructor.from_provider(\n            model=\"anthropic/claude-3-haiku-20240307\", max_tokens=400\n        )\n        sync_results.append(test_sync_client(\"Anthropic\", anthropic_client))\n\n        # Async client\n        anthropic_async = instructor.from_provider(\n            model=\"anthropic/claude-3-haiku-20240307\", async_client=True, max_tokens=400\n        )\n        async_results.append(\n            asyncio.create_task(test_async_client(\"Anthropic\", anthropic_async))\n        )\n    else:\n        print(\"⚠️ ANTHROPIC_API_KEY not set, skipping Anthropic tests\")\n\n    # Test Cohere clients\n    if os.environ.get(\"COHERE_API_KEY\"):\n        # Sync client\n        cohere_client = instructor.from_provider(\"cohere/command\")\n        sync_results.append(test_sync_client(\"Cohere\", cohere_client))\n\n        # Async client\n        cohere_async = instructor.from_provider(\"cohere/command\", async_client=True)\n        async_results.append(\n            asyncio.create_task(test_async_client(\"Cohere\", cohere_async))\n        )\n    else:\n        print(\"⚠️ COHERE_API_KEY not set, skipping Cohere tests\")\n\n    # Test Mistral clients\n    if os.environ.get(\"MISTRAL_API_KEY\"):\n        # Sync client\n        mistral_client = instructor.from_provider(\"mistral/mistral-small\")\n        sync_results.append(test_sync_client(\"Mistral\", mistral_client))\n\n        # Async client\n        mistral_async = instructor.from_provider(\n            \"mistral/mistral-small\", async_client=True\n        )\n        async_results.append(\n            asyncio.create_task(test_async_client(\"Mistral\", mistral_async))\n        )\n    else:\n        print(\"⚠️ MISTRAL_API_KEY not set, skipping Mistral tests\")\n\n    # Process async results\n    if async_results:\n        completed_tasks = await asyncio.gather(*async_results)\n        async_results = completed_tasks\n\n    # Print summary\n    print(\"\\n----- Test Results Summary -----\")\n\n    print(\"\\nSync Clients:\")\n    for result in sync_results:\n        if result.get(\"success\", False):\n            print(f\"✅ {result['provider']} - Success\")\n        else:\n            print(\n                f\"❌ {result['provider']} - Failed: {result.get('error', 'Unknown error')}\"\n            )\n\n    print(\"\\nAsync Clients:\")\n    for result in async_results:\n        if result.get(\"success\", False):\n            print(f\"✅ {result['provider']} - Success\")\n        else:\n            print(\n                f\"❌ {result['provider']} - Failed: {result.get('error', 'Unknown error')}\"\n            )\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n"
  },
  {
    "path": "examples/avail/run.py",
    "content": "from pydantic import BaseModel, Field\nfrom typing import Literal\nfrom collections.abc import Iterable\nfrom datetime import datetime, timedelta\n\nfrom openai import OpenAI\nimport instructor\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass DateRange(BaseModel):\n    explain: str = Field(\n        ...,\n        description=\"Explain the date range in the context of the text before generating the date range and the repeat pattern.\",\n    )\n    repeats: Literal[\"daily\", \"weekly\", \"monthly\", None] = Field(\n        default=None,\n        description=\"If the date range repeats, and how often, this way we can generalize the date range to the future., if its special, then we can assume it is a one time event.\",\n    )\n    days_of_week: list[\n        Literal[\n            \"monday\",\n            \"tuesday\",\n            \"wednesday\",\n            \"thursday\",\n            \"friday\",\n            \"saturday\",\n            \"sunday\",\n            None,\n        ]\n    ] = Field(\n        ...,\n        description=\"If the date range repeats, which days of the week does it repeat on.\",\n    )\n    time_start: datetime = Field(\n        description=\"The start of the first time range in the day.\"\n    )\n    time_end: datetime = Field(\n        description=\"The end of the first time range in the day.\"\n    )\n\n\nclass AvailabilityResponse(BaseModel):\n    availability: list[DateRange]\n\n\ndef prepare_dates(n=7) -> str:\n    # Current date and time\n    now = datetime.now()\n\n    acc = \"\"\n    # Loop for the next 7 days\n    for i in range(n):\n        # Calculate the date for each day\n        day = now + timedelta(days=i)\n        # Print the day of the week, date, and time\n        acc += \"\\n\" + day.strftime(\"%A, %Y-%m-%d %H:%M:%S\")\n\n    return acc.strip()\n\n\ndef parse_availability(text: str) -> Iterable[AvailabilityResponse]:\n    return client.chat.completions.create(\n        model=\"gpt-4-1106-preview\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a state of the art date range parse designed to correctly extract availabilities.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": text,\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"To help you understand the dates, here are the next 7 days: {prepare_dates()}\",\n            },\n        ],\n        response_model=Iterable[AvailabilityResponse],\n    )\n\n\nif __name__ == \"__main__\":\n    text = \"\"\"\n    #1\n    \n    12/8-12/24\n    9am - 5pm Monday - Saturday\n    10am - 5pm Sunday\n\n    #2\n    We are open Friday, after Thanksgiving, and then Saturdays and Sundays 9 a.m. till dusk.``\n    \"\"\"\n    schedules = parse_availability(text)\n    for schedule in schedules:\n        print(schedule.model_dump_json(indent=2))\n        {\n            \"availability\": [\n                {\n                    \"explain\": \"For the first date range, the availability is from December 8 to December 24, from 9 am to 5 pm on Mondays through Saturdays\",\n                    \"repeats\": \"weekly\",\n                    \"days_of_week\": [\n                        \"monday\",\n                        \"tuesday\",\n                        \"wednesday\",\n                        \"thursday\",\n                        \"friday\",\n                        \"saturday\",\n                    ],\n                    \"time_start\": \"2023-12-08T09:00:00\",\n                    \"time_end\": \"2023-12-08T17:00:00\",\n                },\n                {\n                    \"explain\": \"For the same date range, the availability on Sundays is from 10 am to 5 pm\",\n                    \"repeats\": \"weekly\",\n                    \"days_of_week\": [\"sunday\"],\n                    \"time_start\": \"2023-12-10T10:00:00\",\n                    \"time_end\": \"2023-12-10T17:00:00\",\n                },\n            ]\n        }\n    {\n        \"availability\": [\n            {\n                \"explain\": \"The second date range starting from the Friday after Thanksgiving, which is November 24, 2023, and then on Saturdays and Sundays from 9 am until dusk. Assuming 'dusk' means approximately 5 pm, similar to the previous timings.\",\n                \"repeats\": \"weekly\",\n                \"days_of_week\": [\"friday\", \"saturday\", \"sunday\"],\n                \"time_start\": \"2023-11-24T09:00:00\",\n                \"time_end\": \"2023-11-24T17:00:00\",\n            }\n        ]\n    }\n"
  },
  {
    "path": "examples/avail/run_mixtral.py",
    "content": "import os\nfrom pydantic import BaseModel, Field\nfrom typing import Literal\nfrom datetime import datetime, timedelta\n\nfrom openai import OpenAI\nimport instructor\n\nclient = instructor.from_openai(\n    OpenAI(\n        base_url=\"https://api.endpoints.anyscale.com/v1\",\n        api_key=os.environ[\"ANYSCALE_API_KEY\"],\n    ),\n    mode=instructor.Mode.JSON_SCHEMA,\n    model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n)\n\n\nclass DateRange(BaseModel):\n    explain: str = Field(\n        ...,\n        description=\"Explain the date range in the context of the text before generating the date range and the repeat pattern.\",\n    )\n    repeats: Literal[\"daily\", \"weekly\", \"monthly\", None] = Field(\n        default=None,\n        description=\"If the date range repeats, and how often, this way we can generalize the date range to the future., if its special, then we can assume it is a one time event.\",\n    )\n    days_of_week: list[\n        Literal[\n            \"monday\",\n            \"tuesday\",\n            \"wednesday\",\n            \"thursday\",\n            \"friday\",\n            \"saturday\",\n            \"sunday\",\n            None,\n        ]\n    ] = Field(\n        ...,\n        description=\"If the date range repeats, which days of the week does it repeat on.\",\n    )\n    time_start: datetime = Field(\n        description=\"The start of the first time range in the day.\"\n    )\n    time_end: datetime = Field(\n        description=\"The end of the first time range in the day.\"\n    )\n\n\nclass AvailabilityResponse(BaseModel):\n    availability: list[DateRange]\n\n\ndef prepare_dates(n=7) -> str:\n    # Current date and time\n    now = datetime.now()\n\n    acc = \"\"\n    # Loop for the next 7 days\n    for i in range(n):\n        # Calculate the date for each day\n        day = now + timedelta(days=i)\n        # Print the day of the week, date, and time\n        acc += \"\\n\" + day.strftime(\"%A, %Y-%m-%d %H:%M:%S\")\n\n    return acc.strip()\n\n\ndef parse_availability(text: str):\n    return client.chat.completions.create_iterable(\n        max_tokens=10000,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a state of the art date range parse designed to correctly extract availabilities.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": text,\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"To help you understand the dates, here are the next 7 days: {prepare_dates()}\",\n            },\n        ],\n        response_model=AvailabilityResponse,\n        max_retries=3,\n    )\n\n\nif __name__ == \"__main__\":\n    text = \"\"\"\n    #1\n    \n    12/8-12/24\n    9am - 5pm Monday - Saturday\n    10am - 5pm Sunday\n\n    #2\n    We are open Friday, after Thanksgiving, and then Saturdays and Sundays 9 a.m. till dusk.``\n    \"\"\"\n    schedules = parse_availability(text)\n    for schedule in schedules:\n        print(schedule.model_dump_json(indent=2))\n        {\n            \"availability\": [\n                {\n                    \"explain\": \"For the first date range, the availability is from December 8 to December 24, from 9 am to 5 pm on Mondays through Saturdays\",\n                    \"repeats\": \"weekly\",\n                    \"days_of_week\": [\n                        \"monday\",\n                        \"tuesday\",\n                        \"wednesday\",\n                        \"thursday\",\n                        \"friday\",\n                        \"saturday\",\n                    ],\n                    \"time_start\": \"2023-12-08T09:00:00\",\n                    \"time_end\": \"2023-12-08T17:00:00\",\n                },\n                {\n                    \"explain\": \"For the same date range, the availability on Sundays is from 10 am to 5 pm\",\n                    \"repeats\": \"weekly\",\n                    \"days_of_week\": [\"sunday\"],\n                    \"time_start\": \"2023-12-10T10:00:00\",\n                    \"time_end\": \"2023-12-10T17:00:00\",\n                },\n            ]\n        }\n    {\n        \"availability\": [\n            {\n                \"explain\": \"The second date range starting from the Friday after Thanksgiving, which is November 24, 2023, and then on Saturdays and Sundays from 9 am until dusk. Assuming 'dusk' means approximately 5 pm, similar to the previous timings.\",\n                \"repeats\": \"weekly\",\n                \"days_of_week\": [\"friday\", \"saturday\", \"sunday\"],\n                \"time_start\": \"2023-11-24T09:00:00\",\n                \"time_end\": \"2023-11-24T17:00:00\",\n            }\n        ]\n    }\n"
  },
  {
    "path": "examples/batch-classification/run-cache.py",
    "content": "import instructor\nimport asyncio\n\nfrom openai import AsyncOpenAI\nfrom pydantic import BaseModel, Field, field_validator\nfrom enum import Enum\n\nclient = instructor.from_openai(AsyncOpenAI(), mode=instructor.Mode.TOOLS)\nsem = asyncio.Semaphore(5)\n\n\nclass QuestionType(Enum):\n    CONTACT = \"CONTACT\"\n    TIMELINE_QUERY = \"TIMELINE_QUERY\"\n    DOCUMENT_SEARCH = \"DOCUMENT_SEARCH\"\n    COMPARE_CONTRAST = \"COMPARE_CONTRAST\"\n    EMAIL = \"EMAIL\"\n    PHOTOS = \"PHOTOS\"\n    SUMMARY = \"SUMMARY\"\n\n\n# You can add more instructions and examples in the description\n# or you can put it in the prompt in `messages=[...]`\nclass QuestionClassification(BaseModel):\n    \"\"\"\n    Predict the type of question that is being asked.\n    Here are some tips on how to predict the question type:\n    CONTACT: Searches for some contact information.\n    TIMELINE_QUERY: \"When did something happen?\n    DOCUMENT_SEARCH: \"Find me a document\"\n    COMPARE_CONTRAST: \"Compare and contrast two things\"\n    EMAIL: \"Find me an email, search for an email\"\n    PHOTOS: \"Find me a photo, search for a photo\"\n    SUMMARY: \"Summarize a large amount of data\"\n    \"\"\"\n\n    # If you want only one classification, just change it to\n    #   `classification: QuestionType` rather than `classifications: List[QuestionType]``\n    chain_of_thought: str = Field(\n        ..., description=\"The chain of thought that led to the classification\"\n    )\n    classification: list[QuestionType] = Field(\n        description=f\"An accuracy and correct prediction predicted class of question. Only allowed types: {[t.value for t in QuestionType]}, should be used\",\n    )\n\n    @field_validator(\"classification\", mode=\"before\")\n    def validate_classification(cls, v):\n        # sometimes the API returns a single value, just make sure it's a list\n        if not isinstance(v, list):\n            v = [v]\n        return v\n\n\n# Modify the classify function\nasync def classify(data: str):\n    async with sem:  # some simple rate limiting\n        return data, await client.chat.completions.create(\n            model=\"gpt-4\",\n            response_model=QuestionClassification,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Classify the following question: {data}\",\n                },\n            ],\n        )\n\n\nasync def main(questions: list[str]):\n    tasks = [classify(question) for question in questions]\n    resps = []\n    for task in asyncio.as_completed(tasks):\n        question, label = await task\n        resp = {\n            \"question\": question,\n            \"classification\": [c.value for c in label.classification],\n            \"chain_of_thought\": label.chain_of_thought,\n        }\n        resps.append(resp)\n    return resps\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    questions = [\n        \"What was that ai app that i saw on the news the other day?\",\n        \"Can you find the trainline booking email?\",\n        \"What was the book I saw on amazon yesturday?\",\n        \"Can you speak german?\",\n        \"Do you have access to the meeting transcripts?\",\n        \"what are the recent sites I visited?\",\n        \"what did I do on Monday?\",\n        \"Tell me about todays meeting and how it relates to the email on Monday\",\n    ]\n\n    asyncio.run(main(questions))\n"
  },
  {
    "path": "examples/batch-classification/run.py",
    "content": "import json\nimport instructor\nimport asyncio\n\nfrom openai import AsyncOpenAI\nfrom pydantic import BaseModel, Field, field_validator\nfrom enum import Enum\n\nclient = AsyncOpenAI()\nclient = instructor.from_openai(client, mode=instructor.Mode.TOOLS)\nsem = asyncio.Semaphore(5)\n\n\nclass QuestionType(Enum):\n    CONTACT = \"CONTACT\"\n    TIMELINE_QUERY = \"TIMELINE_QUERY\"\n    DOCUMENT_SEARCH = \"DOCUMENT_SEARCH\"\n    COMPARE_CONTRAST = \"COMPARE_CONTRAST\"\n    EMAIL = \"EMAIL\"\n    PHOTOS = \"PHOTOS\"\n    SUMMARY = \"SUMMARY\"\n\n\n# You can add more instructions and examples in the description\n# or you can put it in the prompt in `messages=[...]`\nclass QuestionClassification(BaseModel):\n    \"\"\"\n    Predict the type of question that is being asked.\n    Here are some tips on how to predict the question type:\n    CONTACT: Searches for some contact information.\n    TIMELINE_QUERY: \"When did something happen?\n    DOCUMENT_SEARCH: \"Find me a document\"\n    COMPARE_CONTRAST: \"Compare and contrast two things\"\n    EMAIL: \"Find me an email, search for an email\"\n    PHOTOS: \"Find me a photo, search for a photo\"\n    SUMMARY: \"Summarize a large amount of data\"\n    \"\"\"\n\n    # If you want only one classification, just change it to\n    #   `classification: QuestionType` rather than `classifications: List[QuestionType]``\n    chain_of_thought: str = Field(\n        ..., description=\"The chain of thought that led to the classification\"\n    )\n    classification: list[QuestionType] = Field(\n        description=f\"An accuracy and correct prediction predicted class of question. Only allowed types: {[t.value for t in QuestionType]}, should be used\",\n    )\n\n    @field_validator(\"classification\", mode=\"before\")\n    def validate_classification(cls, v):\n        # sometimes the API returns a single value, just make sure it's a list\n        if not isinstance(v, list):\n            v = [v]\n        return v\n\n\nasync def classify(data: str):\n    async with sem:  # some simple rate limiting\n        return data, await client.chat.completions.create(\n            model=\"gpt-4\",\n            response_model=QuestionClassification,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Classify the following question: {data}\",\n                },\n            ],\n        )\n\n\nasync def main(questions: list[str], *, path_to_jsonl: str = None):\n    tasks = [classify(question) for question in questions]\n    for task in asyncio.as_completed(tasks):\n        question, label = await task\n        resp = {\n            \"question\": question,\n            \"classification\": [c.value for c in label.classification],\n        }\n        print(resp)\n        if path_to_jsonl:\n            with open(path_to_jsonl, \"a\") as f:\n                json_dump = json.dumps(resp)\n                f.write(json_dump + \"\\n\")\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    questions = [\n        \"What was that ai app that i saw on the news the other day?\",\n        \"Can you find the trainline booking email?\",\n        \"What was the book I saw on amazon yesturday?\",\n        \"Can you speak german?\",\n        \"Do you have access to the meeting transcripts?\",\n        \"what are the recent sites I visited?\",\n        \"what did I do on Monday?\",\n        \"Tell me about todays meeting and how it relates to the email on Monday\",\n    ]\n\n    asyncio.run(main(questions))\n"
  },
  {
    "path": "examples/batch-classification/run_langsmith.py",
    "content": "import instructor\nimport asyncio\n\nfrom langsmith import traceable\nfrom langsmith.wrappers import wrap_openai\n\nfrom openai import AsyncOpenAI\nfrom pydantic import BaseModel, Field, field_validator\nfrom enum import Enum\n\nclient = wrap_openai(AsyncOpenAI())\nclient = instructor.from_openai(client, mode=instructor.Mode.TOOLS)\nsem = asyncio.Semaphore(5)\n\n\nclass QuestionType(Enum):\n    CONTACT = \"CONTACT\"\n    TIMELINE_QUERY = \"TIMELINE_QUERY\"\n    DOCUMENT_SEARCH = \"DOCUMENT_SEARCH\"\n    COMPARE_CONTRAST = \"COMPARE_CONTRAST\"\n    EMAIL = \"EMAIL\"\n    PHOTOS = \"PHOTOS\"\n    SUMMARY = \"SUMMARY\"\n\n\n# You can add more instructions and examples in the description\n# or you can put it in the prompt in `messages=[...]`\nclass QuestionClassification(BaseModel):\n    \"\"\"\n    Predict the type of question that is being asked.\n    Here are some tips on how to predict the question type:\n    CONTACT: Searches for some contact information.\n    TIMELINE_QUERY: \"When did something happen?\n    DOCUMENT_SEARCH: \"Find me a document\"\n    COMPARE_CONTRAST: \"Compare and contrast two things\"\n    EMAIL: \"Find me an email, search for an email\"\n    PHOTOS: \"Find me a photo, search for a photo\"\n    SUMMARY: \"Summarize a large amount of data\"\n    \"\"\"\n\n    # If you want only one classification, just change it to\n    #   `classification: QuestionType` rather than `classifications: List[QuestionType]``\n    chain_of_thought: str = Field(\n        ..., description=\"The chain of thought that led to the classification\"\n    )\n    classification: list[QuestionType] = Field(\n        description=f\"An accuracy and correct prediction predicted class of question. Only allowed types: {[t.value for t in QuestionType]}, should be used\",\n    )\n\n    @field_validator(\"classification\", mode=\"before\")\n    def validate_classification(cls, v):\n        # sometimes the API returns a single value, just make sure it's a list\n        if not isinstance(v, list):\n            v = [v]\n        return v\n\n\n# Modify the classify function\n@traceable(name=\"classify-question\")\nasync def classify(data: str):\n    async with sem:  # some simple rate limiting\n        return data, await client.chat.completions.create(\n            model=\"gpt-4\",\n            response_model=QuestionClassification,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Classify the following question: {data}\",\n                },\n            ],\n        )\n\n\nasync def main(questions: list[str]):\n    tasks = [classify(question) for question in questions]\n    resps = []\n    for task in asyncio.as_completed(tasks):\n        question, label = await task\n        resp = {\n            \"question\": question,\n            \"classification\": [c.value for c in label.classification],\n            \"chain_of_thought\": label.chain_of_thought,\n        }\n        resps.append(resp)\n    return resps\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    questions = [\n        \"What was that ai app that i saw on the news the other day?\",\n        \"Can you find the trainline booking email?\",\n        \"What was the book I saw on amazon yesturday?\",\n        \"Can you speak german?\",\n        \"Do you have access to the meeting transcripts?\",\n        \"what are the recent sites I visited?\",\n        \"what did I do on Monday?\",\n        \"Tell me about todays meeting and how it relates to the email on Monday\",\n    ]\n\n    asyncio.run(main(questions))\n"
  },
  {
    "path": "examples/batch_api/README.md",
    "content": "# Batch API Examples\n\nThis directory contains examples and test scripts for Instructor's batch processing capabilities, including both traditional file-based and new in-memory processing.\n\n## Examples\n\n### 1. In-Memory Batch Processing (`in_memory_batch_example.py`)\n\nDemonstrates the new in-memory batch processing feature, perfect for serverless deployments:\n\n```bash\npython in_memory_batch_example.py\n```\n\n**Key Features:**\n- No disk I/O required - ideal for serverless environments\n- BytesIO buffers instead of temporary files  \n- Automatic cleanup - no file management needed\n- Security benefits - no temporary files on disk\n\n### 2. Unified Test Script (`run_batch_test.py`)\n\nTests the unified BatchProcessor with all supported providers: OpenAI, Anthropic, and Google Gemini.\n\nThe script creates a batch job to extract structured `User(name: str, age: int)` data from 10 text examples and saves the batch ID for later checking. Since batch jobs can take time to complete, the script returns immediately after creation.\n\n## Unified Test Script (`run_batch_test.py`)\n\nTests the unified BatchProcessor with any supported provider/model combination.\n\n### Usage\n\n```bash\n# Test OpenAI\nexport OPENAI_API_KEY=\"your-openai-api-key\"\npython run_batch_test.py create --model \"openai/gpt-4o-mini\"\n\n# Test Anthropic  \nexport ANTHROPIC_API_KEY=\"your-anthropic-api-key\"\npython run_batch_test.py create --model \"anthropic/claude-3-5-sonnet-20241022\"\n\n# Test Google (simulation mode)\npython run_batch_test.py create --model \"google/gemini-2.0-flash-001\"\n```\n\n### Supported Models\n\nUse the `list-models` command to see all supported models:\n\n```bash\npython run_batch_test.py list-models\n```\n\n**OpenAI Models:**\n- `openai/gpt-4o-mini`\n- `openai/gpt-4o`\n- `openai/gpt-4-turbo`\n\n**Anthropic Models:**\n- `anthropic/claude-3-5-sonnet-20241022`\n- `anthropic/claude-3-opus-20240229`\n- `anthropic/claude-3-haiku-20240307`\n\n**Google Models:**\n- `google/gemini-2.0-flash-001`\n- `google/gemini-pro`\n- `google/gemini-pro-vision`\n\n### What the Script Does\n\n1. **Creates test messages**: 10 prompts containing user information\n2. **Uses BatchProcessor**: Leverages the unified API with provider detection\n3. **Generates batch file**: Provider-specific format with JSON schema\n4. **Submits batch job**: Actual API call to create the batch\n5. **Saves batch ID**: Stores ID in `{provider}_batch_id.txt`\n6. **Returns immediately**: No waiting for completion\n\n### API Keys Required\n\n| Provider | Environment Variable | Required |\n|----------|---------------------|----------|\n| OpenAI | `OPENAI_API_KEY` | Yes |\n| Anthropic | `ANTHROPIC_API_KEY` | Yes |\n| Google | `GOOGLE_API_KEY` | No (simulation mode) |\n\n### Output Files\n\nEach run creates:\n- `{provider}_batch_id.txt` - Contains the batch ID for status checking\n- Temporary batch files (automatically cleaned up)\n\n### Test Data\n\nAll providers use the same 10 test prompts:\n\n1. \"Hi there! My name is Alice and I'm 28 years old. I work as a software engineer.\"\n2. \"Hello, I'm Bob, 35 years old, and I love hiking and photography.\"\n3. \"This is Sarah speaking. I'm 42 and I'm a graphic designer.\"\n4. \"Hey! John here, I'm 29 years old and I teach high school math.\"\n5. \"I'm Emma, 33 years old, currently working as a marketing manager.\"\n6. \"My name is Michael and I'm 45 years old. I'm a chef at a downtown restaurant.\"\n7. \"I'm Lisa, 31 years old, working as a nurse at the local hospital.\"\n8. \"This is David, 38 years old, I'm a freelance photographer.\"\n9. \"Hello, I'm Jessica, 26 years old, and I'm a data scientist.\"\n10. \"I'm Ryan, 41 years old, working in software development for a tech startup.\"\n\n### Expected Results\n\nEach batch job should extract `User` objects:\n\n```python\nclass User(BaseModel):\n    name: str\n    age: int\n```\n\nExpected extractions:\n- Alice, 28 | Bob, 35 | Sarah, 42 | John, 29 | Emma, 33\n- Michael, 45 | Lisa, 31 | David, 38 | Jessica, 26 | Ryan, 41\n\n## Checking Batch Status\n\nAfter creating batch jobs, use the CLI to check their status:\n\n```bash\n# List all batch jobs for a provider\ninstructor batch list --model \"openai/gpt-4o-mini\"\ninstructor batch list --model \"anthropic/claude-3-5-sonnet-20241022\"\n\n# Check specific batch status\ninstructor batch status --batch-id \"batch_123\" --model \"openai/gpt-4o-mini\"\n\n# Get results when completed\ninstructor batch results \\\n  --batch-id \"batch_123\" \\\n  --output-file \"results.jsonl\" \\\n  --model \"openai/gpt-4o-mini\"\n```\n\n## Processing Times\n\n- **OpenAI**: Usually completes within a few hours, guaranteed within 24h\n- **Anthropic**: Most batches complete in under 1 hour\n- **Google**: Varies (simulation only in this test)\n\n## Running Tests for All Providers\n\n```bash\n# Test all providers (requires API keys)\npython run_batch_test.py create --model \"openai/gpt-4o-mini\"\npython run_batch_test.py create --model \"anthropic/claude-3-5-sonnet-20241022\" \npython run_batch_test.py create --model \"google/gemini-2.0-flash-001\"\n\n# Check what was created\nls *_batch_id.txt\n```\n\n## Troubleshooting\n\n### Common Issues\n\n1. **API Key Not Set**\n   ```\n   ❌ Error: OPENAI_API_KEY environment variable is not set\n   ```\n   Solution: Set the appropriate environment variable.\n\n2. **Invalid Model Format**\n   ```\n   ❌ Error: Model must be in format 'provider/model-name'\n   ```\n   Solution: Use the format `provider/model-name`, e.g., `openai/gpt-4o-mini`.\n\n3. **Unsupported Provider**\n   ```\n   ❌ Unsupported provider: xyz\n   ```\n   Solution: Use `openai`, `anthropic`, or `google` as the provider.\n\n### Provider-Specific Notes\n\n**OpenAI:**\n- Requires valid API key with sufficient credits\n- Supports both individual and organization accounts\n- Rate limits are separate for batch vs regular API\n\n**Anthropic:**\n- Uses beta API endpoints (`client.beta.messages.batches`)\n- Requires Anthropic API access\n- May have different availability by region\n\n**Google:**\n- Runs in simulation mode by default\n- Full implementation requires Google Cloud Storage setup\n- Would need proper GCS authentication for real batch jobs\n\n## Integration with CLI\n\nThis test validates that the unified BatchProcessor works correctly, which powers the CLI commands:\n\n```bash\n# Create batch using CLI directly\ninstructor batch create \\\n  --messages-file messages.jsonl \\\n  --model \"openai/gpt-4o-mini\" \\\n  --response-model \"examples.User\" \\\n  --output-file batch_requests.jsonl\n\n# Submit the batch\ninstructor batch create-from-file \\\n  --file-path batch_requests.jsonl \\\n  --model \"openai/gpt-4o-mini\"\n```\n\n## Development\n\nTo modify the test:\n1. Update `create_test_messages()` to change test data\n2. Modify the `User` model if needed\n3. Add new providers in the provider detection logic\n4. Adjust batch creation functions for new provider-specific behavior\n\nThe test demonstrates that the same code works across all providers thanks to the unified BatchProcessor abstraction!"
  },
  {
    "path": "examples/batch_api/in_memory_batch_example.py",
    "content": "#!/usr/bin/env python3\n\"\"\"Example of using in-memory batching for serverless deployments.\n\nThis example shows how to create and submit batch requests without writing to disk\n\"\"\"\n\nimport time\nfrom pydantic import BaseModel\nfrom instructor.batch.processor import BatchProcessor\n\n\nclass User(BaseModel):\n    \"\"\"User model for extraction.\"\"\"\n\n    name: str\n    age: int\n    email: str\n\n\ndef main():\n    \"\"\"Demonstrate in-memory batch processing.\"\"\"\n    print(\"In-Memory Batch Processing Example\")\n    print(\"===================================\\n\")\n\n    # Initialize batch processor\n    # Note: Use gpt-4o-mini for JSON schema support in batch API\n    processor = BatchProcessor(\"openai/gpt-4o-mini\", User)\n\n    # Sample messages for batch processing\n    messages_list = [\n        [\n            {\"role\": \"system\", \"content\": \"Extract user information from the text.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"John Doe is 25 years old and his email is john@example.com\",\n            },\n        ],\n        [\n            {\"role\": \"system\", \"content\": \"Extract user information from the text.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Jane Smith, age 30, can be reached at jane.smith@company.com\",\n            },\n        ],\n        [\n            {\"role\": \"system\", \"content\": \"Extract user information from the text.\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Bob Wilson (bob.wilson@email.com) is 28 years old\",\n            },\n        ],\n    ]\n\n    print(\"Creating batch requests in memory...\")\n\n    # Create batch in memory (no file_path specified)\n    batch_buffer = processor.create_batch_from_messages(\n        messages_list,\n        file_path=None,  # This triggers in-memory mode\n        max_tokens=150,\n        temperature=0.1,\n    )\n\n    print(f\"Created batch buffer: {type(batch_buffer)}\")\n    print(f\"Buffer size: {len(batch_buffer.getvalue())} bytes\\n\")\n\n    # Show the content of the buffer (first 200 chars)\n    batch_buffer.seek(0)\n    content_preview = batch_buffer.read(200).decode(\"utf-8\")\n    print(\"Buffer content preview:\")\n    print(f\"{content_preview}...\\n\")\n\n    # Reset buffer position for submission\n    batch_buffer.seek(0)\n\n    print(\"Submitting batch job...\")\n\n    try:\n        # Submit the batch using the in-memory buffer\n        batch_id = processor.submit_batch(\n            batch_buffer, metadata={\"description\": \"In-memory batch example\"}\n        )\n\n        print(f\"Batch submitted successfully!\")\n        print(f\"Batch ID: {batch_id}\")\n\n        # Poll for completion\n        print(\"\\nWaiting for batch to complete...\")\n        max_wait_time = 300  # 5 minutes max\n        start_time = time.time()\n        status = {}\n\n        while time.time() - start_time < max_wait_time:\n            status = processor.get_batch_status(batch_id)\n            current_status = status.get(\"status\", \"unknown\")\n\n            # Update status on the same line\n            print(f\"\\rCurrent status: {current_status.ljust(20)}\", end=\"\")\n\n            if current_status in [\"completed\", \"failed\", \"cancelled\", \"expired\"]:\n                break\n\n            time.sleep(10)\n\n        print()  # Newline after polling is done\n\n        # Use the last fetched status\n        final_status = status\n        print(f\"\\nFinal status: {final_status.get('status', 'unknown')}\")\n\n        if final_status.get(\"status\") == \"completed\":\n            print(\"\\nBatch completed! Retrieving results...\")\n\n            # Retrieve and process results\n            results = processor.get_results(batch_id)\n\n            print(f\"\\nResults Summary:\")\n            print(f\"   Total results: {len(results)}\")\n\n            successful_results = [r for r in results if hasattr(r, \"result\")]\n            error_results = [r for r in results if hasattr(r, \"error_message\")]\n\n            print(f\"   Successful: {len(successful_results)}\")\n            print(f\"   Errors: {len(error_results)}\")\n\n            # Show successful extractions\n            if successful_results:\n                print(\"\\nExtracted Users:\")\n                for result in successful_results:\n                    user = result.result\n                    print(f\"   - {user.name}, {user.age} years old, {user.email}\")\n\n            # Show any errors\n            if error_results:\n                print(\"\\nErrors encountered:\")\n                for error in error_results:\n                    print(f\"   - {error.custom_id}: {error.error_message}\")\n\n        elif final_status.get(\"status\") == \"failed\":\n            print(\"\\nBatch failed to complete\")\n            print(\"   Check your API usage and batch format\")\n\n        else:\n            print(f\"\\nBatch did not complete within {max_wait_time} seconds\")\n            print(f\"   Current status: {final_status.get('status', 'unknown')}\")\n            print(\n                \"   You can check status later with processor.get_batch_status(batch_id)\"\n            )\n\n    except Exception as e:\n        print(f\"Error during batch processing: {e}\")\n        print(\"\\nThis is expected if you don't have OpenAI API credentials set up.\")\n        print(\n            \"   The important part is that the in-memory buffer was created successfully!\"\n        )\n\n    print(\"\\nIn-memory batch processing demo complete!\")\n    print(\"\\nKey benefits of in-memory batching:\")\n    print(\"   - No disk I/O required - perfect for serverless\")\n    print(\"   - Faster processing - no file system overhead\")\n    print(\"   - Better security - no temporary files on disk\")\n    print(\"   - Cleaner code - no file cleanup required\")\n\n\ndef compare_file_vs_memory():\n    \"\"\"Compare file-based vs in-memory batch creation.\"\"\"\n    print(\"\\nComparing File-based vs In-Memory Batching\")\n    print(\"===========================================\\n\")\n\n    processor = BatchProcessor(\"openai/gpt-4o-mini\", User)\n\n    messages_list = [\n        [{\"role\": \"user\", \"content\": \"Extract: John, 25, john@example.com\"}],\n        [{\"role\": \"user\", \"content\": \"Extract: Jane, 30, jane@example.com\"}],\n    ]\n\n    # File-based approach (traditional)\n    print(\"File-based approach:\")\n    file_path = processor.create_batch_from_messages(\n        messages_list,\n        file_path=\"temp_batch.jsonl\",  # Specify file path\n    )\n    print(f\"   Created file: {file_path}\")\n\n    # Clean up the file\n    import os\n\n    if os.path.exists(file_path):\n        os.remove(file_path)\n        print(\"   File cleaned up\")\n\n    # In-memory approach (new)\n    print(\"\\nIn-memory approach:\")\n    buffer = processor.create_batch_from_messages(\n        messages_list,\n        file_path=None,  # No file path = in-memory\n    )\n    print(f\"   Created buffer: {type(buffer).__name__}\")\n    print(f\"   Buffer size: {len(buffer.getvalue())} bytes\")\n    print(\"   No cleanup required!\")\n\n\ndef demo_polling_logic():\n    \"\"\"Demonstrate how to properly poll for batch completion.\"\"\"\n    print(\"\\nBatch Polling Best Practices\")\n    print(\"============================\\n\")\n\n    print(\"When working with real batches, follow this pattern:\")\n    print(\"\")\n    print(\"```python\")\n    print(\"import time\")\n    print(\"\")\n    print(\"# Submit your batch\")\n    print(\"batch_id = processor.submit_batch(buffer)\")\n    print(\"\")\n    print(\"# Poll for completion\")\n    print(\"while True:\")\n    print(\"    status = processor.get_batch_status(batch_id)\")\n    print(\"    current_status = status.get('status')\")\n    print(\"    \")\n    print(\"    if current_status == 'completed':\")\n    print(\"        results = processor.get_results(batch_id)\")\n    print(\"        break\")\n    print(\"    elif current_status in ['failed', 'cancelled', 'expired']:\")\n    print(\"        print(f'Batch failed with status: {current_status}')\")\n    print(\"        break\")\n    print(\"    else:\")\n    print(\"        print(f'Status: {current_status}, waiting...')\")\n    print(\"        time.sleep(10)  # Wait 10 seconds before checking again\")\n    print(\"```\")\n    print(\"\")\n    print(\"Typical batch statuses:\")\n    print(\"   - validating - Checking request format\")\n    print(\"   - in_progress - Processing requests\")\n    print(\"   - finalizing - Preparing results\")\n    print(\"   - completed - Ready for download\")\n    print(\"   - failed - Something went wrong\")\n    print(\"   - cancelled - Manually cancelled\")\n    print(\"   - expired - Took too long to process\")\n\n\nif __name__ == \"__main__\":\n    main()\n    compare_file_vs_memory()\n"
  },
  {
    "path": "examples/batch_api/run_batch_test.py",
    "content": "#!/usr/bin/env python3\n\"\"\"Unified Batch API Test Script\n\nTest script to verify the unified BatchProcessor works correctly with all supported providers.\nCreates a batch job to extract User(name: str, age: int) data from text examples.\n\nSupports:\n- OpenAI: openai/gpt-4o-mini, openai/gpt-4o, etc.\n- Anthropic: anthropic/claude-3-5-sonnet-20241022, anthropic/claude-3-opus-20240229, etc.\n- Google: google/gemini-2.5-flash, google/gemini-pro, etc.\n\nUsage:\n    # Default (Google Gemini 2.5 Flash)\n    export GOOGLE_API_KEY=\"your-key\"\n    python run_batch_test.py\n\n    # OpenAI\n    export OPENAI_API_KEY=\"your-key\"\n    python run_batch_test.py --model \"openai/gpt-4o-mini\"\n\n    # Anthropic\n    export ANTHROPIC_API_KEY=\"your-key\"\n    python run_batch_test.py --model \"anthropic/claude-3-5-sonnet-20241022\"\n\n    # Google with specific model\n    export GOOGLE_API_KEY=\"your-key\"\n    python run_batch_test.py --model \"google/gemini-2.5-flash\"\n\"\"\"\n\nimport os\nimport sys\nfrom typing import Optional\nimport typer\nfrom pydantic import BaseModel\n\n# Add parent directory to path for imports\nsys.path.append(os.path.join(os.path.dirname(__file__), \"..\", \"..\"))\nfrom instructor.batch import (\n    BatchProcessor,\n    BatchStatus,\n    filter_successful,\n    filter_errors,\n    extract_results,\n)\n\napp = typer.Typer(help=\"Unified Batch API Test for all providers\")\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\ndef create_test_messages() -> list[list[dict]]:\n    \"\"\"Create test message conversations for user extraction\"\"\"\n    test_prompts = [\n        \"Hi there! My name is Alice and I'm 28 years old. I work as a software engineer.\",\n    ]\n\n    messages_list = []\n    for prompt in test_prompts:\n        messages = [\n            {\n                \"role\": \"system\",\n                \"content\": \"You are an expert at extracting structured user information from text. Extract the person's name and age.\",\n            },\n            {\"role\": \"user\", \"content\": prompt},\n        ]\n        messages_list.append(messages)\n\n    return messages_list\n\n\ndef get_expected_results() -> list[User]:\n    \"\"\"Get the expected User objects for validation\"\"\"\n    return [\n        User(name=\"Alice\", age=28),\n    ]\n\n\ndef check_api_key(provider: str) -> bool:\n    \"\"\"Check if the required API key is set for the provider\"\"\"\n    key_map = {\n        \"openai\": \"OPENAI_API_KEY\",\n        \"anthropic\": \"ANTHROPIC_API_KEY\",\n        \"google\": \"GOOGLE_API_KEY\",\n    }\n\n    required_key = key_map.get(provider)\n    if not required_key:\n        return True  # Unknown provider, let it fail later\n\n    if provider == \"google\":\n        # Google is optional since we simulate\n        if not os.getenv(required_key):\n            typer.echo(f\"Warning: {required_key} not set - will run in simulation mode\")\n        return True\n\n    if not os.getenv(required_key):\n        typer.echo(f\"Error: {required_key} environment variable is not set\", err=True)\n        typer.echo(\n            f\"Please set your API key: export {required_key}='your-api-key-here'\",\n            err=True,\n        )\n        return False\n\n    return True\n\n\ndef create_openai_batch(model: str, messages_list: list[list[dict]]) -> Optional[str]:\n    \"\"\"Create OpenAI batch job using BatchProcessor\"\"\"\n    processor = BatchProcessor(model, User)\n\n    # Create batch file\n    batch_filename = \"test_batch.jsonl\"\n    processor.create_batch_from_messages(\n        file_path=batch_filename,\n        messages_list=messages_list,\n        max_tokens=200,\n        temperature=0.1,\n    )\n\n    try:\n        typer.echo(\"Submitting batch job...\")\n        batch_id = processor.submit_batch(\n            file_path=batch_filename,\n            metadata={\"description\": \"Unified BatchProcessor test\"},\n        )\n        return batch_id\n\n    finally:\n        if os.path.exists(batch_filename):\n            os.remove(batch_filename)\n\n\ndef create_anthropic_batch(\n    model: str, messages_list: list[list[dict]]\n) -> Optional[str]:\n    \"\"\"Create Anthropic batch job using BatchProcessor\"\"\"\n    processor = BatchProcessor(model, User)\n\n    # Create batch file\n    batch_filename = \"test_batch.jsonl\"\n    processor.create_batch_from_messages(\n        file_path=batch_filename,\n        messages_list=messages_list,\n        max_tokens=200,\n        temperature=0.1,\n    )\n\n    try:\n        typer.echo(\"Submitting batch job...\")\n        batch_id = processor.submit_batch(file_path=batch_filename)\n        return batch_id\n\n    finally:\n        if os.path.exists(batch_filename):\n            os.remove(batch_filename)\n\n\ndef create_google_batch(model: str, messages_list: list[list[dict]]) -> Optional[str]:\n    \"\"\"Create Google batch job using BatchProcessor (inline only)\"\"\"\n    processor = BatchProcessor(model, User)\n\n    typer.echo(\"Submitting Google inline batch...\")\n    batch_id = processor.submit_batch(\n        messages_list=messages_list,\n        metadata={\"description\": \"Unified BatchProcessor test\"},\n        use_inline=True,\n        max_tokens=200,\n        temperature=0.1,\n    )\n\n    typer.echo(f\"Inline batch job created: {batch_id}\")\n    return batch_id\n\n\n@app.command()\ndef create(\n    model: str = typer.Option(\n        \"openai/gpt-4o-mini\",\n        help=\"Model in format 'provider/model-name' (e.g., 'google/gemini-2.5-flash', 'openai/gpt-4o-mini', 'anthropic/claude-3-5-sonnet-20241022')\",\n    ),\n    save_id: bool = typer.Option(True, help=\"Save batch ID to file\"),\n):\n    \"\"\"Create a batch job for the specified model\"\"\"\n\n    typer.echo(f\"Creating Batch Job for {model}\")\n    typer.echo(\"=\" * 50)\n\n    # Parse provider from model\n    try:\n        provider, model_name = model.split(\"/\", 1)\n    except ValueError:\n        typer.echo(\"Error: Model must be in format 'provider/model-name'\", err=True)\n        typer.echo(\n            \"Examples: 'openai/gpt-4o-mini', 'anthropic/claude-3-5-sonnet-20241022'\",\n            err=True,\n        )\n        raise typer.Exit(1) from None\n\n    # Check API key\n    if not check_api_key(provider):\n        raise typer.Exit(1)\n\n    # Create test messages\n    messages_list = create_test_messages()\n    typer.echo(f\"Created {len(messages_list)} test message conversations\")\n\n    try:\n        # Create batch job based on provider\n        batch_id = None\n\n        if provider == \"openai\":\n            batch_id = create_openai_batch(model, messages_list)\n        elif provider == \"anthropic\":\n            batch_id = create_anthropic_batch(model, messages_list)\n        else:\n            typer.echo(f\"Unsupported provider: {provider}\", err=True)\n            raise typer.Exit(1)\n\n        if batch_id:\n            typer.echo(f\"Batch job created with ID: {batch_id}\")\n\n            if save_id:\n                filename = f\"{provider}_batch_id.txt\"\n                with open(filename, \"w\") as f:\n                    f.write(batch_id)\n                typer.echo(f\"Batch ID saved to {filename}\")\n\n            # Validate expected results\n            expected_results = get_expected_results()\n            typer.echo(f\"Expected results validated: {len(expected_results)} users\")\n            for i, user in enumerate(expected_results):\n                typer.echo(f\"   {i + 1}. {user.name}, age {user.age}\")\n\n            # Show how to check status\n            typer.echo(f\"Check status with:\")\n            typer.echo(f\"   instructor batch list --model {model}\")\n\n            typer.echo(f\"Cost savings: 50% vs regular API\")\n            typer.echo(f\"\\nSuccess! Batch ID: {batch_id}\")\n\n        else:\n            typer.echo(\"Failed to create batch job\", err=True)\n            raise typer.Exit(1)\n\n    except Exception as e:\n        typer.echo(f\"Error creating batch: {e}\", err=True)\n        raise typer.Exit(1) from e\n\n\n@app.command()\ndef list_batches():\n    \"\"\"List saved batch IDs for all providers\"\"\"\n    typer.echo(\"Saved Batch IDs:\")\n    typer.echo(\"=\" * 30)\n\n    providers = [\"openai\", \"anthropic\"]\n    found_any = False\n\n    for provider in providers:\n        filename = f\"{provider}_batch_id.txt\"\n        if os.path.exists(filename):\n            with open(filename) as f:\n                batch_id = f.read().strip()\n\n            typer.echo(f\"{provider.upper()}: {batch_id}\")\n            found_any = True\n\n    if not found_any:\n        typer.echo(\"No batch IDs found. Run 'create' command first.\")\n        typer.echo(\n            \"Usage: python run_batch_test.py create --model 'provider/model-name'\"\n        )\n    else:\n        typer.echo()\n        typer.echo(\n            \"To fetch results: python run_batch_test.py fetch --provider <provider>\"\n        )\n\n\n@app.command()\ndef fetch(\n    provider: str = typer.Option(\n        help=\"Provider to fetch results from (openai, anthropic, google)\"\n    ),\n    validate: bool = typer.Option(\n        True, help=\"Validate extracted data against expected results\"\n    ),\n    poll: bool = typer.Option(\n        False, help=\"Poll every 30 seconds until batch completes\"\n    ),\n    max_wait: int = typer.Option(\n        600, help=\"Maximum time to wait in seconds (default: 10 minutes)\"\n    ),\n):\n    \"\"\"Fetch and validate batch results from a provider\"\"\"\n\n    if provider not in [\"openai\", \"anthropic\"]:\n        typer.echo(\"Error: Provider must be one of: openai, anthropic\", err=True)\n        raise typer.Exit(1)\n\n    # Check if batch ID file exists\n    filename = f\"{provider}_batch_id.txt\"\n    if not os.path.exists(filename):\n        typer.echo(\n            f\"Error: No batch ID found for {provider}. Run 'create' command first.\",\n            err=True,\n        )\n        raise typer.Exit(1)\n\n    # Read batch ID\n    with open(filename) as f:\n        batch_id = f.read().strip()\n\n    typer.echo(f\"Fetching results for {provider.upper()} batch: {batch_id}\")\n    typer.echo(\"=\" * 60)\n\n    # Check API key\n    if not check_api_key(provider):\n        raise typer.Exit(1)\n\n    try:\n        if poll:\n            results = poll_for_results(provider, batch_id, validate, max_wait)\n        else:\n            if provider == \"openai\":\n                results = fetch_openai_results(batch_id, validate)\n            elif provider == \"anthropic\":\n                results = fetch_anthropic_results(batch_id, validate)\n\n        if results:\n            typer.echo(f\"Successfully fetched and validated {len(results)} results!\")\n            if validate:\n                # Assert that the results match the expected results\n                assert validate_results(results, provider.capitalize()), (\n                    f\"Test failed: {provider} results do not match expected results.\"\n                )\n        else:\n            typer.echo(\"No results available yet or batch still processing\")\n            if not poll:\n                typer.echo(\"Use --poll to automatically wait for completion\")\n\n    except AssertionError as ae:\n        typer.echo(f\"AssertionError: {ae}\", err=True)\n        raise typer.Exit(1) from ae\n    except Exception as e:\n        typer.echo(f\"Error fetching results: {e}\", err=True)\n        raise typer.Exit(1) from e\n\n\n@app.command()\ndef show_results(\n    provider: str = typer.Option(\n        help=\"Provider to show detailed results from (openai, anthropic, google)\"\n    ),\n):\n    \"\"\"Show detailed parsed Pydantic objects from batch results\"\"\"\n\n    if provider not in [\"openai\", \"anthropic\"]:\n        typer.echo(\"Error: Provider must be one of: openai, anthropic\", err=True)\n        raise typer.Exit(1)\n\n    # Check if batch ID file exists\n    filename = f\"{provider}_batch_id.txt\"\n    if not os.path.exists(filename):\n        typer.echo(\n            f\"Error: No batch ID found for {provider}. Run 'create' command first.\",\n            err=True,\n        )\n        raise typer.Exit(1)\n\n    # Read batch ID\n    with open(filename) as f:\n        batch_id = f.read().strip()\n\n    typer.echo(f\"{provider.upper()} BATCH RESULTS\")\n    typer.echo(\"=\" * 50)\n    typer.echo(f\"Batch ID: {batch_id}\")\n\n    # Check API key\n    if not check_api_key(provider):\n        raise typer.Exit(1)\n\n    try:\n        # Get results using BatchProcessor\n        if provider == \"openai\":\n            processor = BatchProcessor(\"openai/gpt-4o-mini\", User)\n        elif provider == \"anthropic\":\n            processor = BatchProcessor(\"anthropic/claude-3-5-sonnet-20241022\", User)\n\n        # Get batch info using list_batches to find our batch\n        all_batches = processor.list_batches(limit=100)\n        batch_info = None\n        for batch in all_batches:\n            if batch.id == batch_id:\n                batch_info = batch\n                break\n\n        if not batch_info:\n            typer.echo(f\"Batch {batch_id} not found\")\n            return\n\n        typer.echo(f\"Status: {batch_info.status.value}\")\n        typer.echo(f\"Raw Status: {batch_info.raw_status}\")\n\n        if batch_info.status != BatchStatus.COMPLETED:\n            typer.echo(f\"Batch not completed yet: {batch_info.status.value}\")\n            return\n\n        # Get all results using the new get_results method\n        all_results = processor.get_results(batch_id)\n        typer.echo(f\"Total results: {len(all_results)}\")\n\n        # Show each result with detailed info\n        for i, result in enumerate(all_results):\n            typer.echo(f\"\\n--- Result {i + 1} ---\")\n            typer.echo(f\"Custom ID: {result.custom_id}\")\n            typer.echo(f\"Success: {result.success}\")\n\n            if result.success:\n                user = result.result\n                typer.echo(f\"PARSED USER OBJECT:\")\n                typer.echo(f\"   Type: {type(user)}\")\n                typer.echo(f\"   Name: {user.name}\")\n                typer.echo(f\"   Age: {user.age}\")\n                typer.echo(f\"   JSON: {user.model_dump_json()}\")\n                typer.echo(f\"   Dict: {user.model_dump()}\")\n\n                # Test that it's a real Pydantic object\n                typer.echo(f\"   Is BaseModel: {isinstance(user, BaseModel)}\")\n                typer.echo(f\"   Is User: {isinstance(user, User)}\")\n\n                # Test Pydantic methods\n                try:\n                    validated = User.model_validate(user.model_dump())\n                    typer.echo(f\"   Re-validation: Works\")\n                    typer.echo(f\"   Re-validated: {validated}\")\n                except Exception as e:\n                    typer.echo(f\"   Re-validation: Failed - {e}\")\n            else:\n                typer.echo(f\"ERROR:\")\n                typer.echo(f\"   Type: {result.error_type}\")\n                typer.echo(f\"   Message: {result.error_message}\")\n\n        # Test the utility functions\n        successful_results = filter_successful(all_results)\n        error_results = filter_errors(all_results)\n        extracted_users = extract_results(all_results)\n\n        typer.echo(f\"\\nUTILITY FUNCTIONS:\")\n        typer.echo(f\"Successful results: {len(successful_results)}\")\n        typer.echo(f\"Error results: {len(error_results)}\")\n        typer.echo(f\"Extracted users: {len(extracted_users)}\")\n\n        if extracted_users:\n            typer.echo(f\"\\nEXTRACTED USER OBJECTS:\")\n            for user in extracted_users:\n                typer.echo(\n                    f\"  • {user.name}, age {user.age} (type: {type(user).__name__})\"\n                )\n\n    except Exception as e:\n        typer.echo(f\"Error showing results: {e}\", err=True)\n        raise typer.Exit(1) from e\n\n\ndef poll_for_results(\n    provider: str, batch_id: str, validate: bool, max_wait: int\n) -> list[User]:\n    \"\"\"Poll for batch results until completion or timeout\"\"\"\n    import time\n\n    typer.echo(f\"Polling {provider.upper()} batch every 30 seconds...\")\n    typer.echo(f\"Max wait time: {max_wait} seconds ({max_wait // 60} minutes)\")\n    typer.echo(f\"Batch ID: {batch_id}\")\n    typer.echo()\n\n    start_time = time.time()\n    attempt = 1\n\n    while time.time() - start_time < max_wait:\n        typer.echo(f\"Attempt {attempt} - Checking batch status...\")\n\n        try:\n            if provider == \"openai\":\n                status, results = fetch_openai_results_with_status(batch_id, validate)\n            elif provider == \"anthropic\":\n                status, results = fetch_anthropic_results_with_status(\n                    batch_id, validate\n                )\n\n            if status == \"completed\" or status == \"ended\":\n                typer.echo(\n                    f\"Batch completed after {int(time.time() - start_time)} seconds!\"\n                )\n                return results\n            elif status in [\"failed\", \"expired\", \"cancelled\"]:\n                typer.echo(f\"Batch {status}\")\n                return []\n            else:\n                elapsed = int(time.time() - start_time)\n                remaining = max_wait - elapsed\n                typer.echo(\n                    f\"Status: {status} | Elapsed: {elapsed}s | Remaining: {remaining}s\"\n                )\n\n                if remaining > 30:\n                    typer.echo(\"Waiting 30 seconds before next check...\")\n                    time.sleep(30)\n                else:\n                    typer.echo(f\"Waiting {remaining} seconds...\")\n                    time.sleep(remaining)\n                    break\n\n        except Exception as e:\n            typer.echo(f\"Error during polling: {e}\")\n            time.sleep(30)\n\n        attempt += 1\n\n    typer.echo(f\"Timeout reached after {max_wait} seconds\")\n    return []\n\n\ndef fetch_openai_results_with_status(\n    batch_id: str, validate: bool\n) -> tuple[str, list[User]]:\n    \"\"\"Fetch OpenAI batch results and return status\"\"\"\n    processor = BatchProcessor(\"openai/gpt-4o-mini\", User)\n\n    # Get batch info\n    all_batches = processor.list_batches(limit=100)\n    batch_info = None\n    for batch in all_batches:\n        if batch.id == batch_id:\n            batch_info = batch\n            break\n\n    if not batch_info:\n        return \"not_found\", []\n\n    if batch_info.status != BatchStatus.COMPLETED:\n        return batch_info.raw_status, []\n\n    # Get results using the new get_results method\n    all_results = processor.get_results(batch_id)\n\n    successful_results = filter_successful(all_results)\n    error_results = filter_errors(all_results)\n    extracted_results = extract_results(all_results)\n\n    typer.echo(f\"Successful extractions: {len(successful_results)}\")\n    if error_results:\n        typer.echo(f\"Failed extractions: {len(error_results)}\")\n        # Show first few errors for debugging\n        for error in error_results[:3]:\n            typer.echo(f\"   Error ({error.custom_id}): {error.error_message}\")\n\n    if validate and extracted_results:\n        validate_results(extracted_results, \"OpenAI\")\n\n    return \"completed\", extracted_results\n\n\ndef fetch_anthropic_results_with_status(\n    batch_id: str, validate: bool\n) -> tuple[str, list[User]]:\n    \"\"\"Fetch Anthropic batch results and return status\"\"\"\n    processor = BatchProcessor(\"anthropic/claude-3-5-sonnet-20241022\", User)\n\n    # Get batch info\n    all_batches = processor.list_batches(limit=100)\n    batch_info = None\n    for batch in all_batches:\n        if batch.id == batch_id:\n            batch_info = batch\n            break\n\n    if not batch_info:\n        return \"not_found\", []\n\n    # Check for various terminal states\n    if batch_info.status in [\n        BatchStatus.FAILED,\n        BatchStatus.CANCELLED,\n        BatchStatus.EXPIRED,\n    ]:\n        return batch_info.raw_status, []\n\n    if batch_info.status != BatchStatus.COMPLETED:\n        return batch_info.raw_status, []\n\n    # Get results using the new get_results method\n    all_results = processor.get_results(batch_id)\n\n    successful_results = filter_successful(all_results)\n    error_results = filter_errors(all_results)\n    extracted_results = extract_results(all_results)\n\n    typer.echo(f\"Successful extractions: {len(successful_results)}\")\n    if error_results:\n        typer.echo(f\"Failed extractions: {len(error_results)}\")\n        # Show first few errors for debugging\n        for error in error_results[:3]:\n            typer.echo(f\"   Error ({error.custom_id}): {error.error_message}\")\n\n    if validate and extracted_results:\n        validate_results(extracted_results, \"Anthropic\")\n\n    return \"ended\", extracted_results\n\n\ndef fetch_openai_results(batch_id: str, validate: bool) -> list[User]:\n    \"\"\"Fetch OpenAI batch results using BatchProcessor\"\"\"\n    processor = BatchProcessor(\"openai/gpt-4o-mini\", User)\n\n    # Get batch info\n    all_batches = processor.list_batches(limit=100)\n    batch_info = None\n    for batch in all_batches:\n        if batch.id == batch_id:\n            batch_info = batch\n            break\n\n    if not batch_info:\n        typer.echo(f\"Batch {batch_id} not found\")\n        return []\n\n    typer.echo(f\"Batch Status: {batch_info.status.value}\")\n\n    if batch_info.status != BatchStatus.COMPLETED:\n        typer.echo(\n            f\"Batch is still {batch_info.status.value}. Please wait and try again.\"\n        )\n        return []\n\n    # Get results using the new get_results method\n    all_results = processor.get_results(batch_id)\n\n    successful_results = filter_successful(all_results)\n    error_results = filter_errors(all_results)\n    extracted_results = extract_results(all_results)\n\n    typer.echo(f\"Successful extractions: {len(successful_results)}\")\n    if error_results:\n        typer.echo(f\"Failed extractions: {len(error_results)}\")\n        # Show first few errors for debugging\n        for error in error_results[:3]:\n            typer.echo(f\"   Error ({error.custom_id}): {error.error_message}\")\n\n    if validate and extracted_results:\n        validate_results(extracted_results, \"OpenAI\")\n\n    return extracted_results\n\n\ndef fetch_anthropic_results(batch_id: str, validate: bool) -> list[User]:\n    \"\"\"Fetch Anthropic batch results using BatchProcessor\"\"\"\n    processor = BatchProcessor(\"anthropic/claude-3-5-sonnet-20241022\", User)\n\n    # Get batch info\n    all_batches = processor.list_batches(limit=100)\n    batch_info = None\n    for batch in all_batches:\n        if batch.id == batch_id:\n            batch_info = batch\n            break\n\n    if not batch_info:\n        typer.echo(f\"Batch {batch_id} not found\")\n        return []\n\n    typer.echo(f\"Batch Status: {batch_info.status.value}\")\n\n    if batch_info.status != BatchStatus.COMPLETED:\n        typer.echo(\n            f\"Batch is still {batch_info.status.value}. Please wait and try again.\"\n        )\n        return []\n\n    # Get results using the new get_results method\n    all_results = processor.get_results(batch_id)\n\n    successful_results = filter_successful(all_results)\n    error_results = filter_errors(all_results)\n    extracted_results = extract_results(all_results)\n\n    typer.echo(f\"Successful extractions: {len(successful_results)}\")\n    if error_results:\n        typer.echo(f\"Failed extractions: {len(error_results)}\")\n        # Show first few errors for debugging\n        for error in error_results[:3]:\n            typer.echo(f\"   Error ({error.custom_id}): {error.error_message}\")\n\n    if validate and extracted_results:\n        validate_results(extracted_results, \"Anthropic\")\n\n    return extracted_results\n\n\ndef fetch_google_results(batch_job_name: str, validate: bool) -> list[User]:\n    \"\"\"Fetch Google batch results using BatchProcessor\"\"\"\n    try:\n        processor = BatchProcessor(\"google/gemini-2.5-flash\", User)\n\n        # Get batch info\n        all_batches = processor.list_batches(limit=100)\n        batch_info = None\n        for batch in all_batches:\n            if batch.id == batch_job_name:\n                batch_info = batch\n                break\n\n        if not batch_info:\n            typer.echo(f\"Batch {batch_job_name} not found\")\n            return []\n\n        typer.echo(f\"Batch Status: {batch_info.status.value}\")\n\n        if batch_info.status != BatchStatus.COMPLETED:\n            typer.echo(\n                f\"Batch is still {batch_info.status.value}. Please wait and try again.\"\n            )\n            return []\n\n        # Get results using the new get_results method\n        all_results = processor.get_results(batch_job_name)\n\n        successful_results = filter_successful(all_results)\n        error_results = filter_errors(all_results)\n        extracted_results = extract_results(all_results)\n\n        typer.echo(f\"Successful extractions: {len(successful_results)}\")\n        if error_results:\n            typer.echo(f\"Failed extractions: {len(error_results)}\")\n\n        if validate and extracted_results:\n            validate_results(extracted_results, \"Google GenAI\")\n\n        return extracted_results\n\n    except Exception as e:\n        typer.echo(f\"Error fetching Google batch results: {e}\")\n        return []\n\n\ndef validate_results(results: list[User], provider_name: str) -> bool:\n    \"\"\"Validate extracted results against expected results\"\"\"\n    expected_results = get_expected_results()\n\n    typer.echo(f\"\\nValidating {provider_name} Results:\")\n    typer.echo(\"-\" * 40)\n\n    if len(results) != len(expected_results):\n        typer.echo(f\"Expected {len(expected_results)} results, got {len(results)}\")\n        return False\n\n    # Sort both lists by name for comparison\n    results_sorted = sorted(results, key=lambda x: x.name)\n    expected_sorted = sorted(expected_results, key=lambda x: x.name)\n\n    all_correct = True\n    for i, (actual, expected) in enumerate(zip(results_sorted, expected_sorted)):\n        if actual.name == expected.name and actual.age == expected.age:\n            typer.echo(f\"{i + 1}. {actual.name}, age {actual.age} - CORRECT\")\n        else:\n            typer.echo(f\"{i + 1}. Expected: {expected.name}, age {expected.age}\")\n            typer.echo(f\"    Got: {actual.name}, age {actual.age}\")\n            all_correct = False\n\n    if all_correct:\n        typer.echo(f\"\\nAll {provider_name} extractions are correct!\")\n    else:\n        typer.echo(f\"\\nSome {provider_name} extractions have errors\")\n\n    return all_correct\n\n\n@app.command()\ndef help():\n    \"\"\"Show all available commands and usage examples\"\"\"\n    typer.echo(\"Unified Batch API Test Commands\")\n    typer.echo(\"=\" * 40)\n    typer.echo()\n\n    typer.echo(\"Available Commands:\")\n    typer.echo(\"  • create         - Create a new batch job\")\n    typer.echo(\"  • list-batches   - List all saved batch IDs\")\n    typer.echo(\"  • fetch          - Fetch and validate batch results\")\n    typer.echo(\"  • show-results   - Show detailed parsed Pydantic objects\")\n    typer.echo(\"  • list-models    - Show supported models\")\n    typer.echo(\"  • help           - Show this help message\")\n    typer.echo()\n\n    typer.echo(\"Usage Examples:\")\n    typer.echo(\"  # Create batch job (default: Google Gemini 2.5 Flash)\")\n    typer.echo(\"  python run_batch_test.py create\")\n    typer.echo()\n    typer.echo(\"  # Create batch job with specific model\")\n    typer.echo(\"  python run_batch_test.py create --model 'openai/gpt-4o-mini'\")\n    typer.echo()\n    typer.echo(\"  # List saved batch IDs\")\n    typer.echo(\"  python run_batch_test.py list-batches\")\n    typer.echo()\n    typer.echo(\"  # Fetch results with validation\")\n    typer.echo(\"  python run_batch_test.py fetch --provider openai\")\n    typer.echo()\n    typer.echo(\"  # Show detailed parsed objects\")\n    typer.echo(\"  python run_batch_test.py show-results --provider anthropic\")\n    typer.echo()\n    typer.echo(\"  # Poll every 30 seconds until batch completes (max 10 minutes)\")\n    typer.echo(\"  python run_batch_test.py fetch --provider openai --poll\")\n    typer.echo()\n    typer.echo(\"  # Poll with custom timeout (20 minutes)\")\n    typer.echo(\n        \"  python run_batch_test.py fetch --provider openai --poll --max-wait 1200\"\n    )\n    typer.echo()\n\n\n@app.command()\ndef list_models():\n    \"\"\"List example models for each provider\"\"\"\n    typer.echo(\"Supported Models by Provider:\")\n    typer.echo()\n\n    typer.echo(\"OpenAI:\")\n    typer.echo(\"  • openai/gpt-4o-mini\")\n    typer.echo(\"  • openai/gpt-4o\")\n    typer.echo(\"  • openai/gpt-4-turbo\")\n    typer.echo()\n\n    typer.echo(\"Anthropic:\")\n    typer.echo(\"  • anthropic/claude-3-5-sonnet-20241022\")\n    typer.echo(\"  • anthropic/claude-3-opus-20240229\")\n    typer.echo(\"  • anthropic/claude-3-haiku-20240307\")\n    typer.echo()\n\n    typer.echo(\"Google:\")\n    typer.echo(\"  • google/gemini-2.5-flash\")\n    typer.echo(\"  • google/gemini-2.0-flash-001\")\n    typer.echo(\"  • google/gemini-pro\")\n    typer.echo()\n\n    typer.echo(\"Usage: python run_batch_test.py create --model 'provider/model-name'\")\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "examples/caching/example_diskcache.py",
    "content": "import functools\nimport inspect\nimport instructor\nimport diskcache\n\nfrom openai import OpenAI, AsyncOpenAI\nfrom pydantic import BaseModel\n\nclient = instructor.from_openai(OpenAI())\naclient = instructor.from_openai(AsyncOpenAI())\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\ncache = diskcache.Cache(\"./my_cache_directory\")\n\n\ndef instructor_cache(func):\n    \"\"\"Cache a function that returns a Pydantic model\"\"\"\n    return_type = inspect.signature(func).return_annotation\n    if not issubclass(return_type, BaseModel):\n        raise ValueError(\"The return type must be a Pydantic model\")\n\n    is_async = inspect.iscoroutinefunction(func)\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type\n            if issubclass(return_type, BaseModel):\n                return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    @functools.wraps(func)\n    async def awrapper(*args, **kwargs):\n        key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type\n            if issubclass(return_type, BaseModel):\n                return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = await func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    return wrapper if not is_async else awrapper\n\n\n@instructor_cache\ndef extract(data) -> UserDetail:\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )  # type: ignore\n\n\n@instructor_cache\nasync def aextract(data) -> UserDetail:\n    return await aclient.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )  # type: ignore\n\n\ndef test_extract():\n    import time\n\n    start = time.perf_counter()\n    model = extract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n    start = time.perf_counter()\n    model = extract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n\nasync def atest_extract():\n    import time\n\n    start = time.perf_counter()\n    model = await aextract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n    start = time.perf_counter()\n    model = await aextract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n\nif __name__ == \"__main__\":\n    test_extract()\n    # Time taken: 0.7285366660216823\n    # Time taken: 9.841693099588156e-05\n\n    import asyncio\n\n    asyncio.run(atest_extract())\n"
  },
  {
    "path": "examples/caching/example_redis.py",
    "content": "import redis\nimport functools\nimport inspect\nimport instructor\n\nfrom pydantic import BaseModel\nfrom openai import OpenAI\n\nclient = instructor.from_openai(OpenAI())\ncache = redis.Redis(\"localhost\")\n\n\ndef instructor_cache(func):\n    \"\"\"Cache a function that returns a Pydantic model\"\"\"\n    return_type = inspect.signature(func).return_annotation\n    if not issubclass(return_type, BaseModel):\n        raise ValueError(\"The return type must be a Pydantic model\")\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        key = f\"{func.__name__}-{functools._make_key(args, kwargs, typed=False)}\"\n        # Check if the result is already cached\n        if (cached := cache.get(key)) is not None:\n            # Deserialize from JSON based on the return type\n            if issubclass(return_type, BaseModel):\n                return return_type.model_validate_json(cached)\n\n        # Call the function and cache its result\n        result = func(*args, **kwargs)\n        serialized_result = result.model_dump_json()\n        cache.set(key, serialized_result)\n\n        return result\n\n    return wrapper\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@instructor_cache\ndef extract(data) -> UserDetail:\n    # Assuming client.chat.completions.create returns a UserDetail instance\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\ndef test_extract():\n    import time\n\n    start = time.perf_counter()\n    model = extract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n    start = time.perf_counter()\n    model = extract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n\nif __name__ == \"__main__\":\n    test_extract()\n    # Time taken: 0.798335583996959\n    # Time taken: 0.00017016706988215446\n"
  },
  {
    "path": "examples/caching/lru.py",
    "content": "import instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport functools\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\n@functools.lru_cache\ndef extract(data):\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\ndef test_extract():\n    import time\n\n    start = time.perf_counter()\n    model = extract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n    start = time.perf_counter()\n    model = extract(\"Extract jason is 25 years old\")\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    print(f\"Time taken: {time.perf_counter() - start}\")\n\n\nif __name__ == \"__main__\":\n    test_extract()\n    # Time taken: 0.9267581660533324\n    # Time taken: 1.2080417945981026e-06\n"
  },
  {
    "path": "examples/caching/run.py",
    "content": "\"\"\"\nComprehensive Caching Example for Instructor\n===========================================\n\nThis example demonstrates various caching strategies for LLM applications:\n1. functools.cache - Simple in-memory caching\n2. diskcache - Persistent disk-based caching\n3. Redis - Distributed caching\n4. Performance benchmarks and cost analysis\n5. Advanced patterns: hierarchical caching, monitoring, schema invalidation\n\nRun this example to see real-world performance improvements and cost savings.\n\"\"\"\n\nimport asyncio\nimport functools\nimport hashlib\nimport inspect\nimport json\nimport logging\nimport time\nfrom collections import defaultdict\nfrom typing import Any, Callable, Optional, TypeVar\n\nimport instructor\nfrom openai import AsyncOpenAI, OpenAI\nfrom pydantic import BaseModel, Field\n\n# Configure logging\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\n# Initialize clients\nclient = instructor.from_openai(OpenAI())\naclient = instructor.from_openai(AsyncOpenAI())\n\n# Test data\nTEST_QUERIES = [\n    \"Extract: Jason is 25 years old and works as a software engineer\",\n    \"Extract: Sarah is 30 years old and is a data scientist\",\n    \"Extract: Mike is 28 years old and works in marketing\",\n    \"Extract: Lisa is 32 years old and is a product manager\",\n    \"Extract: Jason is 25 years old and works as a software engineer\",  # Duplicate for cache hit\n]\n\nF = TypeVar(\"F\", bound=Callable[..., Any])\n\n\nclass UserDetail(BaseModel):\n    \"\"\"Enhanced user model with more fields for testing\"\"\"\n\n    name: str = Field(description=\"User's full name\")\n    age: int = Field(description=\"User's age\", ge=0, le=150)\n    occupation: Optional[str] = Field(None, description=\"User's job title\")\n\n\nclass CacheMetrics:\n    \"\"\"Production-ready cache monitoring\"\"\"\n\n    def __init__(self):\n        self.hits = 0\n        self.misses = 0\n        self.total_time_saved = 0.0\n        self.error_count = 0\n        self.hit_rate_by_function: dict[str, dict[str, int]] = defaultdict(\n            lambda: {\"hits\": 0, \"misses\": 0}\n        )\n\n    def record_hit(self, func_name: str, time_saved: float):\n        self.hits += 1\n        self.total_time_saved += time_saved\n        self.hit_rate_by_function[func_name][\"hits\"] += 1\n        logger.debug(f\"Cache HIT for {func_name}, saved {time_saved:.3f}s\")\n\n    def record_miss(self, func_name: str):\n        self.misses += 1\n        self.hit_rate_by_function[func_name][\"misses\"] += 1\n        logger.debug(f\"Cache MISS for {func_name}\")\n\n    def record_error(self, func_name: str, error: str):\n        self.error_count += 1\n        logger.warning(f\"Cache ERROR in {func_name}: {error}\")\n\n    @property\n    def hit_rate(self) -> float:\n        total = self.hits + self.misses\n        return self.hits / total if total > 0 else 0.0\n\n    def get_stats(self) -> dict[str, Any]:\n        return {\n            \"hit_rate\": f\"{self.hit_rate:.2%}\",\n            \"total_hits\": self.hits,\n            \"total_misses\": self.misses,\n            \"error_count\": self.error_count,\n            \"time_saved_seconds\": f\"{self.total_time_saved:.3f}\",\n            \"function_stats\": dict(self.hit_rate_by_function),\n        }\n\n    def reset(self):\n        \"\"\"Reset all metrics for new test runs\"\"\"\n        self.hits = 0\n        self.misses = 0\n        self.total_time_saved = 0.0\n        self.error_count = 0\n        self.hit_rate_by_function.clear()\n\n\n# Global metrics instance\nmetrics = CacheMetrics()\n\n\ndef smart_cache_key(\n    func_name: str, args: tuple, kwargs: dict, model_class: type\n) -> str:\n    \"\"\"Generate cache key with schema versioning for automatic invalidation\"\"\"\n    # Include model schema in cache key for automatic invalidation\n    schema_hash = hashlib.md5(\n        json.dumps(model_class.model_json_schema(), sort_keys=True).encode()\n    ).hexdigest()[:8]\n\n    args_hash = hashlib.md5(str((args, kwargs)).encode()).hexdigest()[:8]\n\n    return f\"{func_name}:{schema_hash}:{args_hash}\"\n\n\n# 1. Simple functools.cache implementation\n@functools.lru_cache(maxsize=1000)\ndef extract_functools(data: str) -> UserDetail:\n    \"\"\"Simple in-memory caching with functools.lru_cache\"\"\"\n    start_time = time.perf_counter()\n\n    result = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n    # This won't be called on cache hits, so we track metrics differently\n    return result\n\n\ndef monitored_functools_cache(func: F) -> F:\n    \"\"\"functools.cache with monitoring\"\"\"\n    cached_func = functools.lru_cache(maxsize=1000)(func)\n\n    @functools.wraps(func)\n    def wrapper(*args, **kwargs):\n        # Check if we'll get a cache hit by calling cache_info\n        info_before = cached_func.cache_info()\n\n        start_time = time.perf_counter()\n        result = cached_func(*args, **kwargs)\n        execution_time = time.perf_counter() - start_time\n\n        info_after = cached_func.cache_info()\n\n        if info_after.hits > info_before.hits:\n            # We got a cache hit\n            metrics.record_hit(func.__name__, 0.8)  # Assume 800ms saved\n        else:\n            # Cache miss\n            metrics.record_miss(func.__name__)\n\n        return result\n\n    # Preserve cache_info method\n    wrapper.cache_info = cached_func.cache_info\n    wrapper.cache_clear = cached_func.cache_clear\n\n    return wrapper\n\n\n@monitored_functools_cache\ndef extract_functools_monitored(data: str) -> UserDetail:\n    \"\"\"functools.cache with monitoring\"\"\"\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\n# 2. Enhanced diskcache implementation\ndef create_diskcache_decorator(\n    cache_dir: str = \"./cache_directory\", ttl: Optional[int] = None\n):\n    \"\"\"Factory for diskcache decorator with enhanced features\"\"\"\n    try:\n        import diskcache\n\n        cache = diskcache.Cache(cache_dir)\n    except ImportError:\n        logger.warning(\"diskcache not available, skipping disk cache example\")\n        return lambda func: func\n\n    def decorator(func: F) -> F:\n        return_type = inspect.signature(func).return_annotation\n        if not (inspect.isclass(return_type) and issubclass(return_type, BaseModel)):\n            raise ValueError(\"The return type must be a Pydantic model\")\n\n        @functools.wraps(func)\n        def wrapper(*args, **kwargs):\n            # Generate smart cache key with schema versioning\n            key = smart_cache_key(func.__name__, args, kwargs, return_type)\n\n            try:\n                # Check if the result is already cached\n                if (cached := cache.get(key)) is not None:\n                    metrics.record_hit(func.__name__, 0.8)  # Assume 800ms saved\n                    return return_type.model_validate_json(cached)\n\n                metrics.record_miss(func.__name__)\n            except Exception as e:\n                metrics.record_error(func.__name__, str(e))\n                logger.warning(f\"Cache read error: {e}\")\n\n            # Call the function and cache its result\n            result = func(*args, **kwargs)\n\n            try:\n                serialized_result = result.model_dump_json()\n                if ttl:\n                    cache.set(key, serialized_result, expire=ttl)\n                else:\n                    cache.set(key, serialized_result)\n            except Exception as e:\n                metrics.record_error(func.__name__, str(e))\n                logger.warning(f\"Cache write error: {e}\")\n\n            return result\n\n        return wrapper\n\n    return decorator\n\n\n@create_diskcache_decorator(ttl=3600)  # 1 hour TTL\ndef extract_diskcache(data: str) -> UserDetail:\n    \"\"\"Persistent disk-based caching with TTL\"\"\"\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\n# 3. Enhanced Redis implementation (with fallback)\ndef create_redis_decorator(\n    redis_url: str = \"redis://localhost:6379\",\n    ttl: int = 3600,\n    prefix: str = \"instructor\",\n):\n    \"\"\"Factory for Redis decorator with production features\"\"\"\n    try:\n        import redis\n\n        cache = redis.from_url(redis_url, decode_responses=True)\n        # Test connection\n        cache.ping()\n        logger.info(\"Connected to Redis successfully\")\n    except ImportError as e:\n        logger.warning(f\"Redis not available (ImportError: {e}), using fallback\")\n        return lambda func: func\n    except Exception as e:  # Covers redis.RedisError and other connection issues\n        logger.warning(f\"Redis not available ({e}), using fallback\")\n        return lambda func: func\n\n    def decorator(func: F) -> F:\n        return_type = inspect.signature(func).return_annotation\n        if not (inspect.isclass(return_type) and issubclass(return_type, BaseModel)):\n            raise ValueError(\"The return type must be a Pydantic model\")\n\n        @functools.wraps(func)\n        def wrapper(*args, **kwargs):\n            # Generate cache key with schema versioning\n            schema_hash = hashlib.md5(\n                json.dumps(return_type.model_json_schema(), sort_keys=True).encode()\n            ).hexdigest()[:8]\n            key = f\"{prefix}:{func.__name__}:{schema_hash}:{functools._make_key(args, kwargs, typed=False)}\"\n\n            try:\n                # Check if the result is already cached\n                if (cached := cache.get(key)) is not None:\n                    metrics.record_hit(func.__name__, 0.8)  # Assume 800ms saved\n                    logger.debug(f\"Cache hit for key: {key}\")\n                    return return_type.model_validate_json(cached)\n\n                metrics.record_miss(func.__name__)\n                logger.debug(f\"Cache miss for key: {key}\")\n            except redis.RedisError as e:\n                metrics.record_error(func.__name__, str(e))\n                logger.warning(f\"Redis read error: {e}\")\n\n            # Call the function and cache its result\n            result = func(*args, **kwargs)\n\n            try:\n                serialized_result = result.model_dump_json()\n                cache.setex(key, ttl, serialized_result)\n                logger.debug(f\"Cached result for key: {key}\")\n            except redis.RedisError as e:\n                metrics.record_error(func.__name__, str(e))\n                logger.warning(f\"Redis write error: {e}\")\n\n            return result\n\n        return wrapper\n\n    return decorator\n\n\n@create_redis_decorator(ttl=3600)\ndef extract_redis(data: str) -> UserDetail:\n    \"\"\"Distributed Redis caching with error handling\"\"\"\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\n# 4. No cache baseline for comparison\ndef extract_no_cache(data: str) -> UserDetail:\n    \"\"\"Baseline function without caching\"\"\"\n    metrics.record_miss(\"extract_no_cache\")\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\n# 5. Hierarchical caching example\n@functools.lru_cache(maxsize=50)  # L1: Fast in-memory\ndef extract_l1(data: str) -> UserDetail:\n    return extract_l2(data)\n\n\n@create_diskcache_decorator()  # L2: Persistent disk\ndef extract_l2(data: str) -> UserDetail:\n    return extract_l3(data)\n\n\n@create_redis_decorator()  # L3: Shared distributed\ndef extract_l3(data: str) -> UserDetail:\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": data},\n        ],\n    )\n\n\ndef benchmark_caching_strategy(\n    func: Callable, name: str, queries: list[str]\n) -> dict[str, Any]:\n    \"\"\"Benchmark a specific caching strategy\"\"\"\n    logger.info(f\"\\n=== Benchmarking {name} ===\")\n\n    # Reset metrics for this test\n    metrics.reset()\n\n    times = []\n    results = []\n\n    for i, query in enumerate(queries):\n        start_time = time.perf_counter()\n        try:\n            result = func(query)\n            execution_time = time.perf_counter() - start_time\n            times.append(execution_time)\n            results.append(result)\n            logger.info(\n                f\"Query {i + 1}: {execution_time:.3f}s - {result.name}, {result.age}, {result.occupation}\"\n            )\n        except Exception as e:\n            logger.error(f\"Error in {name}: {e}\")\n            times.append(float(\"inf\"))\n            results.append(None)\n\n    # Calculate statistics\n    valid_times = [t for t in times if t != float(\"inf\")]\n    if valid_times:\n        avg_time = sum(valid_times) / len(valid_times)\n        total_time = sum(valid_times)\n        fastest_time = min(valid_times)\n        slowest_time = max(valid_times)\n    else:\n        avg_time = total_time = fastest_time = slowest_time = 0\n\n    stats = {\n        \"name\": name,\n        \"total_time\": total_time,\n        \"avg_time\": avg_time,\n        \"fastest_time\": fastest_time,\n        \"slowest_time\": slowest_time,\n        \"cache_metrics\": metrics.get_stats(),\n        \"success_rate\": len(valid_times) / len(queries),\n    }\n\n    logger.info(f\"Total time: {total_time:.3f}s\")\n    logger.info(f\"Average time: {avg_time:.3f}s\")\n    logger.info(f\"Cache hit rate: {metrics.hit_rate:.2%}\")\n\n    return stats\n\n\ndef calculate_cost_savings(baseline_stats: dict, cached_stats: dict) -> dict[str, Any]:\n    \"\"\"Calculate cost savings from caching\"\"\"\n    baseline_time = baseline_stats[\"total_time\"]\n    cached_time = cached_stats[\"total_time\"]\n\n    # Assume $0.002 per API call (rough average)\n    cost_per_call = 0.002\n    num_queries = len(TEST_QUERIES)\n\n    # Without caching: every call costs money\n    cost_without_cache = num_queries * cost_per_call\n\n    # With caching: only cache misses cost money\n    cache_misses = cached_stats[\"cache_metrics\"][\"total_misses\"]\n    cost_with_cache = cache_misses * cost_per_call\n\n    savings = cost_without_cache - cost_with_cache\n    savings_percent = (\n        (savings / cost_without_cache) * 100 if cost_without_cache > 0 else 0\n    )\n\n    time_saved = baseline_time - cached_time\n    time_savings_percent = (\n        (time_saved / baseline_time) * 100 if baseline_time > 0 else 0\n    )\n\n    return {\n        \"cost_without_cache\": cost_without_cache,\n        \"cost_with_cache\": cost_with_cache,\n        \"cost_savings\": savings,\n        \"cost_savings_percent\": savings_percent,\n        \"time_saved\": time_saved,\n        \"time_savings_percent\": time_savings_percent,\n        \"speed_improvement\": (\n            baseline_time / cached_time if cached_time > 0 else float(\"inf\")\n        ),\n    }\n\n\nasync def run_async_example():\n    \"\"\"Demonstrate async caching patterns\"\"\"\n    logger.info(\"\\n=== Async Caching Example ===\")\n\n    # Simple async function with metrics\n    async def extract_async(data: str) -> UserDetail:\n        metrics.record_miss(\"extract_async\")\n        return await aclient.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=UserDetail,\n            messages=[\n                {\"role\": \"user\", \"content\": data},\n            ],\n        )\n\n    # Run concurrent requests\n    start_time = time.perf_counter()\n    tasks = [\n        extract_async(query) for query in TEST_QUERIES[:3]\n    ]  # First 3 to save costs\n    results = await asyncio.gather(*tasks)\n    total_time = time.perf_counter() - start_time\n\n    logger.info(f\"Async processing time: {total_time:.3f}s\")\n    for i, result in enumerate(results):\n        logger.info(f\"Result {i + 1}: {result.name}, {result.age}, {result.occupation}\")\n\n\ndef demonstrate_schema_invalidation():\n    \"\"\"Show how cache keys change when model schema changes\"\"\"\n    logger.info(\"\\n=== Schema-Based Cache Invalidation ===\")\n\n    # Original model\n    class OriginalUser(BaseModel):\n        name: str\n        age: int\n\n    # Modified model (different schema)\n    class ModifiedUser(BaseModel):\n        name: str\n        age: int\n        email: Optional[str] = None  # New field\n\n    # Generate cache keys for same function args but different models\n    args = (\"test data\",)\n    kwargs = {}\n\n    key1 = smart_cache_key(\"test_func\", args, kwargs, OriginalUser)\n    key2 = smart_cache_key(\"test_func\", args, kwargs, ModifiedUser)\n\n    logger.info(f\"Original model cache key: {key1}\")\n    logger.info(f\"Modified model cache key: {key2}\")\n    logger.info(f\"Keys are different: {key1 != key2}\")\n    logger.info(\"This ensures cache invalidation when model schemas change!\")\n\n\ndef main():\n    \"\"\"Run comprehensive caching demonstration\"\"\"\n    logger.info(\"🚀 Starting Comprehensive Caching Demonstration\")\n    logger.info(\"=\" * 60)\n\n    # Run benchmarks for each strategy\n    strategies = [\n        (extract_no_cache, \"No Cache (Baseline)\"),\n        (extract_functools_monitored, \"functools.lru_cache\"),\n        (extract_diskcache, \"diskcache\"),\n        (extract_redis, \"Redis\"),\n        (extract_l1, \"Hierarchical (L1→L2→L3)\"),\n    ]\n\n    all_stats = {}\n\n    for func, name in strategies:\n        try:\n            stats = benchmark_caching_strategy(func, name, TEST_QUERIES)\n            all_stats[name] = stats\n        except Exception as e:\n            logger.error(f\"Failed to benchmark {name}: {e}\")\n            continue\n\n    # Print summary comparison\n    logger.info(\"\\n\" + \"=\" * 60)\n    logger.info(\"📊 PERFORMANCE COMPARISON SUMMARY\")\n    logger.info(\"=\" * 60)\n\n    baseline_stats = all_stats.get(\"No Cache (Baseline)\")\n\n    if baseline_stats:\n        for name, stats in all_stats.items():\n            if name == \"No Cache (Baseline)\":\n                continue\n\n            logger.info(f\"\\n{name}:\")\n            logger.info(f\"  Total time: {stats['total_time']:.3f}s\")\n            logger.info(f\"  Cache hit rate: {stats['cache_metrics']['hit_rate']}\")\n\n            # Calculate savings\n            savings = calculate_cost_savings(baseline_stats, stats)\n            logger.info(f\"  Speed improvement: {savings['speed_improvement']:.1f}x\")\n            logger.info(\n                f\"  Time saved: {savings['time_saved']:.3f}s ({savings['time_savings_percent']:.1f}%)\"\n            )\n            logger.info(\n                f\"  Cost savings: ${savings['cost_savings']:.4f} ({savings['cost_savings_percent']:.1f}%)\"\n            )\n\n    # Additional demonstrations\n    demonstrate_schema_invalidation()\n\n    # Run async example\n    asyncio.run(run_async_example())\n\n    # Print cache info for functools\n    logger.info(\n        f\"\\nfunctools.lru_cache info: {extract_functools_monitored.cache_info()}\"\n    )\n\n    logger.info(\"\\n\" + \"=\" * 60)\n    logger.info(\"✅ Caching demonstration completed!\")\n    logger.info(\"💡 Key takeaways:\")\n    logger.info(\"  - Caching can provide 10x-1000x speed improvements\")\n    logger.info(\"  - Choose the right strategy based on your needs:\")\n    logger.info(\"    • functools.cache: Development, single process\")\n    logger.info(\"    • diskcache: Persistence, moderate performance\")\n    logger.info(\"    • Redis: Distributed systems, high performance\")\n    logger.info(\"    • Hierarchical: Best of all worlds\")\n    logger.info(\"  - Smart cache keys prevent stale data\")\n    logger.info(\"  - Monitoring helps optimize cache performance\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "examples/caching_prototype/README.md",
    "content": "# Instructor Caching Prototype\n\nThis example demonstrates the new built-in caching functionality in Instructor.\n\n## Files\n\n- `run.py` - Main example showing all caching features (with mock calls for quick testing)\n- `run_real.py` - Complete demo with real API calls\n- `test_simple.py` - Unit tests for cache components without API calls\n- `test_anthropic.py` - Tests with Anthropic provider to verify caching works across providers\n\n## Features Demonstrated\n\n### 1. AutoCache (In-Memory LRU)\n```python\nfrom instructor.cache import AutoCache\n\ncache = AutoCache(maxsize=100)\nclient = instructor.from_openai(OpenAI(), cache=cache)\n```\n\n### 2. DiskCache (Persistent)\n```python\nfrom instructor.cache import DiskCache\n\ncache = DiskCache(directory=\".instructor_cache\")\nclient = instructor.from_openai(OpenAI(), cache=cache)\n```\n\n### 3. Cache TTL (Time-to-Live)\n```python\nclient.create(\n    model=\"gpt-3.5-turbo\",\n    messages=messages,\n    response_model=User,\n    cache_ttl=3600,  # 1 hour\n)\n```\n\n### 4. create_with_completion Support\nBoth the parsed model and raw completion objects are cached and restored.\n\n## Performance Results\n\nFrom our tests:\n- **156x faster** cache hits vs API calls\n- **Identical results** from cache and API\n- **Persistent storage** across client instances\n- **Automatic cache invalidation** based on:\n  - Different prompts\n  - Different models\n  - Different response schemas\n  - TTL expiration\n\n## Running the Examples\n\n```bash\n# Run the complete demo (requires OpenAI API key)\nuv run python run_real.py\n\n# Run unit tests (no API required)\nuv run python test_simple.py\n\n# Run pytest tests\nuv run pytest tests/test_cache*.py\n```\n\n## Key Features\n\n1. **Deterministic caching** - same inputs always produce same cache key\n2. **Schema-aware** - changing field descriptions invalidates cache\n3. **Multiple backends** - AutoCache (LRU), DiskCache (persistent)\n4. **TTL support** - automatic expiration (where supported)\n5. **Raw response preservation** - `create_with_completion` works seamlessly\n6. **Thread-safe** - all cache implementations are thread-safe"
  },
  {
    "path": "examples/caching_prototype/run_real.py",
    "content": "\"\"\"Demonstrate real caching functionality with actual API calls.\"\"\"\n\nimport time\nimport instructor\nfrom instructor.cache import AutoCache, DiskCache\nfrom pydantic import BaseModel, Field\nfrom openai import OpenAI\n\n\nclass User(BaseModel):\n    name: str = Field(description=\"The user's name\")\n    age: int = Field(description=\"The user's age\")\n\n\ndef test_autocache():\n    \"\"\"Test basic in-memory caching.\"\"\"\n    print(\"\\n=== Testing AutoCache (in-memory) ===\")\n\n    cache = AutoCache(maxsize=100)\n    client = instructor.from_openai(OpenAI(), cache=cache)\n\n    messages = [\n        {\"role\": \"user\", \"content\": \"Generate a user named Alice who is 25 years old\"}\n    ]\n\n    # First call - hits API\n    print(\"First call (hits API)...\")\n    start = time.time()\n    user1 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n    )\n    api_time = time.time() - start\n    print(f\"Result: {user1}\")\n    print(f\"Time: {api_time:.2f}s\")\n\n    # Second call - from cache\n    print(\"\\nSecond call (from cache)...\")\n    start = time.time()\n    user2 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n    )\n    cache_time = time.time() - start\n    print(f\"Result: {user2}\")\n    print(f\"Time: {cache_time:.4f}s\")\n    print(f\"Speedup: {api_time / cache_time:.0f}x faster\")\n\n    assert user1.name == user2.name\n    assert user1.age == user2.age\n    print(\"✓ Cache working - identical results\")\n\n\ndef test_create_with_completion():\n    \"\"\"Test create_with_completion caching.\"\"\"\n    print(\"\\n=== Testing create_with_completion ===\")\n\n    cache = AutoCache(maxsize=100)\n    client = instructor.from_openai(OpenAI(), cache=cache)\n\n    messages = [\n        {\"role\": \"user\", \"content\": \"What's the weather? Say it's 22C and sunny.\"}\n    ]\n\n    class Weather(BaseModel):\n        temperature: float\n        condition: str\n\n    # First call\n    print(\"First call with completion...\")\n    weather1, completion1 = client.create_with_completion(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=Weather,\n    )\n    print(f\"Weather: {weather1}\")\n    print(f\"Completion ID: {completion1.id}\")\n    print(f\"Tokens used: {completion1.usage.total_tokens}\")\n\n    # Second call - cached\n    print(\"\\nSecond call (cached)...\")\n    start = time.time()\n    weather2, completion2 = client.create_with_completion(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=Weather,\n    )\n    cache_time = time.time() - start\n    print(f\"Weather: {weather2}\")\n    print(f\"Completion ID: {completion2.id}\")\n    print(f\"Cache time: {cache_time:.4f}s\")\n\n    assert weather1.temperature == weather2.temperature\n    assert completion1.id == completion2.id\n    print(\"✓ Completion object cached correctly\")\n\n\ndef test_diskcache():\n    \"\"\"Test persistent disk caching.\"\"\"\n    print(\"\\n=== Testing DiskCache (persistent) ===\")\n\n    # First client\n    cache1 = DiskCache(directory=\".instructor_cache_demo\")\n    client1 = instructor.from_openai(OpenAI(), cache=cache1)\n\n    messages = [{\"role\": \"user\", \"content\": \"Create a user named Bob who is 30\"}]\n\n    print(\"First client creates user...\")\n    user1 = client1.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n    )\n    print(f\"Result: {user1}\")\n\n    # New client, same cache directory\n    print(\"\\nNew client with same cache dir...\")\n    cache2 = DiskCache(directory=\".instructor_cache_demo\")\n    client2 = instructor.from_openai(OpenAI(), cache=cache2)\n\n    start = time.time()\n    user2 = client2.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n    )\n    cache_time = time.time() - start\n    print(f\"Result: {user2}\")\n    print(f\"Time: {cache_time:.4f}s (from disk cache)\")\n\n    assert user1.name == user2.name\n    print(\"✓ Cache persisted across clients\")\n\n    # Test create_with_completion persistence\n    print(\"\\nTesting create_with_completion persistence...\")\n    weather_messages = [{\"role\": \"user\", \"content\": \"Weather is 25C and cloudy\"}]\n\n    class Weather(BaseModel):\n        temperature: float\n        condition: str\n\n    # First call with completion\n    weather1, completion1 = client1.create_with_completion(\n        model=\"gpt-3.5-turbo\",\n        messages=weather_messages,\n        response_model=Weather,\n    )\n    print(f\"Weather: {weather1}, Completion ID: {completion1.id}\")\n\n    # Second call from different client - should get cached completion\n    weather2, completion2 = client2.create_with_completion(\n        model=\"gpt-3.5-turbo\",\n        messages=weather_messages,\n        response_model=Weather,\n    )\n    print(f\"Cached: {weather2}, Completion ID: {completion2.id}\")\n\n    assert weather1.temperature == weather2.temperature\n    assert completion1.id == completion2.id\n    print(\"✓ Raw completion persisted to disk\")\n\n    # Cleanup\n    import shutil\n\n    shutil.rmtree(\".instructor_cache_demo\", ignore_errors=True)\n\n\ndef test_cache_ttl():\n    \"\"\"Test cache TTL with DiskCache.\"\"\"\n    print(\"\\n=== Testing Cache TTL ===\")\n\n    cache = DiskCache(directory=\".instructor_cache_ttl\")\n    client = instructor.from_openai(OpenAI(), cache=cache)\n\n    messages = [{\"role\": \"user\", \"content\": \"Create user Charlie age 35\"}]\n\n    # Set with 2 second TTL\n    print(\"Setting cache with 2s TTL...\")\n    user1 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n        cache_ttl=2,\n    )\n    print(f\"Result: {user1}\")\n\n    # Immediate call - cached\n    print(\"\\nImmediate call (cached)...\")\n    start = time.time()\n    user2 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n    )\n    print(f\"Time: {time.time() - start:.4f}s\")\n\n    # Wait for expiry\n    print(\"\\nWaiting 3s for TTL expiry...\")\n    time.sleep(3)\n\n    # Should hit API again\n    print(\"After TTL (hits API)...\")\n    start = time.time()\n    user3 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=messages,\n        response_model=User,\n    )\n    api_time = time.time() - start\n    print(f\"Time: {api_time:.2f}s\")\n    print(\"✓ TTL working correctly\")\n\n    # Cleanup\n    import shutil\n\n    shutil.rmtree(\".instructor_cache_ttl\", ignore_errors=True)\n\n\ndef test_different_inputs():\n    \"\"\"Show that different inputs use different cache keys.\"\"\"\n    print(\"\\n=== Testing Different Cache Keys ===\")\n\n    cache = AutoCache(maxsize=100)\n    client = instructor.from_openai(OpenAI(), cache=cache)\n\n    # Different prompts\n    user1 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[{\"role\": \"user\", \"content\": \"Create user David age 40\"}],\n        response_model=User,\n    )\n    print(f\"User 1: {user1}\")\n\n    user2 = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[{\"role\": \"user\", \"content\": \"Create user Eve age 45\"}],\n        response_model=User,\n    )\n    print(f\"User 2: {user2}\")\n\n    assert user1.name != user2.name or user1.age != user2.age\n    print(\"✓ Different prompts = different results\")\n\n    # Different models\n    class SimpleUser(BaseModel):\n        name: str\n\n    simple = client.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[{\"role\": \"user\", \"content\": \"Create user David age 40\"}],\n        response_model=SimpleUser,\n    )\n    print(f\"Simple user: {simple}\")\n    print(\"✓ Different models = different cache keys\")\n\n\nif __name__ == \"__main__\":\n    print(\"Instructor Caching Demo - Real API Calls\")\n    print(\"=\" * 50)\n\n    test_autocache()\n    test_create_with_completion()\n    test_diskcache()\n    test_cache_ttl()\n    test_different_inputs()\n\n    print(\"\\n\" + \"=\" * 50)\n    print(\"All tests completed! ✨\")\n"
  },
  {
    "path": "examples/chain-of-density/Readme.md",
    "content": "# Introduction\n\nThis is a simple example which shows how to perform Chain Of Density summarization using GPT-3.5 and utilise the generated output to fine-tune a 3.5 model for production usage. All of our data referenced in this file is located [here](https://huggingface.co/datasets/ivanleomk/gpt4-chain-of-density) on hugging face\n\nCheck out our blog post [here](https://jxnl.github.io/instructor/blog/2023/11/05/implementing-chain-of-density/) where we have a detailed explanation of the code and a [colab notebook](https://colab.research.google.com/drive/1iBkrEh2G5U8yh8RmI8EkWxjLq6zIIuVm?usp=sharing) walking you through how we perform our calculations.\n\n## Instructions\n\n1. First, install all of the required dependencies by running the command below. We recommend using a virtual environment to install these so that it does not affect your system installation.\n\n> We use NLTK to ensure that our summaries are of a certain token length. In order to do so, you'll need to download the `punkt` package to compute the token metrics. You can do so by running the command `nltk.download('punkt')`\n\n```\npip3 install -r requirements.txt\n```\n\n2. Download the `test.csv` file and the `summarization.jsonl` file that you want to use for finetuning. We provide one with `20` examples, `50` examples and `100` examples to be used for testing. Let's now run a simple finetuning job with the following command.\n\n> Don't forget to set your `OPENAI_API_KEY` as an environment variable in your shell before running these commands\n\n```\ninstructor jobs create-from-file summarization.jsonl \n```\n\n3. Once the job is complete, you'll end up with a new GPT 3.5 model that's capable of producing high quality summaries with a high entity density. You can run it by simply changing our `finetune.py` file's `instructions.distil` annotator as\n\n```\n@instructions.distil(model=<your finetuned model >,mode=\"dispatch\")\ndef distil_summarization(text: str) -> GeneratedSummary:\n// rest of code goes here\n```"
  },
  {
    "path": "examples/chain-of-density/chain_of_density.py",
    "content": "from pydantic import BaseModel, Field, field_validator\nimport instructor\nimport nltk\nfrom openai import OpenAI\nimport spacy\n\nclient = instructor.from_openai(OpenAI())\nnlp = spacy.load(\"en_core_web_sm\")\n\n\nclass InitialSummary(BaseModel):\n    \"\"\"\n    This is an initial summary which should be long ( 4-5 sentences, ~80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose languages and fillers (Eg. This article discusses) to reach ~80 words.\n    \"\"\"\n\n    summary: str = Field(\n        ...,\n        description=\"This is a summary of the article provided which is overly verbose and uses fillers. It should be roughly 80 words in length\",\n    )\n\n\nclass RewrittenSummary(BaseModel):\n    \"\"\"\n    This is a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities.\n\n    Guidelines\n    - Make every word count : Rewrite the previous summary to improve flow and make space for additional entities\n    - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\n    - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\n    - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\"\n    - Missing entities can appear anywhere in the new summary\n\n    An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\n    \"\"\"\n\n    summary: str = Field(\n        ...,\n        description=\"This is a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities. It should have the same length ( ~ 80 words ) as the previous summary and should be easily understood without the Article\",\n    )\n    absent: list[str] = Field(\n        ...,\n        default_factory=list,\n        description=\"this is a list of Entities found absent from the new summary that were present in the previous summary\",\n    )\n    missing: list[str] = Field(\n        default_factory=list,\n        description=\"This is a list of 1-3 informative Entities from the Article that are missing from the new summary which should be included in the next generated summary.\",\n    )\n\n    @field_validator(\"summary\")\n    def min_entity_density(cls, v: str):\n        # We want to make sure we have a minimum density of 0.12 whenever we do a rewrite. This ensures that the summary quality is always going up\n        tokens = nltk.word_tokenize(v)\n        num_tokens = len(tokens)\n\n        # Extract Entities\n        doc = nlp(v)\n        num_entities = len(doc.ents)\n\n        density = num_entities / num_tokens\n        if density < 0.08:\n            raise ValueError(\n                f\"The summary of {v} has too few entities. Please regenerate a new summary with more new entities added to it. Remember that new entities can be added at any point of the summary.\"\n            )\n\n        return v\n\n    @field_validator(\"summary\")\n    def min_length(cls, v: str):\n        tokens = nltk.word_tokenize(v)\n        num_tokens = len(tokens)\n        if num_tokens < 60:\n            raise ValueError(\n                \"The current summary is too short. Please make sure that you generate a new summary that is around 80 words long.\"\n            )\n        return v\n\n    @field_validator(\"missing\")\n    def has_missing_entities(cls, missing_entities: list[str]):\n        if len(missing_entities) == 0:\n            raise ValueError(\n                \"You must identify 1-3 informative Entities from the Article which are missing from the previously generated summary to be used in a new summary\"\n            )\n        return missing_entities\n\n    @field_validator(\"absent\")\n    def has_no_absent_entities(cls, absent_entities: list[str]):\n        absent_entity_string = \",\".join(absent_entities)\n        if len(absent_entities) > 0:\n            print(f\"Detected absent entities of {absent_entity_string}\")\n            raise ValueError(\n                f\"Do not omit the following Entities {absent_entity_string} from the new summary\"\n            )\n        return absent_entities\n\n\ndef summarize_article(article: str, summary_steps: int = 3):\n    summary_chain = []\n    # We first generate an initial summary\n    summary: InitialSummary = client.chat.completions.create(\n        model=\"gpt-4-0613\",\n        response_model=InitialSummary,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Write a summary about the article that is long (4-5 sentences) yet highly non-specific. Use overly, verbose language and fillers(eg.,'this article discusses') to reach ~80 words. \",\n            },\n            {\"role\": \"user\", \"content\": f\"Here is the Article: {article}\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"The generated summary should be about 80 words.\",\n            },\n        ],\n        max_retries=2,\n    )\n    summary_chain.append(summary.summary)\n    for _i in range(summary_steps):\n        new_summary: RewrittenSummary = client.chat.completions.create(\n            model=\"gpt-4-0613\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": f\"\"\"\n                Article: {article}\n                You are going to generate an increasingly concise,entity-dense summary of the following article.\n\n                Perform the following two tasks\n                - Identify 1-3 informative entities from the following article which is missing from the previous summary\n                - Write a new denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities\n\n                Guidelines\n                - Make every word count: re-write the previous summary to improve flow and make space for additional entities\n                - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\".\n                - The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Article.\n                - Missing entities can appear anywhere in the new summary\n                - Never drop entities from the previous summary. If space cannot be made, add fewer new entities.\n                \"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Here is the previous summary: {summary_chain[-1]}\",\n                },\n            ],\n            max_retries=5,\n            max_tokens=1000,\n            response_model=RewrittenSummary,\n        )\n        summary_chain.append(new_summary.summary)\n\n    return summary_chain\n"
  },
  {
    "path": "examples/chain-of-density/finetune.py",
    "content": "from openai import OpenAI\nfrom chain_of_density import summarize_article\nimport csv\nimport logging\nimport instructor\nfrom pydantic import BaseModel, Field\n\nlogging.basicConfig(level=logging.INFO)\n\nclient = instructor.from_openai(OpenAI())\n\ninstructions = instructor.Instructions(\n    name=\"Chain Of Density\",\n    finetune_format=\"messages\",\n    # log handler is used to save the data to a file\n    # you can imagine saving it to a database or other storage\n    # based on your needs!\n    log_handlers=[logging.FileHandler(\"generated.jsonl\")],\n    openai_client=client,\n)\n\n\nclass GeneratedSummary(BaseModel):\n    \"\"\"\n    This represents a highly concise summary that includes as many entities as possible from the original source article.\n\n    An Entity is a real-world object that's assigned a name - for example, a person, country a product or a book title.\n\n    Guidelines\n    - Make every word count\n    - The new summary should be highly dense and concise yet self-contained, eg., easily understood without the Article.\n    - Make space with fusion, compression, and removal of uninformative phrases like \"the article discusses\"\n    \"\"\"\n\n    summary: str = Field(\n        ...,\n        description=\"This represents the final summary generated that captures the meaning of the original article which is as concise as possible. \",\n    )\n\n\n@instructions.distil\ndef distil_summarization(text: str) -> GeneratedSummary:\n    summary_chain: list[str] = summarize_article(text)\n    return GeneratedSummary(summary=summary_chain[-1])\n\n\nwith open(\"test.csv\") as file:\n    reader = csv.reader(file)\n    next(reader)  # Skip the header\n    for article, _summary in reader:\n        distil_summarization(article)\n"
  },
  {
    "path": "examples/chain-of-density/requirements.txt",
    "content": "openai\npydantic\ninstructor\nnltk\nrich"
  },
  {
    "path": "examples/citation_with_extraction/Dockerfile",
    "content": "# https://hub.docker.com/_/python\nFROM python:3.10-slim-bullseye\n\nENV PYTHONUNBUFFERED True\nENV APP_HOME /app\nWORKDIR $APP_HOME\nCOPY requirements.txt ./\nRUN pip install -r requirements.txt\n\n\nCOPY . ./\n\n\nCMD [\"uvicorn\", \"main:app\", \"--host\", \"0.0.0.0\", \"--port\", \"8080\"]"
  },
  {
    "path": "examples/citation_with_extraction/README.md",
    "content": "# Citation with Extraction\n\nThis repository contains a FastAPI application that uses GPT-4 to answer questions based on a given context and extract relevant facts with correct and exact citations. The extracted facts are returned as JSON events using Server-Sent Events (SSE).\n\n## How it Works\n\nThe FastAPI app defines an endpoint `/extract` that accepts a POST request with JSON data containing a `context` and a `query`. The `context` represents the text from which the question is being asked, and the `query` is the question itself.\n\nThe app leverages GPT-4, an advanced language model, to generate answers to the questions and extract relevant facts. It ensures that the extracted facts include direct quotes from the given context.\n\n## Example Usage\n\nTo use the `/extract` endpoint, send a POST request with `curl` or any HTTP client with the following format:\n\n```bash\ncurl -X POST -H \"Content-Type: application/json\" -d '{\n  \"context\": \"My name is Jason Liu, and I grew up in Toronto Canada but I was born in China.I went to an arts highschool but in university I studied Computational Mathematics and physics.  As part of coop I worked at many companies including Stitchfix, Facebook.  I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.\",\n  \"query\": \"What did the author do in school?\"\n}' -N http://localhost:8000/extract\n```\n\n```sh\ndata: {'body': 'In school, the author went to an arts high school.', 'spans': [(91, 106)], 'citation': ['arts highschool']}\ndata: {'body': 'In university, the author studied Computational Mathematics and physics.', 'spans': [(135, 172)], 'citation': ['Computational Mathematics and physics']}\n```\n\nReplace `http://localhost:8000` with the actual URL of your FastAPI app if it's running on a different host and port. The API will respond with Server-Sent Events (SSE) containing the extracted facts in real-time.\n\n## Bring your own API key\n\nIf you have your own api key but dont want to try deploying it yourself you're welcome to use my \nmodal isntance here, this code is public and I do not store your key.\n\n```bash\ncurl -X 'POST' \\\n  'https://jxnl--rag-citation-fastapi-app.modal.run/extract' \\\n  -H 'accept: */*' \\\n  -H 'Content-Type: application/json' \\\n  -H 'Authorization: Bearer <OPENAI_API_KEY>' \\\n  -d '{\n  \"context\": \"My name is Jason Liu, and I grew up in Toronto Canada but I was born in China.I went to an arts highschool but in university I studied Computational Mathematics and physics.  As part of coop I worked at many companies including Stitchfix, Facebook.  I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.\",\n  \"query\": \"What did the author do in school?\"\n}'\n```\n\n\n## Requirements\n\nTo run this application, ensure you have the following Python packages installed:\n\n```bash\npip install -r requirements.txt\n```\n\n## Running the App\n\nTo run the FastAPI app, execute the following command:\n\n```bash\nuvicorn main:app --reload\n```\n\nThis will start the server, and the `/extract` endpoint will be available at `http://localhost:8000/extract`.\n\n## Note\n\nEnsure that you have a valid API key for GPT-4 from OpenAI. If you don't have one, you can obtain it from the OpenAI website.\n\nPlease use this application responsibly and be mindful of any usage limits or restrictions from OpenAI's API usage policy.\n\n## License\n\nThis project is licensed under the [MIT License](LICENSE). Feel free to use, modify, and distribute it as you see fit."
  },
  {
    "path": "examples/citation_with_extraction/citation_fuzzy_match.py",
    "content": "import instructor\n\nfrom loguru import logger\nfrom openai import OpenAI\nfrom pydantic import Field, BaseModel, FieldValidationInfo, model_validator\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Fact(BaseModel):\n    statement: str = Field(\n        ..., description=\"Body of the sentence, as part of a response\"\n    )\n    substring_phrase: list[str] = Field(\n        ...,\n        description=\"String quote long enough to evaluate the truthfulness of the fact\",\n    )\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self, info: FieldValidationInfo) -> \"Fact\":\n        \"\"\"\n        For each substring_phrase, find the span of the substring_phrase in the context.\n        If the span is not found, remove the substring_phrase from the list.\n        \"\"\"\n        if info.context is None:\n            logger.info(\"No context found, skipping validation\")\n            return self\n\n        # Get the context from the info\n        text_chunks = info.context.get(\"text_chunk\", None)\n\n        # Get the spans of the substring_phrase in the context\n        spans = list(self.get_spans(text_chunks))\n        logger.info(\n            f\"Found {len(spans)} span(s) for from {len(self.substring_phrase)} citation(s).\"\n        )\n        # Replace the substring_phrase with the actual substring\n        self.substring_phrase = [text_chunks[span[0] : span[1]] for span in spans]\n        return self\n\n    def _get_span(self, quote, context, errs=5):\n        import regex\n\n        minor = quote\n        major = context\n\n        errs_ = 0\n        s = regex.search(f\"({minor}){{e<={errs_}}}\", major)\n        while s is None and errs_ <= errs:\n            errs_ += 1\n            s = regex.search(f\"({minor}){{e<={errs_}}}\", major)\n\n        if s is not None:\n            yield from s.spans()\n\n    def get_spans(self, context):\n        for quote in self.substring_phrase:\n            yield from self._get_span(quote, context)\n\n\nclass QuestionAnswer(instructor.ResponseSchema):\n    \"\"\"\n    Class representing a question and its answer as a list of facts each one should have a soruce.\n    each sentence contains a body and a list of sources.\"\"\"\n\n    question: str = Field(..., description=\"Question that was asked\")\n    answer: list[Fact] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be its separate object with a body and a list of sources\",\n    )\n\n    @model_validator(mode=\"after\")\n    def validate_sources(self) -> \"QuestionAnswer\":\n        \"\"\"\n        Checks that each fact has some sources, and removes those that do not.\n        \"\"\"\n        logger.info(f\"Validating {len(self.answer)} facts\")\n        self.answer = [fact for fact in self.answer if len(fact.substring_phrase) > 0]\n        logger.info(f\"Found {len(self.answer)} facts with sources\")\n        return self\n\n\ndef ask_ai(question: str, context: str) -> QuestionAnswer:\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        temperature=0,\n        response_model=QuestionAnswer,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world class algorithm to answer questions with correct and exact citations.\",\n            },\n            {\"role\": \"user\", \"content\": f\"{context}\"},\n            {\"role\": \"user\", \"content\": f\"Question: {question}\"},\n        ],\n        validation_context={\"text_chunk\": context},\n    )\n\n\nquestion = \"where did he go to school?\"\ncontext = \"\"\"\nMy name is Jason Liu, and I grew up in Toronto Canada but I was born in China.I went to an arts highschool but in university I studied Computational Mathematics and physics.  As part of coop I worked at many companies including Stitchfix, Facebook. I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.\n\"\"\"\n\nanswer = ask_ai(question, context)\nprint(answer.model_dump_json(indent=2))\n\"\"\"\n2023-09-09 15:48:11.022 | INFO     | __main__:validate_sources:35 - Found 1 span(s) for from 1 citation(s).\n2023-09-09 15:48:11.023 | INFO     | __main__:validate_sources:35 - Found 1 span(s) for from 1 citation(s).\n2023-09-09 15:48:11.023 | INFO     | __main__:validate_sources:78 - Validating 2 facts\n2023-09-09 15:48:11.023 | INFO     | __main__:validate_sources:80 - Found 2 facts with sources\n{\n  \"question\": \"where did he go to school?\",\n  \"answer\": [\n    {\n      \"statement\": \"Jason Liu went to an arts highschool.\",\n      \"substring_phrase\": [\n        \"arts highschool\"\n      ]\n    },\n    {\n      \"statement\": \"Jason Liu studied Computational Mathematics and physics in university.\",\n      \"substring_phrase\": [\n        \"university\"\n      ]\n    }\n  ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/citation_with_extraction/diagram.py",
    "content": "import erdantic as erd\n\nfrom citation_fuzzy_match import QuestionAnswer\n\ndiagram = erd.create(QuestionAnswer)\ndiagram.draw(\"examples/citation_fuzzy_match/schema.png\")\n"
  },
  {
    "path": "examples/citation_with_extraction/main.py",
    "content": "import json\nfrom collections.abc import Iterable\nfrom fastapi import FastAPI, Request, HTTPException\nfrom fastapi.params import Depends\nfrom instructor import ResponseSchema\nfrom pydantic import BaseModel, Field\nfrom starlette.responses import StreamingResponse\n\nimport os\nimport instructor\nimport logging\n\nfrom openai import OpenAI\nfrom instructor.dsl.multitask import MultiTaskBase\n\nclient = instructor.from_openai(OpenAI())\nlogger = logging.getLogger(__name__)\n\n# FastAPI app\napp = FastAPI(\n    title=\"Citation with Extraction\",\n)\n\n\nclass Fact(BaseModel):\n    \"\"\"\n    Class representing single statement.\n    Each fact has a body and a list of sources.\n    If there are multiple facts make sure to break them apart such that each one only uses a set of sources that are relevant to it.\n    \"\"\"\n\n    fact: str = Field(\n        ...,\n        description=\"Body of the sentences, as part of a response, it should read like a sentence that answers the question\",\n    )\n    substring_quotes: list[str] = Field(\n        ...,\n        description=\"Each source should be a direct quote from the context, as a substring of the original content\",\n    )\n\n    def _get_span(self, quote, context):\n        import regex\n\n        minor = quote\n        major = context\n\n        errs_ = 0\n        s = regex.search(f\"({minor}){{e<={errs_}}}\", major)\n        while s is None and errs_ <= len(context) * 0.05:\n            errs_ += 1\n            s = regex.search(f\"({minor}){{e<={errs_}}}\", major)\n\n        if s is not None:\n            yield from s.spans()\n\n    def get_spans(self, context):\n        if self.substring_quotes:\n            for quote in self.substring_quotes:\n                yield from self._get_span(quote, context)\n\n\nclass QuestionAnswer(ResponseSchema, MultiTaskBase):\n    \"\"\"\n    Class representing a question and its answer as a list of facts each one should have a source.\n    each sentence contains a body and a list of sources.\"\"\"\n\n    question: str = Field(..., description=\"Question that was asked\")\n    tasks: list[Fact] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be its separate object with a body and a list of sources\",\n    )\n\n\nQuestionAnswer.task_type = Fact\n\n\nclass Question(BaseModel):\n    context: str = Field(..., description=\"Context to extract answers from\")\n    query: str = Field(..., description=\"Question to answer\")\n\n\n# Function to extract entities from input text using GPT-3.5\ndef stream_extract(question: Question) -> Iterable[Fact]:\n    completion = client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        temperature=0,\n        stream=True,\n        functions=[QuestionAnswer.openai_schema],\n        function_call={\"name\": QuestionAnswer.openai_schema[\"name\"]},\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world class algorithm to answer questions with correct and exact citations. \",\n            },\n            {\"role\": \"user\", \"content\": \"Answer question using the following context\"},\n            {\"role\": \"user\", \"content\": f\"{question.context}\"},\n            {\"role\": \"user\", \"content\": f\"Question: {question.query}\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Tips: Make sure to cite your sources, and use the exact words from the context.\",\n            },\n        ],\n        max_tokens=2000,\n    )\n    return QuestionAnswer.from_streaming_response(completion)\n\n\ndef get_api_key(request: Request):\n    \"\"\"\n    This just gets the API key from the request headers.\n    but tries to read from the environment variable OPENAI_API_KEY first.\n    \"\"\"\n    if \"OPENAI_API_KEY\" in os.environ:\n        return os.environ[\"OPENAI_API_KEY\"]\n\n    auth = request.headers.get(\"Authorization\")\n    if auth is None:\n        raise HTTPException(status_code=401, detail=\"Missing Authorization header\")\n\n    if auth.startswith(\"Bearer \"):\n        return auth.replace(\"Bearer \", \"\")\n\n    return None\n\n\n# Route to handle SSE events and return users\n@app.post(\"/extract\", response_class=StreamingResponse)\nasync def extract(question: Question, openai_key: str = Depends(get_api_key)):\n    raise Exception(\n        \"The 'openai.api_key' option isn't read in the client API. You will need to pass it when you instantiate the client, e.g. 'OpenAI(api_key=openai_key)'\"\n    )\n    facts = stream_extract(question)\n\n    async def generate():\n        for fact in facts:\n            logger.info(f\"Fact: {fact}\")\n            spans = list(fact.get_spans(question.context))\n            resp = {\n                \"body\": fact.fact,\n                \"spans\": spans,\n                \"citation\": [question.context[a:b] for (a, b) in spans],\n            }\n            resp_json = json.dumps(resp)\n            yield f\"data: {resp_json}\"\n        yield \"data: [DONE]\"\n\n    return StreamingResponse(generate(), media_type=\"text/event-stream\")\n"
  },
  {
    "path": "examples/citation_with_extraction/modal_main.py",
    "content": "from main import app\nimport modal\n\nstub = modal.Stub(\"rag-citation\")\n\nimage = modal.Image.debian_slim().pip_install(\"fastapi\", \"instructor>=0.2.1\", \"regex\")\n\n\n@stub.function(image=image)\n@modal.asgi_app()\ndef fastapi_app():\n    return app\n"
  },
  {
    "path": "examples/citation_with_extraction/requirements.txt",
    "content": "fastapi\nuvicorn\nopenai>=1.0.0\npydantic\ninstructor\nregex"
  },
  {
    "path": "examples/citations/run.py",
    "content": "from typing import Optional\nfrom openai import OpenAI\nfrom pydantic import (\n    BaseModel,\n    Field,\n    ValidationError,\n    ValidationInfo,\n    field_validator,\n    model_validator,\n)\n\nimport instructor\n\nclient = instructor.from_openai(OpenAI())\n\n\"\"\" \nExample 1) Simple Substring check that compares a citation to a text chunk\n\"\"\"\n\n\nclass Statements(BaseModel):\n    body: str\n    substring_quote: str\n\n    @field_validator(\"substring_quote\")\n    @classmethod\n    def substring_quote_exists(cls, v: str, info: ValidationInfo):\n        context = info.context.get(\"text_chunks\", None)\n\n        # Check if the substring_quote is in the text_chunk\n        # if not, raise an error\n        for text_chunk in context.values():\n            if v in text_chunk:\n                return v\n        raise ValueError(\n            f\"Could not find substring_quote `{v}` in contexts\",\n        )\n\n\nclass AnswerWithCitaton(BaseModel):\n    question: str\n    answer: list[Statements]\n\n\ntry:\n    AnswerWithCitaton.model_validate(\n        {\n            \"question\": \"What is the capital of France?\",\n            \"answer\": [\n                {\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"},\n            ],\n        },\n        context={\n            \"text_chunks\": {\n                1: \"Jason is a pirate\",\n                2: \"Paris is not the capital of France\",\n                3: \"Irrelevant data\",\n            }\n        },\n    )\nexcept ValidationError as e:\n    print(e)\n\"\"\"\nanswer.0.substring_quote\n  Value error, Could not find substring_quote `Paris is the capital of France` in contexts [type=value_error, input_value='Paris is the capital of France', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\n\"\"\"\n\n\n\"\"\" \nExample 2) Using an LLM to verify if a \n\"\"\"\n\n\nclass Validation(BaseModel):\n    \"\"\"\n    Verification response from the LLM,\n    the error message should be detailed if the is_valid is False\n    but keep it to less than 100 characters, reference specific\n    attributes that you are comparing, use `...` is the string is too long\n    \"\"\"\n\n    is_valid: bool\n    error_messages: Optional[str] = Field(None, description=\"Error messages if any\")\n\n\nclass Statements(BaseModel):\n    body: str\n    substring_quote: str\n\n    @model_validator(mode=\"after\")\n    def substring_quote_exists(self, info: ValidationInfo):\n        context = info.context.get(\"text_chunks\", None)\n\n        resp: Validation = client.chat.completions.create(\n            response_model=Validation,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Does the following citation exist in the following context?\\n\\nCitation: {self.substring_quote}\\n\\nContext: {context}\",\n                }\n            ],\n            model=\"gpt-3.5-turbo\",\n        )\n\n        if resp.is_valid:\n            return self\n\n        raise ValueError(resp.error_messages)\n\n\nclass AnswerWithCitaton(BaseModel):\n    question: str\n    answer: list[Statements]\n\n\nresp = AnswerWithCitaton.model_validate(\n    {\n        \"question\": \"What is the capital of France?\",\n        \"answer\": [\n            {\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"},\n        ],\n    },\n    context={\n        \"text_chunks\": {\n            1: \"Jason is a pirate\",\n            2: \"Paris is the capital of France\",\n            3: \"Irrelevant data\",\n        }\n    },\n)\n# output: notice that there are no errors\nprint(resp.model_dump_json(indent=2))\n{\n    \"question\": \"What is the capital of France?\",\n    \"answer\": [{\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"}],\n}\n\n# Now we change the text chunk to something else, and we get an error\ntry:\n    AnswerWithCitaton.model_validate(\n        {\n            \"question\": \"What is the capital of France?\",\n            \"answer\": [\n                {\"body\": \"Paris\", \"substring_quote\": \"Paris is the capital of France\"},\n            ],\n        },\n        context={\n            \"text_chunks\": {\n                1: \"Jason is a pirate\",\n                2: \"Paris is not the capital of France\",\n                3: \"Irrelevant data\",\n            }\n        },\n    )\nexcept ValidationError as e:\n    print(e)\n\"\"\" \n1 validation error for AnswerWithCitaton\nanswer.0\n  Value error, Citation not found in context [type=value_error, input_value={'body': 'Paris', 'substr... the capital of France'}, input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\n\"\"\"\n\n# Example 3) Using an LLM to verify if the citations and the answers are all aligned\n\n\n# we keep the same model as above for Statements, but we add a new model for the answer\n# that also verifies that the citations are aligned with the answers\nclass AnswerWithCitaton(BaseModel):\n    question: str\n    answer: list[Statements]\n\n    @model_validator(mode=\"after\")\n    def validate_answer(self, info: ValidationInfo):\n        context = info.context.get(\"text_chunks\", None)\n\n        resp: Validation = client.chat.completions.create(\n            response_model=Validation,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Does the following answers match the question and the context?\\n\\nQuestion: {self.question}\\n\\nAnswer: {self.answer}\\n\\nContext: {context}\",\n                }\n            ],\n            model=\"gpt-3.5-turbo\",\n        )\n\n        if resp.is_valid:\n            return self\n\n        raise ValueError(resp.error_messages)\n\n\n\"\"\" \nUsing LLMs for citation verification is inefficient during runtime. \nHowever, we can utilize them to create a dataset consisting only of accurate responses \nwhere citations must be valid (as determined by LLM, fuzzy text search, etc.). \n\nThis approach would require an initial investment during data generation to obtain \na finely-tuned model for improved citation.\n\"\"\"\ntry:\n    AnswerWithCitaton.model_validate(\n        {\n            \"question\": \"What is the capital of France?\",\n            \"answer\": [\n                {\"body\": \"Texas\", \"substring_quote\": \"Paris is the capital of France\"},\n            ],\n        },\n        context={\n            \"text_chunks\": {\n                1: \"Jason is a pirate\",\n                2: \"Paris is the capital of France\",\n                3: \"Irrelevant data\",\n            }\n        },\n    )\nexcept ValidationError as e:\n    print(e)\n\"\"\" \n1 validation error for AnswerWithCitaton\n  Value error, The answer does not match the question and context [type=value_error, input_value={'question': 'What is the...he capital of France'}]}, input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.4/v/value_error\n\"\"\"\n"
  },
  {
    "path": "examples/classification/classifiy_with_validation.py",
    "content": "# pip install openai instructor\nfrom pydantic import BaseModel, field_validator, Field\nimport openai\nimport instructor\nfrom tqdm import tqdm\n\nclient = instructor.from_openai(openai.OpenAI())\n\nclasses = {\n    \"11-0000\": \"Management\",\n    \"13-0000\": \"Business and Financial Operations\",\n    \"15-0000\": \"Computer and Mathematical\",\n    \"17-0000\": \"Architecture and Engineering\",\n    \"19-0000\": \"Life, Physical, and Social Science\",\n    \"21-0000\": \"Community and Social Service\",\n    \"23-0000\": \"Legal\",\n    \"25-0000\": \"Education Instruction and Library\",\n    \"27-0000\": \"Arts, Design, Entertainment, Sports and Media\",\n    \"29-0000\": \"Healthcare Practitioners and Technical\",\n    \"31-0000\": \"Healthcare Support\",\n    \"33-0000\": \"Protective Service\",\n    \"35-0000\": \"Food Preparation and Serving\",\n    \"37-0000\": \"Building and Grounds Cleaning and Maintenance\",\n    \"39-0000\": \"Personal Care and Service\",\n    \"41-0000\": \"Sales and Related\",\n    \"43-0000\": \"Office and Administrative Support\",\n    \"45-0000\": \"Farming, Fishing and Forestry\",\n    \"47-0000\": \"Construction and Extraction\",\n    \"49-0000\": \"Installation, Maintenance, and Repair\",\n    \"51-0000\": \"Production Occupations\",\n    \"53-0000\": \"Transportation and Material Moving\",\n    \"55-0000\": \"Military Specific\",\n    \"99-0000\": \"Other\",\n}\n\n\nclass SOCCode(BaseModel):\n    reasoning: str = Field(\n        default=None,\n        description=\"Step-by-step reasoning to get the correct classification\",\n    )\n    code: str\n\n    @field_validator(\"code\")\n    def validate_code(cls, v):\n        if v not in classes:\n            raise ValueError(f\"Invalid SOC code, {v}\")\n        return v\n\n\ndef classify_job(description: str) -> SOCCode:\n    response = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=SOCCode,\n        max_retries=3,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"You are an expert at classifying job descriptions into Standard Occupational Classification (SOC) codes. from the following list: {classes}\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify this job description into the most appropriate SOC code: {description}\",\n            },\n        ],\n    )\n    return response\n\n\nif __name__ == \"__main__\":\n    # gpt-3.5-turbo: 16/20\n    # gpt-3.5-turbo (COT): 18/20\n    # gpt-4-turbo: 20/20\n\n    job_descriptions = [\n        (\n            \"Develop and design complex software applications for various industries, including finance, healthcare, and e-commerce\",\n            \"15-0000\",  # Computer and Mathematical Occupations\n        ),\n        (\n            \"Provide comprehensive technical support and troubleshooting for enterprise-level software products, ensuring seamless user experience\",\n            \"15-0000\",  # Computer and Mathematical Occupations\n        ),\n        (\n            \"Teach a diverse range of subjects to elementary school students, fostering their intellectual and social development\",\n            \"25-0000\",  # Education, Training, and Library Occupations\n        ),\n        (\n            \"Conduct cutting-edge research in various academic fields at a renowned university, contributing to the advancement of knowledge\",\n            \"25-0000\",  # Education, Training, and Library Occupations\n        ),\n        (\n            \"Design visually appealing and strategically effective logos, branding, and marketing materials for clients across different industries\",\n            \"27-0000\",  # Arts, Design, Entertainment, Sports, and Media Occupations\n        ),\n        (\n            \"Perform as part of a professional musical group, entertaining audiences and showcasing artistic talent\",\n            \"27-0000\",  # Arts, Design, Entertainment, Sports, and Media Occupations\n        ),\n        (\n            \"Diagnose and treat a wide range of injuries and medical conditions, providing comprehensive healthcare services to patients\",\n            \"29-0000\",  # Healthcare Practitioners and Technical Occupations\n        ),\n        (\n            \"Assist doctors and nurses in delivering high-quality patient care, ensuring the smooth operation of healthcare facilities\",\n            \"31-0000\",  # Healthcare Support Occupations\n        ),\n        (\n            \"Patrol assigned areas to enforce laws and ordinances, maintaining public safety and order in the community\",\n            \"33-0000\",  # Protective Service Occupations\n        ),\n        (\n            \"Prepare and serve a diverse menu of delectable meals in a fast-paced restaurant environment\",\n            \"35-0000\",  # Food Preparation and Serving Related Occupations\n        ),\n        (\n            \"Maintain the cleanliness and upkeep of various buildings and facilities, ensuring a safe and presentable environment\",\n            \"37-0000\",  # Building and Grounds Cleaning and Maintenance Occupations\n        ),\n        (\n            \"Provide a range of beauty services, such as haircuts, styling, and manicures, to help clients look and feel their best\",\n            \"39-0000\",  # Personal Care and Service Occupations\n        ),\n        (\n            \"Engage with customers in a retail setting, providing excellent service and assisting them in finding the products they need\",\n            \"41-0000\",  # Sales and Related Occupations\n        ),\n        (\n            \"Perform a variety of clerical duties in an office environment, supporting the overall operations of the organization\",\n            \"43-0000\",  # Office and Administrative Support Occupations\n        ),\n        (\n            \"Cultivate and harvest a wide range of crops, contributing to the production of food and other agricultural products\",\n            \"45-0000\",  # Farming, Fishing, and Forestry Occupations\n        ),\n        (\n            \"Construct and build various structures, including residential, commercial, and infrastructure projects\",\n            \"47-0000\",  # Construction and Extraction Occupations\n        ),\n        (\n            \"Repair and maintain a diverse range of mechanical equipment, ensuring their proper functioning and longevity\",\n            \"49-0000\",  # Installation, Maintenance, and Repair Occupations\n        ),\n        (\n            \"Operate specialized machinery and equipment in a manufacturing setting to produce high-quality goods\",\n            \"51-0000\",  # Production Occupations\n        ),\n        (\n            \"Transport freight and goods across different regions, ensuring timely and efficient delivery\",\n            \"53-0000\",  # Transportation and Material Moving Occupations\n        ),\n        (\n            \"Serve in the armed forces, protecting the nation and its citizens through various military operations and duties\",\n            \"55-0000\",  # Military Specific Occupations\n        ),\n    ]\n\n    correct = 0\n    errors = []\n    for description, expected_code in tqdm(job_descriptions):\n        try:\n            predicted_code = None\n            result = classify_job(description)\n            predicted_code = result.code\n            assert result.code == expected_code, (\n                f\"Expected {expected_code}, got {result.code} for description: {description}\"\n            )\n            correct += 1\n        except Exception as e:\n            errors.append(\n                f\"Got {classes.get(predicted_code, 'Unknown')} expected {classes.get(expected_code, 'Unknown')}\"\n            )\n\n    print(f\"{correct} out of {len(job_descriptions)} tests passed!\")\n    for error in errors:\n        print(error)\n"
  },
  {
    "path": "examples/classification/multi_prediction.py",
    "content": "import enum\nimport instructor\n\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\nclient = instructor.from_openai(OpenAI())\n\n\n# Define new Enum class for multiple labels\nclass MultiLabels(str, enum.Enum):\n    BILLING = \"billing\"\n    GENERAL_QUERY = \"general_query\"\n    HARDWARE = \"hardware\"\n\n\n# Adjust the prediction model to accommodate a list of labels\nclass MultiClassPrediction(BaseModel):\n    predicted_labels: list[MultiLabels]\n\n\n# Modify the classify function\ndef multi_classify(data: str) -> MultiClassPrediction:\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=MultiClassPrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following support ticket: {data}\",\n            },\n        ],\n    )  # type: ignore\n\n\n# Example using a support ticket\nticket = (\n    \"My account is locked and I can't access my billing info. Phone is also broken.\"\n)\nprediction = multi_classify(ticket)\nprint(prediction)\n"
  },
  {
    "path": "examples/classification/simple_prediction.py",
    "content": "import enum\nimport instructor\nfrom openai import OpenAI\n\nfrom pydantic import BaseModel\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Labels(str, enum.Enum):\n    SPAM = \"spam\"\n    NOT_SPAM = \"not_spam\"\n\n\nclass SinglePrediction(BaseModel):\n    \"\"\"\n    Correct class label for the given text\n    \"\"\"\n\n    class_label: Labels\n\n\ndef classify(data: str) -> SinglePrediction:\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=SinglePrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: {data}\",\n            },\n        ],\n    )  # type: ignore\n\n\nprediction = classify(\"Hello there I'm a nigerian prince and I want to give you money\")\nassert prediction.class_label == Labels.SPAM\n"
  },
  {
    "path": "examples/codegen-from-schema/create_fastapi_app.py",
    "content": "import json\nimport datetime\nfrom pathlib import Path\nfrom jinja2 import Template\nimport re\nfrom datamodel_code_generator import InputFileType, generate\nfrom pydantic import BaseModel\n\nAPP_TEMPLATE_STR = '''# generated by instructor-codegen:\n#   timestamp: {{timestamp}}\n#   task_name: {{task_name}}\n#   api_path: {{api_path}}\n#   json_schema_path: {{json_schema_path}}\n\nfrom fastapi import FastAPI\nfrom pydantic import BaseModel\nfrom jinja2 import Template\nfrom models import {{title}}\n\nimport openai\nimport instructor\n\ninstructor.from_openai()\n\napp = FastAPI()\n\nclass TemplateVariables(BaseModel):\n{% for var in jinja_vars %}\n    {{var.strip()}}: str\n{% endfor %}\n\nclass RequestSchema(BaseModel):\n    template_variables: TemplateVariables\n    model: str\n    temperature: int\n\nPROMPT_TEMPLATE = Template(\"\"\"{{prompt_template}}\"\"\".strip())\n\n@app.post(\"{{api_path}}\", response_model={{title}})\nasync def {{task_name}}(input: RequestSchema) -> {{title}}:\n    rendered_prompt = PROMPT_TEMPLATE.render(**input.template_variables.model_dump())\n    return await openai.ChatCompletion.acreate(\n        model=input.model,\n        temperature=input.temperature,\n        response_model={{title}},\n        messages=[\n            {\"role\": \"user\", \"content\": rendered_prompt}\n        ]\n    ) # type: ignore\n'''\n\n\nclass TemplateVariables(BaseModel):\n    biography: str\n\n\ndef load_json_schema(json_schema_path: str) -> dict:\n    try:\n        with open(json_schema_path) as f:\n            return json.load(f)\n    except Exception as e:\n        raise ValueError(f\"Failed to load JSON schema: {e}\") from e\n\n\ndef generate_pydantic_model(json_schema_path: str):\n    input_path = Path(json_schema_path)\n    output_path = Path(\"./models.py\")\n    generate(\n        input_=input_path, input_file_type=InputFileType.JsonSchema, output=output_path\n    )\n\n\ndef extract_jinja_vars(prompt_template: str) -> list:\n    return re.findall(r\"\\{\\{(.*?)\\}\\}\", prompt_template)\n\n\ndef render_app_template(template_str: str, **kwargs) -> str:\n    app_template = Template(template_str)\n    return app_template.render(**kwargs)\n\n\ndef create_app(\n    api_path: str, task_name: str, json_schema_path: str, prompt_template: str\n) -> str:\n    if not api_path.startswith(\"/\"):\n        api_path = \"/\" + api_path\n\n    schema = load_json_schema(json_schema_path)\n    title = schema[\"title\"]\n    generate_pydantic_model(json_schema_path)\n\n    jinja_vars = extract_jinja_vars(prompt_template)\n\n    return render_app_template(\n        APP_TEMPLATE_STR,\n        timestamp=datetime.datetime.now().isoformat(),\n        task_name=task_name,\n        api_path=api_path,\n        json_schema_path=json_schema_path,\n        title=title,\n        jinja_vars=jinja_vars,\n        prompt_template=prompt_template,\n    )\n\n\nif __name__ == \"__main__\":\n    try:\n        fastapi_code = create_app(\n            api_path=\"/api/v1/extract_person\",\n            task_name=\"extract_person\",\n            json_schema_path=\"./input.json\",\n            prompt_template=\"Extract the person from the following: {{biography}}\",\n        )\n\n        with open(\"./run.py\", \"w\") as f:\n            f.write(fastapi_code)\n\n        print(\"FastAPI application generated and saved to './run.py'\")\n\n    except Exception as e:\n        print(f\"An error occurred: {e}\")\n"
  },
  {
    "path": "examples/codegen-from-schema/input.json",
    "content": "{\n  \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n  \"type\": \"object\",\n  \"title\": \"ExtractPerson\",\n  \"properties\": {\n    \"name\": {\n      \"type\": \"string\"\n    },\n    \"age\": {\n      \"type\": \"integer\"\n    },\n    \"phoneNumbers\": {\n      \"type\": \"array\",\n      \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n          \"type\": {\n            \"type\": \"string\",\n            \"enum\": [\"home\", \"work\", \"mobile\"]\n          },\n          \"number\": {\n            \"type\": \"string\"\n          }\n        },\n        \"required\": [\"type\", \"number\"]\n      }\n    }\n  },\n  \"required\": [\"name\", \"age\", \"phoneNumbers\"]\n}\n"
  },
  {
    "path": "examples/codegen-from-schema/models.py",
    "content": "# generated by datamodel-codegen:\n#   filename:  input.json\n#   timestamp: 2023-09-10T00:33:42+00:00\n\nfrom __future__ import annotations\n\nfrom enum import Enum\n\nfrom pydantic import BaseModel\n\n\nclass Type(Enum):\n    home = \"home\"\n    work = \"work\"\n    mobile = \"mobile\"\n\n\nclass PhoneNumber(BaseModel):\n    type: Type\n    number: str\n\n\nclass ExtractPerson(BaseModel):\n    name: str\n    age: int\n    phoneNumbers: list[PhoneNumber]\n"
  },
  {
    "path": "examples/codegen-from-schema/readme.md",
    "content": "# FastAPI Code Generator\n\n## Overview\n\nGenerates FastAPI application code from API path, task name, JSON schema path, and Jinja2 prompt template. Also creates a `models.py` file for Pydantic models.\n\n## Dependencies\n\n- FastAPI\n- Pydantic\n- Jinja2\n- datamodel-code-generator\n\n## Functions\n\n### `create_app(api_path: str, task_name: str, json_schema_path: str, prompt_template: str) -> str`\n\nMain function to generate FastAPI application code.\n\n## Usage\n\nRun the script with required parameters.\n\nExample:\n\n```python\nfastapi_code = create_app(\n    api_path=\"/api/v1/extract_person\",\n    task_name=\"extract_person\",\n    json_schema_path=\"./input.json\",\n    prompt_template=\"Extract the person from the following: {{biography}}\",\n)\n```\n\nOutputs FastAPI application code to `./run.py` and a Pydantic model to `./models.py`."
  },
  {
    "path": "examples/codegen-from-schema/run.py",
    "content": "# This file was generated by instructor\n#   timestamp: 2023-09-09T20:33:42.572627\n#   task_name: extract_person\n#   api_path: /api/v1/extract_person\n#   json_schema_path: ./input.json\n\nimport instructor\n\nfrom fastapi import FastAPI\nfrom pydantic import BaseModel\nfrom jinja2 import Template\nfrom models import ExtractPerson\nfrom openai import AsyncOpenAI\n\naclient = instructor.apatch(AsyncOpenAI())\n\napp = FastAPI()\n\n\nclass TemplateVariables(BaseModel):\n    biography: str\n\n\nclass RequestSchema(BaseModel):\n    template_variables: TemplateVariables\n    model: str\n    temperature: int\n\n\nPROMPT_TEMPLATE = Template(\n    \"\"\"Extract the person from the following: {{biography}}\"\"\".strip()\n)\n\n\n@app.post(\"/api/v1/extract_person\", response_model=ExtractPerson)\nasync def extract_person(input: RequestSchema) -> ExtractPerson:\n    rendered_prompt = PROMPT_TEMPLATE.render(**input.template_variables.model_dump())\n    return await aclient.chat.completions.create(\n        model=input.model,\n        temperature=input.temperature,\n        response_model=ExtractPerson,\n        messages=[{\"role\": \"user\", \"content\": rendered_prompt}],\n    )  # type: ignore\n"
  },
  {
    "path": "examples/cohere/cohere.py",
    "content": "import cohere\nimport instructor\nfrom pydantic import BaseModel, Field\n\n\n# Patching the Cohere client with the instructor for enhanced capabilities\nclient = instructor.from_cohere(\n    cohere.ClientV2(),\n    max_tokens=1000,\n    model=\"command-a-03-2025\",\n)\n\n\nclass Person(BaseModel):\n    name: str = Field(description=\"name of the person\")\n    country_of_origin: str = Field(description=\"country of origin of the person\")\n\n\nclass Group(BaseModel):\n    group_name: str = Field(description=\"name of the group\")\n    members: list[Person] = Field(description=\"list of members in the group\")\n\n\ntask = \"\"\"\\\nGiven the following text, create a Group object for 'The Beatles' band\n\nText:\nThe Beatles were an English rock band formed in Liverpool in 1960. With a line-up comprising John Lennon, Paul McCartney, George Harrison and Ringo Starr, they are regarded as the most influential band of all time. The group were integral to the development of 1960s counterculture and popular music's recognition as an art form.\n\"\"\"\ngroup = client.messages.create(\n    response_model=Group,\n    messages=[{\"role\": \"user\", \"content\": task}],\n    temperature=0,\n)\n\nprint(group.model_dump_json(indent=2))\n\"\"\"\n{\n  \"group_name\": \"The Beatles\",\n  \"members\": [\n    {\n      \"name\": \"John Lennon\",\n      \"country_of_origin\": \"England\"\n    },\n    {\n      \"name\": \"Paul McCartney\",\n      \"country_of_origin\": \"England\"\n    },\n    {\n      \"name\": \"George Harrison\",\n      \"country_of_origin\": \"England\"\n    },\n    {\n      \"name\": \"Ringo Starr\",\n      \"country_of_origin\": \"England\"\n    }\n  ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/crm/run.py",
    "content": "from enum import Enum\nfrom pydantic import BaseModel, Field\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass CRMSource(Enum):\n    personal = \"personal\"\n    business = \"business\"\n    work_contacts = \"work_contacts\"\n    all = \"all\"\n\n\nclass CRMSearch(BaseModel):\n    \"\"\"A CRM search query\n\n    The search description is a natural language description of the search query\n    the backend will use semantic search so use a range of phrases to describe the search\n    \"\"\"\n\n    source: CRMSource\n    city_location: str = Field(\n        ..., description=\"City location used to match the desired customer profile\"\n    )\n    search_description: str = Field(\n        ..., description=\"Search query used to match the desired customer profile\"\n    )\n\n\nclass CRMSearchQuery(BaseModel):\n    \"\"\"\n    A set of CRM queries to be executed against a CRM system,\n    for large locations decompose into multiple queries of smaller locations\n    \"\"\"\n\n    queries: list[CRMSearch]\n\n\ndef query_crm(query: str) -> CRMSearchQuery:\n    queries = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=CRMSearchQuery,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n            You are a world class CRM search career generator. \n            You will take the user query and decompose it into a set of CRM queries queries.\n            \"\"\",\n            },\n            {\"role\": \"user\", \"content\": query},\n        ],\n    )\n    return queries\n\n\nif __name__ == \"__main__\":\n    query = \"find me all the pottery businesses in San Francisco and my friends in the east coast big cities\"\n    print(query_crm(query).model_dump_json(indent=2))\n    \"\"\"\n    {\n    \"queries\": [\n        {\n            \"source\": \"business\",\n            \"city_location\": \"San Francisco\",\n            \"search_description\": \"pottery businesses\"\n        },\n        {\n            \"source\": \"personal\",\n            \"city_location\": \"New York\",\n            \"search_description\": \"friends in New York\"\n        },\n        {\n            \"source\": \"personal\",\n            \"city_location\": \"Boston\",\n            \"search_description\": \"friends in Boston\"\n        },\n        {\n            \"source\": \"personal\",\n            \"city_location\": \"Philadelphia\",\n            \"search_description\": \"friends in Philadelphia\"\n        }\n    ]\n    }\n    \"\"\"\n"
  },
  {
    "path": "examples/decimals/run.py",
    "content": "#!/usr/bin/env python3\n\nfrom decimal import Decimal\nfrom pydantic import BaseModel, field_validator\nimport instructor\n\n\nclass Receipt(BaseModel):\n    item: str\n    price: Decimal\n\n    @field_validator(\"price\", mode=\"before\")\n    @classmethod\n    def parse_price(cls, v):\n        if isinstance(v, str):\n            return Decimal(v)\n        return v\n\n\nif __name__ == \"__main__\":\n    client = instructor.from_provider(\"openai/gpt-4.1-mini\")\n\n    receipt = client.chat.completions.create(\n        messages=[{\"role\": \"user\", \"content\": \"Coffee costs $4.99\"}],\n        response_model=Receipt,\n    )\n\n    print(f\"Item: {receipt.item}\")\n    print(f\"Price: {receipt.price}\")  # Decimal('4.99')\n    print(f\"Type: {type(receipt.price)}\")  # <class 'decimal.Decimal'>\n\n    # Test precision\n    total = receipt.price * 2\n    print(f\"Total for 2 items: {total}\")  # Decimal('9.98')\n"
  },
  {
    "path": "examples/distilations/math_finetunes_val.jsonl",
    "content": "{\"messages\": [{\"role\": \"system\", \"content\": \"Predict the results of this function:\\n\\ndef fn(a: int, b: int, c: str) -> __main__.Multiply\\n\\\"\\\"\\\"\\n_summary_\\n\\nArgs:\\n    a (int): _description_\\n    b (int): _description_\\n    c (str): _description_\\n\\nReturns:\\n    Response: _description_\\n\\\"\\\"\\\"\"}, {\"role\": \"user\", \"content\": \"Return `fn(540, b=677, c=\\\"hello\\\")`\"}, {\"role\": \"assistant\", \"function_call\": {\"name\": \"Multiply\", \"arguments\": \"{\\n  \\\"a\\\": 540,\\n  \\\"b\\\": 677,\\n  \\\"result\\\": 1217\\n}\"}}], \"functions\": [{\"name\": \"Multiply\", \"description\": \"Correctly extracted `Multiply` with all the required parameters with correct types\", \"parameters\": {\"properties\": {\"a\": {\"type\": \"integer\"}, \"b\": {\"type\": \"integer\"}, \"result\": {\"description\": \"The result of the multiplication\", \"type\": \"integer\"}}, \"required\": [\"a\", \"b\", \"result\"], \"type\": \"object\"}}]}\n{\"messages\": [{\"role\": \"system\", \"content\": \"Predict the results of this function:\\n\\ndef fn(a: int, b: int, c: str) -> __main__.Multiply\\n\\\"\\\"\\\"\\n_summary_\\n\\nArgs:\\n    a (int): _description_\\n    b (int): _description_\\n    c (str): _description_\\n\\nReturns:\\n    Response: _description_\\n\\\"\\\"\\\"\"}, {\"role\": \"user\", \"content\": \"Return `fn(798, b=534, c=\\\"hello\\\")`\"}, {\"role\": \"assistant\", \"function_call\": {\"name\": \"Multiply\", \"arguments\": \"{\\n  \\\"a\\\": 798,\\n  \\\"b\\\": 534,\\n  \\\"result\\\": 1332\\n}\"}}], \"functions\": [{\"name\": \"Multiply\", \"description\": \"Correctly extracted `Multiply` with all the required parameters with correct types\", \"parameters\": {\"properties\": {\"a\": {\"type\": \"integer\"}, \"b\": {\"type\": \"integer\"}, \"result\": {\"description\": \"The result of the multiplication\", \"type\": \"integer\"}}, \"required\": [\"a\", \"b\", \"result\"], \"type\": \"object\"}}]}\n{\"messages\": [{\"role\": \"system\", \"content\": \"Predict the results of this function:\\n\\ndef fn(a: int, b: int, c: str) -> __main__.Multiply\\n\\\"\\\"\\\"\\n_summary_\\n\\nArgs:\\n    a (int): _description_\\n    b (int): _description_\\n    c (str): _description_\\n\\nReturns:\\n    Response: _description_\\n\\\"\\\"\\\"\"}, {\"role\": \"user\", \"content\": \"Return `fn(608, b=669, c=\\\"hello\\\")`\"}, {\"role\": \"assistant\", \"function_call\": {\"name\": \"Multiply\", \"arguments\": \"{\\n  \\\"a\\\": 608,\\n  \\\"b\\\": 669,\\n  \\\"result\\\": 1277\\n}\"}}], \"functions\": [{\"name\": \"Multiply\", \"description\": \"Correctly extracted `Multiply` with all the required parameters with correct types\", \"parameters\": {\"properties\": {\"a\": {\"type\": \"integer\"}, \"b\": {\"type\": \"integer\"}, \"result\": {\"description\": \"The result of the multiplication\", \"type\": \"integer\"}}, \"required\": [\"a\", \"b\", \"result\"], \"type\": \"object\"}}]}\n{\"messages\": [{\"role\": \"system\", \"content\": \"Predict the results of this function:\\n\\ndef fn(a: int, b: int, c: str) -> __main__.Multiply\\n\\\"\\\"\\\"\\n_summary_\\n\\nArgs:\\n    a (int): _description_\\n    b (int): _description_\\n    c (str): _description_\\n\\nReturns:\\n    Response: _description_\\n\\\"\\\"\\\"\"}, {\"role\": \"user\", \"content\": \"Return `fn(982, b=768, c=\\\"hello\\\")`\"}, {\"role\": \"assistant\", \"function_call\": {\"name\": \"Multiply\", \"arguments\": \"{\\n  \\\"a\\\": 982,\\n  \\\"b\\\": 768,\\n  \\\"result\\\": 1750\\n}\"}}], \"functions\": [{\"name\": \"Multiply\", \"description\": \"Correctly extracted `Multiply` with all the required parameters with correct types\", \"parameters\": {\"properties\": {\"a\": {\"type\": \"integer\"}, \"b\": {\"type\": \"integer\"}, \"result\": {\"description\": \"The result of the multiplication\", \"type\": \"integer\"}}, \"required\": [\"a\", \"b\", \"result\"], \"type\": \"object\"}}]}"
  },
  {
    "path": "examples/distilations/readme.md",
    "content": "# What to Expect\nThis script demonstrates how to use the `Instructor` library for fine-tuning a Python function that performs three-digit multiplication. It uses Pydantic for type validation and logging features to generate a fine-tuning dataset.\n\n## How to Run\n\n### Prerequisites\n- Python 3.9\n- `Instructor` library\n\n### Steps\n1. **Install Dependencies**  \n   If you haven't already installed the required libraries, you can do so using pip:\n    ```\n    pip install instructor pydantic\n    ```\n\n2. **Set Up Logging**  \n   The script uses Python's built-in `logging` module to log the fine-tuning process. Ensure you have write permissions in the directory where the log file `math_finetunes.jsonl` will be saved.\n\n3. **Run the Script**  \n    Navigate to the directory containing `script.py` and run it:\n    ```\n    python three_digit_mul.py\n    ```\n\n    This will execute the script, running the function ten times with random three-digit numbers for multiplication. The function outputs and logs are saved in `math_finetunes.jsonl`.\n\n4. **Fine-Tuning**  \n    Once you have the log file, you can run a fine-tuning job using the following `Instructor` CLI command:\n    ```\n    instructor jobs create-from-file math_finetunes.jsonl\n    ```\n    Wait for the fine-tuning job to complete.\n\n    If you have validation date you can run:\n\n    ```\n    instructor jobs create-from-file math_finetunes.jsonl --n-epochs 4 --validation-file math_finetunes_val.jsonl \n    ```\n\n### Output\n\nThat's it! You've successfully run the script and can now proceed to fine-tune your model.\n\n### Dispatch \n\nOnce you have the model you can replace the model in `three_digit_mul_dispatch.py` with the model you just fine-tuned and run the script again. This time, the script will use the fine-tuned model to predict the output of the function."
  },
  {
    "path": "examples/distilations/three_digit_mul.py",
    "content": "import logging\n\nfrom pydantic import BaseModel, Field\nfrom instructor import Instructions\n\nlogging.basicConfig(level=logging.INFO)\n\n# Usage\ninstructions = Instructions(\n    name=\"three_digit_multiply\",\n    finetune_format=\"messages\",\n    log_handlers=[\n        logging.FileHandler(\"math_finetunes.jsonl\"),\n    ],\n)\n\n\nclass Multiply(BaseModel):\n    a: int\n    b: int\n    result: int = Field(..., description=\"The result of the multiplication\")\n\n\n@instructions.distil\ndef fn(a: int, b: int) -> Multiply:\n    \"\"\"Return the result of multiplying a and b together\"\"\"\n    resp = a * b\n    return Multiply(a=a, b=b, result=resp)\n\n\nif __name__ == \"__main__\":\n    import random\n\n    log_lines = {\n        \"messages\": [\n            {\n                \"role\": \"system\",\n                \"content\": 'Predict the results of this function:\\n\\ndef fn(a: int, b: int) -> __main__.Multiply\\n\"\"\"\\nReturn the result of multiplying a and b together\\n\"\"\"',\n            },\n            {\"role\": \"user\", \"content\": \"Return `fn(169, b=166)`\"},\n            {\n                \"role\": \"assistant\",\n                \"function_call\": {\n                    \"name\": \"Multiply\",\n                    \"arguments\": '{\\n  \"a\": 169,\\n  \"b\": 166,\\n  \"result\": 28054\\n}',\n                },\n            },\n        ],\n        \"functions\": [\n            {\n                \"name\": \"Multiply\",\n                \"description\": \"Correctly extracted `Multiply` with all the required parameters with correct types\",\n                \"parameters\": {\n                    \"properties\": {\n                        \"a\": {\"title\": \"A\", \"type\": \"integer\"},\n                        \"b\": {\"title\": \"B\", \"type\": \"integer\"},\n                        \"result\": {\n                            \"description\": \"The result of the multiplication\",\n                            \"title\": \"Result\",\n                            \"type\": \"integer\",\n                        },\n                    },\n                    \"required\": [\"a\", \"b\", \"result\"],\n                    \"type\": \"object\",\n                },\n            }\n        ],\n    }\n    for _ in range(10):\n        a = random.randint(100, 999)\n        b = random.randint(100, 999)\n        print(\"returning\", fn(a, b=b))\n"
  },
  {
    "path": "examples/distilations/three_digit_mul_dispatch.py",
    "content": "import logging\n\nfrom pydantic import BaseModel, Field\nfrom instructor import Instructions\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_openai(OpenAI())\n\nlogging.basicConfig(level=logging.INFO)\n\n# Usage\ninstructions = Instructions(\n    name=\"three_digit_multiply\",\n    finetune_format=\"messages\",\n    include_code_body=True,\n    log_handlers=[\n        logging.FileHandler(\"math_finetunes.jsonl\"),\n    ],\n    openai_client=client,\n)\n\n\nclass Multiply(BaseModel):\n    a: int\n    b: int\n    result: int = Field(..., description=\"The result of the multiplication\")\n\n\n@instructions.distil(mode=\"dispatch\", model=\"ft:gpt-3.5-turbo-0125:personal::9i1JeuxJ\")\ndef fn(a: int, b: int) -> Multiply:\n    \"\"\"Return the result of the multiplication as an integer\"\"\"\n    resp = a * b\n    return Multiply(a=a, b=b, result=resp)\n\n\nif __name__ == \"__main__\":\n    import random\n\n    for _ in range(5):\n        a = random.randint(100, 999)\n        b = random.randint(100, 999)\n        result = fn(a, b)\n        print(f\"{a} * {b} = {result.result}, expected {a * b}\")\n    \"\"\"\n    972 * 508 = 493056, expected 493776\n    145 * 369 = 53505, expected 53505\n    940 * 440 = 413600, expected 413600\n    114 * 213 = 24282, expected 24282\n    259 * 650 = 168350, expected 168350\n    \"\"\"\n"
  },
  {
    "path": "examples/evals/eval.py",
    "content": "from collections import Counter, defaultdict\nfrom enum import Enum\nfrom typing import Any, Union\nimport numpy as np\nimport json\nfrom pydantic import ValidationError\nfrom pprint import pprint\nimport models as m\n\n\nclass Status(Enum):\n    IS_JSON = \"_is_json_\"\n    IS_VALID = \"_is_valid_\"\n    VALIDATION_ERROR = \"_validation_error_\"\n\n\nclass StreamingAccumulatorManager:\n    def __init__(self):\n        self.accumulator = defaultdict(StreamingAccumulator)\n\n    def validate_string(self, json_string: str, index: int) -> None:\n        try:\n            obj = json.loads(json_string)\n            self.accumulator[Status.IS_JSON.value].update(index, True)\n            try:\n                # Replace this line with your validation logic\n                obj = m.MultiSearch.model_validate(obj)\n                self.update(index, obj.model_dump())\n                self.accumulator[Status.IS_VALID.value].update(index, True)\n            except ValidationError as e:\n                self.accumulator[Status.IS_VALID.value].update(index, False)\n                self.process_validation_error(e, index)\n        except json.JSONDecodeError:\n            self.accumulator[Status.IS_JSON.value].update(index, False)\n\n    def process_validation_error(self, error, index):\n        for err in error.errors():\n            path = (\n                \"$.\"\n                + \".\".join(\n                    [str(x) if not isinstance(x, int) else \"[*]\" for x in err[\"loc\"]]\n                )\n                + \".\"\n                + err[\"type\"]\n            )\n            self.accumulator[Status.VALIDATION_ERROR.value].update(index, path)\n\n    def update(self, index, data: Any, path: str = \"$\") -> None:\n        if isinstance(data, dict):\n            for key, value in data.items():\n                new_path = f\"{path}.{key}\"\n                self.update(index, value, new_path)\n        elif isinstance(data, list):\n            new_path = f\"{path}[*]\"\n            for value in data:\n                self.update(index, value, new_path)\n            length_path = f\"{path}.length\"\n            self.accumulator[length_path].update(index, len(data))\n        elif isinstance(data, Enum):\n            enum_path = f\"{path}.enum\"\n            self.accumulator[enum_path].update(index, data.value)\n        else:\n            self.accumulator[path].update(index, data)\n\n    def summarize(self) -> dict[str, dict]:\n        return {k: v.summarize(key_name=k) for k, v in self.accumulator.items()}\n\n\nclass StreamingAccumulator:\n    def __init__(self):\n        self.counter = Counter()\n        self.min = float(\"inf\")\n        self.max = float(\"-inf\")\n        self.sum = 0\n        self.squared_sum = 0\n        self.unique_values = set()\n        self.missing_values = 0\n        self.str_min_length = float(\"inf\")\n        self.str_max_length = float(\"-inf\")\n        self.str_sum_length = 0\n        self.str_squared_sum_length = 0\n        self.value = []\n        self.str_length = []\n        self.reverse_lookup = defaultdict(list)\n\n    def update(self, index: Any, value: Any) -> None:\n        if isinstance(value, (int, str, bool)):\n            self.counter[value] += 1\n            self.unique_values.add(value)\n            self.value.append(value)\n            self.reverse_lookup[value].append(index)\n        if value is None or value == \"\":\n            self.missing_values += 1\n            return\n        if isinstance(value, (int, float)):\n            self.min = min(self.min, value)\n            self.max = max(self.max, value)\n            self.sum += value\n            self.squared_sum += value**2\n        if isinstance(value, str):\n            str_len = len(value)\n            self.str_length.append(str_len)\n            self.str_min_length = min(self.str_min_length, str_len)\n            self.str_max_length = max(self.str_max_length, str_len)\n            self.str_sum_length += str_len\n            self.str_squared_sum_length += str_len**2\n\n    def summarize(self, key_name=None) -> dict[str, Union[int, float, dict]]:\n        if key_name is None:\n            key_name = \"\"\n        n = sum(self.counter.values())\n        summaries = {}\n        summaries[\"counter\"] = self.counter\n        summaries[\"unique_count\"] = len(self.unique_values)\n        summaries[\"missing_values\"] = self.missing_values\n        summaries[\"_reverse_lookup\"] = dict(self.reverse_lookup)\n        if n > 0:\n            if all(isinstance(value, (bool)) for value in self.unique_values):\n                summaries[\"mean\"] = self.sum / n\n                return summaries\n            if all(isinstance(value, (int, float)) for value in self.unique_values):\n                summaries[\"min\"] = self.min\n                summaries[\"max\"] = self.max\n                summaries[\"mean\"] = self.sum / n\n                summaries[\"std\"] = np.sqrt(self.squared_sum / n - (self.sum / n) ** 2)\n                return summaries\n            if all(isinstance(value, str) for value in self.unique_values):\n                summaries[\"str_min_length\"] = self.str_min_length\n                summaries[\"str_max_length\"] = self.str_max_length\n                summaries[\"str_mean_length\"] = self.str_sum_length / n\n                summaries[\"str_std_length\"] = np.sqrt(\n                    self.str_squared_sum_length / n - (self.str_sum_length / n) ** 2\n                )\n                return summaries\n        return summaries\n\n\nif __name__ == \"__main__\":\n    eval_manager = StreamingAccumulatorManager()\n\n    with open(\"test.jsonl\") as f:\n        lines = f.readlines()\n        for ii, line in enumerate(lines):\n            eval_manager.validate_string(line, ii)\n\n    pprint(eval_manager.summarize())\n"
  },
  {
    "path": "examples/evals/models.py",
    "content": "from typing import Optional\nfrom pydantic import BaseModel, Field\nfrom enum import Enum\n\n\nclass SourceType(str, Enum):\n    CRM = \"CRM\"\n    WEB = \"WEB\"\n    EMAIL = \"EMAIL\"\n    SOCIAL_MEDIA = \"SOCIAL_MEDIA\"\n    OTHER = \"OTHER\"\n\n\nclass Search(BaseModel):\n    query: str\n    source_type: SourceType\n    results_limit: Optional[int] = Field(10)\n    is_priority: Optional[bool] = None\n    tags: Optional[list[str]] = None\n\n\nclass MultiSearch(BaseModel):\n    queries: list[Search]\n    user_id: Optional[str]\n"
  },
  {
    "path": "examples/evals/stats_dict.py",
    "content": "from collections import Counter\n\nstats_dict = {\n    \"$.queries.length\": {\n        \"_reverse_lookup\": {\n            1: [0, 1, 8, 9, 10, 13, 14, 15],\n            2: [7, 11, 16],\n            3: [12, 17],\n        },\n        \"counter\": Counter({1: 8, 2: 3, 3: 2}),\n        \"max\": 3,\n        \"mean\": 1.5384615384615385,\n        \"min\": 1,\n        \"missing_values\": 0,\n        \"std\": 0.7457969011409735,\n        \"unique_count\": 3,\n    },\n    \"$.queries[*].is_priority\": {\n        \"_reverse_lookup\": {False: [13], True: [1, 9, 14, 17]},\n        \"counter\": Counter({True: 4, False: 1}),\n        \"mean\": 0.8,\n        \"missing_values\": 15,\n        \"unique_count\": 2,\n    },\n    \"$.queries[*].query\": {\n        \"_reverse_lookup\": {\n            \"customer churn\": [1],\n            \"customer feedback\": [15],\n            \"customer satisfaction\": [11],\n            \"email campaigns\": [12],\n            \"email open rates\": [17],\n            \"email outreach\": [10],\n            \"marketing strategies\": [14],\n            \"new products\": [16],\n            \"product sales\": [11],\n            \"revenue 2022\": [9],\n            \"revenue streams\": [16],\n            \"sales Q1\": [0, 7, 8, 13],\n            \"sales Q2\": [7],\n            \"social impact\": [12],\n            \"social trends\": [17],\n            \"web traffic\": [12],\n            \"website analytics\": [17],\n        },\n        \"counter\": Counter(\n            {\n                \"sales Q1\": 4,\n                \"customer churn\": 1,\n                \"sales Q2\": 1,\n                \"revenue 2022\": 1,\n                \"email outreach\": 1,\n                \"product sales\": 1,\n                \"customer satisfaction\": 1,\n                \"social impact\": 1,\n                \"email campaigns\": 1,\n                \"web traffic\": 1,\n                \"marketing strategies\": 1,\n                \"customer feedback\": 1,\n                \"revenue streams\": 1,\n                \"new products\": 1,\n                \"social trends\": 1,\n                \"email open rates\": 1,\n                \"website analytics\": 1,\n            }\n        ),\n        \"missing_values\": 0,\n        \"str_max_length\": 21,\n        \"str_mean_length\": 13.15,\n        \"str_min_length\": 8,\n        \"str_std_length\": 3.8376425054973518,\n        \"unique_count\": 17,\n    },\n    \"$.queries[*].results_limit\": {\n        \"_reverse_lookup\": {\n            5: [17],\n            10: [0, 1, 7, 7, 8, 9, 10, 11, 11, 12, 12, 12, 13, 15, 16, 16, 17, 17],\n            15: [14],\n        },\n        \"counter\": Counter({10: 18, 15: 1, 5: 1}),\n        \"max\": 15,\n        \"mean\": 10.0,\n        \"min\": 5,\n        \"missing_values\": 0,\n        \"std\": 1.5811388300841898,\n        \"unique_count\": 3,\n    },\n    \"$.queries[*].source_type.enum\": {\n        \"_reverse_lookup\": {\n            \"CRM\": [0, 7, 8, 11, 13, 16],\n            \"EMAIL\": [10, 11, 12, 15, 17],\n            \"SOCIAL_MEDIA\": [12, 17],\n            \"WEB\": [1, 7, 9, 12, 14, 16, 17],\n        },\n        \"counter\": Counter({\"WEB\": 7, \"CRM\": 6, \"EMAIL\": 5, \"SOCIAL_MEDIA\": 2}),\n        \"missing_values\": 0,\n        \"str_max_length\": 12,\n        \"str_mean_length\": 4.4,\n        \"str_min_length\": 3,\n        \"str_std_length\": 2.672077843177477,\n        \"unique_count\": 4,\n    },\n    \"$.queries[*].tags\": {\n        \"_reverse_lookup\": {},\n        \"counter\": Counter(),\n        \"missing_values\": 16,\n        \"unique_count\": 0,\n    },\n    \"$.queries[*].tags.length\": {\n        \"_reverse_lookup\": {1: [15, 17], 2: [10, 14]},\n        \"counter\": Counter({2: 2, 1: 2}),\n        \"max\": 2,\n        \"mean\": 1.5,\n        \"min\": 1,\n        \"missing_values\": 0,\n        \"std\": 0.5,\n        \"unique_count\": 2,\n    },\n    \"$.queries[*].tags[*]\": {\n        \"_reverse_lookup\": {\n            \"2022\": [10],\n            \"2023\": [14],\n            \"analytics\": [17],\n            \"feedback\": [15],\n            \"outreach\": [10],\n            \"strategy\": [14],\n        },\n        \"counter\": Counter(\n            {\n                \"outreach\": 1,\n                \"2022\": 1,\n                \"strategy\": 1,\n                \"2023\": 1,\n                \"feedback\": 1,\n                \"analytics\": 1,\n            }\n        ),\n        \"missing_values\": 0,\n        \"str_max_length\": 9,\n        \"str_mean_length\": 6.833333333333333,\n        \"str_min_length\": 4,\n        \"str_std_length\": 2.034425935955618,\n        \"unique_count\": 6,\n    },\n    \"$.user_id\": {\n        \"_reverse_lookup\": {\n            \"user_1\": [0],\n            \"user_10\": [10],\n            \"user_11\": [11],\n            \"user_12\": [12],\n            \"user_13\": [13],\n            \"user_14\": [14],\n            \"user_15\": [15],\n            \"user_16\": [16],\n            \"user_17\": [17],\n            \"user_2\": [1],\n            \"user_7\": [7],\n            \"user_8\": [8],\n            \"user_9\": [9],\n        },\n        \"counter\": Counter(\n            {\n                \"user_1\": 1,\n                \"user_2\": 1,\n                \"user_7\": 1,\n                \"user_8\": 1,\n                \"user_9\": 1,\n                \"user_10\": 1,\n                \"user_11\": 1,\n                \"user_12\": 1,\n                \"user_13\": 1,\n                \"user_14\": 1,\n                \"user_15\": 1,\n                \"user_16\": 1,\n                \"user_17\": 1,\n            }\n        ),\n        \"missing_values\": 0,\n        \"str_max_length\": 7,\n        \"str_mean_length\": 6.615384615384615,\n        \"str_min_length\": 6,\n        \"str_std_length\": 0.48650425541052295,\n        \"unique_count\": 13,\n    },\n    \"_is_json_\": {\n        \"_reverse_lookup\": {\n            False: [2, 4],\n            True: [0, 1, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],\n        },\n        \"counter\": Counter({True: 16, False: 2}),\n        \"mean\": 0.8888888888888888,\n        \"missing_values\": 0,\n        \"unique_count\": 2,\n    },\n    \"_is_valid_\": {\n        \"_reverse_lookup\": {\n            False: [3, 5, 6],\n            True: [0, 1, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],\n        },\n        \"counter\": Counter({True: 13, False: 3}),\n        \"mean\": 0.8125,\n        \"missing_values\": 0,\n        \"unique_count\": 2,\n    },\n    \"_validation_error_\": {\n        \"_reverse_lookup\": {\n            \"$.queries.[*].is_priority.bool_parsing\": [6],\n            \"$.queries.[*].source_type.enum\": [3],\n            \"$.user_id.missing\": [5],\n        },\n        \"counter\": Counter(\n            {\n                \"$.queries.[*].source_type.enum\": 1,\n                \"$.user_id.missing\": 1,\n                \"$.queries.[*].is_priority.bool_parsing\": 1,\n            }\n        ),\n        \"missing_values\": 0,\n        \"str_max_length\": 38,\n        \"str_mean_length\": 28.333333333333332,\n        \"str_min_length\": 17,\n        \"str_std_length\": 8.653836657164781,\n        \"unique_count\": 3,\n    },\n}\n"
  },
  {
    "path": "examples/evals/streamlit.py",
    "content": "import streamlit as st\nfrom stats_dict import stats_dict\n\n# Sample data\nquery_data = {i: line.strip() for i, line in enumerate(open(\"test.jsonl\"))}\n\n# Initialize selected keys\nselected_keys = {}\n\n\n# Function to get lines\ndef get_lines(stats_key, keys):\n    indices = []\n    for key in keys:\n        indices.extend(stats_dict[stats_key][\"_reverse_lookup\"][key])\n    return \"\\n\".join([query_data[i] for i in indices])\n\n\n# Function to render dropdown and button\ndef render_dropdown_and_button(stats_key):\n    st.subheader(f\"Stats for `{stats_key}`\")\n    st.json(stats_dict[stats_key][\"counter\"])\n    st.json(\n        {k: v for k, v in stats_dict[stats_key].items() if isinstance(v, (int, float))}\n    )\n    st.subheader(\"Histogram\")\n    st.bar_chart(stats_dict[stats_key][\"counter\"], use_container_width=True)\n\n    options = list(stats_dict[stats_key][\"counter\"].keys())\n    selected_keys[stats_key] = st.multiselect(\n        f\"View samples with {stats_key}\",\n        options,\n        default=selected_keys.get(stats_key, []),\n    )\n    st.code(get_lines(stats_key, selected_keys[stats_key]))\n\n\n# Sidebar for navigation\nst.sidebar.title(\"Navigation\")\npage = st.sidebar.selectbox(\n    \"Select a page:\",\n    [\"Validation Stats\", \"Individual Path Views\"],\n)\n\n# Main Streamlit App\nst.title(\"Structured Output Evaluation\")\n\n# Validation Stats\nif page == \"Validation Stats\":\n    st.header(\"Validation Stats\")\n    for key in [k for k in stats_dict.keys() if k.startswith(\"_\")]:\n        render_dropdown_and_button(key)\n\n# Individual Path Views\nelif page == \"Individual Path Views\":\n    st.header(\"Individual Path Views\")\n    path = st.selectbox(\n        \"Choose a path:\",\n        [key for key in stats_dict.keys() if not key.startswith(\"_\")],\n    )\n    if \"counter\" in stats_dict[path]:\n        render_dropdown_and_button(path)\n"
  },
  {
    "path": "examples/evals/test.jsonl",
    "content": "{\"queries\": [{\"query\": \"sales Q1\", \"source_type\": \"CRM\"}], \"user_id\": \"user_1\"}\n{\"queries\": [{\"query\": \"customer churn\", \"source_type\": \"WEB\", \"is_priority\": true}], \"user_id\": \"user_2\", \"total_queries\": 1}\n{\"queries\": [\"query\": \"email campaigns\", \"source_type\": \"EMAIL\"}, {\"query\": \"social ads\", \"source_type\": \"SOCIAL_MEDIA\"}], \"user_id\": \"user_3\", \"total_queries\": 2}\n{\"queries\": [{\"query\": \"sales Q2\", \"source_type\": \"INVALID_ENUM\"}], \"user_id\": \"user_4\"}\n{queries: [{\"query\": \"sales Q3\", \"source_type\": \"CRM\"}], \"user_id\": \"user_5\"}\n{\"queries\": [{\"query\": \"sales Q4\", \"source_type\": \"CRM\", \"timestamp\": \"2023-09-10T12:00:00Z\"}], \"total_queries\": 1}\n{\"queries\": [{\"query\": \"customer retention\", \"source_type\": \"EMAIL\", \"is_priority\": \"should_be_bool\"}], \"user_id\": \"user_6\"}\n{\"queries\": [{\"query\": \"sales Q1\", \"source_type\": \"CRM\"}, {\"query\": \"sales Q2\", \"source_type\": \"WEB\"}], \"user_id\": \"user_7\", \"total_queries\": 2}\n{\"queries\": [{\"query\": \"sales Q1\", \"source_type\": \"CRM\", \"timestamp\": \"2023-09-10T12:00:00Z\"}], \"user_id\": \"user_8\", \"total_queries\": 1}\n{\"queries\": [{\"query\": \"revenue 2022\", \"source_type\": \"WEB\", \"results_limit\": 10, \"is_priority\": true}], \"user_id\": \"user_9\", \"total_queries\": 1}\n{\"queries\": [{\"query\": \"email outreach\", \"source_type\": \"EMAIL\", \"tags\": [\"outreach\", \"2022\"]}], \"user_id\": \"user_10\", \"total_queries\": 1}\n{\"queries\": [{\"query\": \"product sales\", \"source_type\": \"CRM\"}, {\"query\": \"customer satisfaction\", \"source_type\": \"EMAIL\"}], \"user_id\": \"user_11\", \"total_queries\": 2}\n{\"queries\": [{\"query\": \"social impact\", \"source_type\": \"SOCIAL_MEDIA\"}, {\"query\": \"email campaigns\", \"source_type\": \"EMAIL\"}, {\"query\": \"web traffic\", \"source_type\": \"WEB\"}], \"user_id\": \"user_12\", \"total_queries\": 3}\n{\"queries\": [{\"query\": \"sales Q1\", \"source_type\": \"CRM\", \"is_priority\": false}], \"user_id\": \"user_13\", \"total_queries\": 1}\n{\"queries\": [{\"query\": \"marketing strategies\", \"source_type\": \"WEB\", \"results_limit\": 15, \"is_priority\": true, \"tags\": [\"strategy\", \"2023\"]}], \"user_id\": \"user_14\", \"total_queries\": 1}\n{\"queries\": [{\"query\": \"customer feedback\", \"source_type\": \"EMAIL\", \"tags\": [\"feedback\"]}], \"user_id\": \"user_15\", \"total_queries\": 1}\n{\"queries\": [{\"query\": \"revenue streams\", \"source_type\": \"CRM\"}, {\"query\": \"new products\", \"source_type\": \"WEB\"}], \"user_id\": \"user_16\", \"total_queries\": 2}\n{\"queries\": [{\"query\": \"social trends\", \"source_type\": \"SOCIAL_MEDIA\", \"is_priority\": true}, {\"query\": \"email open rates\", \"source_type\": \"EMAIL\", \"results_limit\": 5}, {\"query\": \"website analytics\", \"source_type\": \"WEB\", \"tags\": [\"analytics\"]}], \"user_id\": \"user_17\", \"total_queries\": 3}"
  },
  {
    "path": "examples/extract-table/run_vision.py",
    "content": "from openai import OpenAI\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n    PlainSerializer,\n    InstanceOf,\n    WithJsonSchema,\n)\nimport instructor\nimport pandas as pd\nfrom rich.console import Console\n\nconsole = Console()\nclient = instructor.from_openai(\n    client=OpenAI(),\n    mode=instructor.Mode.TOOLS,\n)\n\n\ndef md_to_df(data: Any) -> Any:\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Get rid of whitespaces\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .map(lambda x: x.strip())\n        )  # type: ignore\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda x: x.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n                The markdown representation of the table, \n                each one should be tidy, do not try to join tables\n                that should be separate\"\"\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\nclass MultipleTables(BaseModel):\n    tables: list[Table]\n\n\nexample = MultipleTables(\n    tables=[\n        Table(\n            caption=\"This is a caption\",\n            dataframe=pd.DataFrame(\n                {\n                    \"Chart A\": [10, 40],\n                    \"Chart B\": [20, 50],\n                    \"Chart C\": [30, 60],\n                }\n            ),\n        )\n    ]\n)\n\n\ndef extract(url: str) -> MultipleTables:\n    return client.chat.completions.create(\n        model=\"gpt-4-turbo\",\n        max_tokens=4000,\n        response_model=MultipleTables,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                            First, analyze the image to determine the most appropriate headers for the tables.\n                            Generate a descriptive h1 for the overall image, followed by a brief summary of the data it contains. \n                            For each identified table, create an informative h2 title and a concise description of its contents.\n                            Finally, output the markdown representation of each table.\n\n\n                            Make sure to escape the markdown table properly, and make sure to include the caption and the dataframe.\n                            including escaping all the newlines and quotes. Only return a markdown table in dataframe, nothing else.\n                        \"\"\",\n                    },\n                ],\n            }\n        ],\n    )\n\n\nurls = [\n    \"https://a.storyblok.com/f/47007/2400x1260/f816b031cb/uk-ireland-in-three-charts_chart_a.png/m/2880x0\",\n    \"https://a.storyblok.com/f/47007/2400x2000/bf383abc3c/231031_uk-ireland-in-three-charts_table_v01_b.png/m/2880x0\",\n]\n\nfor url in urls:\n    for table in extract(url).tables:\n        console.print(table.caption, \"\\n\", table.dataframe)\n\"\"\"\nGrowth in app installations and sessions across different app categories in Q3 2022 compared to Q2 2022 for Ireland and U.K. \n              Install Growth (%)  Session Growth (%) \n Category                                           \nEducation                      7                   6\nGames                         13                   3\nSocial                         4                  -3\nUtilities                      6                -0.4\nTop 10 Grossing Android Apps in Ireland, October 2023 \n                              App Name           Category \n Rank                                                    \n1                           Google One       Productivity\n2                              Disney+      Entertainment\n3        TikTok - Videos, Music & LIVE      Entertainment\n4                     Candy Crush Saga              Games\n5       Tinder: Dating, Chat & Friends  Social networking\n6                          Coin Master              Games\n7                               Roblox              Games\n8       Bumble - Dating & Make Friends             Dating\n9                          Royal Match              Games\n10         Spotify: Music and Podcasts      Music & Audio\nTop 10 Grossing iOS Apps in Ireland, October 2023 \n                              App Name           Category \n Rank                                                    \n1       Tinder: Dating, Chat & Friends  Social networking\n2                              Disney+      Entertainment\n3       YouTube: Watch, Listen, Stream      Entertainment\n4         Audible: Audio Entertainment      Entertainment\n5                     Candy Crush Saga              Games\n6        TikTok - Videos, Music & LIVE      Entertainment\n7       Bumble - Dating & Make Friends             Dating\n8                               Roblox              Games\n9          LinkedIn: Job Search & News           Business\n10         Duolingo - Language Lessons          Education\n\"\"\"\n"
  },
  {
    "path": "examples/extract-table/run_vision_langsmith.py",
    "content": "from openai import OpenAI\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n    PlainSerializer,\n    InstanceOf,\n    WithJsonSchema,\n)\nimport instructor\nimport pandas as pd\nfrom langsmith.wrappers import wrap_openai\nfrom langsmith import traceable\n\n\nclient = wrap_openai(OpenAI())\nclient = instructor.from_openai(\n    client, mode=instructor.processing.function_calls.Mode.MD_JSON\n)\n\n\ndef md_to_df(data: Any) -> Any:\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Get rid of whitespaces\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .map(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda x: x.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n                The markdown representation of the table, \n                each one should be tidy, do not try to join tables\n                that should be separate\"\"\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\nclass MultipleTables(BaseModel):\n    tables: list[Table]\n\n\nexample = MultipleTables(\n    tables=[\n        Table(\n            caption=\"This is a caption\",\n            dataframe=pd.DataFrame(\n                {\n                    \"Chart A\": [10, 40],\n                    \"Chart B\": [20, 50],\n                    \"Chart C\": [30, 60],\n                }\n            ),\n        )\n    ]\n)\n\n\n@traceable(name=\"extract-table\")\ndef extract(url: str) -> MultipleTables:\n    tables = client.chat.completions.create(\n        model=\"gpt-4-vision-preview\",\n        max_tokens=4000,\n        response_model=MultipleTables,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": f\"Describe this data accurately as a table in markdown format. {example.model_dump_json(indent=2)}\",\n                    },\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                            First take a moment to reason about the best set of headers for the tables. \n                            Write a good h1 for the image above. Then follow up with a short description of the what the data is about.\n                            Then for each table you identified, write a h2 tag that is a descriptive title of the table. \n                            Then follow up with a short description of the what the data is about. \n                            Lastly, produce the markdown table for each table you identified.\n\n\n                            Make sure to escape the markdown table properly, and make sure to include the caption and the dataframe.\n                            including escaping all the newlines and quotes. Only return a markdown table in dataframe, nothing else.\n                        \"\"\",\n                    },\n                ],\n            }\n        ],\n    )\n    return tables.model_dump()\n\n\nurls = [\n    \"https://a.storyblok.com/f/47007/2400x1260/f816b031cb/uk-ireland-in-three-charts_chart_a.png/m/2880x0\",\n    \"https://a.storyblok.com/f/47007/2400x2000/bf383abc3c/231031_uk-ireland-in-three-charts_table_v01_b.png/m/2880x0\",\n]\n\n\nfor url in urls:\n    tables = extract(url)\n    print(tables)\n"
  },
  {
    "path": "examples/extract-table/run_vision_org.py",
    "content": "from openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom rich.console import Console\n\nimport instructor\n\nconsole = Console()\nclient = instructor.from_openai(\n    client=OpenAI(),\n    mode=instructor.Mode.TOOLS,\n)\n\n\nclass People(BaseModel):\n    id: str\n    name: str\n    role: str\n    reports: list[str] = Field(\n        default_factory=list, description=\"People who report to this person\"\n    )\n    manages: list[str] = Field(\n        default_factory=list, description=\"People who this person manages\"\n    )\n\n\nclass Organization(BaseModel):\n    people: list[People]\n\n\ndef extract(url: str):\n    return client.chat.completions.create_partial(\n        model=\"gpt-4-turbo\",\n        max_tokens=4000,\n        response_model=Organization,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                            Analyze the organizational chart image and extract the relevant information to reconstruct the hierarchy.\n                            \n                            Create a list of People objects, where each person has the following attributes:\n                            - id: A unique identifier for the person\n                            - name: The person's name\n                            - role: The person's role or position in the organization\n                            - reports: A list of IDs of people who report directly to this person\n                            - manages: A list of IDs of people who this person manages\n                            \n                            Ensure that the relationships between people are accurately captured in the reports and manages attributes.\n                            \n                            Return the list of People objects as the people attribute of an Organization object.\n                        \"\"\",\n                    },\n                ],\n            }\n        ],\n    )\n\n\nconsole.print(\n    extract(\n        \"https://www.mindmanager.com/static/mm/images/features/org-chart/hierarchical-chart.png\"\n    )\n)\n\"\"\"\nOrganization(\n    people=[\n        People(id='A1', name='Adele Morana', role='Founder, Chairman & CEO', reports=[], manages=['B1', 'C1', 'D1']),\n        People(id='B1', name='Winston Cole', role='COO', reports=['A1'], manages=['E1']),\n        People(id='C1', name='Marcus Kim', role='CFO', reports=['A1'], manages=['F1']),\n        People(id='D1', name='Karin Ludovicicus', role='CPO', reports=['A1'], manages=['G1']),\n        People(id='E1', name='Lea Erastos', role='Chief Business Officer', reports=['B1'], manages=['H1', 'I1']),\n        People(id='F1', name='John McKinley', role='Chief Accounting Officer', reports=['C1'], manages=[]),\n        People(id='G1', name='Ayda Williams', role='VP, Global Customer & Business Marketing', reports=['D1'], manages=['J1', 'K1']),\n        People(id='H1', name='Zahida Mahtab', role='VP, Global Affairs & Communication', reports=['E1'], manages=[]),\n        People(id='I1', name='Adelaide Zhu', role='VP, Central Services', reports=['E1'], manages=[]),\n        People(id='J1', name='Gabriel Drummond', role='VP, Investor Relations', reports=['G1'], manages=[]),\n        People(id='K1', name='Nicholas Brambilla', role='VP, Company Brand', reports=['G1'], manages=[]),\n        People(id='L1', name='Felice Vasili', role='VP Finance', reports=['C1'], manages=[]),\n        People(id='M1', name='Sandra Herminius', role='VP, Product Marketing', reports=['D1'], manages=[])\n    ]\n)\n\"\"\"\n"
  },
  {
    "path": "examples/extract-table/run_vision_org_table.py",
    "content": "from openai import OpenAI\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n    PlainSerializer,\n    InstanceOf,\n    WithJsonSchema,\n)\nimport instructor\nimport pandas as pd\nfrom rich.console import Console\n\nconsole = Console()\nclient = instructor.from_openai(\n    client=OpenAI(),\n    mode=instructor.Mode.TOOLS,\n)\n\n\ndef md_to_df(data: Any) -> Any:\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Get rid of whitespaces\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .map(lambda x: x.strip())\n        )  # type: ignore\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(lambda x: x.to_markdown()),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n                The markdown representation of the table, \n                each one should be tidy, do not try to join tables\n                that should be separate\"\"\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\ndef extract(url: str):\n    return client.chat.completions.create(\n        model=\"gpt-4-turbo\",\n        max_tokens=4000,\n        response_model=Table,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                            Analyze the organizational chart image and extract the relevant information to reconstruct the hierarchy.\n                            \n                            Create a list of People objects, where each person has the following attributes:\n                            - id: A unique identifier for the person\n                            - name: The person's name\n                            - role: The person's role or position in the organization\n                            - manager_name: The name of the person who manages this person\n                            - manager_role: The role of the person who manages this person\n                            \n                            Ensure that the relationships between people are accurately captured in the reports and manages attributes.\n                            \n                            Return the list of People objects as the people attribute of an Organization object.\n                        \"\"\",\n                    },\n                ],\n            }\n        ],\n    )\n\n\nprint(\n    extract(\n        \"https://www.mindmanager.com/static/mm/images/features/org-chart/hierarchical-chart.png\"\n    ).model_dump()[\"dataframe\"]\n)\n\"\"\"\n|    id  |  name              |  role                                    |  manager_name     |  manager_role                |\n|-------:|:-------------------|:-----------------------------------------|:------------------|:-----------------------------|\n|    1   | Adele Morana       | Founder, Chairman & CEO                  |                   |                              |\n|    2   | Winston Cole       | COO                                      | Adele Morana      | Founder, Chairman & CEO      |\n|    3   | Marcus Kim         | CFO                                      | Adele Morana      | Founder, Chairman & CEO      |\n|    4   | Karin Ludovicus    | CPO                                      | Adele Morana      | Founder, Chairman & CEO      |\n|    5   | Lea Erastos        | Chief Business Officer                   | Winston Cole      | COO                          |\n|    6   | John McKinley      | Chief Accounting Officer                 | Winston Cole      | COO                          |\n|    7   | Zahida Mahtab      | VP, Global Affairs & Communication       | Winston Cole      | COO                          |\n|    8   | Adelaide Zhu       | VP, Central Services                     | Winston Cole      | COO                          |\n|    9   | Gabriel Drummond   | VP, Investor Relations                   | Marcus Kim        | CFO                          |\n|    10  | Felicie Vasili     | VP, Finance                              | Marcus Kim        | CFO                          |\n|    11  | Ayda Williams      | VP, Global Customer & Business Marketing | Karin Ludovicius  | CPO                          |\n|    12  | Nicholas Brambilla | VP, Company Brand                        | Karin Ludovicius  | CPO                          |\n|    13  | Sandra Herminius   | VP, Product Marketing                    | Karin Ludovicius  | CPO                          |\n\"\"\"\n"
  },
  {
    "path": "examples/extract-table/run_vision_receipt.py",
    "content": "from pydantic import BaseModel, model_validator\nfrom openai import OpenAI\nimport instructor\n\n\nclient = instructor.from_openai(\n    client=OpenAI(),\n    mode=instructor.Mode.TOOLS,\n)\n\n\nclass Item(BaseModel):\n    name: str\n    price: float\n    quantity: int\n\n\nclass Receipt(BaseModel):\n    items: list[Item]\n    total: float\n\n    @model_validator(mode=\"after\")\n    def check_total(cls, values: \"Receipt\"):\n        items = values.items\n        total = values.total\n        calculated_total = sum(item.price * item.quantity for item in items)\n        if calculated_total != total:\n            raise ValueError(\n                f\"Total {total} does not match the sum of item prices {calculated_total}\"\n            )\n        return values\n\n\ndef extract(url: str) -> Receipt:\n    return client.chat.completions.create(\n        model=\"gpt-4o\",\n        max_tokens=4000,\n        response_model=Receipt,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Analyze the image and return the items in the receipt and the total amount.\",\n                    },\n                ],\n            }\n        ],\n    )\n\n\n# URLs of images containing receipts. Exhibits the use of the model validator to check the total amount.\nurls = [\n    \"https://templates.mediamodifier.com/645124ff36ed2f5227cbf871/supermarket-receipt-template.jpg\",\n    \"https://ocr.space/Content/Images/receipt-ocr-original.jpg\",\n]\n\nfor url in urls:\n    receipt = extract(url)\n    print(receipt)\n"
  },
  {
    "path": "examples/extract-table/test.py",
    "content": "from pydantic import BaseModel\n\nfrom openai import OpenAI\nimport instructor\n\nclient = OpenAI()\n\nclient = instructor.from_openai(client)\n\n\nclass User(BaseModel):\n    name: str\n    email: str\n\n\nclass MeetingInfo(BaseModel):\n    user: User\n    date: str\n    location: str\n    budget: int\n    deadline: str\n\n\ndata = \"\"\"\nJason Liu jason@gmail.com\nMeeting Date: 2024-01-01\nMeeting Location: 1234 Main St\nMeeting Budget: $1000\nMeeting Deadline: 2024-01-31\n\"\"\"\nstream1 = client.chat.completions.create_partial(\n    model=\"gpt-4\",\n    response_model=MeetingInfo,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": f\"Get the information about the meeting and the users {data}\",\n        },\n    ],\n    stream=True,\n)  # type: ignore\n\nfor message in stream1:\n    print(message)\n\"\"\"\nser={} date=None location=None budget=None deadline=None\nuser={} date=None location=None budget=None deadline=None\nuser={} date=None location=None budget=None deadline=None\nuser={} date=None location=None budget=None deadline=None\nuser=PartialUser(name=None, email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name=None, email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name=None, email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name=None, email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name=None, email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name=None, email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email=None) date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date=None location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location=None budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=None deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=100 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline=None\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline='2024-01-31'\nuser=PartialUser(name='Jason Liu', email='jason@gmail.com') date='2024-01-01' location='1234 Main St' budget=1000 deadline='2024-01-31'\n\"\"\"\n"
  },
  {
    "path": "examples/extracting-pii/run.py",
    "content": "from pydantic import BaseModel\n\nimport instructor\nfrom openai import OpenAI\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Data(BaseModel):\n    index: int\n    data_type: str\n    pii_value: str\n\n\nclass PIIDataExtraction(BaseModel):\n    \"\"\"\n    Extracted PII data from a document, all data_types should try to have consistent property names\n    \"\"\"\n\n    private_data: list[Data]\n\n    def scrub_data(self, content):\n        \"\"\"\n        Iterates over the private data and replaces the value with a placeholder in the form of\n        <{data_type}_{i}>\n        \"\"\"\n\n        for i, data in enumerate(self.private_data):\n            content = content.replace(data.pii_value, f\"<{data.data_type}_{i}>\")\n\n        return content\n\n\nEXAMPLE_DOCUMENT = \"\"\"\n# Fake Document with PII for Testing PII Scrubbing Model\n\n## Personal Story\n\nJohn Doe was born on 01/02/1980. His social security number is 123-45-6789. He has been using the email address john.doe@email.com for years, and he can always be reached at 555-123-4567.\n\n## Residence\n\nJohn currently resides at 123 Main St, Springfield, IL, 62704. He's been living there for about 5 years now.\n\n## Career\n\nAt the moment, John is employed at Company A. He started his role as a Software Engineer in January 2015 and has been with the company since then.\n\"\"\"\n\n# Define the PII Scrubbing Model\npii_data: PIIDataExtraction = client.chat.completions.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=PIIDataExtraction,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a world class PII scrubbing model, Extract the PII data from the following document\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": EXAMPLE_DOCUMENT,\n        },\n    ],\n)  # type: ignore\n\n\nprint(\"Extracted PII Data:\")\nprint(pii_data.model_dump_json(indent=2))\n\"\"\"\n{\n  \"private_data\": [\n    {\n      \"index\": 0,\n      \"data_type\": \"date\",\n      \"pii_value\": \"01/02/1980\"\n    },\n    {\n      \"index\": 1,\n      \"data_type\": \"ssn\",\n      \"pii_value\": \"123-45-6789\"\n    },\n    {\n      \"index\": 2,\n      \"data_type\": \"email\",\n      \"pii_value\": \"john.doe@email.com\"\n    },\n    {\n      \"index\": 3,\n      \"data_type\": \"phone\",\n      \"pii_value\": \"555-123-4567\"\n    },\n    {\n      \"index\": 4,\n      \"data_type\": \"address\",\n      \"pii_value\": \"123 Main St, Springfield, IL, 62704\"\n    }\n  ]\n}\n\"\"\"\n\n# Scrub the PII Data from the document\nprint(\"Scrubbed Document:\")\nprint(pii_data.scrub_data(EXAMPLE_DOCUMENT))\n\"\"\"\n# Fake Document with PII for Testing PII Scrubbing Model\n\n## Personal Story\n\nJohn Doe was born on <date_of_birth_0>. His social security number is <social_security_number_1>. He has been using the email address <email_address_2> for years, and he can always be reached at <phone_number_3>.\n\n## Residence\n\nJohn currently resides at <address_4>. He's been living there for about 5 years now.\n\n## Career\n\nAt the moment, John is employed at <employment_5>. He started his role as a <job_title_6> in <employment_start_date_7> and has been with the company since then.\n\"\"\"\n"
  },
  {
    "path": "examples/fastapi_app/__init__.py",
    "content": ""
  },
  {
    "path": "examples/fastapi_app/main.py",
    "content": "from fastapi import FastAPI\nfrom instructor import ResponseSchema\nimport instructor.dsl as dsl\nfrom pydantic import BaseModel, Field\n\napp = FastAPI(title=\"Example Application using instructor\")\n\n\nclass SearchRequest(BaseModel):\n    body: str\n\n\nclass SearchQuery(ResponseSchema):\n    title: str = Field(..., description=\"Question that the query answers\")\n    query: str = Field(\n        ...,\n        description=\"Detailed, comprehensive, and specific query to be used for semantic search\",\n    )\n\n\nSearchResponse = dsl.MultiTask(\n    subtask_class=SearchQuery,\n    description=\"Correctly segmented set of search queries\",\n)\n\n\n@app.post(\"/search\", response_model=SearchResponse)\nasync def search(request: SearchRequest):\n    task = (\n        dsl.ChatCompletion(name=\"Segmenting Search requests example\")\n        | dsl.SystemTask(task=\"Segment search results\")\n        | dsl.TaggedMessage(content=request.body, tag=\"query\")\n        | dsl.TipsMessage(\n            tips=[\n                \"Expand query to contain multiple forms of the same word (SSO -> Single Sign On)\",\n                \"Use the title to explain what the query should return, but use the query to complete the search\",\n                \"The query should be detailed, specific, and cast a wide net when possible\",\n            ]\n        )\n        | SearchRequest\n    )\n    return await task.acreate()\n"
  },
  {
    "path": "examples/fastapi_app/script.py",
    "content": "from instructor import ResponseSchema, dsl\nfrom pydantic import Field\nimport json\n\n\nclass SearchQuery(ResponseSchema):\n    query: str = Field(\n        ...,\n        description=\"Detailed, comprehensive, and specific query to be used for semantic search\",\n    )\n\n\nSearchResponse = dsl.MultiTask(\n    subtask_class=SearchQuery,\n    description=\"Correctly segmented set of search queries\",\n)\n\n\ntask = (\n    dsl.ChatCompletion(name=\"Segmenting Search requests example\")\n    | dsl.SystemTask(task=\"Segment search results\")\n    | dsl.TaggedMessage(\n        content=\"can you send me the data about the video investment and the one about spot the dog?\",\n        tag=\"query\",\n    )\n    | dsl.TipsMessage(\n        tips=[\n            \"Expand query to contain multiple forms of the same word (SSO -> Single Sign On)\",\n            \"Use the title to explain what the query should return, but use the query to complete the search\",\n            \"The query should be detailed, specific, and cast a wide net when possible\",\n        ]\n    )\n    | SearchResponse\n)\n\n\nprint(json.dumps(task.kwargs, indent=1))\n\"\"\"\n{\n  \"tasks\": [\n    {\n      \"query\": \"data about video investment\"\n    },\n    {\n      \"query\": \"data about spot the dog\"\n    }\n  ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/fizzbuzz/run.py",
    "content": "from __future__ import annotations\n\nfrom openai import OpenAI\nimport instructor\n\nclient = instructor.from_openai(OpenAI())\n\n\ndef fizzbuzz_gpt(n) -> list[int | str]:\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=list[int | str],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Return the first {n} numbers in fizzbuzz\",\n            },\n        ],\n    )  # type: ignore\n\n\nif __name__ == \"__main__\":\n    print(fizzbuzz_gpt(n=15))\n    # > [1, 2, 'Fizz', 4, 'Buzz', 'Fizz', 7, 8, 'Fizz', 'Buzz', 11, 'Fizz', 13, 14, 'FizzBuzz']\n"
  },
  {
    "path": "examples/gpt-engineer/changes.diff",
    "content": "--- readme.md\n+++ readme.md\n@@ -1,9 +1,9 @@\n # FastAPI App\n \n-This is a FastAPI app that provides some basic math functions.\n+This is a Flask app that provides some basic math functions.\n \n ## Usage\n \n To use this app, follow the instructions below:\n \n 1. Install the required dependencies by running `pip install -r requirements.txt`.\n-2. Start the app by running `uvicorn main:app --reload`.\n+2. Start the app by running `flask run`.\n 3. Open your browser and navigate to `http://localhost:5000/docs` to access the Swagger UI documentation.\n \n ## Example\n \n To perform a basic math operation, you can use the following curl command:\n \n ```bash\n-curl -X POST -H \"Content-Type: application/json\" -d '{\"operation\": \"add\", \"operands\": [2, 3]}' http://localhost:8000/calculate\n+curl -X POST -H \"Content-Type: application/json\" -d '{\"operation\": \"add\", \"operands\": [2, 3]}' http://localhost:5000/calculate\n ```\n\n--- main.py\n+++ main.py\n@@ -1,29 +1,29 @@\n-from fastapi import FastAPI\n-from pydantic import BaseModel\n+from flask import Flask, request, jsonify\n \n-app = FastAPI()\n+app = Flask(__name__)\n \n \n-class Operation(BaseModel):\n-    operation: str\n-    operands: list\n+@app.route('/calculate', methods=['POST'])\n+def calculate():\n+    data = request.get_json()\n+    operation = data.get('operation')\n+    operands = data.get('operands')\n \n \n-@app.post('/calculate')\n-async def calculate(operation: Operation):\n-    if operation.operation == 'add':\n-        result = sum(operation.operands)\n-    elif operation.operation == 'subtract':\n-        result = operation.operands[0] - sum(operation.operands[1:])\n-    elif operation.operation == 'multiply':\n+    if operation == 'add':\n+        result = sum(operands)\n+    elif operation == 'subtract':\n+        result = operands[0] - sum(operands[1:])\n+    elif operation == 'multiply':\n         result = 1\n-        for operand in operation.operands:\n+        for operand in operands:\n             result *= operand\n-    elif operation.operation == 'divide':\n-        result = operation.operands[0]\n-        for operand in operation.operands[1:]:\n+    elif operation == 'divide':\n+        result = operands[0]\n+        for operand in operands[1:]:\n             result /= operand\n     else:\n         result = None\n-    return {'result': result}\n+    return jsonify({'result': result})\n\n--- requirements.txt\n+++ requirements.txt\n@@ -1,3 +1,2 @@\n-fastapi\n-uvicorn\n-pydantic\n+flask\n+flask-cors"
  },
  {
    "path": "examples/gpt-engineer/generate.py",
    "content": "import instructor\n\nfrom openai import OpenAI\nfrom pydantic import Field\nfrom instructor import ResponseSchema\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass File(ResponseSchema):\n    \"\"\"\n    Correctly named file with contents.\n    \"\"\"\n\n    file_name: str = Field(\n        ..., description=\"The name of the file including the extension\"\n    )\n    body: str = Field(..., description=\"Correct contents of a file\")\n\n    def save(self):\n        with open(self.file_name, \"w\") as f:\n            f.write(self.body)\n\n\nclass Program(ResponseSchema):\n    \"\"\"\n    Set of files that represent a complete and correct program\n    \"\"\"\n\n    files: list[File] = Field(..., description=\"List of files\")\n\n\ndef develop(data: str) -> Program:\n    completion = client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        temperature=0.1,\n        functions=[Program.openai_schema],\n        function_call={\"name\": Program.openai_schema[\"name\"]},\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world class programming AI capable of writing correct python scripts and modules. You will name files correct, include __init__.py files and write correct python code. with correct imports.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": data,\n            },\n        ],\n        max_tokens=1000,\n    )\n    return Program.from_response(completion)\n\n\nif __name__ == \"__main__\":\n    program = develop(\n        \"\"\"\n        Create a fastapi app with a readme.md file and a main.py file with\n        some basic math functions. the datamodels should use pydantic and\n        the main.py should use fastapi. the readme.md should have a title\n        and a description. The readme should contain some helpful infromation\n        and a curl example\"\"\"\n    )\n\n    for file in program.files:\n        print(file.file_name)\n        print(\"-\")\n        print(file.body)\n        print(\"\\n\\n\\n\")\n    \"\"\"\n    readme.md\n    -\n    # FastAPI App\n\n    This is a FastAPI app that provides some basic math functions.\n\n    ## Usage\n\n    To use this app, follow the instructions below:\n\n    1. Install the required dependencies by running `pip install -r requirements.txt`.\n    2. Start the app by running `uvicorn main:app --reload`.\n    3. Open your browser and navigate to `http://localhost:8000/docs` to access the Swagger UI documentation.\n\n    ## Example\n\n    To perform a basic math operation, you can use the following curl command:\n\n    ```bash\n    curl -X POST -H \"Content-Type: application/json\" -d '{\"operation\": \"add\", \"operands\": [2, 3]}' http://localhost:8000/calculate\n    ```\n\n\n\n\n\n    main.py\n    -\n    from fastapi import FastAPI\n    from pydantic import BaseModel\n\n    app = FastAPI()\n\n\n    class Operation(BaseModel):\n        operation: str\n        operands: list\n\n\n    @app.post('/calculate')\n    async def calculate(operation: Operation):\n        if operation.operation == 'add':\n            result = sum(operation.operands)\n        elif operation.operation == 'subtract':\n            result = operation.operands[0] - sum(operation.operands[1:])\n        elif operation.operation == 'multiply':\n            result = 1\n            for operand in operation.operands:\n                result *= operand\n        elif operation.operation == 'divide':\n            result = operation.operands[0]\n            for operand in operation.operands[1:]:\n                result /= operand\n        else:\n            result = None\n        return {'result': result}\n\n\n\n\n\n    requirements.txt\n    -\n    fastapi\n    uvicorn\n    pydantic\n    \"\"\"\n\n    with open(\"program.json\", \"w\") as f:\n        f.write(Program.parse_obj(program).json())\n"
  },
  {
    "path": "examples/gpt-engineer/program.json",
    "content": "{\"files\": [{\"file_name\": \"readme.md\", \"body\": \"# FastAPI App\\n\\nThis is a FastAPI app that provides some basic math functions.\\n\\n## Usage\\n\\nTo use this app, follow the instructions below:\\n\\n1. Install the required dependencies by running `pip install -r requirements.txt`.\\n2. Start the app by running `uvicorn main:app --reload`.\\n3. Open your browser and navigate to `http://localhost:8000/docs` to access the Swagger UI documentation.\\n\\n## Example\\n\\nTo perform a basic math operation, you can use the following curl command:\\n\\n```bash\\ncurl -X POST -H \\\"Content-Type: application/json\\\" -d '{\\\"operation\\\": \\\"add\\\", \\\"operands\\\": [2, 3]}' http://localhost:8000/calculate\\n```\\n\"}, {\"file_name\": \"main.py\", \"body\": \"from fastapi import FastAPI\\nfrom pydantic import BaseModel\\n\\napp = FastAPI()\\n\\n\\nclass Operation(BaseModel):\\n    operation: str\\n    operands: list\\n\\n\\n@app.post('/calculate')\\nasync def calculate(operation: Operation):\\n    if operation.operation == 'add':\\n        result = sum(operation.operands)\\n    elif operation.operation == 'subtract':\\n        result = operation.operands[0] - sum(operation.operands[1:])\\n    elif operation.operation == 'multiply':\\n        result = 1\\n        for operand in operation.operands:\\n            result *= operand\\n    elif operation.operation == 'divide':\\n        result = operation.operands[0]\\n        for operand in operation.operands[1:]:\\n            result /= operand\\n    else:\\n        result = None\\n    return {'result': result}\\n\"}, {\"file_name\": \"requirements.txt\", \"body\": \"fastapi\\nuvicorn\\npydantic\"}]}"
  },
  {
    "path": "examples/gpt-engineer/refactor.py",
    "content": "import instructor\n\nfrom openai import OpenAI\nfrom pydantic import Field, parse_file_as\nfrom instructor import ResponseSchema\nfrom generate import Program\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Diff(ResponseSchema):\n    \"\"\"\n    Changes that must be correctly made in a program's code repository defined as a\n    complete diff (Unified Format) file which will be used to `patch` the repository.\n\n    Example:\n      --- /path/to/original\ttimestamp\n      +++ /path/to/new\ttimestamp\n      @@ -1,3 +1,9 @@\n      +This is an important\n      +notice! It should\n      +therefore be located at\n      +the beginning of this\n      +document!\n      +\n       This part of the\n       document has stayed the\n       same from version to\n      @@ -8,13 +14,8 @@\n       compress the size of the\n       changes.\n      -This paragraph contains\n      -text that is outdated.\n      -It will be deleted in the\n      -near future.\n      -\n       It is important to spell\n      -check this dokument. On\n      +check this document. On\n       the other hand, a\n       misspelled word isn't\n       the end of the world.\n      @@ -22,3 +23,7 @@\n       this paragraph needs to\n       be changed. Things can\n       be added after it.\n      +\n      +This paragraph contains\n      +important new additions\n      +to this document.\n    \"\"\"\n\n    diff: str = Field(\n        ...,\n        description=(\n            \"Changes in a code repository correctly represented in 'diff' format, \"\n            \"correctly escaped so it could be used in a JSON\"\n        ),\n    )\n\n\ndef refactor(new_requirements: str, program: Program) -> Diff:\n    program_description = \"\\n\".join(\n        [f\"{code.file_name}\\n[[[\\n{code.body}\\n]]]\\n\" for code in program.files]\n    )\n    completion = client.chat.completions.create(\n        model=\"gpt-4\",\n        temperature=0,\n        functions=[Diff.openai_schema],\n        function_call={\"name\": Diff.openai_schema[\"name\"]},\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world class programming AI capable of refactor \"\n                \"existing python repositories. You will name files correct, include \"\n                \"__init__.py files and write correct python code, with correct imports. \"\n                \"You'll deliver your changes in valid 'diff' format so that they could \"\n                \"be applied using the 'patch' command. \"\n                \"Make sure you put the correct line numbers, \"\n                \"and that all lines that must be changed are correctly marked.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": new_requirements,\n            },\n            {\n                \"role\": \"user\",\n                \"content\": program_description,\n            },\n        ],\n        max_tokens=1000,\n    )\n    return Diff.from_response(completion)\n\n\nif __name__ == \"__main__\":\n    program = parse_file_as(path=\"program.json\", type_=Program)\n\n    changes = refactor(\n        new_requirements=\"Refactor this code to use flask instead.\",\n        program=program,\n    )\n    print(changes.diff)\n    \"\"\"\n    --- readme.md\n    +++ readme.md\n    @@ -1,9 +1,9 @@\n     # FastAPI App\n\n    -This is a FastAPI app that provides some basic math functions.\n    +This is a Flask app that provides some basic math functions.\n\n     ## Usage\n\n     To use this app, follow the instructions below:\n\n     1. Install the required dependencies by running `pip install -r requirements.txt`.\n    -2. Start the app by running `uvicorn main:app --reload`.\n    +2. Start the app by running `flask run`.\n     3. Open your browser and navigate to `http://localhost:5000/docs` to access the Swagger UI documentation.\n\n     ## Example\n\n     To perform a basic math operation, you can use the following curl command:\n\n     ```bash\n    -curl -X POST -H \"Content-Type: application/json\" -d '{\"operation\": \"add\", \"operands\": [2, 3]}' http://localhost:8000/calculate\n    +curl -X POST -H \"Content-Type: application/json\" -d '{\"operation\": \"add\", \"operands\": [2, 3]}' http://localhost:5000/calculate\n     ```\n\n    --- main.py\n    +++ main.py\n    @@ -1,29 +1,29 @@\n    -from fastapi import FastAPI\n    -from pydantic import BaseModel\n    +from flask import Flask, request, jsonify\n\n    -app = FastAPI()\n    +app = Flask(__name__)\n\n\n    -class Operation(BaseModel):\n    -    operation: str\n    -    operands: list\n    +@app.route('/calculate', methods=['POST'])\n    +def calculate():\n    +    data = request.get_json()\n    +    operation = data.get('operation')\n    +    operands = data.get('operands')\n\n\n    -@app.post('/calculate')\n    -async def calculate(operation: Operation):\n    -    if operation.operation == 'add':\n    -        result = sum(operation.operands)\n    -    elif operation.operation == 'subtract':\n    -        result = operation.operands[0] - sum(operation.operands[1:])\n    -    elif operation.operation == 'multiply':\n    +    if operation == 'add':\n    +        result = sum(operands)\n    +    elif operation == 'subtract':\n    +        result = operands[0] - sum(operands[1:])\n    +    elif operation == 'multiply':\n             result = 1\n    -        for operand in operation.operands:\n    +        for operand in operands:\n                 result *= operand\n    -    elif operation.operation == 'divide':\n    -        result = operation.operands[0]\n    -        for operand in operation.operands[1:]:\n    +    elif operation == 'divide':\n    +        result = operands[0]\n    +        for operand in operands[1:]:\n                 result /= operand\n         else:\n             result = None\n    -    return {'result': result}\n    +    return jsonify({'result': result})\n\n    --- requirements.txt\n    +++ requirements.txt\n    @@ -1,3 +1,2 @@\n    -fastapi\n    -uvicorn\n    -pydantic\n    +flask\n    +flask-cors\n    \"\"\"\n\n    with open(\"changes.diff\", \"w\") as f:\n        f.write(changes.diff)\n"
  },
  {
    "path": "examples/groq/groq_example.py",
    "content": "import os\nfrom pydantic import BaseModel, Field\nfrom groq import Groq\nimport instructor\n\n\nclass Character(BaseModel):\n    name: str\n    fact: list[str] = Field(..., description=\"A list of facts about the subject\")\n\n\nclient = Groq(\n    api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nclient = instructor.from_groq(client, mode=instructor.Mode.TOOLS)\n\nresp = client.chat.completions.create(\n    model=\"mixtral-8x7b-32768\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Tell me about the company Tesla\",\n        }\n    ],\n    response_model=Character,\n)\nprint(resp.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"Tesla\",\n  \"fact\": [\n    \"An American electric vehicle and clean energy company.\",\n    \"Co-founded by Elon Musk, JB Straubel, Martin Eberhard, Marc Tarpenning, and Ian Wright in 2003.\",\n    \"Headquartered in Austin, Texas.\",\n    \"Produces electric vehicles, energy storage solutions, and more recently, solar energy products.\",\n    \"Known for its premium electric vehicles, such as the Model S, Model 3, Model X, and Model Y.\",\n    \"One of the world's most valuable car manufacturers by market capitalization.\",\n    \"Tesla's CEO, Elon Musk, is also the CEO of SpaceX, Neuralink, and The Boring Company.\",\n    \"Tesla operates the world's largest global network of electric vehicle supercharging stations.\",\n    \"The company aims to accelerate the world's transition to sustainable transport and energy through innovative technologies and products.\"\n  ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/groq/groq_example2.py",
    "content": "import os\nfrom pydantic import BaseModel\nfrom groq import Groq\nimport instructor\n\nclient = Groq(\n    api_key=os.environ.get(\"GROQ_API_KEY\"),\n)\n\nclient = instructor.from_groq(client, mode=instructor.Mode.TOOLS)\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.chat.completions.create(\n    model=\"mixtral-8x7b-32768\",\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)\n\nassert isinstance(user, UserExtract), \"Should be instance of UserExtract\"\nassert user.name.lower() == \"jason\"\nassert user.age == 25\n\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"jason\",\n  \"age\": 25\n}\n\"\"\"\n"
  },
  {
    "path": "examples/hooks/README.md",
    "content": "# Instructor Hooks Example\n\nThis example demonstrates how to use the Hooks system in the Instructor library to monitor, log, and debug your LLM interactions.\n\n## What are Hooks?\n\nHooks provide a powerful mechanism for intercepting and handling events during the completion and parsing process. They allow you to add custom behavior, logging, or error handling at various stages of the API interaction.\n\nThe Instructor library supports several predefined hooks:\n\n- `completion:kwargs`: Emitted when completion arguments are provided\n- `completion:response`: Emitted when a completion response is received\n- `completion:error`: Emitted when an error occurs during completion\n- `completion:last_attempt`: Emitted when the last retry attempt is made\n- `parse:error`: Emitted when an error occurs during response parsing\n\n## What This Example Shows\n\nThis example demonstrates:\n\n1. **Basic Hook Registration**: How to register handlers for different hook events\n2. **Multiple Handlers**: How to register multiple handlers for the same event\n3. **Statistics Collection**: How to collect and track API usage statistics\n4. **Error Handling**: How to catch and process different types of errors\n5. **Hook Cleanup**: How to remove hooks when they're no longer needed\n\n## Usage Examples\n\nThe code demonstrates three scenarios:\n\n1. **Successful Extraction**: A basic example that works correctly\n2. **Parse Error**: An example that triggers a validation error\n3. **Multiple Hooks**: Shows how to attach multiple handlers to the same event\n\n## How to Run the Example\n\n```bash\n# Navigate to the hooks example directory\ncd examples/hooks\n\n# Run the example\npython run.py\n```\n\n## Expected Output\n\nThe example will print detailed information about each request, including:\n\n- 🔍 Request details (model, prompt)\n- 📏 Approximate input token count\n- 📊 Token usage statistics\n- ✅ Successful responses\n- ⚠️ Parse errors\n- ❌ Completion errors\n- 🔄 Retry attempt notifications\n\nAt the end, it will print a summary of the statistics collected.\n\n## Learn More\n\nFor more information about hooks in Instructor, see the [hooks documentation](https://instructor-ai.github.io/instructor/concepts/hooks/). "
  },
  {
    "path": "examples/hooks/run.py",
    "content": "\"\"\"\nThis example demonstrates how to use hooks in Instructor for monitoring,\nlogging, and debugging your LLM interactions.\n\nHooks allow you to attach handlers to events that occur during the completion\nand parsing process. This can be useful for:\n- Logging API requests and responses\n- Debugging parsing errors\n- Collecting statistics about API usage\n- Adding custom error handling\n\"\"\"\n\nimport instructor\nimport openai\nimport pydantic\n\n\nclass User(pydantic.BaseModel):\n    \"\"\"A simple user model with validation.\"\"\"\n\n    name: str\n    age: int\n\n    @pydantic.field_validator(\"age\")\n    def validate_age(cls, v: int) -> int:\n        if v < 0:\n            raise ValueError(\"Age must be non-negative\")\n        return v\n\n\nclass CompletionStats:\n    \"\"\"A simple class to collect statistics about completions.\"\"\"\n\n    def __init__(self):\n        self.total_completions = 0\n        self.errors = 0\n        self.successful = 0\n        self.tokens_used = 0\n\n    def report(self):\n        \"\"\"Print a report of the statistics.\"\"\"\n        print(\"\\n--- Completion Statistics ---\")\n        print(f\"Total completions: {self.total_completions}\")\n        print(f\"Successful: {self.successful}\")\n        print(f\"Errors: {self.errors}\")\n        print(f\"Total tokens used: {self.tokens_used}\")\n\n\ndef main():\n    # Initialize the OpenAI client with Instructor\n    client = instructor.from_openai(openai.OpenAI())\n\n    # Create a statistics collector\n    stats = CompletionStats()\n\n    # Define hook handlers\n    def log_completion_kwargs(_, **kwargs):\n        \"\"\"Handler for completion:kwargs hook.\"\"\"\n        stats.total_completions += 1\n        print(\n            f\"\\n🔍 Sending completion request using model: {kwargs.get('model', 'unknown')}\"\n        )\n        if \"messages\" in kwargs:\n            for msg in kwargs[\"messages\"]:\n                if msg.get(\"role\") == \"user\":\n                    print(f\"📝 User prompt: {msg.get('content')}\")\n\n    def log_completion_response(response):\n        \"\"\"Handler for completion:response hook.\"\"\"\n        stats.successful += 1\n\n        # Extract token usage if available\n        if hasattr(response, \"usage\") and response.usage:\n            token_usage = response.usage.total_tokens\n            stats.tokens_used += token_usage\n            print(f\"📊 Token usage: {token_usage}\")\n\n        print(f\"✅ Received completion response\")\n\n    def log_completion_error(error):\n        \"\"\"Handler for completion:error hook.\"\"\"\n        stats.errors += 1\n        print(f\"❌ Completion error: {type(error).__name__}: {str(error)}\")\n\n    def log_parse_error(error):\n        \"\"\"Handler for parse:error hook.\"\"\"\n        stats.errors += 1\n        print(f\"⚠️ Parse error: {type(error).__name__}: {str(error)}\")\n\n    # Register the hooks\n    client.on(\"completion:kwargs\", log_completion_kwargs)\n    client.on(\"completion:response\", log_completion_response)\n    client.on(\"completion:error\", log_completion_error)\n    client.on(\n        \"completion:last_attempt\", lambda _: print(f\"🔄 Last retry attempt failed\")\n    )\n    client.on(\"parse:error\", log_parse_error)\n\n    # Example 1: Successful extraction\n    try:\n        print(\"\\n--- Example 1: Successful Extraction ---\")\n        user = client.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            messages=[{\"role\": \"user\", \"content\": \"Extract: John is 30 years old.\"}],\n            response_model=User,\n        )\n        print(f\"Result: {user}\")\n    except Exception as e:\n        print(f\"Main exception: {e}\")\n\n    # Example 2: Parse error (validation fails)\n    try:\n        print(\"\\n--- Example 2: Parse Error (Age Validation) ---\")\n        user = client.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            messages=[{\"role\": \"user\", \"content\": \"Extract: Alice is -5 years old.\"}],\n            response_model=User,\n        )\n        print(f\"Result: {user}\")\n    except Exception as e:\n        print(f\"Main exception: {e}\")\n\n    # Example 3: Multiple hooks for the same event\n    print(\"\\n--- Example 3: Multiple Hooks ---\")\n\n    # Add another hook for completion:kwargs that counts message tokens\n    def count_input_tokens(_, **kwargs):\n        \"\"\"Handler for counting approximate tokens in input messages.\"\"\"\n        if \"messages\" in kwargs:\n            total_chars = sum(len(msg.get(\"content\", \"\")) for msg in kwargs[\"messages\"])\n            # Rough approximation of tokens (not accurate)\n            approx_tokens = total_chars / 4\n            print(f\"📏 Approximate input tokens: {approx_tokens:.0f}\")\n\n    # Register the additional hook\n    client.on(\"completion:kwargs\", count_input_tokens)\n\n    try:\n        user = client.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            messages=[{\"role\": \"user\", \"content\": \"Extract: Bob is 25 years old.\"}],\n            response_model=User,\n        )\n        print(f\"Result: {user}\")\n    except Exception as e:\n        print(f\"Main exception: {e}\")\n\n    # Print the final statistics\n    stats.report()\n\n    # Clean up hooks\n    print(\"\\n--- Cleaning Up Hooks ---\")\n    client.clear()\n    print(\"All hooks cleared\")\n\n\nif __name__ == \"__main__\":\n    main()\n\n\"\"\"\n\n--- Example 1: Successful Extraction ---\n\n🔍 Sending completion request using model: gpt-3.5-turbo\n📝 User prompt: Extract: John is 30 years old.\n📊 Token usage: 82\n✅ Received completion response\nResult: name='John' age=30\n\n--- Example 2: Parse Error (Age Validation) ---\n\n🔍 Sending completion request using model: gpt-3.5-turbo\n📝 User prompt: Extract: Alice is -5 years old.\n📊 Token usage: 82\n✅ Received completion response\n⚠️ Parse error: ValidationError: 1 validation error for User\nage\n  Value error, Age must be non-negative [type=value_error, input_value=-5, input_type=int]\n    For further information visit https://errors.pydantic.dev/2.9/v/value_error\n\n🔍 Sending completion request using model: gpt-3.5-turbo\n📝 User prompt: Extract: Alice is -5 years old.\n📊 Token usage: 170\n✅ Received completion response\nResult: name='Alice' age=5\n\n--- Example 3: Multiple Hooks ---\n\n🔍 Sending completion request using model: gpt-3.5-turbo\n📝 User prompt: Extract: Bob is 25 years old.\n📏 Approximate input tokens: 7\n📊 Token usage: 82\n✅ Received completion response\nResult: name='Bob' age=25\n\n--- Completion Statistics ---\nTotal completions: 4\nSuccessful: 4\nErrors: 1\nTotal tokens used: 416\n\n--- Cleaning Up Hooks ---\nAll hooks cleared\n\"\"\"\n"
  },
  {
    "path": "examples/iterables/run.py",
    "content": "import time\n\nfrom collections.abc import Iterable\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\nimport instructor\n\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass User(BaseModel):\n    name: str\n    job: str\n    age: int\n\n\ndef stream_extract(input: str) -> Iterable[User]:\n    return client.chat.completions.create_iterable(\n        model=\"gpt-4o\",\n        temperature=0.1,\n        stream=True,\n        response_model=User,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a perfect entity extraction system\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Consider the data below:\\n{input}\"\n                    \"Correctly segment it into entitites\"\n                    \"Make sure the JSON is correct\"\n                ),\n            },\n        ],\n        max_tokens=1000,\n    )\n\n\nstart = time.time()\nfor user in stream_extract(\n    input=\"Create 5 characters from the book Three Body Problem\"\n):\n    delay = round(time.time() - start, 1)\n    print(f\"{delay} s: User({user})\")\n    \"\"\"\n    0.8 s: User(name='Ye Wenjie' job='Astrophysicist' age=60)\n    1.1 s: User(name='Wang Miao' job='Nanomaterials Researcher' age=40)\n    1.7 s: User(name='Shi Qiang' job='Detective' age=50)\n    1.9 s: User(name='Ding Yi' job='Theoretical Physicist' age=45)\n    1.9 s: User(name='Chang Weisi' job='Military Strategist' age=55)\n    \"\"\"\n    # Notice that the first one would return at 5s bu the last one returned in 10s!\n"
  },
  {
    "path": "examples/knowledge-graph/run.py",
    "content": "import instructor\n\nfrom graphviz import Digraph\nfrom pydantic import BaseModel, Field\nfrom openai import OpenAI\n\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Node(BaseModel):\n    id: int\n    label: str\n    color: str\n\n\nclass Edge(BaseModel):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: list[Node] = Field(..., default_factory=list)\n    edges: list[Edge] = Field(..., default_factory=list)\n\n\ndef generate_graph(input) -> KnowledgeGraph:\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo-16k\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Help me understand following by describing as a detailed knowledge graph: {input}\",\n            }\n        ],\n        response_model=KnowledgeGraph,\n    )  # type: ignore\n\n\ndef visualize_knowledge_graph(kg: KnowledgeGraph):\n    dot = Digraph(comment=\"Knowledge Graph\")\n\n    # Add nodes\n    for node in kg.nodes:\n        dot.node(str(node.id), node.label, color=node.color)\n\n    # Add edges\n    for edge in kg.edges:\n        dot.edge(str(edge.source), str(edge.target), label=edge.label, color=edge.color)\n\n    # Render the graph\n    dot.render(\"knowledge_graph.gv\", view=True)\n\n\ngraph: KnowledgeGraph = generate_graph(\"Teach me about quantum mechanics\")\nvisualize_knowledge_graph(graph)\n"
  },
  {
    "path": "examples/knowledge-graph/run_stream.py",
    "content": "from openai import OpenAI\nimport instructor\n\nfrom graphviz import Digraph\nfrom typing import Optional\n\nfrom pydantic import BaseModel, Field\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Node(BaseModel):\n    id: int\n    label: str\n    color: str\n\n    def __hash__(self) -> int:\n        return hash((id, self.label))\n\n\nclass Edge(BaseModel):\n    source: int\n    target: int\n    label: str\n    color: str = \"black\"\n\n    def __hash__(self) -> int:\n        return hash((self.source, self.target, self.label))\n\n\nclass KnowledgeGraph(BaseModel):\n    nodes: Optional[list[Node]] = Field(..., default_factory=list)\n    edges: Optional[list[Edge]] = Field(..., default_factory=list)\n\n    def update(self, other: \"KnowledgeGraph\") -> \"KnowledgeGraph\":\n        \"\"\"Updates the current graph with the other graph, deduplicating nodes and edges.\"\"\"\n        return KnowledgeGraph(\n            nodes=list(set(self.nodes + other.nodes)),\n            edges=list(set(self.edges + other.edges)),\n        )\n\n    def draw(self, prefix: str = None):\n        dot = Digraph(comment=\"Knowledge Graph\")\n\n        # Add nodes\n        for node in self.nodes:\n            dot.node(str(node.id), node.label, color=node.color)\n\n        # Add edges\n        for edge in self.edges:\n            dot.edge(\n                str(edge.source), str(edge.target), label=edge.label, color=edge.color\n            )\n        dot.render(prefix, format=\"png\", view=True)\n\n\ndef generate_graph(input: list[str]) -> KnowledgeGraph:\n    cur_state = KnowledgeGraph()\n    num_iterations = len(input)\n    for i, inp in enumerate(input):\n        new_updates = client.chat.completions.create(\n            model=\"gpt-3.5-turbo-16k\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"You are an iterative knowledge graph builder.\n                    You are given the current state of the graph, and you must append the nodes and edges \n                    to it Do not procide any duplcates and try to reuse nodes as much as possible.\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Extract any new nodes and edges from the following:\n                    # Part {i}/{num_iterations} of the input:\n\n                    {inp}\"\"\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"\"\"Here is the current state of the graph:\n                    {cur_state.model_dump_json(indent=2)}\"\"\",\n                },\n            ],\n            response_model=KnowledgeGraph,\n        )  # type: ignore\n\n        # Update the current state\n        cur_state = cur_state.update(new_updates)\n        cur_state.draw(prefix=f\"iteration_{i}\")\n    return cur_state\n\n\n# here we assume that we have to process the text in chunks\n# one at a time since they may not fit in the prompt otherwise\ntext_chunks = [\n    \"Jason knows a lot about quantum mechanics. He is a physicist. He is a professor\",\n    \"Professors are smart.\",\n    \"Sarah knows Jason and is a student of his.\",\n    \"Sarah is a student at the University of Toronto. and UofT is in Canada.\",\n]\n\ngraph: KnowledgeGraph = generate_graph(text_chunks)\n\ngraph.draw(prefix=\"final\")\n"
  },
  {
    "path": "examples/learn-async/run.py",
    "content": "import time\nimport asyncio\n\nimport instructor\nfrom pydantic import BaseModel\nfrom openai import AsyncOpenAI\n\n\nclient = instructor.apatch(AsyncOpenAI())\n\n\nclass Timer:\n    def __init__(self, name):\n        self.name = name\n        self.start = None\n        self.end = None\n\n    async def __aenter__(self):\n        self.start = time.time()\n\n    async def __aexit__(self, *args, **kwargs):\n        self.end = time.time()\n        print(f\"{self.name} took {(self.end - self.start):.2f} seconds\")\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nasync def extract_person(text: str) -> Person:\n    return await client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\"role\": \"user\", \"content\": text},\n        ],\n        response_model=Person,\n    )\n\n\nasync def main():\n    \"\"\"We'll use this to run the example. and time how long each one takes!\n\n    0. for loop\n    1. asyncio.gather\n    2. asyncio.as_completed\n    \"\"\"\n    dataset = [\n        \"My name is John and I am 20 years old\",\n        \"My name is Mary and I am 21 years old\",\n        \"My name is Bob and I am 22 years old\",\n        \"My name is Alice and I am 23 years old\",\n        \"My name is Jane and I am 24 years old\",\n        \"My name is Joe and I am 25 years old\",\n        \"My name is Jill and I am 26 years old\",\n    ]\n\n    \"\"\"\n    This is the simplest way to run multiple async functions in series.\n    It will wait for each function to complete before continuing.\n    \"\"\"\n    async with Timer(\"for loop\"):\n        persons = []\n        for text in dataset:\n            person = await extract_person(text)\n            persons.append(person)\n        print(\"for loop:\", persons)\n\n    \"\"\"\n    This is the simplest way to run multiple async functions in parallel.\n    It will wait for all of the functions to complete before continuing.\n    \"\"\"\n    async with Timer(\"asyncio.gather\"):\n        tasks_get_persons = [extract_person(text) for text in dataset]\n        all_person = await asyncio.gather(*tasks_get_persons)\n        print(\"asyncio.gather:\", all_person)\n\n    \"\"\"\n    This is a bit more complicated, but it allows us to process each\n    person as soon as they are ready. This is useful if you have a\n    large dataset and want to start processing the results as soon\n    as they are ready.\n    \"\"\"\n    async with Timer(\"asyncio.as_completed\"):\n        all_persons = []\n        tasks_get_persons = [extract_person(text) for text in dataset]\n        for person in asyncio.as_completed(tasks_get_persons):\n            all_persons.append(await person)\n        print(\"asyncio.as_copmleted:\", all_persons)\n\n    \"\"\"\n    If we want to rate limit our requests, we can use the\n    semaphore to limit the number of concurrent requests.\n    \"\"\"\n\n    # Create a semaphore that will only allow 2 concurrent requests\n    sem = asyncio.Semaphore(2)\n\n    async def rate_limited_extract_person(text: str) -> Person:\n        async with sem:\n            return await extract_person(text)\n\n    async with Timer(\"asyncio.gather (rate limited)\"):\n        tasks_get_persons = [rate_limited_extract_person(text) for text in dataset]\n        resp = await asyncio.gather(*tasks_get_persons)\n        print(\"asyncio.gather (rate limited):\", resp)\n\n    async with Timer(\"asyncio.as_completed (rate limited)\"):\n        all_persons = []\n        tasks_get_persons = [rate_limited_extract_person(text) for text in dataset]\n        for person in asyncio.as_completed(tasks_get_persons):\n            all_persons.append(await person)\n        print(\"asyncio.as_completed (rate limited):\", all_persons)\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n    \"\"\"\n    for loop took 6.17 seconds\n\n    asyncio.gather took 1.11 seconds\n    asyncio.as_completed took 0.87 seconds\n\n    asyncio.gather (rate limited) took 3.04 seconds\n    asyncio.as_completed (rate limited) took 3.26 seconds\n    \"\"\"\n"
  },
  {
    "path": "examples/llm-judge-relevance/run.py",
    "content": "import instructor\nimport openai\nfrom pydantic import BaseModel, Field\n\nclient = instructor.from_openai(openai.OpenAI())\n\n\nclass Judgment(BaseModel):\n    thought: str = Field(\n        description=\"The step-by-step reasoning process used to analyze the question and text\"\n    )\n    justification: str = Field(\n        description=\"Explanation for the similarity judgment, detailing key factors that led to the conclusion\"\n    )\n    similarity: bool = Field(\n        description=\"Boolean judgment indicating whether the question and text are similar or relevant (True) or not (False)\"\n    )\n\n\nprompt = \"\"\"\nYou are tasked with comparing a question and a piece of text to determine if they are relevant to each other or similar in some way. Your goal is to analyze the content, context, and potential connections between the two.\n\n\nTo determine if the question and text are relevant or similar, please follow these steps:\n\n1. Carefully read and understand both the question and the text.\n2. Identify the main topic, keywords, and concepts in the question.\n3. Analyze the text for any mention of these topics, keywords, or concepts.\n4. Consider any potential indirect connections or implications that might link the question and text.\n5. Evaluate the overall context and purpose of both the question and the text.\n\nAs you go through this process, please use a chain of thought approach. Write out your reasoning for each step inside <thought> tags.\n\nAfter your analysis, provide a boolean judgment on whether the question and text are similar or relevant to each other. Use \"true\" if they are similar or relevant, and \"false\" if they are not.\n\nBefore giving your final judgment, provide a justification for your decision. Explain the key factors that led to your conclusion.\n\"\"\"\n\n\ndef judge_relevance(question: str, text: str) -> Judgment:\n    return client.chat.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\"role\": \"system\", \"content\": prompt},\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n             Here is the question:\n\n             <question>\n             {{question}}\n             </question>\n\n             Here is the text:\n             <text>\n             {{text}}\n             </text>\n                \"\"\",\n            },\n        ],\n        response_model=Judgment,\n        context={\"question\": question, \"text\": text},\n    )\n\n\nif __name__ == \"__main__\":\n    test_pairs = [\n        {\n            \"question\": \"What are the main causes of climate change?\",\n            \"text\": \"Global warming is primarily caused by human activities, such as burning fossil fuels, deforestation, and industrial processes. These activities release greenhouse gases into the atmosphere, trapping heat and leading to a rise in global temperatures.\",\n            \"is_similar\": True,\n        },\n        {\n            \"question\": \"How does photosynthesis work?\",\n            \"text\": \"Photosynthesis is the process by which plants use sunlight, water, and carbon dioxide to produce oxygen and energy in the form of sugar. It occurs in the chloroplasts of plant cells and is essential for life on Earth.\",\n            \"is_similar\": True,\n        },\n        {\n            \"question\": \"What are the benefits of regular exercise?\",\n            \"text\": \"The Eiffel Tower, located in Paris, France, was completed in 1889. It stands 324 meters tall and was originally built as the entrance arch for the 1889 World's Fair.\",\n            \"is_similar\": False,\n        },\n        {\n            \"question\": \"How do vaccines work?\",\n            \"text\": \"The process of baking bread involves mixing flour, water, yeast, and salt to form a dough. The dough is then kneaded, left to rise, shaped, and finally baked in an oven.\",\n            \"is_similar\": False,\n        },\n    ]\n\n    score = 0\n    for pair in test_pairs:\n        result = judge_relevance(pair[\"question\"], pair[\"text\"])\n        if result.similarity == pair[\"is_similar\"]:\n            score += 1\n\n    print(f\"Score: {score}/{len(test_pairs)}\")\n"
  },
  {
    "path": "examples/logfire/classify.py",
    "content": "import enum\nfrom pydantic import BaseModel\nfrom openai import OpenAI\nimport instructor\nimport logfire\n\n\nclass Labels(str, enum.Enum):\n    \"\"\"Enumeration for single-label text classification.\"\"\"\n\n    SPAM = \"spam\"\n    NOT_SPAM = \"not_spam\"\n\n\nclass SinglePrediction(BaseModel):\n    \"\"\"\n    Class for a single class label prediction.\n    \"\"\"\n\n    class_label: Labels\n\n\nopenai_client = OpenAI()\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_openai(openai_client)\nclient = instructor.from_openai(openai_client)\n\n\n@logfire.instrument(\"classification\", extract_args=True)\ndef classify(data: str) -> SinglePrediction:\n    \"\"\"Perform single-label classification on the input text.\"\"\"\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=SinglePrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: {data}\",\n            },\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    emails = [\n        \"Hello there I'm a Nigerian prince and I want to give you money\",\n        \"Meeting with Thomas has been set at Friday next week\",\n        \"Here are some weekly product updates from our marketing team\",\n    ]\n\n    for email in emails:\n        classify(email)\n"
  },
  {
    "path": "examples/logfire/image.py",
    "content": "import instructor\nfrom io import StringIO\nfrom typing import Annotated, Any\nfrom collections.abc import Iterable\nfrom pydantic import (\n    BeforeValidator,\n    InstanceOf,\n    WithJsonSchema,\n    BaseModel,\n)\nimport pandas as pd\nfrom openai import OpenAI\nimport logfire\n\nopenai_client = OpenAI()\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_openai(openai_client)\nclient = instructor.from_openai(openai_client, mode=instructor.Mode.MD_JSON)\n\n\ndef md_to_df(data: Any) -> Any:\n    # Convert markdown to DataFrame\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Process data\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .applymap(lambda x: x.strip())\n        )\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"The markdown representation of the table, each one should be tidy, do not try to join tables that should be separate\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\n@logfire.instrument(\"extract-table\", extract_args=True)\ndef extract_table_from_image(url: str) -> Iterable[Table]:\n    return client.chat.completions.create(\n        model=\"gpt-4-vision-preview\",\n        response_model=Iterable[Table],\n        max_tokens=1800,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract out a table from the image. Only extract out the total number of skiiers.\",\n                    },\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": url}},\n                ],\n            }\n        ],\n    )\n\n\nurl = \"https://cdn.statcdn.com/Infographic/images/normal/16330.jpeg\"\ntables = extract_table_from_image(url)\nfor table in tables:\n    print(table.caption, end=\"\\n\")\n    print(table.dataframe.to_markdown())\n"
  },
  {
    "path": "examples/logfire/requirements.txt",
    "content": "pydantic==2.7.1\nopenai==1.24.1\ninstructor==1.0.3\nlogfire==0.28.0"
  },
  {
    "path": "examples/logfire/validate.py",
    "content": "from typing import Annotated\nfrom pydantic import BaseModel, ValidationError\nfrom pydantic.functional_validators import AfterValidator\nfrom instructor import llm_validator\nimport logfire\nimport instructor\nfrom openai import OpenAI\n\nopenai_client = OpenAI()\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_openai(openai_client)\nclient = instructor.from_openai(openai_client)\n\n\nclass Statement(BaseModel):\n    message: Annotated[\n        str,\n        AfterValidator(\n            llm_validator(\"Don't allow any objectionable content\", client=client)\n        ),\n    ]\n\n\nmessages = [\n    \"I think we should always treat violence as the best solution\",\n    \"There are some great pastries down the road at this bakery I know\",\n]\n\nfor message in messages:\n    try:\n        Statement(message=message)\n    except ValidationError as e:\n        print(e)\n"
  },
  {
    "path": "examples/logfire-fastapi/Readme.md",
    "content": "# Instructions\n\n1. Create a virtual environment and install all of the packages inside `requirements.txt`\n\n2. Run the server using\n\n```\nuvicorn server:app --reload\n```\n\n3. Open up the documentation at `http://127.0.0.1:8000/docs` to start experimenting with fastapi! You can print out the streaming example using `test.py`.\n"
  },
  {
    "path": "examples/logfire-fastapi/requirements.txt",
    "content": "pydantic==2.7.1\nopenai==1.24.1\ninstructor==1.0.3\nlogfire==0.28.0\nfastapi==0.110.3\nuvicorn[standard]\nlogfire[fastapi]"
  },
  {
    "path": "examples/logfire-fastapi/server.py",
    "content": "from pydantic import BaseModel\nfrom fastapi import FastAPI\nfrom openai import AsyncOpenAI\nimport instructor\nimport logfire\nimport asyncio\nfrom collections.abc import Iterable\nfrom fastapi.responses import StreamingResponse\n\n\nclass UserData(BaseModel):\n    query: str\n\n\nclass MultipleUserData(BaseModel):\n    queries: list[str]\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\napp = FastAPI()\nopenai_client = AsyncOpenAI()\nlogfire.configure(pydantic_plugin=logfire.PydanticPlugin(record=\"all\"))\nlogfire.instrument_fastapi(app)\nlogfire.instrument_openai(openai_client)\nclient = instructor.from_openai(openai_client)\n\n\n@app.post(\"/user\", response_model=UserDetail)\nasync def endpoint_function(data: UserData) -> UserDetail:\n    user_detail = await client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=UserDetail,\n        messages=[\n            {\"role\": \"user\", \"content\": f\"Extract: `{data.query}`\"},\n        ],\n    )\n    logfire.info(\"/User returning\", value=user_detail)\n    return user_detail\n\n\n@app.post(\"/many-users\", response_model=list[UserDetail])\nasync def extract_many_users(data: MultipleUserData):\n    async def extract_user(query: str):\n        user_detail = await client.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=UserDetail,\n            messages=[\n                {\"role\": \"user\", \"content\": f\"Extract: `{query}`\"},\n            ],\n        )\n        logfire.info(\"/User returning\", value=user_detail)\n        return user_detail\n\n    coros = [extract_user(query) for query in data.queries]\n    return await asyncio.gather(*coros)\n\n\n@app.post(\"/extract\", response_class=StreamingResponse)\nasync def extract(data: UserData):\n    supressed_client = AsyncOpenAI()\n    logfire.instrument_openai(supressed_client, suppress_other_instrumentation=False)\n    client = instructor.from_openai(supressed_client)\n    users = await client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=Iterable[UserDetail],\n        stream=True,\n        messages=[\n            {\"role\": \"user\", \"content\": data.query},\n        ],\n    )\n\n    async def generate():\n        with logfire.span(\"Generating User Response Objects\"):\n            async for user in users:\n                resp_json = user.model_dump_json()\n                logfire.info(\"Returning user object\", value=resp_json)\n\n                yield resp_json\n\n    return StreamingResponse(generate(), media_type=\"text/event-stream\")\n"
  },
  {
    "path": "examples/logfire-fastapi/test.py",
    "content": "import requests\n\nresponse = requests.post(\n    \"http://127.0.0.1:3000/extract\",\n    json={\n        \"query\": \"Alice and Bob are best friends. They are currently 32 and 43 respectively. \"\n    },\n    stream=True,\n)\n\nfor chunk in response.iter_content(chunk_size=1024):\n    if chunk:\n        print(str(chunk, encoding=\"utf-8\"), end=\"\\n\")\n"
  },
  {
    "path": "examples/logging/run.py",
    "content": "import instructor\nimport openai\nimport logging\n\nfrom pydantic import BaseModel\n\n\n# Set logging to DEBUG\nlogging.basicConfig(level=logging.DEBUG)\n\nclient = instructor.from_openai(openai.OpenAI())\n\n\nclass UserDetail(BaseModel):\n    name: str\n    age: int\n\n\nuser = client.chat.completions.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserDetail,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract Jason is 25 years old\"},\n    ],\n)  # type: ignore\n\n\"\"\" \nDEBUG:httpx:load_ssl_context verify=True cert=None trust_env=True http2=False\nDEBUG:httpx:load_verify_locations cafile='/Users/jasonliu/dev/instructor/.venv/lib/python3.11/site-packages/certifi/cacert.pem'\nDEBUG:instructor:Patching `client.chat.completions.create` with mode=<Mode.TOOLS: 'tool_call'>\nDEBUG:instructor:max_retries: 1\nDEBUG:openai._base_client:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'user', 'content': 'Extract Jason is 25 years old'}], 'model': 'gpt-3.5-turbo', 'function_call': {'name': 'UserDetail'}, 'functions': [{'name': 'UserDetail', 'description': 'Correctly extracted `UserDetail` with all the required parameters with correct types', 'parameters': {'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'integer'}}, 'required': ['age', 'name'], 'type': 'object'}}]}}\nDEBUG:httpcore.connection:connect_tcp.started host='api.openai.com' port=443 local_address=None timeout=5.0 socket_options=None\nDEBUG:httpcore.connection:connect_tcp.complete return_value=<httpcore._backends.sync.SyncStream object at 0x105062c90>\nDEBUG:httpcore.connection:start_tls.started ssl_context=<ssl.SSLContext object at 0x100748680> server_hostname='api.openai.com' timeout=5.0\nDEBUG:httpcore.connection:start_tls.complete return_value=<httpcore._backends.sync.SyncStream object at 0x101caa150>\nDEBUG:httpcore.http11:send_request_headers.started request=<Request [b'POST']>\nDEBUG:httpcore.http11:send_request_headers.complete\nDEBUG:httpcore.http11:send_request_body.started request=<Request [b'POST']>\nDEBUG:httpcore.http11:send_request_body.complete\nDEBUG:httpcore.http11:receive_response_headers.started request=<Request [b'POST']>\nDEBUG:httpcore.http11:receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Mon, 12 Feb 2024 14:55:45 GMT'), (b'Content-Type', b'application/json'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'access-control-allow-origin', b'*'), (b'Cache-Control', b'no-cache, must-revalidate'), (b'openai-model', b'gpt-3.5-turbo-0613'), (b'openai-organization', b'scribe-ai'), (b'openai-processing-ms', b'483'), (b'openai-version', b'2020-10-01'), (b'strict-transport-security', b'max-age=15724800; includeSubDomains'), (b'x-ratelimit-limit-requests', b'10000'), (b'x-ratelimit-limit-tokens', b'2000000'), (b'x-ratelimit-remaining-requests', b'9999'), (b'x-ratelimit-remaining-tokens', b'1999975'), (b'x-ratelimit-reset-requests', b'6ms'), (b'x-ratelimit-reset-tokens', b'0s'), (b'x-request-id', b'req_f0fa476897ae165fc50fa90b7968595b'), (b'CF-Cache-Status', b'DYNAMIC'), (b'Set-Cookie', b'__cf_bm=e2_yCrwo4frh6Oq4ZufCEhNJ4lSGJ2.MMtk45X8lrMM-1707749745-1-AfWk8CyACc7aZo6GpCI82FBfI/wmPEFZLNO/Cr3eavTW3xKVFCS7G9jvwYTFLXjJr0cttYsXeLAnjwipw18R0Vo=; path=/; expires=Mon, 12-Feb-24 15:25:45 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None'), (b'Set-Cookie', b'_cfuvid=PyVVCGSMxTg1p.woYvHVVC9E3n69faOs5FOxaDdjXOM-1707749745711-0-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None'), (b'Server', b'cloudflare'), (b'CF-RAY', b'8545aca30c1fa22f-YYZ'), (b'Content-Encoding', b'gzip'), (b'alt-svc', b'h3=\":443\"; ma=86400')])\nINFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\nDEBUG:httpcore.http11:receive_response_body.started request=<Request [b'POST']>\nDEBUG:httpcore.http11:receive_response_body.complete\nDEBUG:httpcore.http11:response_closed.started\nDEBUG:httpcore.http11:response_closed.complete\nDEBUG:openai._base_client:HTTP Request: POST https://api.openai.com/v1/chat/completions \"200 OK\"\nDEBUG:httpcore.connection:close.started\nDEBUG:httpcore.connection:close.complete\n\"\"\"\n"
  },
  {
    "path": "examples/match_language/run_v1.py",
    "content": "from pydantic import BaseModel\nfrom instructor import patch\nfrom openai import AsyncOpenAI\nfrom langdetect import detect\n\ndocs = map(\n    lambda x: x.strip(),\n    \"\"\"\nԼեզվական մոդելները վերջին տարիներին դարձել են ավելի հարուստ և կատարյալ, հնարավորություն ընձեռելով ստեղծել սահուն և բնական տեքստեր, ինչպես նաև գերազանց արդյունքներ ցուցաբերել մեքենայական թարգմանության, հարցերի պատասխանման և ստեղծագործ տեքստերի ստեղծման նման տարբեր առաջադրանքներում։ Այս մոդելները մշակվում են հսկայական տեքստային տվյալների հիման վրա և կարող են բռնել բնական լեզվի կառուցվածքն ու նրբությունները՝ հեղափոխություն առաջացնելով համակարգիչների և մարդկանց միջև հաղորդակցության ոլորտում։\n\n---\n\nMga modelo ng wika ay naging mas sopistikado sa nagdaang mga taon, na nagbibigay-daan sa pagbuo ng mga natural at madaling basahing teksto, at nagpapakita ng mahusay na pagganap sa iba't ibang gawain tulad ng awtomatikong pagsasalin, pagsagot sa mga tanong, at pagbuo ng malikhain na teksto. Ang mga modelo na ito ay sinanay sa napakalaking mga dataset ng teksto at kayang hulihin ang istruktura at mga nuances ng natural na wika. Ang mga pagpapabuti sa mga modelo ng wika ay maaaring magdulot ng rebolusyon sa komunikasyon sa pagitan ng mga computer at tao, at inaasahan ang higit pang pag-unlad sa hinaharap.\n\n---\n\nNgaahi motuʻa lea kuo nau hoko ʻo fakaʻofoʻofa ange ʻi he ngaahi taʻu fakamuimui ni, ʻo fakafaingofuaʻi e fakatupu ʻo e ngaahi konga tohi ʻoku lelei mo fakanatula pea ʻoku nau fakahaaʻi ʻa e ngaahi ola lelei ʻi he ngaahi ngāue kehekehe ʻo hangē ko e liliu fakaʻētita, tali fehuʻi, mo e fakatupu ʻo e konga tohi fakaʻatamai. Ko e ako ʻa e ngaahi motuʻa ni ʻi he ngaahi seti ʻo e fakamatala tohi lahi pea ʻoku nau malava ʻo puke ʻa e fakafuofua mo e ngaahi meʻa iiki ʻo e lea fakanatula. ʻE lava ke fakatupu ʻe he ngaahi fakaleleiʻi ki he ngaahi motuʻa lea ha liliu lahi ʻi he fetu'utaki ʻi he vahaʻa ʻo e ngaahi komipiuta mo e kakai, pea ʻoku ʻamanaki ʻe toe fakalakalaka ange ia ʻi he kahaʻu.\n\n---\n\nDil modelleri son yıllarda daha da gelişti, akıcı ve doğal metinler üretmeyi mümkün kılıyor ve makine çevirisi, soru cevaplama ve yaratıcı metin oluşturma gibi çeşitli görevlerde mükemmel performans gösteriyor. Bu modeller, devasa metin veri setlerinde eğitilir ve doğal dilin yapısını ve nüanslarını yakalayabilir. Dil modellerindeki iyileştirmeler, bilgisayarlar ve insanlar arasındaki iletişimde devrim yaratabilir ve gelecekte daha da ilerleme bekleniyor.\n\n---\n\nMô hình ngôn ngữ đã trở nên tinh vi hơn trong những năm gần đây, cho phép tạo ra các văn bản trôi chảy và tự nhiên, đồng thời thể hiện hiệu suất xuất sắc trong các nhiệm vụ khác nhau như dịch máy, trả lời câu hỏi và tạo văn bản sáng tạo. Các mô hình này được huấn luyện trên các tập dữ liệu văn bản khổng lồ và có thể nắm bắt cấu trúc và sắc thái của ngôn ngữ tự nhiên. Những cải tiến trong mô hình ngôn ngữ có thể mang lại cuộc cách mạng trong giao tiếp giữa máy tính và con người, và người ta kỳ vọng sẽ có những tiến bộ hơn nữa trong tương lai.\n\n---\n\nLes modèles de langage sont devenus de plus en plus sophistiqués ces dernières années, permettant de générer des textes fluides et naturels, et de performer dans une variété de tâches telles que la traduction automatique, la réponse aux questions et la génération de texte créatif. Entraînés sur d'immenses ensembles de données textuelles, ces modèles sont capables de capturer la structure et les nuances du langage naturel, ouvrant la voie à une révolution dans la communication entre les ordinateurs et les humains.\n\n---\n\n近年来,语言模型变得越来越复杂,能够生成流畅自然的文本,并在机器翻译、问答和创意文本生成等各种任务中表现出色。这些模型在海量文本数据集上训练,可以捕捉自然语言的结构和细微差别。语言模型的改进有望彻底改变计算机和人类之间的交流方式,未来有望实现更大的突破。\n\n---\n\nIn den letzten Jahren sind Sprachmodelle immer ausgefeilter geworden und können flüssige, natürlich klingende Texte generieren und in verschiedenen Aufgaben wie maschineller Übersetzung, Beantwortung von Fragen und Generierung kreativer Texte hervorragende Leistungen erbringen. Diese Modelle werden auf riesigen Textdatensätzen trainiert und können die Struktur und Nuancen natürlicher Sprache erfassen, was zu einer Revolution in der Kommunikation zwischen Computern und Menschen führen könnte.\n\n---\n\nपिछले कुछ वर्षों में भाषा मॉडल बहुत अधिक परिष्कृत हो गए हैं, जो प्राकृतिक और प्रवाहमय पाठ उत्पन्न कर सकते हैं, और मशीन अनुवाद, प्रश्नोत्तर, और रचनात्मक पाठ उत्पादन जैसे विभिन्न कार्यों में उत्कृष्ट प्रदर्शन कर सकते हैं। ये मॉडल विशाल पाठ डेटासेट पर प्रशिक्षित होते हैं और प्राकृतिक भाषा की संरचना और बारीकियों को समझ सकते हैं। भाषा मॉडल में सुधार कंप्यूटर और मानव के बीच संवाद में क्रांति ला सकता है, और भविष्य में और प्रगति की उम्मीद है।\n\n---\n\n近年、言語モデルは非常に洗練され、自然で流暢なテキストを生成できるようになり、機械翻訳、質問応答、クリエイティブなテキスト生成など、様々なタスクで優れたパフォーマンスを発揮しています。これらのモデルは膨大なテキストデータセットで学習され、自然言語の構造とニュアンスを捉えることができます。言語モデルの改善により、コンピューターと人間のコミュニケーションに革命が起こる可能性があり、将来のさらなる進歩が期待されています。\n\"\"\".split(\"---\"),\n)\n\n# Patch the OpenAI client to enable response_model\nclient = patch(AsyncOpenAI())\n\n\nclass GeneratedSummary(BaseModel):\n    summary: str\n\n\nasync def summarize_text(text: str):\n    response = await client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=GeneratedSummary,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Generate a concise summary in the language of the article. \",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Summarize the following text in a concise way:\\n{text}\",\n            },\n        ],\n    )  # type: ignore\n    return response.summary, text\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    async def main():\n        results = await asyncio.gather(*[summarize_text(doc) for doc in docs])\n        for summary, doc in results:\n            source_lang = detect(doc)\n            target_lang = detect(summary)\n            print(\n                f\"Source: {source_lang}, Summary: {target_lang}, Match: {source_lang == target_lang}\"\n            )\n\n    asyncio.run(main())\n    \"\"\"\n    Source: et, Summary: en, Match: False\n    Source: tl, Summary: tl, Match: True\n    Source: sw, Summary: en, Match: False\n    Source: tr, Summary: tr, Match: True\n    Source: vi, Summary: en, Match: False\n    Source: fr, Summary: fr, Match: True\n    Source: zh-cn, Summary: en, Match: False\n    Source: de, Summary: de, Match: True\n    Source: hi, Summary: en, Match: False\n    Source: ja, Summary: en, Match: False\n    \"\"\"\n"
  },
  {
    "path": "examples/match_language/run_v2.py",
    "content": "from pydantic import BaseModel, Field\nfrom instructor import patch\nfrom openai import AsyncOpenAI\nfrom langdetect import detect\n\ndocs = map(\n    lambda x: x.strip(),\n    \"\"\"\nԼեզվական մոդելները վերջին տարիներին դարձել են ավելի հարուստ և կատարյալ, հնարավորություն ընձեռելով ստեղծել սահուն և բնական տեքստեր, ինչպես նաև գերազանց արդյունքներ ցուցաբերել մեքենայական թարգմանության, հարցերի պատասխանման և ստեղծագործ տեքստերի ստեղծման նման տարբեր առաջադրանքներում։ Այս մոդելները մշակվում են հսկայական տեքստային տվյալների հիման վրա և կարող են բռնել բնական լեզվի կառուցվածքն ու նրբությունները՝ հեղափոխություն առաջացնելով համակարգիչների և մարդկանց միջև հաղորդակցության ոլորտում։\n\n---\n\nMga modelo ng wika ay naging mas sopistikado sa nagdaang mga taon, na nagbibigay-daan sa pagbuo ng mga natural at madaling basahing teksto, at nagpapakita ng mahusay na pagganap sa iba't ibang gawain tulad ng awtomatikong pagsasalin, pagsagot sa mga tanong, at pagbuo ng malikhain na teksto. Ang mga modelo na ito ay sinanay sa napakalaking mga dataset ng teksto at kayang hulihin ang istruktura at mga nuances ng natural na wika. Ang mga pagpapabuti sa mga modelo ng wika ay maaaring magdulot ng rebolusyon sa komunikasyon sa pagitan ng mga computer at tao, at inaasahan ang higit pang pag-unlad sa hinaharap.\n\n---\n\nNgaahi motuʻa lea kuo nau hoko ʻo fakaʻofoʻofa ange ʻi he ngaahi taʻu fakamuimui ni, ʻo fakafaingofuaʻi e fakatupu ʻo e ngaahi konga tohi ʻoku lelei mo fakanatula pea ʻoku nau fakahaaʻi ʻa e ngaahi ola lelei ʻi he ngaahi ngāue kehekehe ʻo hangē ko e liliu fakaʻētita, tali fehuʻi, mo e fakatupu ʻo e konga tohi fakaʻatamai. Ko e ako ʻa e ngaahi motuʻa ni ʻi he ngaahi seti ʻo e fakamatala tohi lahi pea ʻoku nau malava ʻo puke ʻa e fakafuofua mo e ngaahi meʻa iiki ʻo e lea fakanatula. ʻE lava ke fakatupu ʻe he ngaahi fakaleleiʻi ki he ngaahi motuʻa lea ha liliu lahi ʻi he fetu'utaki ʻi he vahaʻa ʻo e ngaahi komipiuta mo e kakai, pea ʻoku ʻamanaki ʻe toe fakalakalaka ange ia ʻi he kahaʻu.\n\n---\n\nDil modelleri son yıllarda daha da gelişti, akıcı ve doğal metinler üretmeyi mümkün kılıyor ve makine çevirisi, soru cevaplama ve yaratıcı metin oluşturma gibi çeşitli görevlerde mükemmel performans gösteriyor. Bu modeller, devasa metin veri setlerinde eğitilir ve doğal dilin yapısını ve nüanslarını yakalayabilir. Dil modellerindeki iyileştirmeler, bilgisayarlar ve insanlar arasındaki iletişimde devrim yaratabilir ve gelecekte daha da ilerleme bekleniyor.\n\n---\n\nMô hình ngôn ngữ đã trở nên tinh vi hơn trong những năm gần đây, cho phép tạo ra các văn bản trôi chảy và tự nhiên, đồng thời thể hiện hiệu suất xuất sắc trong các nhiệm vụ khác nhau như dịch máy, trả lời câu hỏi và tạo văn bản sáng tạo. Các mô hình này được huấn luyện trên các tập dữ liệu văn bản khổng lồ và có thể nắm bắt cấu trúc và sắc thái của ngôn ngữ tự nhiên. Những cải tiến trong mô hình ngôn ngữ có thể mang lại cuộc cách mạng trong giao tiếp giữa máy tính và con người, và người ta kỳ vọng sẽ có những tiến bộ hơn nữa trong tương lai.\n\n---\n\nLes modèles de langage sont devenus de plus en plus sophistiqués ces dernières années, permettant de générer des textes fluides et naturels, et de performer dans une variété de tâches telles que la traduction automatique, la réponse aux questions et la génération de texte créatif. Entraînés sur d'immenses ensembles de données textuelles, ces modèles sont capables de capturer la structure et les nuances du langage naturel, ouvrant la voie à une révolution dans la communication entre les ordinateurs et les humains.\n\n---\n\n近年来,语言模型变得越来越复杂,能够生成流畅自然的文本,并在机器翻译、问答和创意文本生成等各种任务中表现出色。这些模型在海量文本数据集上训练,可以捕捉自然语言的结构和细微差别。语言模型的改进有望彻底改变计算机和人类之间的交流方式,未来有望实现更大的突破。\n\n---\n\nIn den letzten Jahren sind Sprachmodelle immer ausgefeilter geworden und können flüssige, natürlich klingende Texte generieren und in verschiedenen Aufgaben wie maschineller Übersetzung, Beantwortung von Fragen und Generierung kreativer Texte hervorragende Leistungen erbringen. Diese Modelle werden auf riesigen Textdatensätzen trainiert und können die Struktur und Nuancen natürlicher Sprache erfassen, was zu einer Revolution in der Kommunikation zwischen Computern und Menschen führen könnte.\n\n---\n\nपिछले कुछ वर्षों में भाषा मॉडल बहुत अधिक परिष्कृत हो गए हैं, जो प्राकृतिक और प्रवाहमय पाठ उत्पन्न कर सकते हैं, और मशीन अनुवाद, प्रश्नोत्तर, और रचनात्मक पाठ उत्पादन जैसे विभिन्न कार्यों में उत्कृष्ट प्रदर्शन कर सकते हैं। ये मॉडल विशाल पाठ डेटासेट पर प्रशिक्षित होते हैं और प्राकृतिक भाषा की संरचना और बारीकियों को समझ सकते हैं। भाषा मॉडल में सुधार कंप्यूटर और मानव के बीच संवाद में क्रांति ला सकता है, और भविष्य में और प्रगति की उम्मीद है।\n\n---\n\n近年、言語モデルは非常に洗練され、自然で流暢なテキストを生成できるようになり、機械翻訳、質問応答、クリエイティブなテキスト生成など、様々なタスクで優れたパフォーマンスを発揮しています。これらのモデルは膨大なテキストデータセットで学習され、自然言語の構造とニュアンスを捉えることができます。言語モデルの改善により、コンピューターと人間のコミュニケーションに革命が起こる可能性があり、将来のさらなる進歩が期待されています。\n\"\"\".split(\"---\"),\n)\n\n# Patch the OpenAI client to enable response_model\nclient = patch(AsyncOpenAI())\n\n\nclass GeneratedSummary(BaseModel):\n    detected_language: str = Field(\n        description=\"The language code of the original article. The summary must be generated in this same language.\",\n    )\n    summary: str\n\n\nasync def summarize_text(text: str):\n    response = await client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=GeneratedSummary,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Generate a concise summary in the language of the article. \",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Summarize the following text in a concise way:\\n{text}\",\n            },\n        ],\n    )  # type: ignore\n    return response.detected_language, response.summary, text\n\n\nif __name__ == \"__main__\":\n    import asyncio\n\n    async def main():\n        results = await asyncio.gather(*[summarize_text(doc) for doc in docs])\n        for lang, summary, doc in results:\n            source_lang = detect(doc)\n            target_lang = detect(summary)\n            print(\n                f\"Source: {source_lang}, Summary: {target_lang}, Match: {source_lang == target_lang}, Detected: {lang}\"\n            )\n\n    asyncio.run(main())\n    \"\"\"\n    Source: et, Summary: et, Match: True, Detected: hy\n    Source: tl, Summary: tl, Match: True, Detected: tl\n    Source: sw, Summary: sw, Match: True, Detected: to\n    Source: tr, Summary: tr, Match: True, Detected: tr\n    Source: vi, Summary: vi, Match: True, Detected: vi\n    Source: fr, Summary: fr, Match: True, Detected: fr\n    Source: zh-cn, Summary: zh-cn, Match: True, Detected: zh\n    Source: de, Summary: de, Match: True, Detected: de\n    Source: hi, Summary: hi, Match: True, Detected: hi\n    Source: ja, Summary: ja, Match: True, Detected: ja\n    \"\"\"\n"
  },
  {
    "path": "examples/mistral/mistral.py",
    "content": "from pydantic import BaseModel\nfrom mistralai.client import MistralClient\nfrom instructor import from_mistral\nfrom instructor.mode import Mode\nimport os\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\n# enables `response_model` in chat call\nclient = MistralClient(api_key=os.environ.get(\"MISTRAL_API_KEY\"))\ninstructor_client = from_mistral(\n    client=client,\n    model=\"mistral-large-latest\",\n    mode=Mode.TOOLS,\n    max_tokens=1000,\n)\n\nresp = instructor_client.messages.create(\n    response_model=UserDetails,\n    messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n    temperature=0,\n)\n\nprint(resp)\n"
  },
  {
    "path": "examples/multi-actions/run.py",
    "content": "import instructor\nimport enum\n\nfrom typing import Optional\nfrom pydantic import BaseModel, Field\nfrom openai import OpenAI\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Action(enum.Enum):\n    CREATE = \"create_task\"\n    DELETE = \"close_task\"\n    UPDATE = \"update_task\"\n\n\nclass Projects(enum.Enum):\n    FRONTLINE_QA_AI = \"frontline_qa_ai\"\n    FUTURE_OF_PROGRAMMING = \"future_of_programming\"\n    PERSONAL_SITE = \"personal_site\"\n    NORDIC_HAMSTRING_CURLS = \"nordic_hamstring_curls\"\n\n\nclass Buckets(enum.Enum):\n    FINANCE = \"finance\"\n    PURVIEW_OPERATIONS = \"purview_operations\"\n    TASKBOT = \"taskbot\"\n    CHECKBOT = \"checkbot\"\n    NIGHT_HACKING = \"night_hacking\"\n    TICKLER = \"tickler\"\n\n\nclass TaskAction(BaseModel):\n    id: int\n    method: Action = Field(\n        description=\"Method of creating and closing a task: to close a task, only an ID is required\"\n    )\n    waiting_on: Optional[list[int]] = Field(\n        None, description=\"IDs of tasks that this task is waiting on\"\n    )\n    name: Optional[str] = Field(None, description=\"Name of the task\")\n    notes: Optional[str] = Field(None, description=\"Notes about the task\")\n    bucket: Optional[Buckets] = Field(\n        None, description=\"Bucket of the task, to set, or update\"\n    )\n    project: Optional[Projects] = Field(\n        None, description=\"Project of the task, to set, or update\"\n    )\n\n\nclass Response(BaseModel):\n    text: str = Field(description=\"The text of the response\")\n    task_action: Optional[list[TaskAction]] = Field(\n        description=\"The action to take on the task\"\n    )\n\n\ninitial_messages = [\n    {\n        \"role\": \"system\",\n        \"content\": \"You are an AI assistant. have the ability to create, update, and close tasks.\",\n    },\n    {\n        \"role\": \"assistant\",\n        \"content\": \"\"\"\n        The task is below. When assisting the user, reference the details from this task.\n\n        [BEGIN TASK]\n            id: 23\n            Name: Create 10 new GIFs\n            Description: Create 10 new GIFs for the Taskbot page on the user's personal site. They should be similar to the existing GIFs, but with different use cases.\n            Projects: Personal site\n            Buckets: Taskbot\n            Updates:\n        [BEGIN UPDATE]\n            **User Update - September 01, 2023 03:58:00 PM EDT**\n            The user plans to create the GIFs in the background as they work through their daily tasks. They aim to produce about one to two GIFs per day. If this plan doesn't work, they will reconsider their strategy.\n        [END UPDATE]\n        [END TASK]\n    \"\"\",\n    },\n    {\"role\": \"assistant\", \"content\": \"What's up with this task?\"},\n    {\n        \"role\": \"user\",\n        \"content\": \"Change it to 20, then make a new task for when its done make 20 more that moves.\",\n    },\n]\n\nresponse: Response = client.chat.completions.create(\n    messages=initial_messages, response_model=Response, model=\"gpt-4\"\n)  # type: ignore\n\nprint(response.model_dump_json(indent=2))\n{\n    \"text\": \"Updating task to create 20 GIFs and creating a new task to create an additional 20 animated GIFs after the initial task is done.\",\n    \"task_action\": [\n        {\n            \"id\": 23,\n            \"method\": \"update_task\",\n            \"waiting_on\": None,\n            \"name\": \"Create 20 new GIFs\",\n            \"notes\": \"The user increased the number of GIFs from 10 to 20. They plan to create these as they work through their daily tasks, creating about one to two GIFs per day. If this plan doesn't work, they will reconsider their strategy.\",\n            \"bucket\": \"taskbot\",\n            \"project\": \"personal_site\",\n        },\n        {\n            \"id\": 24,\n            \"method\": \"create_task\",\n            \"waiting_on\": [23],\n            \"name\": \"Create 20 new animated GIFs\",\n            \"notes\": \"The task will be initiated once the task with id 23 is completed.\",\n            \"bucket\": \"taskbot\",\n            \"project\": \"personal_site\",\n        },\n    ],\n}\n"
  },
  {
    "path": "examples/multiple_search_queries/diagram.py",
    "content": "import erdantic as erd\n\nfrom segment_search_queries import MultiSearch\n\ndiagram = erd.create(MultiSearch)\ndiagram.draw(\"examples/segment_search_queries/schema.png\")\n"
  },
  {
    "path": "examples/multiple_search_queries/segment_search_queries.py",
    "content": "import enum\nimport instructor\n\nfrom openai import OpenAI\nfrom pydantic import Field, BaseModel\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass SearchType(str, enum.Enum):\n    \"\"\"Enumeration representing the types of searches that can be performed.\"\"\"\n\n    VIDEO = \"video\"\n    EMAIL = \"email\"\n\n\nclass Search(BaseModel):\n    \"\"\"\n    Class representing a single search query which contains title, query and the search type\n    \"\"\"\n\n    search_title: str = Field(..., description=\"Title of the request\")\n    query: str = Field(..., description=\"Query to search for relevant content\")\n    type: SearchType = Field(..., description=\"Type of search\")\n\n    async def execute(self):\n        import asyncio\n\n        await asyncio.sleep(1)\n        print(\n            f\"Searching for `{self.search_title}` with query `{self.query}` using `{self.type}`\"\n        )\n\n\nclass MultiSearch(BaseModel):\n    \"\"\"\n    Class representing multiple search queries.\n    Make sure they contain all the required attributes\n\n    Args:\n        searches (List[Search]): The list of searches to perform.\n    \"\"\"\n\n    searches: list[Search] = Field(..., description=\"List of searches\")\n\n    def execute(self):\n        import asyncio\n\n        loop = asyncio.get_event_loop()\n\n        tasks = asyncio.gather(*[search.execute() for search in self.searches])\n        return loop.run_until_complete(tasks)\n\n\ndef segment(data: str) -> MultiSearch:\n    \"\"\"\n    Convert a string into multiple search queries using OpenAI's GPT-3 model.\n\n    Args:\n        data (str): The string to convert into search queries.\n\n    Returns:\n        MultiSearch: An object representing the multiple search queries.\n    \"\"\"\n\n    completion = client.chat.completions.create(\n        model=\"gpt-4-0613\",\n        temperature=0.1,\n        response_model=MultiSearch,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Consider the data below:\\n{data} and segment it into multiple search queries\",\n            },\n        ],\n        max_tokens=1000,\n    )\n    return MultiSearch.from_response(completion)\n\n\nif __name__ == \"__main__\":\n    queries = segment(\n        \"Please send me the video from last week about the investment case study and also documents about your GPDR policy?\"\n    )\n\n    queries.execute()\n    # >>> Searching for `Video` with query `investment case study` using `SearchType.VIDEO`\n    # >>> Searching for `Documents` with query `GPDR policy` using `SearchType.EMAIL`\n"
  },
  {
    "path": "examples/open_source_examples/README.md",
    "content": "# Read first to correctly work with the provided examples\n\n\n## Open Router\n1. Sign up for an Openrouter Account - https://accounts.openrouter.ai/sign-up\n2. Create an API key - https://openrouter.ai/keys\n3. Add API key to environment - `export OPENROUTER_API_KEY=your key here`\n4. Add Openrouter API endpoint to environment - `export OPENROUTER_BASE_URL=https://openrouter.ai/api/v1` [See https://openrouter.ai/docs#format for potential updates]\n\n## Perplexity\n1. Sign up for an Openrouter Account - https://www.perplexity.ai/\n2. Create an API key - https://www.perplexity.ai/pplx-api\n3. Add API key to environment - `export PERPLEXITY_API_KEY=your key here`\n4. Add Openrouter API endpoint to environment - `export PERPLEXITY_BASE_URL=https://api.perplexity.ai` [See https://docs.perplexity.ai/reference/post_chat_completions for potential updates]\n\n## Runpod\n1. Sign up for a Runpod account - https://www.runpod.io/console/signup\n2. Add credits, unfortunately no free tier. - https://www.runpod.io/console/user/billing\n3. Navigate to templates page[Left selection menu], under `Official` click deploy on `RunPod TheBloke LLMs` template. - https://www.runpod.io/console/templates\n4. Navigate to Community Cloud page [Left Selection menu], Click `Deploy` on a GPU with >=16 GB, 1x RTX 4000 Ada SFF works. - https://www.runpod.io/console/gpu-cloud\n5. Click `Customize Deployment`, click the `Environment Variables` drop down, Enter the following Key/Values, then click `Set Overrides`, then click `Continue`, and finally `Deploy`.\n    - key=MODEL value=TheBloke/OpenHermes-2.5-Mistral-7B-GPTQ\n    - key=UI_ARGS value=--n-gpu-layers 100 --threads 1\n6. Navigate to Pods[Left selection menu], wait until you see `Connect` button on the Pod you just deployed, click it. Right click `HTTP Service[Port 5000]` and copy the link address. - https://www.runpod.io/console/pods\n    - Add Runpod API endpoint to environment - `export RUNPOD_BASE_URL=your-runpod-link/v1` <-- Make sure to add v1 as well\n    - Add Runpod API key to environment -  `export RUNPOD_API_KEY=\"None\"` <-- This should be none.\n7. When done running, stop instance by clicking the stop icon on the Pod page. - https://www.runpod.io/console/pods"
  },
  {
    "path": "examples/open_source_examples/openrouter.py",
    "content": "import os\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\nfrom instructor import Maybe, Mode\n\n# Extract API key from environment\nopenrouter_api_key = os.environ.get(\"OPENROUTER_API_KEY\")\nassert openrouter_api_key, \"OPENROUTER_API_KEY is not set in environment variables\"\n\n# Base URL for OpenAI client\nopenrouter_base_url = os.environ.get(\"OPENROUTER_BASE_URL\")\nassert openrouter_base_url, \"OPENROUTER_BASE_URL is not set in environment variables\"\n\n# Initialize OpenAI client\nclient = instructor.from_openai(\n    OpenAI(api_key=openrouter_api_key, base_url=openrouter_base_url),\n    mode=Mode.JSON,\n)\n\ndata = [\n    \"Brandon is 33 years old. He works as a solution architect.\",\n    \"Jason is 25 years old. He is the GOAT.\",\n    \"Dominic is 45 years old. He is retired.\",\n    \"Jenny is 72. She is a wife and a CEO.\",\n    \"Holly is 22. She is an explorer.\",\n    \"There onces was a prince, named Benny. He ruled for 10 years, which just ended. He started at 22.\",\n    \"Simon says, why are you 22 years old marvin?\",\n]\n\n\nif __name__ == \"__main__\":\n\n    class UserDetail(BaseModel):\n        name: str = Field(description=\"Name extracted from the text\")\n        age: int = Field(description=\"Age extracted from the text\")\n        occupation: Optional[str] = Field(\n            default=None, description=\"Occupation extracted from the text\"\n        )\n\n    for content in data:\n        MaybeUser = Maybe(UserDetail)\n        user = client.chat.completions.create(\n            response_model=MaybeUser,\n            model=\"teknium/openhermes-2.5-mistral-7b\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": f\"You are an expert at outputting json. You always output valid json based on this schema: {MaybeUser.model_json_schema()}\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Extract the user details from the following text: {content}. Match your response the correct schema\",\n                },\n            ],\n        )\n        # Output the error or the result.\n        if user.error:\n            print(f\"Error: {user.error}\")\n        if user.result:\n            print(f\"Result: {user.result}\")\n"
  },
  {
    "path": "examples/open_source_examples/perplexity.py",
    "content": "import os\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\nfrom instructor import Maybe, Mode\n\n# Extract API key from environment\nperplexity_api_key = os.environ.get(\"PERPLEXITY_API_KEY\")\nassert perplexity_api_key, \"PERPLEXITY_API_KEY is not set in environment variables\"\n\n# Base URL for OpenAI\nperplexity_base_url = os.environ.get(\"PERPLEXITY_BASE_URL\")\nassert perplexity_base_url, \"PERPLEXITY_BASE_URL is not set in environment variables\"\n\n# Initialize OpenAI client\nclient = instructor.from_openai(\n    OpenAI(api_key=perplexity_api_key, base_url=perplexity_base_url),\n    mode=Mode.JSON,\n)\n\n# For direct reference here. See https://docs.perplexity.ai/docs/model-cards for updates\n# Recommended is pplx-70b-chat\nmodels = [\n    \"codellama-34b-instruct\",\n    \"llama-2-70b-chat\",\n    \"mistral-7b-instruct\",\n    \"pplx-7b-chat\",\n    \"pplx-70b-chat\",\n    \"pplx-7b-online\",\n    \"pplx-70b-online\",\n]\n\ndata = [\n    \"Brandon is 33 years old. He works as a solution architect.\",\n    \"Jason is 25 years old. He is the GOAT.\",\n    \"Dominic is 45 years old. He is retired.\",\n    \"Jenny is 72. She is a wife and a CEO.\",\n    \"Holly is 22. She is an explorer.\",\n    \"There onces was a prince, named Benny. He ruled for 10 years, which just ended. He started at 22.\",\n    \"Simon says, why are you 22 years old marvin?\",\n]\n\n\nif __name__ == \"__main__\":\n\n    class UserDetail(BaseModel):\n        name: str = Field(description=\"Name extracted from the text\")\n        age: int = Field(description=\"Age extracted from the text\")\n        occupation: Optional[str] = Field(\n            default=None, description=\"Occupation extracted from the text\"\n        )\n\n    for content in data:\n        MaybeUser = Maybe(UserDetail)\n        user = client.chat.completions.create(\n            response_model=MaybeUser,\n            model=\"pplx-70b-chat\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are an expert at outputting json. You always output valid JSON based on the pydantic schema given to you.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Extract the user details from the following text: {content}. Match your response to the following schema: {MaybeUser.model_json_schema()}\",\n                },\n            ],\n            max_retries=3,\n        )\n        # Output the error or the result.\n        if user.error:\n            print(f\"Error: {user.error}\")\n        if user.result:\n            print(f\"Result: {user.result}\")\n"
  },
  {
    "path": "examples/open_source_examples/runpod.py",
    "content": "import os\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\nfrom instructor import Mode\n\n# Extract API key from environment\nrunpod_api_key = os.environ.get(\"RUNPOD_API_KEY\")\nassert runpod_api_key, \"RUNPOD_API_KEY is not set in environment variables\"\n\n# Base URL for OpenAI client\nrunpod_base_url = os.environ.get(\"RUNPOD_BASE_URL\")\nassert runpod_base_url, \"RUNPOD_BASE_URL is not set in environment variables\"\n\n# Initialize OpenAI client\nclient = instructor.from_openai(\n    OpenAI(api_key=runpod_api_key, base_url=runpod_base_url),\n    mode=Mode.JSON,\n)\n\n\ndata = [\n    \"Brandon is 33 years old. He works as a solution architect.\",\n    \"Jason is 25 years old. He is the GOAT.\",\n    \"Dominic is 45 years old. He is retired.\",\n    \"Jenny is 72. She is a wife and a CEO.\",\n    \"Holly is 22. She is an explorer.\",\n    \"There onces was a prince, named Benny. He ruled for 10 years, which just ended. He started at 22.\",\n    \"Simon says, why are you 22 years old marvin?\",\n]\n\n\nif __name__ == \"__main__\":\n\n    class UserDetail(BaseModel):\n        name: str = Field(description=\"Name extracted from the text\")\n        age: int = Field(description=\"Age extracted from the text\")\n        occupation: Optional[str] = Field(\n            default=None, description=\"Occupation extracted from the text\"\n        )\n\n    for content in data:\n        try:\n            user = client.chat.completions.create(\n                response_model=UserDetail,\n                model=\"TheBloke_OpenHermes-2.5-Mistral-7B-GPTQ\",\n                messages=[\n                    {\n                        \"role\": \"system\",\n                        \"content\": \"You are an expert at outputting json. You output valid JSON.\",\n                    },\n                    {\n                        \"role\": \"user\",\n                        \"content\": f\"Extract the user details from the following text: {content}. Match your response to the following schema: {UserDetail.model_json_schema()}\",\n                    },\n                ],\n            )\n            print(f\"Result: {user}\")\n        except Exception as e:\n            print(f\"Error: {e}\")\n            continue\n"
  },
  {
    "path": "examples/openai/__init__.py",
    "content": ""
  },
  {
    "path": "examples/openai/run.py",
    "content": "\"\"\"\nCanonical OpenAI starter example for the instructor library.\n\nDemonstrates how to use `instructor.from_provider()` with OpenAI to extract\nstructured data from natural language into a Pydantic model.\n\nUsage:\n    export OPENAI_API_KEY=your-api-key\n    python examples/openai/run.py\n\"\"\"\n\nimport instructor\nfrom pydantic import BaseModel, Field\n\n\nclass UserInfo(BaseModel):\n    \"\"\"Extracted user information.\"\"\"\n\n    name: str = Field(description=\"The user's full name\")\n    age: int = Field(description=\"The user's age in years\")\n\n\nclient = instructor.from_provider(\"openai/gpt-4o-mini\")\n\nuser = client.chat.completions.create(\n    response_model=UserInfo,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract: Jason is 25 years old.\",\n        }\n    ],\n)\n\nprint(user.model_dump_json(indent=2))\n"
  },
  {
    "path": "examples/openai-audio/run.py",
    "content": "from openai import OpenAI\nfrom pydantic import BaseModel\nimport instructor\nfrom instructor.processing.multimodal import Audio\nimport base64\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n\n\nwith open(\"./output.wav\", \"rb\") as f:\n    encoded_string = base64.b64encode(f.read()).decode(\"utf-8\")\n\nresp = client.chat.completions.create(\n    model=\"gpt-4o-audio-preview\",\n    response_model=Person,\n    modalities=[\"text\"],\n    audio={\"voice\": \"alloy\", \"format\": \"wav\"},\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Extract the following information from the audio\",\n                Audio.from_path(\"./output.wav\"),\n            ],\n        },\n    ],\n)  # type: ignore\n\nprint(resp)\n# > Person(name='Jason', age=20)\n"
  },
  {
    "path": "examples/parallel/run.py",
    "content": "from __future__ import annotations\n\nimport openai\nimport instructor\n\nfrom typing import Literal\nfrom collections.abc import Iterable\nfrom pydantic import BaseModel\n\n\nclass Weather(BaseModel):\n    location: str\n    units: Literal[\"imperial\", \"metric\"]\n\n\nclass GoogleSearch(BaseModel):\n    query: str\n\n\nclient = openai.OpenAI()\n\nclient = instructor.from_openai(client, mode=instructor.Mode.PARALLEL_TOOLS)\n\nresp = client.chat.completions.create(\n    model=\"gpt-4-turbo-preview\",\n    messages=[\n        {\"role\": \"system\", \"content\": \"You must always use tools\"},\n        {\n            \"role\": \"user\",\n            \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n        },\n    ],\n    response_model=Iterable[Weather | GoogleSearch],\n)\n\nfor r in resp:\n    print(r)\n"
  },
  {
    "path": "examples/partial_streaming/benchmark.py",
    "content": "# Part of this code is adapted from the following examples from OpenAI Cookbook:\n# https://cookbook.openai.com/examples/how_to_stream_completions\n# https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\nimport time\nimport tiktoken\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\nclient = instructor.from_openai(OpenAI(), mode=instructor.Mode.MD_JSON)\n\n\ndef num_tokens_from_string(string: str, model_name: str) -> int:\n    \"\"\"Returns the number of tokens in a text string.\"\"\"\n    encoding = tiktoken.encoding_for_model(model_name)\n\n    num_tokens = len(encoding.encode(string))\n    num_tokens += 3  # every reply is primed with <|start|>assistant<|message|>\n\n    return num_tokens\n\n\nclass User(BaseModel):\n    name: str\n    role: str\n    age: int\n\n\ndef benchmark_raw_stream(model=\"gpt-4\"):\n    content = f\"\"\"Respond only in JSON that would validate to this schema and include nothing extra.\n    Otherwise something bad will happen:\\n {User.model_json_schema()}\"\"\"\n\n    start_time = time.time()\n    extraction_stream = client.chat.completions.create_fn(\n        model=model,\n        messages=[\n            {\"role\": \"system\", \"content\": content},\n            {\n                \"role\": \"user\",\n                \"content\": \"give me a harry pottery character in json, name, role, age\",\n            },\n        ],\n        stream=True,\n    )\n\n    collected_messages = [chunk.choices[0].delta.content for chunk in extraction_stream]\n    collected_messages = [m for m in collected_messages if m is not None]\n    collected_messages = \"\".join(collected_messages)\n    User.model_validate_json(collected_messages)\n    end_time = time.time() - start_time\n\n    output_tokens = num_tokens_from_string(collected_messages, model)\n    char_per_sec = output_tokens / end_time\n    return char_per_sec\n\n\ndef benchmark_partial_streaming(model=\"gpt-4\"):\n    start_time = time.time()\n    extraction_stream = client.chat.completions.create_partial(\n        model=model,\n        response_model=User,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"give me a harry pottery character in json, name, role, age\",\n            }\n        ],\n        stream=True,\n    )\n\n    for chunk in extraction_stream:  # noqa: B007\n        pass\n    end_time = time.time() - start_time\n\n    output_tokens = num_tokens_from_string(chunk.model_dump_json(), model)\n    char_per_sec = output_tokens / end_time\n    return char_per_sec\n\n\nif __name__ == \"__main__\":\n    partial_times = [\n        benchmark_partial_streaming(model=\"gpt-3.5-turbo-1106\") for _ in range(10)\n    ]\n    avg_partial_time = sum(partial_times) / len(partial_times)\n\n    raw_times = [benchmark_raw_stream(model=\"gpt-3.5-turbo\") for _ in range(10)]\n    avg_raw_time = sum(raw_times) / len(raw_times)\n    print(f\"Raw streaming: {avg_raw_time:.2f} tokens/sec\")\n\n    print(f\"Partial streaming: {avg_partial_time:.2f} token/sec\")\n    print(f\"Overhead: {avg_partial_time / avg_raw_time:.2f}x\")\n\n    \"\"\"OLD IMPLEMENTATION\n    Raw streaming: 35.73 tokens/sec\n    Partial streaming: 24.42 token/sec\n    Overhead: 0.68x\n    \"\"\"\n\n    \"\"\"NEW IMPLEMENTATION\n    Raw streaming: 35.77 tokens/sec\n    Partial streaming: 31.58 token/sec\n    Overhead: 0.88x\n    \"\"\"\n"
  },
  {
    "path": "examples/partial_streaming/run.py",
    "content": "# Part of this code is adapted from the following examples from OpenAI Cookbook:\n# https://cookbook.openai.com/examples/how_to_stream_completions\n# https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb\nimport instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\nclient = instructor.from_openai(OpenAI(), mode=instructor.Mode.TOOLS)\n\n\nclass User(BaseModel):\n    name: str\n    role: str\n\n\nextraction_stream = client.chat.completions.create_partial(\n    model=\"gpt-4\",\n    response_model=User,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"give me a harry pottery character in json, name, role, age\",\n        }\n    ],\n)\n\nfor chunk in extraction_stream:\n    print(chunk)\n"
  },
  {
    "path": "examples/patching/anyscale.py",
    "content": "import os\nimport instructor\n\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\n\n# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.acreate methods. to support response_model parameter\nclient = instructor.from_openai(\n    OpenAI(\n        base_url=\"https://api.endpoints.anyscale.com/v1\",\n        api_key=os.environ[\"ANYSCALE_API_KEY\"],\n    ),\n    mode=instructor.Mode.JSON_SCHEMA,\n)\n\n\n# Now, we can use the response_model parameter using only a base model\n# rather than having to use the ResponseSchema class\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.chat.completions.create(\n    model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)  # type: ignore\n\nprint(user)\n{\n    \"name\": \"Jason\",\n    \"age\": 25,\n}\n"
  },
  {
    "path": "examples/patching/oai.py",
    "content": "import instructor\n\nfrom openai import OpenAI\nfrom pydantic import BaseModel\n\n\n# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.acreate methods. to support response_model parameter\nclient = instructor.from_openai(\n    OpenAI(),\n    mode=instructor.Mode.TOOLS,\n)\n\n\n# Now, we can use the response_model parameter using only a base model\n# rather than having to use the ResponseSchema class\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.chat.completions.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)  # type: ignore\n\nprint(user)\n{\n    \"name\": \"Jason\",\n    \"age\": 25,\n}\n"
  },
  {
    "path": "examples/patching/pcalls.py",
    "content": "from typing import Literal, Union\nfrom collections.abc import Iterable\nfrom pydantic import BaseModel\nfrom instructor import ResponseSchema\n\nimport time\nimport openai\nimport instructor\n\n\nclient = openai.OpenAI()\n\n\nclass Weather(ResponseSchema):\n    location: str\n    units: Literal[\"imperial\", \"metric\"]\n\n\nclass GoogleSearch(ResponseSchema):\n    query: str\n\n\nif __name__ == \"__main__\":\n\n    class Query(BaseModel):\n        query: list[Union[Weather, GoogleSearch]]\n\n    client = instructor.from_openai(client, mode=instructor.Mode.PARALLEL_TOOLS)\n\n    start = time.perf_counter()\n    resp = client.chat.completions.create(\n        model=\"gpt-4-turbo-preview\",\n        messages=[\n            {\"role\": \"system\", \"content\": \"You must always use tools\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto and dallas and who won the super bowl?\",\n            },\n        ],\n        response_model=Iterable[Union[Weather, GoogleSearch]],\n    )\n    print(f\"# Time: {time.perf_counter() - start:.2f}\")\n\n    print(\"# Instructor: Question with Toronto and Super Bowl\")\n    print([model for model in resp])\n\n    start = time.perf_counter()\n    resp = client.chat.completions.create(\n        model=\"gpt-4-turbo-preview\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto and dallas?\",\n            },\n        ],\n        tools=[\n            {\"type\": \"function\", \"function\": Weather.openai_schema},\n            {\"type\": \"function\", \"function\": GoogleSearch.openai_schema},\n        ],\n        tool_choice=\"auto\",\n    )\n    print(f\"# Time: {time.perf_counter() - start:.2f}\")\n\n    print(\"# Question with Toronto and Dallas\")\n    for tool_call in resp.choices[0].message.tool_calls:\n        print(tool_call.model_dump_json(indent=2))\n\n    start = time.perf_counter()\n    resp = client.chat.completions.create(\n        model=\"gpt-4-turbo-preview\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What is the weather in toronto? and who won the super bowl?\",\n            },\n        ],\n        tools=[\n            {\"type\": \"function\", \"function\": Weather.openai_schema},\n            {\"type\": \"function\", \"function\": GoogleSearch.openai_schema},\n        ],\n        tool_choice=\"auto\",\n    )\n    print(f\"# Time: {time.perf_counter() - start:.2f}\")\n\n    print(\"# Question with Toronto and Super Bowl\")\n    for tool_call in resp.choices[0].message.tool_calls:\n        print(tool_call.model_dump_json(indent=2))\n"
  },
  {
    "path": "examples/patching/together.py",
    "content": "import os\nimport openai\nfrom pydantic import BaseModel\nimport instructor\n\nclient = openai.OpenAI(\n    base_url=\"https://api.together.xyz/v1\",\n    api_key=os.environ[\"TOGETHER_API_KEY\"],\n)\n\n\n# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.acreate methods. to support response_model parameter\nclient = instructor.from_openai(client, mode=instructor.Mode.TOOLS)\n\n\n# Now, we can use the response_model parameter using only a base model\n# rather than having to use the ResponseSchema class\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nuser: UserExtract = client.chat.completions.create(\n    model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n    response_model=UserExtract,\n    messages=[\n        {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n    ],\n)  # type: ignore\n\nprint(user.model_dump_json(indent=2))\n{\n    \"name\": \"Jason\",\n    \"age\": 25,\n}\n"
  },
  {
    "path": "examples/proscons/run.py",
    "content": "from openai import OpenAI\nfrom pydantic import BaseModel, Field\n\nimport instructor\n\n\nclass Character(BaseModel):\n    name: str\n    age: int\n    fact: list[str] = Field(..., description=\"A list of facts about the character\")\n\n\n# enables `response_model` in create call\nclient = instructor.from_openai(\n    OpenAI(\n        base_url=\"http://localhost:11434/v1\",\n        api_key=\"ollama\",  # required, but unused\n    ),\n    mode=instructor.Mode.JSON,\n)\n\nresp = client.chat.completions.create(\n    model=\"llama2\",\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Tell me about the Harry Potter\",\n        }\n    ],\n    response_model=Character,\n)\nprint(resp.model_dump_json(indent=2))\n\"\"\" \n{\n  \"name\": \"Harry James Potter\",\n  \"age\": 37,\n  \"fact\": [\n    \"He is the chosen one.\",\n    \"He has a lightning-shaped scar on his forehead.\",\n    \"He is the son of James and Lily Potter.\",\n    \"He attended Hogwarts School of Witchcraft and Wizardry.\",\n    \"He is a skilled wizard and sorcerer.\",\n    \"He fought against Lord Voldemort and his followers.\",\n    \"He has a pet owl named Snowy.\"\n  ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/query_planner_execution/diagram.py",
    "content": "from erdantic import erd\n\nfrom query_planner_execution import QueryPlan\n\ndiagram = erd.create(QueryPlan)\ndiagram.draw(\"examples/query_planner_execution/schema.png\")\n"
  },
  {
    "path": "examples/query_planner_execution/query_planner_execution.py",
    "content": "import asyncio\nimport enum\nimport instructor\n\nfrom openai import OpenAI\nfrom pydantic import Field, BaseModel\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass QueryType(str, enum.Enum):\n    \"\"\"\n    Enumeration representing the types of queries that can be asked to a question answer system.\n    \"\"\"\n\n    # When i call it anything beyond 'merge multiple responses' the accuracy drops significantly.\n    SINGLE_QUESTION = \"SINGLE\"\n    MERGE_MULTIPLE_RESPONSES = \"MERGE_MULTIPLE_RESPONSES\"\n\n\nclass ComputeQuery(BaseModel):\n    \"\"\"\n    Models a computation of a query, assume this can be some RAG system like llamaindex\n    \"\"\"\n\n    query: str\n    response: str = \"...\"\n\n\nclass MergedResponses(BaseModel):\n    \"\"\"\n    Models a merged response of multiple queries.\n    Currently we just concatinate them but we can do much more complex things.\n    \"\"\"\n\n    responses: list[ComputeQuery]\n\n\nclass Query(BaseModel):\n    \"\"\"\n    Class representing a single question in a question answer subquery.\n    Can be either a single question or a multi question merge.\n    \"\"\"\n\n    id: int = Field(..., description=\"Unique id of the query\")\n    question: str = Field(\n        ...,\n        description=\"Question we are asking using a question answer system, if we are asking multiple questions, this question is asked by also providing the answers to the sub questions\",\n    )\n    dependancies: list[int] = Field(\n        default_factory=list,\n        description=\"List of sub questions that need to be answered before we can ask the question. Use a subquery when anything may be unknown, and we need to ask multiple questions to get the answer. Dependences must only be other queries.\",\n    )\n    node_type: QueryType = Field(\n        default=QueryType.SINGLE_QUESTION,\n        description=\"Type of question we are asking, either a single question or a multi question merge when there are multiple questions\",\n    )\n\n    async def execute(self, dependency_func):\n        print(\"Executing\", \"`self.question`\")\n        print(\"Executing with\", len(self.dependancies), \"dependancies\")\n\n        if self.node_type == QueryType.SINGLE_QUESTION:\n            resp = ComputeQuery(\n                query=self.question,\n            )\n            await asyncio.sleep(1)\n            pprint(resp.model_dump())\n            return resp\n\n        sub_queries = dependency_func(self.dependancies)\n        computed_queries = await asyncio.gather(\n            *[q.execute(dependency_func=dependency_func) for q in sub_queries]\n        )\n        sub_answers = MergedResponses(responses=computed_queries)\n        merged_query = f\"{self.question}\\nContext: {sub_answers.model_dump_json()}\"\n        resp = ComputeQuery(\n            query=merged_query,\n        )\n        await asyncio.sleep(2)\n        pprint(resp.model_dump())\n        return resp\n\n\nclass QueryPlan(BaseModel):\n    \"\"\"\n    Container class representing a tree of questions to ask a question answer system.\n    and its dependencies. Make sure every question is in the tree, and every question is asked only once.\n    \"\"\"\n\n    query_graph: list[Query] = Field(\n        ..., description=\"The original question we are asking\"\n    )\n\n    async def execute(self):\n        # this should be done with a topological sort, but this is easier to understand\n        original_question = self.query_graph[-1]\n        print(f\"Executing query plan from `{original_question.question}`\")\n        return await original_question.execute(dependency_func=self.dependencies)\n\n    def dependencies(self, idz: list[int]) -> list[Query]:\n        \"\"\"\n        Returns the dependencies of the query with the given id.\n        \"\"\"\n        return [q for q in self.query_graph if q.id in idz]\n\n\nQuery.model_rebuild()\nQueryPlan.model_rebuild()\n\n\ndef query_planner(question: str, plan=False) -> QueryPlan:\n    PLANNING_MODEL = \"gpt-4\"\n    ANSWERING_MODEL = \"gpt-4o-mini\"\n\n    messages = [\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a world class query planning algorithm capable of breaking apart questions into its depenencies queries such that the answers can be used to inform the parent question. Do not answer the questions, simply provide correct compute graph with good specific questions to ask and relevant dependencies. Before you call the function, think step by step to get a better understanding the problem.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"Consider: {question}\\nGenerate the correct query plan.\",\n        },\n    ]\n\n    if plan:\n        messages.append(\n            {\n                \"role\": \"assistant\",\n                \"content\": \"Lets think step by step to find correct set of queries and its dependencies and not make any assuptions on what is known.\",\n            },\n        )\n        completion = client.chat.completions.create(\n            model=PLANNING_MODEL, temperature=0, messages=messages, max_tokens=1000\n        )\n\n        messages.append(completion[\"choices\"][0][\"message\"])\n\n        messages.append(\n            {\n                \"role\": \"user\",\n                \"content\": \"Using that information produce the complete and correct query plan.\",\n            }\n        )\n\n    completion = client.chat.completions.create(\n        model=ANSWERING_MODEL,\n        temperature=0,\n        functions=[QueryPlan.openai_schema],\n        function_call={\"name\": QueryPlan.openai_schema[\"name\"]},\n        messages=messages,\n        max_tokens=1000,\n    )\n    root = QueryPlan.from_response(completion)\n    return root\n\n\nif __name__ == \"__main__\":\n    from pprint import pprint\n\n    plan = query_planner(\n        \"What is the difference in populations of Canada and the Jason's home country?\",\n        plan=False,\n    )\n    pprint(plan.dict())\n    \"\"\"\n    {'query_graph': [{'dependancies': [],\n                    'id': 1,\n                    'node_type': <QueryType.SINGLE_QUESTION: 'SINGLE'>,\n                    'question': \"Identify Jason's home country\"},\n                    {'dependancies': [],\n                    'id': 2,\n                    'node_type': <QueryType.SINGLE_QUESTION: 'SINGLE'>,\n                    'question': 'Find the population of Canada'},\n                    {'dependancies': [1],\n                    'id': 3,\n                    'node_type': <QueryType.SINGLE_QUESTION: 'SINGLE'>,\n                    'question': \"Find the population of Jason's home country\"},\n                    {'dependancies': [2, 3],\n                    'id': 4,\n                    'node_type': <QueryType.SINGLE_QUESTION: 'SINGLE'>,\n                    'question': 'Calculate the difference in populations between '\n                                \"Canada and Jason's home country\"}]}    \n    \"\"\"\n\n    asyncio.run(plan.execute())\n    \"\"\"\n    Executing query plan from `What is the difference in populations of Canada and Jason's home country?`\n    Executing `What is the difference in populations of Canada and Jason's home country?`\n    Executing with 2 dependancies\n    Executing `What is the population of Canada?`\n    Executing `What is the population of Jason's home country?`\n    {'query': 'What is the population of Canada?', 'response': '...'}\n    {'query': \"What is the population of Jason's home country?\", 'response': '...'}\n    {'query': \"What is the difference in populations of Canada and Jason's home \"\n            'country?'\n            'Context: {\"responses\": [{\"query\": \"What is the population of '\n            'Canada?\", \"response\": \"...\"}, {\"query\": \"What is the population of '\n            'Jason's home country?\", \"response\": \"...\"}]}',\n    'response': '...'}\n    \"\"\"\n"
  },
  {
    "path": "examples/recursive_filepaths/diagram.py",
    "content": "import erdantic as erd\n\nfrom parse_recursive_paths import DirectoryTree\n\ndiagram = erd.create(DirectoryTree)\ndiagram.draw(\"examples/parse_recursive_paths/schema.png\")\n"
  },
  {
    "path": "examples/recursive_filepaths/parse_recursive_paths.py",
    "content": "import enum\nimport instructor\n\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\n\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass NodeType(str, enum.Enum):\n    \"\"\"Enumeration representing the types of nodes in a filesystem.\"\"\"\n\n    FILE = \"file\"\n    FOLDER = \"folder\"\n\n\nclass Node(BaseModel):\n    \"\"\"\n    Class representing a single node in a filesystem. Can be either a file or a folder.\n    Note that a file cannot have children, but a folder can.\n\n    Args:\n        name (str): The name of the node.\n        children (List[Node]): The list of child nodes (if any).\n        node_type (NodeType): The type of the node, either a file or a folder.\n\n    Methods:\n        print_paths: Prints the path of the node and its children.\n    \"\"\"\n\n    name: str = Field(..., description=\"Name of the folder\")\n    children: list[\"Node\"] = Field(\n        default_factory=list,\n        description=\"List of children nodes, only applicable for folders, files cannot have children\",\n    )\n    node_type: NodeType = Field(\n        default=NodeType.FILE,\n        description=\"Either a file or folder, use the name to determine which it could be\",\n    )\n\n    def print_paths(self, parent_path=\"\"):\n        \"\"\"Prints the path of the node and its children.\"\"\"\n\n        if self.node_type == NodeType.FOLDER:\n            path = f\"{parent_path}/{self.name}\" if parent_path != \"\" else self.name\n\n            print(path, self.node_type)\n\n            if self.children is not None:\n                for child in self.children:\n                    child.print_paths(path)\n        else:\n            print(f\"{parent_path}/{self.name}\", self.node_type)\n\n\nclass DirectoryTree(BaseModel):\n    \"\"\"\n    Container class representing a directory tree.\n\n    Args:\n        root (Node): The root node of the tree.\n\n    Methods:\n        print_paths: Prints the paths of the root node and its children.\n    \"\"\"\n\n    root: Node = Field(..., description=\"Root folder of the directory tree\")\n\n    def print_paths(self):\n        \"\"\"Prints the paths of the root node and its children.\"\"\"\n\n        self.root.print_paths()\n\n\nNode.model_rebuild()\nDirectoryTree.model_rebuild()\n\n\ndef parse_tree_to_filesystem(data: str) -> DirectoryTree:\n    \"\"\"\n    Convert a string representing a directory tree into a filesystem structure\n    using OpenAI's GPT-3 model.\n\n    Args:\n        data (str): The string to convert into a filesystem.\n\n    Returns:\n        DirectoryTree: The directory tree representing the filesystem.\n    \"\"\"\n\n    completion = client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=DirectoryTree,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a perfect file system parsing algorithm. You are given a string representing a directory tree. You must return the correct filesystem structure.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Consider the data below:\\n{data} and return the correctly labeled filesystem\",\n            },\n        ],\n        max_tokens=1000,\n    )\n    root = DirectoryTree.from_response(completion)\n    return root\n\n\nif __name__ == \"__main__\":\n    root = parse_tree_to_filesystem(\n        \"\"\"\n        root\n        ├── folder1\n        │   ├── file1.txt\n        │   └── file2.txt\n        └── folder2\n            ├── file3.txt\n            └── subfolder1\n                └── file4.txt\n        \"\"\"\n    )\n    root.print_paths()\n    # >>> root                                  NodeType.FOLDER\n    # >>> root/folder1                          NodeType.FOLDER\n    # >>> root/folder1/file1.txt                NodeType.FILE\n    # >>> root/folder1/file2.txt                NodeType.FILE\n    # >>> root/folder2                          NodeType.FOLDER\n    # >>> root/folder2/file3.txt                NodeType.FILE\n    # >>> root/folder2/subfolder1               NodeType.FOLDER\n    # >>> root/folder2/subfolder1/file4.txt     NodeType.FILE\n"
  },
  {
    "path": "examples/reranker/run.py",
    "content": "import instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field, field_validator, ValidationInfo\n\n# Initialize the OpenAI client with Instructor\nclient = instructor.from_openai(OpenAI())\n\n\nclass Label(BaseModel):\n    chunk_id: str = Field(description=\"The unique identifier of the text chunk\")\n    chain_of_thought: str = Field(\n        description=\"The reasoning process used to evaluate the relevance\"\n    )\n    relevancy: int = Field(\n        description=\"Relevancy score from 0 to 10, where 10 is most relevant\",\n        ge=0,\n        le=10,\n    )\n\n    @field_validator(\"chunk_id\")\n    @classmethod\n    def validate_chunk_id(cls, v: str, info: ValidationInfo) -> str:\n        context = info.context\n        chunks = context.get(\"chunks\", [])\n        if v not in [chunk[\"id\"] for chunk in chunks]:\n            raise ValueError(\n                f\"Chunk with id {v} not found, must be one of {[chunk['id'] for chunk in chunks]}\"\n            )\n        return v\n\n\nclass RerankedResults(BaseModel):\n    labels: list[Label] = Field(description=\"List of labeled and ranked chunks\")\n\n    @field_validator(\"labels\")\n    @classmethod\n    def model_validate(cls, v: list[Label]) -> list[Label]:\n        return sorted(v, key=lambda x: x.relevancy, reverse=True)\n\n\ndef rerank_results(query: str, chunks: list[dict]) -> RerankedResults:\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=RerankedResults,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"\n                You are an expert search result ranker. Your task is to evaluate the relevance of each text chunk to the given query and assign a relevancy score.\n\n                For each chunk:\n                1. Analyze its content in relation to the query.\n                2. Provide a chain of thought explaining your reasoning.\n                3. Assign a relevancy score from 0 to 10, where 10 is most relevant.\n\n                Be objective and consistent in your evaluations.\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n                <query>{{ query }}</query>\n\n                <chunks_to_rank>\n                {% for chunk in chunks %}\n                <chunk chunk_id=\"{{ chunk.id }}\">\n                    {{ chunk.text }}\n                </chunk>\n                {% endfor %}\n                </chunks_to_rank>\n\n                Please provide a RerankedResults object with a Label for each chunk.\n                \"\"\",\n            },\n        ],\n        context={\"query\": query, \"chunks\": chunks},\n    )\n\n\ndef main():\n    # Sample query and chunks\n    query = \"What are the health benefits of regular exercise?\"\n    chunks = [\n        {\n            \"id\": \"a1b2c3d4-e5f6-7890-abcd-ef1234567890\",\n            \"text\": \"Regular exercise can improve cardiovascular health and reduce the risk of heart disease.\",\n        },\n        {\n            \"id\": \"b2c3d4e5-f6g7-8901-bcde-fg2345678901\",\n            \"text\": \"The price of gym memberships varies widely depending on location and facilities.\",\n        },\n        {\n            \"id\": \"c3d4e5f6-g7h8-9012-cdef-gh3456789012\",\n            \"text\": \"Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.\",\n        },\n        {\n            \"id\": \"d4e5f6g7-h8i9-0123-defg-hi4567890123\",\n            \"text\": \"Proper nutrition is essential for maintaining a healthy lifestyle.\",\n        },\n        {\n            \"id\": \"e5f6g7h8-i9j0-1234-efgh-ij5678901234\",\n            \"text\": \"Strength training can increase muscle mass and improve bone density, especially important as we age.\",\n        },\n    ]\n\n    # Rerank the results\n    results = rerank_results(query, chunks)\n\n    # Print the reranked results\n    print(\"Reranked results:\")\n    for label in results.labels:\n        print(f\"Chunk {label.chunk_id} (Relevancy: {label.relevancy}):\")\n        print(\n            f\"Text: {next(chunk['text'] for chunk in chunks if chunk['id'] == label.chunk_id)}\"\n        )\n        print(f\"Reasoning: {label.chain_of_thought}\")\n        print()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "examples/resolving-complex-entities/run.py",
    "content": "from graphviz import Digraph\nfrom pydantic import BaseModel, Field\n\nimport instructor\nfrom openai import OpenAI\n\nclient = OpenAI()\n\n# Patch openai to use instructor\n# allows for response_model\ninstructor.from_openai()\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n    resolved_absolute_value: str\n\n\nclass Entity(BaseModel):\n    id: int = Field(\n        ...,\n        description=\"Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities\",\n    )\n    subquote_string: list[str] = Field(\n        ...,\n        description=\"Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution\",\n    )\n    entity_title: str\n    properties: list[Property] = Field(\n        ..., description=\"List of properties of the entity\"\n    )\n    dependencies: list[int] = Field(\n        ...,\n        description=\"List of entity ids that this entity depends  or relies on to resolve it\",\n    )\n\n\nclass DocumentExtraction(BaseModel):\n    entities: list[Entity] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be its separate object with a body and a list of sources\",\n    )\n\n\ndef ask_ai(content) -> DocumentExtraction:\n    resp: DocumentExtraction = client.chat.completions.create(\n        model=\"gpt-4\",\n        response_model=DocumentExtraction,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a perfect entity resolution system that extracts facts from the document. Extract and resolve a list of entities from the following document:\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n    )  # type: ignore\n    return resp\n\n\ndef generate_html_label(entity: Entity) -> str:\n    rows = [\n        f\"<tr><td>{prop.key}</td><td>{prop.resolved_absolute_value}</td></tr>\"\n        for prop in entity.properties\n    ]\n    table_rows = \"\".join(rows)\n    return f\"\"\"<\n    <table border=\"0\" cellborder=\"1\" cellspacing=\"0\">\n    <tr><td colspan=\"2\"><b>{entity.entity_title}</b></td></tr>\n    {table_rows}\n    </table>>\"\"\"\n\n\ndef generate_graph(data: DocumentExtraction):\n    dot = Digraph(comment=\"Entity Graph\", node_attr={\"shape\": \"plaintext\"})\n\n    # Add nodes\n    for entity in data.entities:\n        label = generate_html_label(entity)\n        dot.node(str(entity.id), label)\n\n    # Add edges\n    for entity in data.entities:\n        for dep_id in entity.dependencies:\n            dot.edge(str(entity.id), str(dep_id))\n\n    # Render graph\n    dot.render(\"entity.gz\", view=True)\n\n\ncontent = \"\"\"\nSample Legal Contract\nAgreement Contract\n\nThis Agreement is made and entered into on 2020-01-01 by and between Company A (\"the Client\") and Company B (\"the Service Provider\").\n\nArticle 1: Scope of Work\n\nThe Service Provider will deliver the software product to the Client 30 days after the agreement date.\n\nArticle 2: Payment Terms\n\nThe total payment for the service is $50,000.\nAn initial payment of $10,000 will be made within 7 days of the the signed date.\nThe final payment will be due 45 days after [SignDate].\n\nArticle 3: Confidentiality\n\nThe parties agree not to disclose any confidential information received from the other party for 3 months after the final payment date.\n\nArticle 4: Termination\n\nThe contract can be terminated with a 30-day notice, unless there are outstanding obligations that must be fulfilled after the [DeliveryDate].\n\"\"\"\n\nmodel = ask_ai(content)\ngenerate_graph(model)\n"
  },
  {
    "path": "examples/retry/run.py",
    "content": "from pydantic import BaseModel, field_validator\nfrom openai import OpenAI\nimport instructor\nimport tenacity\n\nclient = OpenAI()\nclient = instructor.from_openai(client)\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"name\")\n    def name_is_uppercase(cls, v: str):\n        assert v.isupper(), \"Name must be uppercase\"\n        return v\n\n\nresp = client.messages.create(\n    model=\"gpt-3.5-turbo\",\n    max_tokens=1024,\n    max_retries=tenacity.Retrying(\n        stop=tenacity.stop_after_attempt(3),\n        before=lambda _: print(\"before:\", _),\n        after=lambda _: print(\"after:\", _),\n    ),\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract John is 18 years old.\",\n        }\n    ],\n    response_model=User,\n)  # type: ignore\n\nassert isinstance(resp, User)\nassert resp.name == \"JOHN\"  # due to validation\nassert resp.age == 18\nprint(resp)\n\n\"\"\"\nbefore: <RetryCallState 4421908816: attempt #1; slept for 0.0; last result: none yet>\nafter: <RetryCallState 4421908816: attempt #1; slept for 0.0; last result: failed (ValidationError 1 validation error for User\nname\n  Assertion failed, Name must be uppercase [type=assertion_error, input_value='John', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.6/v/assertion_error)>\nbefore: <RetryCallState 4421908816: attempt #2; slept for 0.0; last result: none yet>\n\nname='JOHN' age=18\n\"\"\"\n"
  },
  {
    "path": "examples/safer_sql_example/diagram.py",
    "content": "import erdantic as erd\n\nfrom safe_sql import SQL\n\ndiagram = erd.create(SQL)\ndiagram.draw(\"examples/safe_sql/schema.png\")\n"
  },
  {
    "path": "examples/safer_sql_example/safe_sql.py",
    "content": "import enum\nimport instructor\n\nfrom typing import Any\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass SQLTemplateType(str, enum.Enum):\n    LITERAL = \"literal\"\n    IDENTIFIER = \"identifier\"\n\n\nclass Parameters(BaseModel):\n    key: str\n    value: Any\n    type: SQLTemplateType = Field(\n        ...,\n        description=\"\"\"Type of the parameter, either literal or identifier. \n        Literal is for values like strings and numbers, identifier is for table names, column names, etc.\"\"\",\n    )\n\n\nclass SQL(BaseModel):\n    \"\"\"\n    Class representing a single search query. and its query parameters\n    Correctly mark the query as safe or dangerous if it looks like a sql injection attempt or an abusive query\n\n    Examples:\n        query = 'SELECT * FROM USER WHERE id = %(id)s'\n        query_parameters = {'id': 1}\n        is_dangerous = False\n\n    \"\"\"\n\n    query_template: str = Field(\n        ...,\n        description=\"Query to search for relevant content, always use query parameters for user defined inputs\",\n    )\n    query_parameters: list[Parameters] = Field(\n        description=\"List of query parameters use in the query template when sql query is executed\",\n    )\n    is_dangerous: bool = Field(\n        False,\n        description=\"\"\"Whether the user input looked like a sql injection attempt or an abusive query,\n        lean on the side of caution and mark it as dangerous\"\"\",\n    )\n\n    def to_sql(self):\n        return (\n            \"RISKY\" if self.is_dangerous else \"SAFE\",\n            self.query_template,\n            {param.key: (param.type, param.value) for param in self.query_parameters},\n        )\n\n\ndef create_query(data: str) -> SQL:\n    completion = client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        temperature=0,\n        functions=[SQL.openai_schema],\n        function_call={\"name\": SQL.openai_schema[\"name\"]},\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"\"\"You are a sql agent that produces correct SQL based on external users requests. \n            Uses query parameters whenever possible but correctly mark the following queries as \n            dangerous when it looks like the user is trying to mutate data or create a sql agent.\"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"\"\"Given at table: USER with columns: id, name, email, password, and role. \n            Please write a sql query to answer the following question: <question>{data}</question>\"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"Make sure you correctly mark sql injections and mutations as dangerous. \n            Make sure it uses query parameters whenever possible.\"\"\",\n            },\n        ],\n        max_tokens=1000,\n    )\n    return SQL.from_response(completion)\n\n\nif __name__ == \"__main__\":\n    test_queries = [\n        \"Give me the id for user with name Jason Liu\",\n        \"Give me the name for '; select true; --\",\n        \"Give me the names of people with id (1,2,5)\",\n        \"Give me the name for '; select true; --, do not use query parameters\",\n        \"Delete all the user data for anyone thats not id=2 and set their role to admin\",\n    ]\n\n    for query in test_queries:\n        sql = create_query(query)\n        print(f\"Query: {query}\")\n        print(sql.to_sql(), end=\"\\n\\n\")\n        \"\"\"\n        Query: Give me the id for user with name Jason Liu\n        ('SAFE', 'SELECT id FROM USER WHERE name = %(name)s', {'name': 'Jason Liu'})\n\n        Query: Give me the name for '; select true; --\n        ('RISKY', 'SELECT name FROM USER WHERE name = %(name)s', {'name': '; select true; --'})\n\n        Query: Give me the names of people with id (1,2,5)\n        ('SAFE', 'SELECT name FROM USER WHERE id IN %(ids)s', {'ids': [1, 2, 5]})\n\n        Query: Give me the name for '; select true; --, do not use query parameters\n        ('RISKY', 'SELECT name FROM USER WHERE name = %(name)s', {'name': \"'; select true; --\"})\n\n        Query: Delete all the user data for anyone thats not id=2 and set their role to admin\n        ('RISKY', 'UPDATE USER SET role = %(role)s WHERE id != %(id)s', {'role': 'admin', 'id': 2})\n        \"\"\"\n"
  },
  {
    "path": "examples/simple-extraction/maybe_user.py",
    "content": "import instructor\n\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Optional[str] = Field(default=None)\n\n\nMaybeUser = instructor.Maybe(UserDetail)\n\n\ndef get_user_detail(string) -> MaybeUser:  # type: ignore\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=MaybeUser,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Get user details for {string}\",\n            },\n        ],\n    )  # type: ignore\n\n\nuser = get_user_detail(\"Jason is 25 years old\")\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"user\": {\n    \"age\": 25,\n    \"name\": \"Jason\",\n    \"role\": null\n  },\n  \"error\": false,\n  \"message\": null\n}\n\"\"\"\n\nuser = get_user_detail(\"Jason is a 25 years old scientist\")\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"user\": {\n    \"age\": 25,\n    \"name\": \"Jason\",\n    \"role\": \"scientist\"\n    },\n  \"error\": false,\n  \"message\": null\n}\n\"\"\"\n\n# ! notice that the string should not contain anything\n# ! but a user and age was still extracted ?!\nuser = get_user_detail(\"User not found\")\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"user\": null,\n  \"error\": true,\n  \"message\": \"User not found\"\n}\n\"\"\"\n\n# ! due to the __bool__ method, you can use the MaybeUser object as a boolean\n\nif not user:\n    print(\"Detected error\")\n\"\"\"\nDetected error\n\"\"\"\n"
  },
  {
    "path": "examples/simple-extraction/user.py",
    "content": "import instructor\n\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n    role: Optional[str] = Field(default=None)\n\n\ndef get_user_detail(string) -> UserDetail:\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=UserDetail,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Get user details for {string}\",\n            },\n        ],\n    )  # type: ignore\n\n\nuser = get_user_detail(\"Jason is 25 years old\")\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"age\": 25,\n  \"name\": \"Jason\",\n  \"role\": null\n}\n\"\"\"\n\nuser = get_user_detail(\"Jason is a 25 years old scientist\")\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"age\": 25,\n  \"name\": \"Jason\",\n  \"role\": \"scientist\"\n}\n\"\"\"\n\n# ! notice that the string should not contain anything\n# ! but a user and age was still extracted ?!\nuser = get_user_detail(\"User not found\")\nprint(user.model_dump_json(indent=2))\n\"\"\"\n{\n  \"age\": 25,\n  \"name\": \"John Doe\",\n  \"role\": \"null\"\n}\n\"\"\"\n"
  },
  {
    "path": "examples/situate_context/run.py",
    "content": "from instructor import AsyncInstructor, Mode, patch\nfrom anthropic import AsyncAnthropic\nfrom pydantic import BaseModel, Field\n\n# Initialize the Instructor client with prompt caching\nclient = AsyncInstructor(\n    client=AsyncAnthropic(),\n    create=patch(\n        create=AsyncAnthropic().beta.prompt_caching.messages.create,\n        mode=Mode.TOOLS,\n    ),\n    mode=Mode.TOOLS,\n)\n\n\nclass SituatedContext(BaseModel):\n    \"\"\"\n    The context to situate the chunk within the document. The situated context should be as long as the original chunk.\n\n    Example:\n       - original chunk: \"The company's revenue grew by 3% over the previous quarter.\"\n       - situated context: \"This chunk is from an SEC filing on ACME corp's performance in Q2 2023; the previous quarter's revenue was $314 million. The company's revenue grew by 3% over the previous quarter.\"\n    \"\"\"\n\n    situated_context: str = Field(\n        ..., description=\"The situated context of the chunk within the document.\"\n    )\n\n\nasync def situate_context(doc: str, chunk: str) -> SituatedContext:\n    response = await client.chat.completions.create(\n        model=\"claude-3-haiku-20240307\",\n        max_tokens=1024,\n        temperature=0.0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                        <document>\n                        {{doc}}\n                        </document>\n                        \"\"\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"\n                        Here is the chunk we want to situate within the whole document\n                        <chunk>\n                        {{chunk}}\n                        </chunk>\n\n                        Please give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk.\n                        Answer only with the succinct context and nothing else.\n                        \"\"\",\n                    },\n                ],\n            }\n        ],\n        response_model=SituatedContext,\n        context={\n            \"doc\": doc,\n            \"chunk\": chunk,\n        },\n    )\n    return response\n\n\ndef chunking_function(\n    doc: str, chunk_size: int = 1000, overlap: int = 200\n) -> list[str]:\n    \"\"\"\n    Chunk the document into `chunk_size` character segments with `overlap` overlap.\n    \"\"\"\n    chunks = []\n    start = 0\n    while start < len(doc):\n        end = start + chunk_size\n        chunks.append(doc[start:end])\n        start += chunk_size - overlap\n    return chunks\n\n\nimport asyncio\n\n\nasync def process_chunk(doc: str, chunk: str) -> dict[str, str]:\n    \"\"\"\n    Process a single chunk by situating it within the context of the full document.\n\n    Args:\n    doc (str): The full document text\n    chunk (str): A chunk of the document\n\n    Returns:\n    Dict[str, str]: A dictionary containing the chunk and its situated context\n    \"\"\"\n    context = await situate_context(doc, chunk)\n    return {\"chunk\": chunk, \"context\": context}\n\n\nasync def process(\n    doc: str, chunk_size: int = 1000, overlap: int = 200\n) -> list[dict[str, str]]:\n    \"\"\"\n    Process the document by chunking it and situating each chunk within the context of the full document.\n    Uses asyncio.gather for concurrent processing.\n\n    Args:\n    doc (str): The full document text\n\n    Returns:\n    List[Dict[str, str]]: A list of dictionaries, each containing a chunk and its situated context\n    \"\"\"\n    chunks = chunking_function(doc, chunk_size, overlap)\n    tasks = [process_chunk(doc, chunk) for chunk in chunks]\n    results = await asyncio.gather(*tasks)\n    return results\n\n\nif __name__ == \"__main__\":\n    # Example usage\n    document = \"\"\"\n    ACME Corporation Financial Report for Fiscal Year 2023\n\n    Executive Summary:\n    ACME Corp. has demonstrated exceptional performance in the latest fiscal year, showcasing significant growth across multiple key areas. This report provides a comprehensive overview of our financial achievements, operational successes, and strategic outlook for the future.\n\n    Financial Highlights:\n    1. Revenue: The company's revenue experienced a robust growth of 12% compared to the previous fiscal year, reaching an impressive $1.3 billion. This figure represents a substantial 45% increase from three years ago, underscoring our consistent upward trajectory.\n\n    2. Net Income: Our net income for the fiscal year stood at $180 million, marking a 20% increase from the previous year and an even more impressive 60% rise from three years ago. This growth in net income reflects our improved operational efficiency and successful cost management strategies.\n\n    3. Earnings Per Share (EPS): The EPS for the year reached $4.50, showing a notable improvement from $3.75 in the previous year and $2.80 three years ago. This upward trend in EPS demonstrates our commitment to delivering increasing value to our shareholders.\n\n    4. Gross Margin: Our gross margin improved to 45% in the current fiscal year, up from 41% in the previous year and 38% three years ago, indicating enhanced production efficiency and effective pricing strategies.\n\n    5. Operating Cash Flow: We generated a strong operating cash flow of $250 million in the fiscal year, representing a 50% increase from three years ago.\n\n    Operational Highlights:\n    1. Product Launch: ACME's innovative XYZ product line, launched at the beginning of the fiscal year, has surpassed all expectations with an impressive 2 million units sold. This successful launch has significantly contributed to our revenue growth and market share expansion.\n\n    2. Market Expansion: In line with our global growth strategy, we have successfully penetrated ten new international markets during this fiscal year. These new markets have already made a substantial impact, contributing to 20% of the year's total revenue. This expansion not only diversifies our revenue streams but also strengthens our global presence.\n\n    3. Cost Optimization: The implementation of our cutting-edge AI-driven supply chain management system has yielded excellent results, leading to a 10% reduction in operational costs over the past three years. This initiative is part of our ongoing commitment to leveraging technology for improved efficiency and profitability.\n\n    4. Customer Satisfaction: Our customer satisfaction score has improved to 95%, up from 85% three years ago, reflecting our dedication to product quality and customer service excellence.\n\n    5. Sustainability Initiatives: We've made significant strides in our sustainability efforts, reducing our carbon footprint by 30% compared to three years ago. This aligns with our long-term goal of becoming a carbon-neutral organization by 2030.\n\n    Research and Development:\n    1. R&D Investment: We've increased our R&D spending by 50% compared to three years ago, focusing on next-generation technologies and sustainable product development.\n\n    2. Patent Filings: ACME filed 60 new patents in the fiscal year, bringing our total patent portfolio to over 1000, further solidifying our position as an industry innovator.\n\n    Looking Ahead:\n    Based on the strong fiscal year performance and positive market indicators, ACME Corp. is revising its five-year revenue growth forecast from 50% to a more ambitious 70-80%. This upward revision reflects our confidence in the company's growth trajectory and the effectiveness of our strategic initiatives.\n\n    Key focus areas for the coming years include:\n    1. Further expansion into emerging markets in Asia, Africa, and South America.\n    2. Continued investment in AI and machine learning to drive operational efficiencies.\n    3. Launch of our next-generation sustainable product line, scheduled for the upcoming fiscal year.\n    4. Enhancement of our digital transformation initiatives to improve customer experience and internal processes.\n\n    The company remains steadfastly committed to innovation, operational excellence, and sustainable practices to drive long-term growth and shareholder value. We are confident that our strategic initiatives, coupled with our strong market position, will enable us to capitalize on emerging opportunities and navigate potential challenges in the global business landscape.\n    In conclusion, ACME Corp.'s performance in Q2 2023 has set a solid foundation for continued success. We thank our employees, customers, and shareholders for their ongoing support and look forward to building on this momentum in the quarters to come.\n    \"\"\"\n\n    async def main():\n        import time\n\n        start_time = time.time()\n        processed_chunks = await process(document, chunk_size=800, overlap=200)\n        end_time = time.time()\n        print(f\"Time taken: {end_time - start_time} seconds\")\n        for i, item in enumerate(processed_chunks):\n            print(f\"Chunk {i + 1}:\")\n            print(f\"Text: {item['chunk']}...\")\n            print(f\"Context: {item['context']}\")\n            print()\n\n    asyncio.run(main())\n"
  },
  {
    "path": "examples/sqlmodel/run.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nSQLModel with Instructor - Comprehensive Example\n\nThis example demonstrates AI-powered database operations with advanced patterns.\n\nRequirements:\n    pip install instructor sqlmodel openai\n\nUsage:\n    python run.py\n\nNote: Make sure to set your OPENAI_API_KEY environment variable.\n\"\"\"\n\nimport asyncio\nimport logging\nimport time\nfrom datetime import datetime\nfrom functools import wraps\nfrom typing import Optional\nfrom uuid import UUID, uuid4\n\nimport instructor\nfrom openai import AsyncOpenAI, OpenAI\nfrom pydantic import validator\nfrom pydantic.json_schema import SkipJsonSchema\nfrom sqlmodel import Field, Session, SQLModel, create_engine, select, Relationship\n\n# Configure logging\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\n# Initialize clients\nsync_client = instructor.from_openai(OpenAI())\nasync_client = instructor.from_openai(AsyncOpenAI())\n\n# Database setup\nengine = create_engine(\"sqlite:///heroes_demo.db\", echo=False)\n\n\n# Performance monitoring decorator\ndef monitor_ai_calls(func):\n    @wraps(func)\n    async def async_wrapper(*args, **kwargs):\n        start_time = time.time()\n        result = await func(*args, **kwargs)\n        duration = time.time() - start_time\n        logger.info(f\"AI call took {duration:.2f} seconds\")\n        return result\n\n    @wraps(func)\n    def sync_wrapper(*args, **kwargs):\n        start_time = time.time()\n        result = func(*args, **kwargs)\n        duration = time.time() - start_time\n        logger.info(f\"AI call took {duration:.2f} seconds\")\n        return result\n\n    return async_wrapper if asyncio.iscoroutinefunction(func) else sync_wrapper\n\n\n# Models with relationships and advanced patterns\nclass Team(SQLModel, table=True):\n    \"\"\"Team model with relationship to heroes\"\"\"\n\n    id: Optional[int] = Field(default=None, primary_key=True)\n    name: str = Field(min_length=2, max_length=50)\n    city: str = Field(min_length=2, max_length=50)\n    founded_year: Optional[int] = Field(default=None, ge=1900, le=2024)\n\n    # Relationship to heroes\n    heroes: list[\"Hero\"] = Relationship(back_populates=\"team\")\n\n\nclass Hero(SQLModel, instructor.ResponseSchema, table=True):\n    \"\"\"Hero model with auto-generated fields and validation\"\"\"\n\n    __table_args__ = {\"extend_existing\": True}\n\n    # Auto-generated fields excluded from AI generation\n    id: SkipJsonSchema[Optional[int]] = Field(default=None, primary_key=True)\n    created_at: SkipJsonSchema[datetime] = Field(default_factory=datetime.utcnow)\n    uuid: SkipJsonSchema[UUID] = Field(default_factory=uuid4)\n\n    # AI-generated fields with validation\n    name: str = Field(min_length=2, max_length=50, description=\"Hero's public name\")\n    secret_name: str = Field(\n        min_length=2, max_length=50, description=\"Hero's secret identity\"\n    )\n    age: Optional[int] = Field(default=None, ge=16, le=100, description=\"Hero's age\")\n    power_level: int = Field(ge=1, le=100, description=\"Power level from 1-100\")\n    origin_story: str = Field(\n        min_length=10, max_length=200, description=\"Brief origin story\"\n    )\n\n    # Foreign key relationship\n    team_id: SkipJsonSchema[Optional[int]] = Field(default=None, foreign_key=\"team.id\")\n    team: Optional[Team] = Relationship(back_populates=\"heroes\")\n\n    @validator(\"name\")\n    def validate_name_format(cls, v):\n        \"\"\"Ensure hero name doesn't contain inappropriate words\"\"\"\n        forbidden_words = [\"villain\", \"evil\", \"bad\"]\n        if any(word in v.lower() for word in forbidden_words):\n            raise ValueError(f\"Hero name cannot contain: {', '.join(forbidden_words)}\")\n        return v\n\n\nclass Product(SQLModel, instructor.ResponseSchema, table=True):\n    \"\"\"Product model demonstrating different AI generation patterns\"\"\"\n\n    __table_args__ = {\"extend_existing\": True}\n\n    # Auto-generated fields\n    id: SkipJsonSchema[UUID] = Field(default_factory=uuid4, primary_key=True)\n    created_at: SkipJsonSchema[datetime] = Field(default_factory=datetime.utcnow)\n\n    # AI-generated fields\n    name: str = Field(description=\"Product name\")\n    description: str = Field(description=\"Detailed product description\")\n    price: float = Field(gt=0, description=\"Product price in USD\")\n    category: str = Field(description=\"Product category\")\n\n\n# Functions for AI data generation\n@monitor_ai_calls\ndef create_hero(prompt: str = \"Create a unique superhero\") -> Hero:\n    \"\"\"Generate a single hero using AI\"\"\"\n    try:\n        return sync_client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=Hero,\n            messages=[\n                {\"role\": \"user\", \"content\": prompt},\n            ],\n            max_retries=3,\n        )\n    except Exception as e:\n        logger.error(f\"Failed to create hero: {str(e)}\")\n        raise\n\n\n@monitor_ai_calls\nasync def create_hero_async(prompt: str = \"Create a unique superhero\") -> Hero:\n    \"\"\"Generate a single hero using AI (async)\"\"\"\n    try:\n        return await async_client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=Hero,\n            messages=[\n                {\"role\": \"user\", \"content\": prompt},\n            ],\n            max_retries=3,\n        )\n    except Exception as e:\n        logger.error(f\"Failed to create hero: {str(e)}\")\n        raise\n\n\n@monitor_ai_calls\nasync def create_hero_team_async(team_size: int = 5) -> list[Hero]:\n    \"\"\"Generate multiple heroes concurrently\"\"\"\n    try:\n        return await async_client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=list[Hero],\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Create a team of {team_size} diverse superheroes with different powers\",\n                },\n            ],\n            max_retries=3,\n        )\n    except Exception as e:\n        logger.error(f\"Failed to create hero team: {str(e)}\")\n        raise\n\n\nasync def create_heroes_batch(prompts: list[str]) -> list[Hero]:\n    \"\"\"Generate multiple heroes concurrently from different prompts\"\"\"\n    tasks = []\n    for prompt in prompts:\n        task = create_hero_async(prompt)\n        tasks.append(task)\n\n    return await asyncio.gather(*tasks, return_exceptions=True)\n\n\ndef create_product(category: str) -> Product:\n    \"\"\"Generate a product for a specific category\"\"\"\n    try:\n        return sync_client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=Product,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Create a {category} product with realistic pricing\",\n                },\n            ],\n        )\n    except Exception as e:\n        logger.error(f\"Failed to create product: {str(e)}\")\n        raise\n\n\n# Database operations\ndef setup_database():\n    \"\"\"Create all tables\"\"\"\n    SQLModel.metadata.create_all(engine)\n    logger.info(\"Database tables created successfully\")\n\n\ndef create_sample_teams():\n    \"\"\"Create sample teams for heroes to join\"\"\"\n    teams_data = [\n        {\"name\": \"Justice League\", \"city\": \"Metropolis\", \"founded_year\": 1960},\n        {\"name\": \"Avengers\", \"city\": \"New York\", \"founded_year\": 1963},\n        {\"name\": \"X-Men\", \"city\": \"Westchester\", \"founded_year\": 1963},\n    ]\n\n    with Session(engine) as session:\n        for team_data in teams_data:\n            # Check if team already exists\n            existing_team = session.exec(\n                select(Team).where(Team.name == team_data[\"name\"])\n            ).first()\n\n            if not existing_team:\n                team = Team(**team_data)\n                session.add(team)\n\n        session.commit()\n        logger.info(\"Sample teams created\")\n\n\ndef assign_hero_to_team(hero: Hero, team_name: str):\n    \"\"\"Assign a hero to a team\"\"\"\n    with Session(engine) as session:\n        # Get the team\n        team = session.exec(select(Team).where(Team.name == team_name)).first()\n        if team:\n            hero.team_id = team.id\n            session.add(hero)\n            session.commit()\n            session.refresh(hero)\n            logger.info(f\"Assigned {hero.name} to {team_name}\")\n        else:\n            logger.warning(f\"Team {team_name} not found\")\n\n\ndef list_heroes_with_teams():\n    \"\"\"List all heroes with their team information\"\"\"\n    with Session(engine) as session:\n        statement = select(Hero, Team).join(Team, Hero.team_id == Team.id, isouter=True)\n        results = session.exec(statement).all()\n\n        logger.info(\"Heroes and their teams:\")\n        for hero, team in results:\n            team_name = team.name if team else \"No team\"\n            logger.info(\n                f\"- {hero.name} ({hero.secret_name}) - Power Level: {hero.power_level} - Team: {team_name}\"\n            )\n\n\ndef demonstrate_validation_errors():\n    \"\"\"Show how validation works with invalid data\"\"\"\n    logger.info(\"Testing validation...\")\n\n    try:\n        # This should fail due to validator\n        Hero(\n            name=\"Evil Villain\",  # Contains forbidden word\n            secret_name=\"Bad Guy\",\n            power_level=50,\n            origin_story=\"A story of evil deeds\",\n        )\n    except ValueError as e:\n        logger.info(f\"Validation caught invalid name: {e}\")\n\n    try:\n        # This should fail due to field constraints\n        Hero(\n            name=\"Good Hero\",\n            secret_name=\"G\",  # Too short\n            power_level=150,  # Too high\n            origin_story=\"Short\",  # Too short\n        )\n    except ValueError as e:\n        logger.info(f\"Validation caught field constraint violation: {e}\")\n\n\nasync def main():\n    \"\"\"Main demonstration function\"\"\"\n    logger.info(\"Starting SQLModel with Instructor demonstration...\")\n\n    # Setup\n    setup_database()\n    create_sample_teams()\n\n    # Demonstrate validation\n    demonstrate_validation_errors()\n\n    # 1. Basic hero creation\n    logger.info(\"\\n1. Creating a single hero...\")\n    hero1 = create_hero(\"Create a tech-based superhero\")\n\n    with Session(engine) as session:\n        session.add(hero1)\n        session.commit()\n        session.refresh(hero1)\n\n    logger.info(f\"Created hero: {hero1.name} (Power Level: {hero1.power_level})\")\n    logger.info(f\"Origin: {hero1.origin_story}\")\n    assign_hero_to_team(hero1, \"Avengers\")\n\n    # 2. Async hero creation\n    logger.info(\"\\n2. Creating a hero asynchronously...\")\n    hero2 = await create_hero_async(\"Create a magic-based superhero\")\n\n    with Session(engine) as session:\n        session.add(hero2)\n        session.commit()\n        session.refresh(hero2)\n\n    logger.info(f\"Created async hero: {hero2.name} (Power Level: {hero2.power_level})\")\n    assign_hero_to_team(hero2, \"Justice League\")\n\n    # 3. Bulk hero creation\n    logger.info(\"\\n3. Creating a team of heroes...\")\n    hero_team = await create_hero_team_async(3)\n\n    with Session(engine) as session:\n        for hero in hero_team:\n            session.add(hero)\n        session.commit()\n\n        for hero in hero_team:\n            session.refresh(hero)\n\n    logger.info(f\"Created team of {len(hero_team)} heroes\")\n    for hero in hero_team:\n        assign_hero_to_team(hero, \"X-Men\")\n\n    # 4. Concurrent hero creation with different prompts\n    logger.info(\"\\n4. Creating heroes concurrently...\")\n    prompts = [\n        \"Create a fire-based superhero\",\n        \"Create a water-based superhero\",\n        \"Create an earth-based superhero\",\n        \"Create a wind-based superhero\",\n    ]\n\n    concurrent_heroes = await create_heroes_batch(prompts)\n\n    with Session(engine) as session:\n        for hero in concurrent_heroes:\n            if isinstance(hero, Hero):  # Check if not an exception\n                session.add(hero)\n        session.commit()\n\n    logger.info(\n        f\"Created {len([h for h in concurrent_heroes if isinstance(h, Hero)])} heroes concurrently\"\n    )\n\n    # 5. Product creation (different model)\n    logger.info(\"\\n5. Creating products...\")\n    categories = [\"electronics\", \"clothing\", \"books\"]\n\n    for category in categories:\n        product = create_product(category)\n        with Session(engine) as session:\n            session.add(product)\n            session.commit()\n            session.refresh(product)\n\n        logger.info(\n            f\"Created {category} product: {product.name} - ${product.price:.2f}\"\n        )\n\n    # 6. Display results\n    logger.info(\"\\n6. Final results:\")\n    list_heroes_with_teams()\n\n    # 7. Database statistics\n    with Session(engine) as session:\n        total_heroes = len(session.exec(select(Hero)).all())\n        total_teams = len(session.exec(select(Team)).all())\n        total_products = len(session.exec(select(Product)).all())\n\n    logger.info(f\"\\nDatabase contains:\")\n    logger.info(f\"- {total_heroes} heroes\")\n    logger.info(f\"- {total_teams} teams\")\n    logger.info(f\"- {total_products} products\")\n\n\nif __name__ == \"__main__\":\n    # Run the async main function\n    asyncio.run(main())\n"
  },
  {
    "path": "examples/sqlmodel/test_basic.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nBasic SQLModel test to verify core functionality\n\"\"\"\n\nimport logging\nfrom datetime import datetime\nfrom typing import Optional\nfrom uuid import UUID, uuid4\nfrom pydantic import validator\nfrom sqlmodel import Field, SQLModel, create_engine, Session, select, Relationship\n\n# Configure logging\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\n# Database setup\nengine = create_engine(\"sqlite:///test_basic.db\", echo=False)\n\n\n# Models with relationships\nclass Team(SQLModel, table=True):\n    \"\"\"Team model with relationship to heroes\"\"\"\n\n    id: Optional[int] = Field(default=None, primary_key=True)\n    name: str = Field(min_length=2, max_length=50)\n    city: str = Field(min_length=2, max_length=50)\n    founded_year: Optional[int] = Field(default=None, ge=1900, le=2024)\n\n    # Relationship to heroes\n    heroes: list[\"Hero\"] = Relationship(back_populates=\"team\")\n\n\nclass Hero(SQLModel, table=True):\n    \"\"\"Hero model with auto-generated fields and validation\"\"\"\n\n    __table_args__ = {\"extend_existing\": True}\n\n    # Auto-generated fields\n    id: Optional[int] = Field(default=None, primary_key=True)\n    created_at: datetime = Field(default_factory=datetime.utcnow)\n    uuid: UUID = Field(default_factory=uuid4)\n\n    # Regular fields with validation\n    name: str = Field(min_length=2, max_length=50, description=\"Hero's public name\")\n    secret_name: str = Field(\n        min_length=2, max_length=50, description=\"Hero's secret identity\"\n    )\n    age: Optional[int] = Field(default=None, ge=16, le=100, description=\"Hero's age\")\n    power_level: int = Field(ge=1, le=100, description=\"Power level from 1-100\")\n    origin_story: str = Field(\n        min_length=10, max_length=200, description=\"Brief origin story\"\n    )\n\n    # Foreign key relationship\n    team_id: Optional[int] = Field(default=None, foreign_key=\"team.id\")\n    team: Optional[Team] = Relationship(back_populates=\"heroes\")\n\n    @validator(\"name\")\n    def validate_name_format(cls, v):\n        \"\"\"Ensure hero name doesn't contain inappropriate words\"\"\"\n        forbidden_words = [\"villain\", \"evil\", \"bad\"]\n        if any(word in v.lower() for word in forbidden_words):\n            raise ValueError(f\"Hero name cannot contain: {', '.join(forbidden_words)}\")\n        return v\n\n\ndef test_basic_functionality():\n    \"\"\"Test basic SQLModel functionality\"\"\"\n\n    # Create tables\n    SQLModel.metadata.create_all(engine)\n    logger.info(\"✓ Database tables created\")\n\n    # Create a team\n    with Session(engine) as session:\n        team = Team(name=\"Avengers\", city=\"New York\", founded_year=1963)\n        session.add(team)\n        session.commit()\n        session.refresh(team)\n        logger.info(f\"✓ Created team: {team.name}\")\n\n    # Create heroes\n    heroes_data = [\n        {\n            \"name\": \"Iron Man\",\n            \"secret_name\": \"Tony Stark\",\n            \"age\": 45,\n            \"power_level\": 85,\n            \"origin_story\": \"Genius inventor who built a powered suit of armor\",\n        },\n        {\n            \"name\": \"Captain America\",\n            \"secret_name\": \"Steve Rogers\",\n            \"age\": 100,\n            \"power_level\": 90,\n            \"origin_story\": \"Super soldier enhanced with the super soldier serum\",\n        },\n        {\n            \"name\": \"Thor\",\n            \"secret_name\": \"Thor Odinson\",\n            \"age\": 1500,  # This will be clamped to 100 by validation\n            \"power_level\": 95,\n            \"origin_story\": \"God of Thunder from Asgard with mystical hammer\",\n        },\n    ]\n\n    created_heroes = []\n    with Session(engine) as session:\n        # Get the team\n        team = session.exec(select(Team).where(Team.name == \"Avengers\")).first()\n\n        if not team:\n            logger.error(\"Team not found!\")\n            return\n\n        for hero_data in heroes_data:\n            try:\n                # Handle age validation\n                if hero_data[\"age\"] > 100:\n                    hero_data[\"age\"] = 100\n\n                hero = Hero(**hero_data, team_id=team.id)\n                session.add(hero)\n                created_heroes.append(hero)\n                logger.info(f\"✓ Created hero: {hero.name}\")\n            except ValueError as e:\n                logger.error(f\"✗ Failed to create hero {hero_data['name']}: {e}\")\n\n        session.commit()\n\n        # Refresh all heroes\n        for hero in created_heroes:\n            session.refresh(hero)\n\n    # Test validation\n    logger.info(\"\\n--- Testing Validation ---\")\n    try:\n        Hero(\n            name=\"Evil Villain\",  # Should trigger validator\n            secret_name=\"Bad Guy\",\n            power_level=50,\n            origin_story=\"A story of evil deeds\",\n        )\n    except ValueError as e:\n        logger.info(f\"✓ Validation caught invalid name: {e}\")\n\n    # Query with relationships\n    logger.info(\"\\n--- Testing Relationships ---\")\n    with Session(engine) as session:\n        # Get team with heroes\n        team_with_heroes = session.exec(\n            select(Team).where(Team.name == \"Avengers\")\n        ).first()\n\n        if team_with_heroes:\n            logger.info(\n                f\"✓ {team_with_heroes.name} has {len(team_with_heroes.heroes)} heroes\"\n            )\n\n            for hero in team_with_heroes.heroes:\n                logger.info(\n                    f\"  - {hero.name} ({hero.secret_name}) - Power: {hero.power_level}\"\n                )\n\n    # Test queries\n    logger.info(\"\\n--- Testing Queries ---\")\n    with Session(engine) as session:\n        # Find high-power heroes\n        high_power_heroes = session.exec(\n            select(Hero).where(Hero.power_level >= 90)\n        ).all()\n\n        logger.info(f\"✓ Found {len(high_power_heroes)} high-power heroes:\")\n        for hero in high_power_heroes:\n            logger.info(f\"  - {hero.name}: {hero.power_level}\")\n\n    logger.info(\"\\n✓ All basic functionality tests passed!\")\n\n\nif __name__ == \"__main__\":\n    test_basic_functionality()\n"
  },
  {
    "path": "examples/stream_action_items/run.py",
    "content": "import instructor\n\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\nfrom collections.abc import Iterable\nfrom openai import OpenAI\nfrom rich.console import Console\n\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass ActionItem(BaseModel):\n    slug: str = Field(..., description=\"compact short slug\")\n    title: str = Field(description=\"The title of the action item\")\n    chain_of_thought: str = Field(\n        description=\"Short chain of thought that led to this action item, specifically think about whether or not a task should be marked as completed\"\n    )\n    is_completed: Optional[bool] = Field(\n        False, description=\"Whether the action item is completed\"\n    )\n\n\nclass ActionItemResponse(BaseModel):\n    action_items: Optional[list[ActionItem]] = Field(\n        ..., title=\"The list of action items\"\n    )\n\n    def patch(self, action_item: ActionItem):\n        current_items = {item.slug: item for item in self.action_items}\n        current_items[action_item.slug] = action_item\n        new_response = ActionItemResponse(action_items=list(current_items.values()))\n        print(f\"BEFORE\\n{self}\\n\\nAFTER\\n{new_response}\")\n        return new_response\n\n    def __repr__(self):\n        completed_str = \"DONE -\"\n        pending_str = \"TODO -\"\n\n        def format_item(item):\n            return f\"{completed_str if item.is_completed else pending_str} {item.title}\"\n\n        return \"\\n\\n\".join([format_item(item) for item in self.action_items])\n\n    def __str__(self) -> str:\n        return self.__repr__()\n\n\nconsole = Console()\n\n\ndef yield_action_items(transcript: str, state: ActionItemResponse):\n    action_items = client.chat.completions.create(\n        model=\"gpt-4-turbo-preview\",\n        temperature=0,\n        seed=42,\n        response_model=Iterable[ActionItem],\n        stream=True,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                You're a world-class note taker. \n                You are given the current state of the notes and an additional piece of the transcript. \n                Use this to update the action.\n                \n                If you return an action item with the same ID as something in the set, It will be overwritten.\n                Use this to update the complete status or change the title if there's more context. \n\n                - If they are distinct items, do not repeat the slug.\n                - Only repeat a slug if we need to update the title or completion status.\n                - If the completion status is not mentioned, it should be assumed to be incomplete.\n                - For each task describe the success / completion criteria as well.\n                - If something is explicitly mentioned as being done, mark it as done. \n\n                {state.model_dump_json(indent=2)}\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Take the following transcript to return a set of transactions from the transcript\\n\\n{transcript}\",\n            },\n        ],\n    )\n\n    for action_item in action_items:\n        state = state.patch(action_item)\n        yield state\n\n\ntranscript = \"\"\"\nBob: Great, Carol. I'll handle the back-end optimization then.\n\nAlice: Perfect. Now, after the authentication system is improved, we have to integrate it with our new billing system. That's a medium priority task.\n\nBob: Sure, but I'll need to complete the back-end optimization of the authentication system first, so it's dependent on that.\n\nJason: The backend optimization was finished last week actually.\n\nAlice: Understood. Lastly, we also need to update our user documentation to reflect all these changes. It's a low-priority task but still important.\n\"\"\".strip().split(\"\\n\\n\")\n\n\ndef text_to_speech(chunk):\n    \"\"\"\n    Uses a subprocess to convert text to speech via the `say` command on macOS.\n    \"\"\"\n    import subprocess\n\n    subprocess.run([\"say\", chunk], check=True)\n\n\ndef process_transcript(transcript: list[str]):\n    state = ActionItemResponse(action_items=[])\n    for chunk in transcript:\n        console.print(f\"update: {chunk}\")\n        for new_state in yield_action_items(chunk, state):\n            state = new_state\n            console.clear()\n            console.print(\"# Action Items\")\n            console.print(str(state))\n            console.print(\"\\n\")\n\n\nif __name__ == \"__main__\":\n    process_transcript(transcript)\n"
  },
  {
    "path": "examples/synethic-data/run.py",
    "content": "import openai\nimport instructor\nfrom collections.abc import Iterable\nfrom pydantic import BaseModel, ConfigDict\n\nclient = instructor.from_openai(openai.OpenAI())\n\n\nclass SyntheticQA(BaseModel):\n    question: str\n    answer: str\n\n    model_config = ConfigDict(\n        json_schema_extra={\n            \"examples\": [\n                {\"question\": \"What is the capital of France?\", \"answer\": \"Paris\"},\n                {\n                    \"question\": \"What is the largest planet in our solar system?\",\n                    \"answer\": \"Jupiter\",\n                },\n                {\n                    \"question\": \"Who wrote 'To Kill a Mockingbird'?\",\n                    \"answer\": \"Harper Lee\",\n                },\n                {\n                    \"question\": \"What element does 'O' represent on the periodic table?\",\n                    \"answer\": \"Oxygen\",\n                },\n            ]\n        }\n    )\n\n\ndef get_synthetic_data() -> Iterable[SyntheticQA]:\n    return client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\"role\": \"system\", \"content\": \"Generate synthetic examples\"},\n            {\n                \"role\": \"user\",\n                \"content\": \"Generate the exact examples you see in the examples of this prompt. \",\n            },\n        ],\n        response_model=Iterable[SyntheticQA],\n    )  # type: ignore\n\n\nif __name__ == \"__main__\":\n    for example in get_synthetic_data():\n        print(example)\n        \"\"\"\n        question='What is the capital of France?' answer='Paris'\n        question='What is the largest planet in our solar system?' answer='Jupiter'\n        question=\"Who wrote 'To Kill a Mockingbird'?\" answer='Harper Lee'\n        question=\"What element does 'O' represent on the periodic table?\" answer='Oxygen'\n        \"\"\"\n"
  },
  {
    "path": "examples/task_planner/diagram.py",
    "content": "import erdantic as erd\n\nfrom task_planner_topological_sort import TaskPlan\n\ndiagram = erd.create(TaskPlan)\ndiagram.draw(\"examples/task_planner_topological_sort/schema.png\")\n"
  },
  {
    "path": "examples/task_planner/task_planner_topological_sort.py",
    "content": "\"\"\"\nProof of Concept for a task planning and execution system using\nOpenAIs Functions and topological sort, based on the idea in\nquery_planner_execution.py.py.\n\nAdditionally: There are also cases where the \"pure\" recursive approach has advantages;\nIf subtasks for different parent tasks that start in parallel have different runtimes,\nwe will wait unnecessarily with my current implementation.\n\nAdded by Jan Philipp Harries / @jpdus\n\"\"\"\n\nimport asyncio\nfrom collections.abc import Generator\n\nfrom openai import OpenAI\n\nfrom pydantic import Field, BaseModel\n\nimport instructor\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass TaskResult(BaseModel):\n    task_id: int\n    result: str\n\n\nclass TaskResults(BaseModel):\n    results: list[TaskResult]\n\n\nclass Task(BaseModel):\n    \"\"\"\n    Class representing a single task in a task plan.\n    \"\"\"\n\n    id: int = Field(..., description=\"Unique id of the task\")\n    task: str = Field(\n        ...,\n        description=\"\"\"Contains the task in text form. If there are multiple tasks,\n        this task can only be executed when all dependant subtasks have been answered.\"\"\",\n    )\n    subtasks: list[int] = Field(\n        default_factory=list,\n        description=\"\"\"List of the IDs of subtasks that need to be answered before\n        we can answer the main question. Use a subtask when anything may be unknown\n        and we need to ask multiple questions to get the answer.\n        Dependencies must only be other tasks.\"\"\",\n    )\n\n    async def aexecute(self, with_results: TaskResults) -> TaskResult:\n        \"\"\"\n        Executes the task by asking the question and returning the answer.\n        \"\"\"\n\n        # We do nothing with the subtask answers, since this is an example however\n        # we could use intermediate results to compute the answer to the main task.\n        return TaskResult(task_id=self.id, result=f\"`{self.task}`\")\n\n\nclass TaskPlan(BaseModel):\n    \"\"\"\n    Container class representing a tree of tasks and subtasks.\n    Make sure every task is in the tree, and every task is done only once.\n    \"\"\"\n\n    task_graph: list[Task] = Field(\n        ...,\n        description=\"List of tasks and subtasks that need to be done to complete the main task. Consists of the main task and its dependencies.\",\n    )\n\n    def _get_execution_order(self) -> list[int]:\n        \"\"\"\n        Returns the order in which the tasks should be executed using topological sort.\n        Inspired by https://gitlab.com/ericvsmith/toposort/-/blob/master/src/toposort.py\n        \"\"\"\n        tmp_dep_graph = {item.id: set(item.subtasks) for item in self.task_graph}\n\n        def topological_sort(\n            dep_graph: dict[int, set[int]],\n        ) -> Generator[set[int], None, None]:\n            while True:\n                ordered = set(item for item, dep in dep_graph.items() if len(dep) == 0)\n                if not ordered:\n                    break\n                yield ordered\n                dep_graph = {\n                    item: (dep - ordered)\n                    for item, dep in dep_graph.items()\n                    if item not in ordered\n                }\n            if len(dep_graph) != 0:\n                raise ValueError(\n                    f\"Circular dependencies exist among these items: {{{', '.join(f'{key}:{value}' for key, value in dep_graph.items())}}}\"\n                )\n\n        result = []\n        for d in topological_sort(tmp_dep_graph):\n            result.extend(sorted(d))\n        return result\n\n    async def execute(self) -> dict[int, TaskResult]:\n        \"\"\"\n        Executes the tasks in the task plan in the correct order using asyncio and chunks with answered dependencies.\n        \"\"\"\n        execution_order = self._get_execution_order()\n        tasks = {q.id: q for q in self.task_graph}\n        task_results = {}\n        while True:\n            ready_to_execute = [\n                tasks[task_id]\n                for task_id in execution_order\n                if task_id not in task_results\n                and all(\n                    subtask_id in task_results for subtask_id in tasks[task_id].subtasks\n                )\n            ]\n            # prints chunks to visualize execution order\n            print(ready_to_execute)\n            computed_answers = await asyncio.gather(\n                *[\n                    q.aexecute(\n                        with_results=TaskResults(\n                            results=[\n                                result\n                                for result in task_results.values()\n                                if result.task_id in q.subtasks\n                            ]\n                        )\n                    )\n                    for q in ready_to_execute\n                ]\n            )\n            for answer in computed_answers:\n                task_results[answer.task_id] = answer\n            if len(task_results) == len(execution_order):\n                break\n        return task_results\n\n\nTask.model_rebuild()\nTaskPlan.model_rebuild()\n\n\ndef task_planner(question: str) -> TaskPlan:\n    messages = [\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a world class task planning algorithm capable of breaking apart tasks into dependant subtasks, such that the answers can be used to enable the system completing the main task. Do not complete the user task, simply provide a correct compute graph with good specific tasks to ask and relevant subtasks. Before completing the list of tasks, think step by step to get a better understanding the problem.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"{question}\",\n        },\n    ]\n\n    completion = client.chat.completions.create(\n        model=\"gpt-4-0613\",\n        temperature=0,\n        response_model=TaskPlan,\n        messages=messages,\n        max_tokens=1000,\n    )\n    root = TaskPlan.from_response(completion)\n\n    return root\n\n\nif __name__ == \"__main__\":\n    plan = task_planner(\n        \"What is the difference in populations betweend the adjacent countries of Jan's home country and the adjacent countries of Jason's home country?\"\n    )\n    print(plan.model_dump_json(indent=2))\n    {\n        \"task_graph\": [\n            {\"id\": 1, \"subtasks\": [], \"task\": \"Identify Jan's home country\"},\n            {\n                \"id\": 2,\n                \"subtasks\": [1],\n                \"task\": \"Identify the adjacent countries of Jan's home country\",\n            },\n            {\n                \"id\": 3,\n                \"subtasks\": [2],\n                \"task\": \"Calculate the total population of the adjacent \"\n                \"countries of Jan's home country\",\n            },\n            {\"id\": 4, \"subtasks\": [], \"task\": \"Identify Jason's home country\"},\n            {\n                \"id\": 5,\n                \"subtasks\": [4],\n                \"task\": \"Identify the adjacent countries of Jason's home country\",\n            },\n            {\n                \"id\": 6,\n                \"subtasks\": [5],\n                \"task\": \"Calculate the total population of the adjacent \"\n                \"countries of Jason's home country\",\n            },\n            {\n                \"id\": 7,\n                \"subtasks\": [3, 6],\n                \"task\": \"Calculate the difference in populations between the \"\n                \"adjacent countries of Jan's home country and the \"\n                \"adjacent countries of Jason's home country\",\n            },\n        ]\n    }\n"
  },
  {
    "path": "examples/tenacity-benchmarks/run.py",
    "content": "\"\"\"\nTenacity Retry Logic Benchmarks with Instructor\n\nThis script demonstrates and benchmarks different retry patterns for LLM processing:\n- Basic retry with exponential backoff\n- Conditional retries for specific errors\n- Validation error retries\n- Custom retry conditions\n- Rate limit handling\n- Network error recovery\n- Logging and monitoring\n- Circuit breaker patterns\n\nRun this script to see retry behavior and verify all code examples work.\n\"\"\"\n\nimport instructor\nfrom tenacity import (\n    retry,\n    stop_after_attempt,\n    wait_exponential,\n    retry_if_exception_type,\n    retry_if_result,\n    before_log,\n    after_log,\n    wait_random_exponential,\n)\nfrom pydantic import BaseModel, field_validator, ValidationError\nfrom openai import OpenAI, RateLimitError, APIError\nimport time\nimport logging\nimport random\nimport os\nfrom functools import lru_cache\nimport httpx\n\n# Set up logging\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\n# Set up the client with Instructor\nclient = instructor.from_openai(OpenAI())\n\n\nclass UserInfo(BaseModel):\n    name: str\n    age: int\n    email: str\n\n    @field_validator(\"age\")\n    @classmethod\n    def validate_age(cls, v):\n        if v < 0 or v > 150:\n            raise ValueError(f\"Age {v} is invalid\")\n        return v\n\n    @field_validator(\"email\")\n    @classmethod\n    def validate_email(cls, v):\n        if \"@\" not in v:\n            raise ValueError(f\"Invalid email: {v}\")\n        return v.lower()\n\n\n# Sample data for testing\ntest_texts = [\n    \"John is 30 years old with email john@example.com\",\n    \"Sarah is 25 with email sarah@test.com\",\n    \"Mike is 35 and his email is mike@demo.org\",\n    \"Alice is 28 with email alice@example.com\",\n    \"Bob is 32 with email bob@test.com\",\n]\n\n\n# Error simulation for testing\nclass MockError:\n    def __init__(self):\n        self.call_count = 0\n        self.fail_until = 2  # Fail first 2 calls, succeed on 3rd\n\n    def maybe_fail(self):\n        self.call_count += 1\n        if self.call_count <= self.fail_until:\n            # Simulate different types of errors\n            error_type = random.choice(\n                [ValidationError, RateLimitError, APIError, Exception]\n            )\n            if error_type == ValidationError:\n                raise ValidationError.from_exception_data(\"UserInfo\", [])\n            elif error_type == RateLimitError:\n                # Create a simple mock response for RateLimitError\n                mock_response = httpx.Response(\n                    status_code=429, headers={}, content=b\"Rate limit exceeded\"\n                )\n                raise RateLimitError(\n                    \"Rate limit exceeded\",\n                    response=mock_response,\n                    body=\"Rate limit exceeded\",\n                )\n            elif error_type == APIError:\n                # Create a simple mock request for APIError\n                mock_request = httpx.Request(\n                    \"POST\", \"https://api.openai.com/v1/chat/completions\"\n                )\n                raise APIError(\n                    \"API error occurred\",\n                    request=mock_request,\n                    body=\"API error occurred\",\n                )\n            else:\n                raise Exception(\"Generic error occurred\")\n\n\nmock_error = MockError()\n\n\ndef extract_user_info_with_mock_errors(text: str) -> UserInfo:\n    \"\"\"Extract user info with simulated errors for testing.\"\"\"\n    if not os.getenv(\"OPENAI_API_KEY\"):\n        # Simulate errors for testing when no API key\n        mock_error.maybe_fail()\n        # Return mock data if no errors\n        return UserInfo(name=\"Mock User\", age=30, email=\"mock@example.com\")\n\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": f\"Extract user info: {text}\"}],\n    )\n\n\n# Method 1: Basic Retry with Exponential Backoff\n@retry(\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=1, max=5),  # Shorter waits for demo\n)\ndef extract_user_info(text: str) -> UserInfo:\n    \"\"\"Extract user information with basic retry logic.\"\"\"\n    print(f\"  Attempting extraction for: {text[:30]}...\")\n    if not os.getenv(\"OPENAI_API_KEY\"):\n        mock_error.maybe_fail()\n        return UserInfo(name=\"Test User\", age=25, email=\"test@example.com\")\n\n    return client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        response_model=UserInfo,\n        messages=[{\"role\": \"user\", \"content\": f\"Extract user info: {text}\"}],\n    )\n\n\n# Method 2: Conditional Retries for Specific Errors\n@retry(\n    retry=retry_if_exception_type((RateLimitError, APIError)),\n    stop=stop_after_attempt(5),\n    wait=wait_exponential(multiplier=1, min=1, max=5),\n)\ndef robust_extraction(text: str) -> UserInfo:\n    \"\"\"Retry only on specific API errors.\"\"\"\n    print(f\"  Robust extraction for: {text[:30]}...\")\n    return extract_user_info_with_mock_errors(text)\n\n\n# Method 3: Validation Error Retries\n@retry(\n    retry=retry_if_exception_type(ValidationError),\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=1, max=3),\n)\ndef extract_with_validation(text: str) -> UserInfo:\n    \"\"\"Retry when Pydantic validation fails.\"\"\"\n    print(f\"  Validation retry for: {text[:30]}...\")\n    return extract_user_info_with_mock_errors(text)\n\n\n# Method 4: Custom Retry Conditions\ndef should_retry(result: UserInfo) -> bool:\n    \"\"\"Custom retry logic based on result content.\"\"\"\n    # Retry if age is invalid or email is missing\n    return result.age < 18 or result.age > 100 or not result.email\n\n\n@retry(\n    retry=retry_if_result(should_retry),\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=1, max=3),\n)\ndef extract_valid_user(text: str) -> UserInfo:\n    \"\"\"Retry based on result validation.\"\"\"\n    print(f\"  Custom retry for: {text[:30]}...\")\n    # Simulate returning invalid data first time\n    if not hasattr(extract_valid_user, \"call_count\"):\n        extract_valid_user.call_count = 0\n    extract_valid_user.call_count += 1\n\n    if extract_valid_user.call_count == 1:\n        # Return invalid data first time\n        return UserInfo(name=\"Invalid User\", age=200, email=\"invalid\")\n    else:\n        # Return valid data on retry\n        return UserInfo(name=\"Valid User\", age=30, email=\"valid@example.com\")\n\n\n# Method 5: Rate Limit Specific Retry\n@retry(\n    retry=retry_if_exception_type(RateLimitError),\n    stop=stop_after_attempt(5),\n    wait=wait_exponential(multiplier=2, min=1, max=10),\n    before_sleep=lambda retry_state: print(\n        f\"    Rate limited, waiting... (attempt {retry_state.attempt_number})\"\n    ),\n)\ndef rate_limit_safe_extraction(text: str) -> UserInfo:\n    \"\"\"Handle rate limits with longer delays.\"\"\"\n    print(f\"  Rate limit safe for: {text[:30]}...\")\n    return extract_user_info_with_mock_errors(text)\n\n\n# Method 6: Network Error Retry\n@retry(\n    retry=retry_if_exception_type((ConnectionError, TimeoutError)),\n    stop=stop_after_attempt(4),\n    wait=wait_random_exponential(multiplier=1, min=1, max=5),\n)\ndef network_resilient_extraction(text: str) -> UserInfo:\n    \"\"\"Handle network issues with random exponential backoff.\"\"\"\n    print(f\"  Network resilient for: {text[:30]}...\")\n    return extract_user_info_with_mock_errors(text)\n\n\n# Method 7: Logging and Monitoring\n@retry(\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=1, max=5),\n    before=before_log(logger, logging.INFO),\n    after=after_log(logger, logging.ERROR),\n)\ndef logged_extraction(text: str) -> UserInfo:\n    \"\"\"Extract with comprehensive logging.\"\"\"\n    print(f\"  Logged extraction for: {text[:30]}...\")\n    return extract_user_info_with_mock_errors(text)\n\n\n# Method 8: Circuit Breaker Pattern\n@lru_cache(maxsize=1)\ndef get_client():\n    \"\"\"Cache the client to avoid repeated initialization.\"\"\"\n    return instructor.from_openai(OpenAI())\n\n\n@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=1, max=5))\ndef circuit_breaker_extraction(text: str) -> UserInfo:\n    \"\"\"Extract with circuit breaker pattern.\"\"\"\n    print(f\"  Circuit breaker for: {text[:30]}...\")\n    client = get_client()\n    return extract_user_info_with_mock_errors(text)\n\n\n# Method 9: Performance Monitoring\n@retry(stop=stop_after_attempt(3))\ndef monitored_extraction(text: str) -> UserInfo:\n    \"\"\"Extract with performance monitoring.\"\"\"\n    start_time = time.time()\n\n    try:\n        print(f\"  Monitored extraction for: {text[:30]}...\")\n        result = extract_user_info_with_mock_errors(text)\n\n        end_time = time.time()\n        print(f\"    Extraction took {end_time - start_time:.2f} seconds\")\n        return result\n\n    except Exception as e:\n        end_time = time.time()\n        print(f\"    Extraction failed after {end_time - start_time:.2f} seconds: {e}\")\n        raise\n\n\ndef benchmark_retry_methods():\n    \"\"\"Test all retry methods and measure their behavior.\"\"\"\n    print(\"=== Python Tenacity Retry Logic with Instructor Benchmarks ===\\n\")\n\n    if not os.getenv(\"OPENAI_API_KEY\"):\n        print(\"⚠️  OPENAI_API_KEY not set. Using mock responses for demonstration.\\n\")\n\n    # Test different retry strategies\n    strategies = [\n        (\"Basic Retry\", extract_user_info),\n        (\"Conditional Retry\", robust_extraction),\n        (\"Validation Retry\", extract_with_validation),\n        (\"Custom Retry\", extract_valid_user),\n        (\"Rate Limit Retry\", rate_limit_safe_extraction),\n        (\"Network Retry\", network_resilient_extraction),\n        (\"Logged Retry\", logged_extraction),\n        (\"Circuit Breaker\", circuit_breaker_extraction),\n        (\"Monitored Retry\", monitored_extraction),\n    ]\n\n    results = {}\n    test_text = test_texts[0]  # Use first text for all tests\n\n    for name, strategy in strategies:\n        print(f\"\\n{'=' * 60}\")\n        print(f\"Testing: {name}\")\n        print(\"=\" * 60)\n\n        # Reset mock error for each test\n        global mock_error\n        mock_error = MockError()\n\n        # Reset call count for custom retry\n        if hasattr(extract_valid_user, \"call_count\"):\n            delattr(extract_valid_user, \"call_count\")\n\n        start_time = time.time()\n        try:\n            user = strategy(test_text)\n            end_time = time.time()\n            duration = end_time - start_time\n\n            results[name] = {\n                \"success\": True,\n                \"duration\": duration,\n                \"user\": user,\n                \"attempts\": getattr(mock_error, \"call_count\", 1),\n            }\n\n            print(f\"✓ Success: {user.name} ({duration:.2f}s)\")\n            print(f\"  Age: {user.age}, Email: {user.email}\")\n            print(f\"  Attempts made: {results[name]['attempts']}\")\n\n        except Exception as e:\n            end_time = time.time()\n            duration = end_time - start_time\n\n            results[name] = {\n                \"success\": False,\n                \"duration\": duration,\n                \"error\": str(e),\n                \"attempts\": getattr(mock_error, \"call_count\", 1),\n            }\n\n            print(f\"✗ Failed: {e} ({duration:.2f}s)\")\n            print(f\"  Attempts made: {results[name]['attempts']}\")\n\n    # Print summary table\n    print(f\"\\n{'=' * 80}\")\n    print(\"RETRY STRATEGY SUMMARY\")\n    print(\"=\" * 80)\n    print(\n        f\"{'Strategy':<20} {'Status':<10} {'Time (s)':<10} {'Attempts':<10} {'Result'}\"\n    )\n    print(\"-\" * 80)\n\n    for name, result in results.items():\n        status = \"✓ Success\" if result[\"success\"] else \"✗ Failed\"\n        attempts = result[\"attempts\"]\n\n        if result[\"success\"]:\n            result_text = f\"{result['user'].name}\"\n        else:\n            result_text = \"Failed\"\n\n        print(\n            f\"{name:<20} {status:<10} {result['duration']:<10.2f} {attempts:<10} {result_text}\"\n        )\n\n    # Show retry efficiency\n    print(f\"\\nRetry Efficiency Analysis:\")\n    successful_strategies = {k: v for k, v in results.items() if v[\"success\"]}\n\n    if successful_strategies:\n        avg_attempts = sum(r[\"attempts\"] for r in successful_strategies.values()) / len(\n            successful_strategies\n        )\n        avg_duration = sum(r[\"duration\"] for r in successful_strategies.values()) / len(\n            successful_strategies\n        )\n\n        print(f\"  Average attempts: {avg_attempts:.1f}\")\n        print(f\"  Average duration: {avg_duration:.2f}s\")\n\n        # Find most efficient strategy\n        most_efficient = min(\n            successful_strategies.items(),\n            key=lambda x: x[1][\"attempts\"] * x[1][\"duration\"],\n        )\n        print(\n            f\"  Most efficient: {most_efficient[0]} ({most_efficient[1]['attempts']} attempts, {most_efficient[1]['duration']:.2f}s)\"\n        )\n\n\ndef test_batch_processing():\n    \"\"\"Test batch processing with retries.\"\"\"\n    print(f\"\\n{'=' * 60}\")\n    print(\"Batch Processing Test\")\n    print(\"=\" * 60)\n\n    @retry(stop=stop_after_attempt(2))\n    def process_batch(texts: list[str]) -> list[UserInfo]:\n        \"\"\"Process multiple texts with retry logic.\"\"\"\n        results = []\n\n        for text in texts:\n            try:\n                # Reset mock error for each item\n                global mock_error\n                mock_error = MockError()\n\n                result = extract_user_info_with_mock_errors(text)\n                results.append(result)\n                print(f\"  ✓ Processed: {result.name}\")\n            except Exception as e:\n                print(f\"  ✗ Failed to process: {text[:30]}... - {e}\")\n                continue\n\n        return results\n\n    start_time = time.time()\n    try:\n        results = process_batch(test_texts[:3])  # Process first 3 texts\n        end_time = time.time()\n        duration = end_time - start_time\n\n        print(f\"\\nBatch processing completed:\")\n        print(f\"  Successfully processed: {len(results)}/{len(test_texts[:3])} items\")\n        print(f\"  Total time: {duration:.2f} seconds\")\n        print(f\"  Average time per item: {duration / len(test_texts[:3]):.2f} seconds\")\n\n    except Exception as e:\n        print(f\"Batch processing failed: {e}\")\n\n\ndef demonstrate_error_types():\n    \"\"\"Demonstrate handling different error types.\"\"\"\n    print(f\"\\n{'=' * 60}\")\n    print(\"Error Type Demonstration\")\n    print(\"=\" * 60)\n\n    # Simulate different error scenarios\n    error_scenarios = [\n        (\"Validation Error\", ValidationError),\n        (\"Rate Limit Error\", RateLimitError),\n        (\"API Error\", APIError),\n        (\"Generic Error\", Exception),\n    ]\n\n    for error_name, error_type in error_scenarios:\n        print(f\"\\nTesting {error_name}:\")\n\n        def create_error_handler(error_type):\n            @retry(\n                retry=retry_if_exception_type(error_type),\n                stop=stop_after_attempt(3),\n                wait=wait_exponential(multiplier=1, min=0.5, max=2),\n            )\n            def handle_specific_error():\n                # Simulate the specific error type\n                if error_type == ValidationError:\n                    raise ValidationError.from_exception_data(\"UserInfo\", [])\n                elif error_type == RateLimitError:\n                    # Create a simple mock response for RateLimitError\n                    mock_response = httpx.Response(\n                        status_code=429, headers={}, content=b\"Rate limit exceeded\"\n                    )\n                    raise RateLimitError(\n                        \"Rate limit exceeded\",\n                        response=mock_response,\n                        body=\"Rate limit exceeded\",\n                    )\n                elif error_type == APIError:\n                    # Create a simple mock request for APIError\n                    mock_request = httpx.Request(\n                        \"POST\", \"https://api.openai.com/v1/chat/completions\"\n                    )\n                    raise APIError(\n                        \"API error occurred\",\n                        request=mock_request,\n                        body=\"API error occurred\",\n                    )\n                else:\n                    raise Exception(\"Generic error occurred\")\n\n            return handle_specific_error\n\n        error_handler = create_error_handler(error_type)\n\n        try:\n            error_handler()\n        except Exception as e:\n            print(f\"  Expected failure: {type(e).__name__}: {e}\")\n\n\ndef main():\n    \"\"\"Main function to run all benchmarks and demonstrations.\"\"\"\n    try:\n        benchmark_retry_methods()\n        test_batch_processing()\n        demonstrate_error_types()\n\n        print(f\"\\n{'=' * 80}\")\n        print(\"🎉 All tenacity retry patterns demonstrated successfully!\")\n        print(\"💡 Key takeaways:\")\n        print(\"   - Different retry strategies serve different purposes\")\n        print(\"   - Exponential backoff prevents overwhelming APIs\")\n        print(\"   - Conditional retries optimize for specific error types\")\n        print(\"   - Monitoring helps debug and optimize retry behavior\")\n        print(\"=\" * 80)\n\n    except KeyboardInterrupt:\n        print(\"\\n⚠️  Interrupted by user\")\n    except Exception as e:\n        print(f\"❌ Error: {e}\")\n        logger.exception(\"Unexpected error occurred\")\n\n\nif __name__ == \"__main__\":\n    print(\"🚀 Starting tenacity retry benchmarks with Instructor...\")\n    print(\"💡 This script demonstrates retry patterns with simulated errors\")\n    print(\"⏱️  Each test includes artificial delays and error scenarios\\n\")\n\n    main()\n"
  },
  {
    "path": "examples/timestamps/run.py",
    "content": "from pydantic import BaseModel, Field, model_validator\nfrom typing import Literal\n\n\n# Turns out this doesn't work well. since longer videos will be HH:MM:SS\n# but shorter videos will be MM:SS, and the language model does not do 00:MM:SS well\n# then we run into issues where 2:00 is parsed as 200 seconds\nclass Segment(BaseModel):\n    title: str = Field(..., description=\"The title of the segment\")\n    timestamp: str = Field(..., description=\"The timestamp of the event as HH:MM:SS\")\n\n\n# We fix this by doing twi things\n# Tell the LMM which format it wants to use\n# And then we use a custom parser to parse the timestamp\nclass SegmentWithTimestamp(BaseModel):\n    title: str = Field(..., description=\"The title of the segment\")\n    time_format: Literal[\"HH:MM:SS\", \"MM:SS\"] = Field(\n        ..., description=\"The format of the timestamp\"\n    )\n    timestamp: str = Field(\n        ..., description=\"The timestamp of the event as either HH:MM:SS or MM:SS\"\n    )\n\n    @model_validator(mode=\"after\")\n    def parse_timestamp(self):\n        if self.time_format == \"HH:MM:SS\":\n            hours, minutes, seconds = map(int, self.timestamp.split(\":\"))\n        elif self.time_format == \"MM:SS\":\n            hours, minutes, seconds = 0, *map(int, self.timestamp.split(\":\"))\n        else:\n            raise ValueError(\"Invalid time format, must be HH:MM:SS or MM:SS\")\n\n        # Normalize seconds and minutes\n        total_seconds = hours * 3600 + minutes * 60 + seconds\n        hours, remainder = divmod(total_seconds, 3600)\n        minutes, seconds = divmod(remainder, 60)\n\n        if hours > 0:\n            self.timestamp = f\"{hours:02d}:{minutes:02d}:{seconds:02d}\"\n        else:\n            self.timestamp = f\"00:{minutes:02d}:{seconds:02d}\"\n\n        return self\n\n\nif __name__ == \"__main__\":\n    # Make tests\n    # Test cases for SegmentWithTimestamp\n    test_cases = [\n        (\n            SegmentWithTimestamp(\n                title=\"Introduction\", time_format=\"MM:SS\", timestamp=\"00:30\"\n            ),\n            \"00:00:30\",\n        ),\n        (\n            SegmentWithTimestamp(\n                title=\"Main Topic\", time_format=\"HH:MM:SS\", timestamp=\"00:15:45\"\n            ),\n            \"00:15:45\",\n        ),\n        (\n            SegmentWithTimestamp(\n                title=\"Conclusion\", time_format=\"MM:SS\", timestamp=\"65:00\"\n            ),\n            \"01:05:00\",\n        ),\n    ]\n\n    for input_data, expected_output in test_cases:\n        try:\n            assert input_data.timestamp == expected_output\n            print(f\"Test passed: {input_data.timestamp} == {expected_output}\")\n        except AssertionError:\n            print(f\"Test failed: {input_data.timestamp} != {expected_output}\")\n\n    # > Test passed: 00:00:30 == 00:00:30\n    # > Test passed: 00:15:45 == 00:15:45\n    # > Test passed: 01:05:00 == 01:05:00\n"
  },
  {
    "path": "examples/union/run.py",
    "content": "from pydantic import BaseModel, Field\nfrom typing import Union\nimport instructor\nfrom openai import OpenAI\n\n\nclass Search(BaseModel):\n    \"\"\"Search action class with a 'query' field and a process method.\"\"\"\n\n    query: str = Field(description=\"The search query\")\n\n    def process(self):\n        \"\"\"Process the search action.\"\"\"\n        return f\"Search method called for query: {self.query}\"\n\n\nclass Lookup(BaseModel):\n    \"\"\"Lookup action class with a 'keyword' field and a process method.\"\"\"\n\n    keyword: str = Field(description=\"The lookup keyword\")\n\n    def process(self):\n        \"\"\"Process the lookup action.\"\"\"\n        return f\"Lookup method called for keyword: {self.keyword}\"\n\n\nclass Finish(BaseModel):\n    \"\"\"Finish action class with an 'answer' field and a process method.\"\"\"\n\n    answer: str = Field(description=\"The answer for finishing the process\")\n\n    def process(self):\n        \"\"\"Process the finish action.\"\"\"\n        return f\"Finish method called with answer: {self.answer}\"\n\n\n# Union of Search, Lookup, and Finish\nclass TakeAction(BaseModel):\n    action: Union[Search, Lookup, Finish]\n\n    def process(self):\n        \"\"\"Process the action.\"\"\"\n        return self.action.process()\n\n\ntry:\n    # Enables `response_model`\n    client = instructor.from_openai(OpenAI())\n    action = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=TakeAction,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Please choose one action\"},\n        ],\n    )\n    assert isinstance(action, TakeAction), \"The action is not TakeAction\"\n    print(action.process())\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\n"
  },
  {
    "path": "examples/validated-multiclass/output.json",
    "content": "{\n  \"texts\": [\n    \"What is your phone number?\",\n    \"What is your email address?\",\n    \"What is your address?\",\n    \"What is your privacy policy?\"\n  ],\n  \"predictions\": [\n    {\n      \"id\": 1,\n      \"name\": \"phone\"\n    },\n    {\n      \"id\": 2,\n      \"name\": \"email\"\n    },\n    {\n      \"id\": 3,\n      \"name\": \"address\"\n    },\n    {\n      \"id\": 4,\n      \"name\": \"Other\"\n    }\n  ]\n}"
  },
  {
    "path": "examples/validated-multiclass/run.py",
    "content": "from pydantic import BaseModel, ValidationInfo, model_validator\nimport openai\nimport instructor\nimport asyncio\n\nclient = instructor.from_openai(\n    openai.AsyncOpenAI(),\n)\n\n\nclass Tag(BaseModel):\n    id: int\n    name: str\n\n    @model_validator(mode=\"after\")\n    def validate_ids(self, info: ValidationInfo):\n        context = info.context\n        if context:\n            tags: list[Tag] = context.get(\"tags\")\n            assert self.id in {tag.id for tag in tags}, (\n                f\"Tag ID {self.id} not found in context\"\n            )\n            assert self.name in {tag.name for tag in tags}, (\n                f\"Tag name {self.name} not found in context\"\n            )\n        return self\n\n\nclass TagWithInstructions(Tag):\n    instructions: str\n\n\nclass TagRequest(BaseModel):\n    texts: list[str]\n    tags: list[TagWithInstructions]\n\n\nclass TagResponse(BaseModel):\n    texts: list[str]\n    predictions: list[Tag]\n\n\nasync def tag_single_request(text: str, tags: list[Tag]) -> Tag:\n    allowed_tags = [(tag.id, tag.name) for tag in tags]\n    allowed_tags_str = \", \".join([f\"`{tag}`\" for tag in allowed_tags])\n    return await client.chat.completions.create(\n        model=\"gpt-4-turbo-preview\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a world-class text tagging system.\",\n            },\n            {\"role\": \"user\", \"content\": f\"Describe the following text: `{text}`\"},\n            {\n                \"role\": \"user\",\n                \"content\": f\"Here are the allowed tags: {allowed_tags_str}\",\n            },\n        ],\n        response_model=Tag,\n        # Minizises the hallucination of tags that are not in the allowed tags.\n        validation_context={\"tags\": tags},\n    )\n\n\nasync def tag_request(request: TagRequest) -> TagResponse:\n    predictions = await asyncio.gather(\n        *[tag_single_request(text, request.tags) for text in request.texts]\n    )\n    return TagResponse(\n        texts=request.texts,\n        predictions=predictions,\n    )\n\n\nif __name__ == \"__main__\":\n    # Tags will be a range of different topics.\n    # Such as personal, phone, email, etc.\n    tags = [\n        TagWithInstructions(id=0, name=\"personal\", instructions=\"Personal information\"),\n        TagWithInstructions(id=1, name=\"phone\", instructions=\"Phone number\"),\n        TagWithInstructions(id=2, name=\"email\", instructions=\"Email address\"),\n        TagWithInstructions(id=3, name=\"address\", instructions=\"Address\"),\n        TagWithInstructions(id=4, name=\"Other\", instructions=\"Other information\"),\n    ]\n\n    # Texts will be a range of different questions.\n    # Such as \"How much does it cost?\", \"What is your privacy policy?\", etc.\n    texts = [\n        \"What is your phone number?\",\n        \"What is your email address?\",\n        \"What is your address?\",\n        \"What is your privacy policy?\",\n    ]\n\n    # The request will contain the texts and the tags.\n    request = TagRequest(texts=texts, tags=tags)\n\n    # The response will contain the texts, the predicted tags, and the confidence.\n    response = asyncio.run(tag_request(request))\n    print(response.model_dump_json(indent=2))\n"
  },
  {
    "path": "examples/validators/allm_validator.py",
    "content": "import asyncio\nfrom typing import Annotated\nfrom pydantic import BaseModel, BeforeValidator\nfrom instructor import llm_validator, patch\nfrom openai import AsyncOpenAI\n\naclient = AsyncOpenAI()\n\npatch()\n\n\nclass QuestionAnswerNoEvil(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\"don't say objectionable things\", allow_override=True)\n        ),\n    ]\n\n\nasync def main():\n    context = \"The according to the devil is to live a life of sin and debauchery.\"\n    question = \"What is the meaning of life?\"\n\n    try:\n        qa: QuestionAnswerNoEvil = await aclient.chat.completions.create(\n            model=\"gpt-3.5-turbo\",\n            response_model=QuestionAnswerNoEvil,\n            max_retries=2,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a system that answers questions based on the context. Answer exactly what the question asks using the context.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n                },\n            ],\n        )  # type: ignore\n        print(qa)\n    except Exception as e:\n        print(e)\n\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n"
  },
  {
    "path": "examples/validators/annotator.py",
    "content": "from typing import Annotated\nfrom pydantic import BaseModel, ValidationError\nfrom pydantic.functional_validators import AfterValidator\n\n\ndef name_must_contain_space(v: str) -> str:\n    if \" \" not in v:\n        raise ValueError(\"name must be a first and last name separated by a space\")\n    return v.lower()\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: Annotated[str, AfterValidator(name_must_contain_space)]\n\n\n# Example 1) Valid input, notice that the name is lowercased\nperson: UserDetail = UserDetail(age=29, name=\"Jason Liu\")\nprint(person.model_dump_json(indent=2))\n\"\"\"\n{\n    \"age\": 29,\n    \"name\": \"jason liu\"\n}\n\"\"\"\n\n# Example 2) Invalid input, we'll get a validation error\n# In the future this validation error will be raised by the API and\n# used by the LLM to generate a better response\ntry:\n    person: UserDetail = UserDetail(age=29, name=\"Jason\")\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserDetail\n    name\n        Value error, name must be a first and last name separated by a space [type=value_error, input_value='Jason', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.3/v/value_error\n    \"\"\"\n"
  },
  {
    "path": "examples/validators/chain_of_thought_validator.py",
    "content": "import instructor\nfrom openai import OpenAI\n\nfrom pydantic import BaseModel, Field, model_validator\nfrom typing import Optional\n\n# Enables `response_model` and `max_retries` parameters\nclient = instructor.from_openai(OpenAI())\n\n\nclass Validation(BaseModel):\n    is_valid: bool = Field(\n        ..., description=\"Whether the value is valid given the rules\"\n    )\n    error_message: Optional[str] = Field(\n        ...,\n        description=\"The error message if the value is not valid, to be used for re-asking the model\",\n    )\n\n\ndef validator(values):\n    chain_of_thought = values[\"chain_of_thought\"]\n    answer = values[\"answer\"]\n    resp = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a validator. Determine if the value is valid for the statement. If it is not, explain why.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Verify that `{answer}` follows the chain of thought: {chain_of_thought}\",\n            },\n        ],\n        # this comes from instructor.from_openai()\n        response_model=Validation,\n    )\n    if not resp.is_valid:\n        raise ValueError(resp.error_message)\n    return values\n\n\nclass Response(BaseModel):\n    chain_of_thought: str\n    answer: str\n\n    @model_validator(mode=\"before\")\n    @classmethod\n    def chain_of_thought_makes_sense(cls, data):\n        return validator(data)\n\n\nif __name__ == \"__main__\":\n    try:\n        resp = Response(\n            chain_of_thought=\"1 + 1 = 2\", answer=\"The meaning of life is 42\"\n        )\n        print(resp)\n    except Exception as e:\n        print(e)\n        \"\"\"\n        1 validation error for Response\n            Value error, The statement 'The meaning of life is 42' does not follow the chain of thought: 1 + 1 = 2. \n            [type=value_error, input_value={'chain_of_thought': '1 +... meaning of life is 42'}, input_type=dict]\n        \"\"\"\n"
  },
  {
    "path": "examples/validators/citations.py",
    "content": "from typing import Annotated\nfrom pydantic import BaseModel, ValidationError, ValidationInfo, AfterValidator\nfrom openai import OpenAI\nimport instructor\n\nclient = instructor.from_openai(OpenAI())\n\n\ndef citation_exists(v: str, info: ValidationInfo):\n    context = info.context\n    if context:\n        context = context.get(\"text_chunk\")\n        if v not in context:\n            raise ValueError(f\"Citation `{v}` not found in text\")\n    return v\n\n\nCitation = Annotated[str, AfterValidator(citation_exists)]\n\n\nclass AnswerWithCitation(BaseModel):\n    answer: str\n    citation: Citation\n\n\ntry:\n    q = \"Are blue berries high in protein?\"\n    text_chunk = \"\"\"\n    Blueberries are a good source of vitamin K.\n    They also contain vitamin C, fibre, manganese and other antioxidants (notably anthocyanins).    \n    \"\"\"\n\n    resp = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=AnswerWithCitation,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Answer the question `{q}` using the text chunk\\n`{text_chunk}`\",\n            },\n        ],\n        validation_context={\"text_chunk\": text_chunk},\n    )  # type: ignore\n    print(resp.model_dump_json(indent=2))\nexcept ValidationError as e:\n    print(e)\n"
  },
  {
    "path": "examples/validators/competitors.py",
    "content": "from typing import Annotated\nfrom pydantic import BaseModel, ValidationError, AfterValidator\nfrom openai import OpenAI\n\nimport instructor\n\nclient = instructor.from_openai(OpenAI())\n\n\ndef no_competitors(v: str) -> str:\n    # does not allow the competitors of mcdonalds\n    competitors = [\"burger king\", \"wendy's\", \"carl's jr\", \"jack in the box\"]\n    for competitor in competitors:\n        if competitor in v.lower():\n            raise ValueError(\n                f\"\"\"Let them know that you are work for and are only allowed to talk about mcdonalds.\n                Do not apologize. Do not even mention `{competitor}` since they are a a competitor of McDonalds\"\"\"\n            )\n    return v\n\n\nclass Response(BaseModel):\n    message: Annotated[str, AfterValidator(no_competitors)]\n\n\ntry:\n    resp = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=Response,\n        max_retries=2,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What is your favourite order at burger king?\",\n            },\n        ],\n    )  # type: ignore\n    print(resp.model_dump_json(indent=2))\nexcept ValidationError as e:\n    print(e)\n"
  },
  {
    "path": "examples/validators/field_validator.py",
    "content": "from pydantic import BaseModel, ValidationError, field_validator\n\n\nclass UserDetail(BaseModel):\n    age: int\n    name: str\n\n    @field_validator(\"name\", mode=\"before\")\n    def name_must_contain_space(cls, v):\n        \"\"\"\n        This validator will be called after the default validator,\n        and will raise a validation error if the name does not contain a space.\n        then it will set the name to be lower case\n        \"\"\"\n        if \" \" not in v:\n            raise ValueError(\"name be a first and last name separated by a space\")\n        return v.lower()\n\n\n# Example 1) Valid input, notice that the name is lowercased\nperson = UserDetail(age=29, name=\"Jason Liu\")\nprint(person.model_dump_json(indent=2))\n\"\"\"\n{\n    \"age\": 29,\n    \"name\": \"jason liu\"\n}\n\"\"\"\n\n# Example 2) Invalid input, we'll get a validation error\n# In the future this validation error will be raised by the API and\n# used by the LLM to generate a better response\ntry:\n    person = UserDetail(age=29, name=\"Jason\")\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for UserDetail \n        name\n    Value error, must contain a space [type=value_error, input_value='Jason', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.3/v/value_error\n    \"\"\"\n"
  },
  {
    "path": "examples/validators/just_a_guy.py",
    "content": "from pydantic import BaseModel, ValidationError, field_validator, ValidationInfo\n\n\nclass AnswerWithCitation(BaseModel):\n    answer: str\n    citation: str\n\n    @field_validator(\"citation\")\n    @classmethod\n    def remove_stopwords(cls, v: str, info: ValidationInfo):\n        context = info.context\n        if context:\n            text_chunks = context.get(\"text_chunk\")\n            if v not in text_chunks:\n                raise ValueError(f\"Citation `{v}` not found in text chunks\")\n        return v\n\n\ntry:\n    AnswerWithCitation.model_validate(\n        {\"answer\": \"Jason is a cool guy\", \"citation\": \"Jason is cool\"},\n        context={\"text_chunk\": \"Jason is just a guy\"},\n    )\nexcept ValidationError as e:\n    print(e)\n    \"\"\"\n    1 validation error for AnswerWithCitation\n    citation\n    Value error, Citation `Jason is cool`` not found in text chunks [type=value_error, input_value='Jason is cool', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.4/v/value_error\n    \"\"\"\n"
  },
  {
    "path": "examples/validators/llm_validator.py",
    "content": "import instructor\n\nfrom openai import OpenAI\nfrom instructor import llm_validator\nfrom pydantic import BaseModel, ValidationError, BeforeValidator\nfrom typing import Annotated\n\n# Apply the patch to the OpenAI client\nclient = instructor.from_openai(OpenAI())\n\n\nclass QuestionAnswer(BaseModel):\n    question: str\n    answer: str\n\n\nquestion = \"What is the meaning of life?\"\ncontext = \"The according to the devil is to live a life of sin and debauchery.\"\n\nqa: QuestionAnswer = client.chat.completions.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=QuestionAnswer,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n        },\n    ],\n)  # type: ignore\n\nprint(\"Before validation with `llm_validator`\")\nprint(qa.model_dump_json(indent=2), end=\"\\n\\n\")\n\"\"\"\nBefore validation with `llm_validator`\n{\n    \"question\": \"What is the meaning of life?\",\n    \"answer\": \"The meaning of life, according to the context, is to live a life of sin and debauchery.\",\n}\n\"\"\"\n\n\nclass QuestionAnswerNoEvil(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\"don't say objectionable things\", openai_client=client)\n        ),\n    ]\n\n\ntry:\n    qa = QuestionAnswerNoEvil(\n        question=\"What is the meaning of life?\",\n        answer=\"The meaning of life is to be evil and steal\",\n    )\nexcept ValidationError as e:\n    print(e)\n\"\"\"\n1 validation error for QuestionAnswerNoEvil\nanswer\n  Assertion failed, The statement promotes objectionable behavior. [type=assertion_error, input_value='The meaning of life is to be evil and steal', input_type=str]\n    For further information visit https://errors.pydantic.dev/2.4/v/assertion_error\n\"\"\"\n\ntry:\n    qa: QuestionAnswerNoEvil = client.chat.completions.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=QuestionAnswerNoEvil,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n            },\n        ],\n    )  # type: ignore\nexcept Exception as e:\n    print(e, end=\"\\n\\n\")\n    \"\"\"\n    1 validation error for QuestionAnswerNoEvil\n    answer\n        Assertion failed, The statement promotes sin and debauchery, which is objectionable. [type=assertion_error, input_value='The meaning of life is t... of sin and debauchery.', input_type=str]\n        For further information visit https://errors.pydantic.dev/2.3/v/assertion_error\n    \"\"\"\n\nqa: QuestionAnswerNoEvil = client.chat.completions.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=QuestionAnswerNoEvil,\n    max_retries=2,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n        },\n    ],\n)  # type: ignore\n\nprint(\"After validation with `llm_validator` with `max_retries=2`\")\nprint(qa.model_dump_json(indent=2), end=\"\\n\\n\")\n\"\"\"\nAfter validation with `llm_validator` with `max_retries=2`\n{\n  \"question\": \"What is the meaning of life?\",\n  \"answer\": \"The meaning of life is subjective and can vary depending on individual beliefs and philosophies.\"\n}\n\"\"\"\n"
  },
  {
    "path": "examples/validators/moderation.py",
    "content": "import instructor\n\nfrom instructor import openai_moderation\n\nfrom typing import Annotated\nfrom pydantic import BaseModel, AfterValidator\nfrom openai import OpenAI\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Response(BaseModel):\n    message: Annotated[str, AfterValidator(openai_moderation(client=client))]\n\n\nresponse = Response(message=\"I want to make them suffer the consequences\")\n"
  },
  {
    "path": "examples/validators/readme.md",
    "content": "# Using `llm_validator` with OpenAI's GPT-3.5 Turbo and Pydantic for Text Validation with Output Examples\n\n## Overview\n\nThis document outlines how to use a custom text validation logic (`llm_validator`) with OpenAI's GPT-3.5 Turbo and Pydantic, including the outputs for each operation.\n\n## Code Explanation\n\n### Basic Setup\n\nImport necessary modules and apply patches for compatibility.\n\n```python\nfrom typing_extensions import Annotated\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n)\nfrom instructor import llm_validator, patch\nimport openai\n\npatch()\n```\n\n### Defining Response Models\n\nDefine a basic Pydantic model named `QuestionAnswer`.\n\n```python\nclass QuestionAnswer(BaseModel):\n    question: str\n    answer: str\n```\n\n### Generating a Response\n\nGenerate a response from GPT-3.5 Turbo.\n\n```python\nquestion = \"What is the meaning of life?\"\ncontext = \"The according to the devil is to live a life of sin and debauchery.\"\n\nqa: QuestionAnswer = openai.ChatCompletion.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=QuestionAnswer,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n        },\n    ],\n)\n```\n\n#### Output\n\nBefore validation with `llm_validator`:\n\n```json\n{\n  \"question\": \"What is the meaning of life?\",\n  \"answer\": \"The meaning of life, according to the context, is to live a life of sin and debauchery.\"\n}\n```\n\n### Adding Custom Validation\n\nAdd custom validation using `llm_validator`.\n\n```python\nclass QuestionAnswerNoEvil(BaseModel):\n    question: str\n    answer: Annotated[\n        str,\n        BeforeValidator(\n            llm_validator(\"don't say objectionable things\", allow_override=True)\n        ),\n    ]\n```\n\n#### Output\n\n```text\n1 validation error for QuestionAnswerNoEvil\nanswer\n    Assertion failed, The statement promotes sin and debauchery, which is objectionable.\n```\n\n### Handling Validation Errors\n\nCatch exceptions raised by the validation.\n\n```python\ntry:\n    qa: QuestionAnswerNoEvil = openai.ChatCompletion.create(\n        model=\"gpt-3.5-turbo\",\n        response_model=QuestionAnswerNoEvil,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n            },\n        ],\n    )\nexcept Exception as e:\n    print(e)\n```\n\n### Retrying Validation\n\nAllow for retries by setting `max_retries=2`.\n\n```python\nqa: QuestionAnswerNoEvil = openai.ChatCompletion.create(\n    model=\"gpt-3.5-turbo\",\n    response_model=QuestionAnswerNoEvil,\n    max_retries=2,\n    messages=[\n        {\n            \"role\": \"system\",\n            \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n        },\n        {\n            \"role\": \"user\",\n            \"content\": f\"using the context: {context}\\n\\nAnswer the following question: {question}\",\n        },\n    ],\n)\n```\n\n#### Output\n\nAfter validation with `llm_validator` and `max_retries=2`:\n\n```json\n{\n  \"question\": \"What is the meaning of life?\",\n  \"answer\": \"The meaning of life is subjective and can vary depending on individual beliefs and philosophies.\"\n}\n```\n\n## Summary\n\nThis document described how to use `llm_validator` with OpenAI's GPT-3.5 Turbo and Pydantic, including example outputs. This approach allows for controlled and filtered responses.\n"
  },
  {
    "path": "examples/vision/image_to_ad_copy.py",
    "content": "import json\nimport logging\nimport os\nimport sys\nfrom typing import Optional\n\nfrom dotenv import find_dotenv, load_dotenv\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom rich import print as rprint\n\nimport instructor\n\nload_dotenv(find_dotenv())\n\n# Add logger\nlogging.basicConfig()\nlogger = logging.getLogger(\"app\")\nlogger.setLevel(\"INFO\")\n\n\n# Define models\nclass Product(BaseModel):\n    \"\"\"\n    Represents a product extracted from an image using AI.\n\n    The product attributes are dynamically determined based on the content\n    of the image and the AI's interpretation. This class serves as a structured\n    representation of the identified product characteristics.\n    \"\"\"\n\n    name: str = Field(\n        description=\"A generic name for the product.\", example=\"Headphones\"\n    )\n    key_features: Optional[list[str]] = Field(\n        description=\"A list of key features of the product that stand out.\",\n        example=[\"Wireless\", \"Noise Cancellation\"],\n        default=None,\n    )\n\n    description: Optional[str] = Field(\n        description=\"A description of the product.\",\n        example=\"Wireless headphones with noise cancellation.\",\n        default=None,\n    )\n\n    def generate_prompt(self):\n        prompt = f\"Product: {self.name}\\n\"\n        if self.description:\n            prompt += f\"Description: {self.description}\\n\"\n        if self.key_features:\n            prompt += f\"Key Features: {', '.join(self.key_features)}\\n\"\n        return prompt\n\n\nclass IdentifiedProduct(BaseModel):\n    \"\"\"\n    Represents a list of products identified in the images.\n    \"\"\"\n\n    products: Optional[list[Product]] = Field(\n        description=\"A list of products identified by the AI.\",\n        example=[\n            Product(\n                name=\"Headphones\",\n                description=\"Wireless headphones with noise cancellation.\",\n                key_features=[\"Wireless\", \"Noise Cancellation\"],\n            )\n        ],\n        default=None,\n    )\n\n    error: bool = Field(default=False)\n    message: Optional[str] = Field(default=None)\n\n    def __bool__(self):\n        return self.products is not None and len(self.products) > 0\n\n\nclass AdCopy(BaseModel):\n    \"\"\"\n    Represents a generated ad copy.\n    \"\"\"\n\n    headline: str = Field(\n        description=\"A short, catchy, and memorable headline for the given product. The headline should invoke curiosity and interest in the product.\",\n        example=\"Wireless Headphones\",\n    )\n    ad_copy: str = Field(\n        description=\"A long-form advertisement copy for the given product. This will be used in campaigns to promote the product with a persuasive message and a call-to-action with the objective of driving sales.\",\n        example=\"\"\"\n        \"Experience the ultimate sound quality with our wireless headphones, featuring high-definition audio, noise-cancellation, and a comfortable, ergonomic design for all-day listening.\"\n        \"\"\",\n    )\n    name: str = Field(\n        description=\"The name of the product being advertised.\",\n        example=\"Headphones\",\n    )\n\n\n# Define clients\nclient_image = instructor.from_openai(\n    OpenAI(api_key=os.getenv(\"OPENAI_API_KEY\")), mode=instructor.Mode.MD_JSON\n)\nclient_copy = instructor.from_openai(\n    OpenAI(api_key=os.getenv(\"OPENAI_API_KEY\")), mode=instructor.Mode.TOOLS\n)\n\n\n# Define functions\ndef read_images(image_urls: list[str]) -> IdentifiedProduct:\n    \"\"\"\n    Given a list of image URLs, identify the products in the images.\n    \"\"\"\n\n    logger.info(f\"Identifying products in images... {len(image_urls)} images\")\n\n    return client_image.chat.completions.create(\n        model=\"gpt-4-vision-preview\",\n        response_model=IdentifiedProduct,\n        max_tokens=1024,\n        temperature=0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Identify products using the given images and generate key features for each product.\",\n                    },\n                    *[\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": url}}\n                        for url in image_urls\n                    ],\n                ],\n            }\n        ],\n    )\n\n\ndef generate_ad_copy(product: Product) -> AdCopy:\n    \"\"\"\n    Given a product, generate an ad copy for the product.\n    \"\"\"\n\n    logger.info(f\"Generating ad copy for product: {product.name}\")\n\n    return client_copy.chat.completions.create(\n        model=\"gpt-4-1106-preview\",\n        response_model=AdCopy,\n        temperature=0.3,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are an expert marketing assistant for all products. Your task is to generate an advertisement copy for a product using the name, description, and key features.\",\n            },\n            {\"role\": \"user\", \"content\": product.generate_prompt()},\n        ],\n    )\n\n\ndef run(images: list[str]) -> tuple[list[Product], list[AdCopy]]:\n    \"\"\"\n    Given a list of images, identify the products in the images and generate ad copy for each product.\n    \"\"\"\n\n    identified_products: IdentifiedProduct = read_images(images)\n    ad_copies = []\n\n    if identified_products.error:\n        rprint(f\"[red]Error: {identified_products.message}[/red]\")\n        return []\n\n    if not identified_products:\n        rprint(\"[yellow]No products identified.[/yellow]\")\n        return []\n\n    for product in identified_products.products:\n        ad_copy: AdCopy = generate_ad_copy(product)\n        ad_copies.append(ad_copy)\n\n    return identified_products.products, ad_copies\n\n\nif __name__ == \"__main__\":\n    # Run logger\n    logger.info(\"Starting app...\")\n\n    if len(sys.argv) != 2:\n        print(\"Usage: python app.py <path_to_image_list_file>\")\n        sys.exit(1)\n\n    image_file = sys.argv[1]\n    with open(image_file) as file:\n        logger.info(f\"Reading images from file: {image_file}\")\n        try:\n            image_list = file.read().splitlines()\n            logger.info(f\"{len(image_list)} images read from file: {image_file}\")\n        except Exception as e:\n            logger.error(f\"Error reading images from file: {image_file}\")\n            logger.error(e)\n            sys.exit(1)\n\n    products, ad_copies = run(image_list)\n\n    rprint(f\"[green]{len(products)} products identified:[/green]\")\n    for product, ad_copy in zip(products, ad_copies):\n        rprint(f\"[green]{product}[/green]\")\n        rprint(f\"[blue]Ad Copy: {ad_copy.ad_copy}[/blue]\")\n\n    logger.info(\"Writing results to file...\")\n\n    with open(\"results.json\", \"w\") as f:\n        json.dump(\n            {\n                \"products\": [prod.model_dump() for prod in products],\n                \"ad_copies\": [ad.model_dump() for ad in ad_copies],\n            },\n            f,\n            indent=4,\n        )\n\n\"\"\" \nExample output:\n{\n    \"products\": [\n        {\n            \"name\": \"Ice Skates\",\n            \"key_features\": [\n                \"Lace-up closure\",\n                \"Durable blade\",\n                \"Ankle support\"\n            ],\n            \"description\": \"A pair of ice skates with lace-up closure for secure fit, durable blade for ice skating, and reinforced ankle support.\"\n        },\n        {\n            \"name\": \"Hiking Boots\",\n            \"key_features\": [\n                \"High-top design\",\n                \"Rugged outsole\",\n                \"Water-resistant\"\n            ],\n            \"description\": \"Sturdy hiking boots featuring a high-top design for ankle support, rugged outsole for grip on uneven terrain, and water-resistant construction.\"\n        },\n        {\n            \"name\": \"Winter Boots\",\n            \"key_features\": [\n                \"Insulated lining\",\n                \"Waterproof lower\",\n                \"Slip-resistant sole\"\n            ],\n            \"description\": \"Warm winter boots with insulated lining for cold weather, waterproof lower section to keep feet dry, and a slip-resistant sole for stability.\"\n        }\n    ],\n    \"ad_copies\": [\n        {\n            \"headline\": \"Glide with Confidence - Discover the Perfect Ice Skates!\",\n            \"ad_copy\": \"Step onto the ice with poise and precision with our premium Ice Skates. Designed for both beginners and seasoned skaters, these skates offer a perfect blend of comfort and performance. The lace-up closure ensures a snug fit that keeps you stable as you carve through the ice. With a durable blade that withstands the test of time, you can focus on perfecting your moves rather than worrying about your equipment. The reinforced ankle support provides the necessary protection and aids in preventing injuries, allowing you to skate with peace of mind. Whether you're practicing your spins, jumps, or simply enjoying a leisurely glide across the rink, our Ice Skates are the ideal companion for your ice adventures. Lace up and get ready to experience the thrill of ice skating like never before!\",\n            \"name\": \"Ice Skates\"\n        },\n        {\n            \"headline\": \"Conquer Every Trail with Confidence!\",\n            \"ad_copy\": \"Embark on your next adventure with our top-of-the-line Hiking Boots! Designed for the trail-blazing spirits, these boots boast a high-top design that provides unparalleled ankle support to keep you steady on any path. The rugged outsole ensures a firm grip on the most uneven terrains, while the water-resistant construction keeps your feet dry as you traverse through streams and muddy trails. Whether you're a seasoned hiker or just starting out, our Hiking Boots are the perfect companion for your outdoor escapades. Lace up and step into the wild with confidence - your journey awaits!\",\n            \"name\": \"Hiking Boots\"\n        },\n        {\n            \"headline\": \"Conquer the Cold with Comfort!\",\n            \"ad_copy\": \"Step into the season with confidence in our Winter Boots, the ultimate ally against the chill. Designed for those who don't let the cold dictate their moves, these boots feature an insulated lining that wraps your feet in a warm embrace, ensuring that the biting cold is a worry of the past. But warmth isn't their only virtue. With a waterproof lower section, your feet will remain dry and cozy, come rain, snow, or slush. And let's not forget the slip-resistant sole that stands between you and the treacherous ice, offering stability and peace of mind with every step you take. Whether you're braving a blizzard or just nipping out for a coffee, our Winter Boots are your trusty companions, keeping you warm, dry, and upright. Don't let winter slow you down. Lace up and embrace the elements!\",\n            \"name\": \"Winter Boots\"\n        }\n    ]\n}\n\"\"\"\n"
  },
  {
    "path": "examples/vision/run.py",
    "content": "import instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel\nimport base64\n\nclient = instructor.from_openai(OpenAI(), mode=instructor.Mode.MD_JSON)\n\n\nclass Circle(BaseModel):\n    x: int\n    y: int\n    color: str\n\n\ndef encode_image(image_path):\n    with open(image_path, \"rb\") as image_file:\n        return base64.b64encode(image_file.read()).decode(\"utf-8\")\n\n\ndef draw_circle(image_size, num_circles, path):\n    from PIL import Image, ImageDraw\n    import random\n\n    image = Image.new(\"RGB\", image_size, \"white\")\n\n    draw = ImageDraw.Draw(image)\n    for _ in range(num_circles):\n        # Randomize the circle properties\n        radius = 100  # random.randint(10, min(image_size)//5)  # Radius between 10 and 1/5th of the smallest dimension\n        x = random.randint(radius, image_size[0] - radius)\n        y = random.randint(radius, image_size[1] - radius)\n        color = [\"red\", \"black\", \"blue\", \"green\"][random.randint(0, 3)]\n\n        circle_position = (x - radius, y - radius, x + radius, y + radius)\n        print(f\"Generating circle at {x, y} with color {color}\")\n        draw.ellipse(circle_position, fill=color, outline=\"black\")\n\n    image.save(path)\n\n\nimg_path = \"circle.jpg\"\ndraw_circle((1024, 1024), 1, img_path)\nbase64_image = encode_image(img_path)\n\nresponse = client.chat.completions.create(\n    model=\"gpt-4-vision-preview\",\n    max_tokens=1800,\n    response_model=Circle,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\"type\": \"text\", \"text\": \"find the circle\"},\n                {\n                    \"type\": \"image_url\",\n                    \"image_url\": {\"url\": f\"data:image/jpeg;base64,{base64_image}\"},\n                },\n            ],\n        }\n    ],\n)\n\nprint(\n    f\"Found circle with center at x: {response.x}, y: {response.y} and color: {response.color}\"\n)\n"
  },
  {
    "path": "examples/vision/run_raw.py",
    "content": "from openai import OpenAI\nfrom pydantic import BaseModel, Field\n\nclient = OpenAI()\n\n\nclass SearchQuery(BaseModel):\n    product_name: str\n    query: str = Field(\n        ...,\n        description=\"A descriptive query to search for the product, include adjectives, and the product type. will be used to serve relevant products to the user.\",\n    )\n\n\nclass MultiSearchQuery(BaseModel):\n    products: list[SearchQuery]\n\n\ndef extract_table(url: str):\n    completion = client.chat.completions.create(\n        model=\"gpt-4-vision-preview\",\n        max_tokens=1800,\n        temperature=0,\n        stop=[\"```\"],\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": f\"\"\"\n                You are an expert system designed to extract products from images for a ecommerse application\n                Please provide the product name and a descriptive query to search for the product.\n                Accuratly identify every product in an image and provide a descriptive query to search for the product\n                \n                You just return a correctly formatted JSON object with the product name and query for each product in the image\n                and follows the schema below:\n\n                {MultiSearchQuery.model_json_schema()}\n                \"\"\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Extract the products from the image, and describe them in a query in JSON format\",\n                    },\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                ],\n            },\n            {\n                \"role\": \"assistant\",\n                \"content\": \"Here is the following search queries for the products in the image\\n ```json\",\n            },\n        ],\n    )\n    return MultiSearchQuery.model_validate_json(completion.choices[0].message.content)\n\n\nif __name__ == \"__main__\":\n    url = \"https://mensfashionpostingcom.files.wordpress.com/2020/03/fbe79-img_5052.jpg?w=768\"\n    products = extract_table(url)\n    print(products.model_dump_json(indent=2))\n    \"\"\"\n    {\n    \"products\": [\n        {\n            \"product_name\": \"Olive Green Shirt\",\n            \"query\": \"Olive green casual long sleeve button-down shirt\"\n        },\n        {\n            \"product_name\": \"Black Jeans\",\n            \"query\": \"Slim fit black jeans for men\"\n        },\n        {\n            \"product_name\": \"Sunglasses\",\n            \"query\": \"Classic brown aviator sunglasses\"\n        },\n        {\n            \"product_name\": \"Leather Strap Watch\",\n            \"query\": \"Minimalist men's watch with black leather strap\"\n        },\n        {\n            \"product_name\": \"Beige Sneakers\",\n            \"query\": \"Men's beige lace-up fashion sneakers with white soles\"\n        }\n    ]}\n    \"\"\"\n"
  },
  {
    "path": "examples/vision/run_table.py",
    "content": "from io import StringIO\nfrom typing import Annotated, Any\nfrom openai import OpenAI\nfrom pydantic import (\n    BaseModel,\n    BeforeValidator,\n    PlainSerializer,\n    InstanceOf,\n    WithJsonSchema,\n)\nimport pandas as pd\nimport instructor\n\n\nclient = instructor.from_openai(OpenAI(), mode=instructor.Mode.MD_JSON)\n\n\ndef to_markdown(df: pd.DataFrame) -> str:\n    return df.to_markdown()\n\n\ndef md_to_df(data: Any) -> Any:\n    if isinstance(data, str):\n        return (\n            pd.read_csv(\n                StringIO(data),  # Get rid of whitespaces\n                sep=\"|\",\n                index_col=1,\n            )\n            .dropna(axis=1, how=\"all\")\n            .iloc[1:]\n            .map(lambda x: x.strip())\n        )  # type: ignore\n    return data\n\n\nMarkdownDataFrame = Annotated[\n    InstanceOf[pd.DataFrame],\n    BeforeValidator(md_to_df),\n    PlainSerializer(to_markdown),\n    WithJsonSchema(\n        {\n            \"type\": \"string\",\n            \"description\": \"\"\"\n                The markdown representation of the table, \n                each one should be tidy, do not try to join tables\n                that should be separate\"\"\",\n        }\n    ),\n]\n\n\nclass Table(BaseModel):\n    caption: str\n    dataframe: MarkdownDataFrame\n\n\ndef extract_table(url: str):\n    return client.chat.completions.create_iterable(\n        model=\"gpt-4-vision-preview\",\n        response_model=Table,\n        max_tokens=1800,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"\"\"Extract the table from the image, and describe it. \n                        Each table should be tidy, do not try to join tables that \n                        should be separately described.\"\"\",\n                    },\n                    {\n                        \"type\": \"image_url\",\n                        \"image_url\": {\"url\": url},\n                    },\n                ],\n            }\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    url = \"https://a.storyblok.com/f/47007/2400x2000/bf383abc3c/231031_uk-ireland-in-three-charts_table_v01_b.png\"\n    tables = extract_table(url)\n    for tbl in tables:\n        print(tbl.caption, end=\"\\n\")\n        print(tbl.dataframe)\n    \"\"\"\n    Top 10 grossing apps in October 2023 (Ireland) for Android platforms, listing the rank, app name, and category.\n\n                App Name                    Category         \n    Rank                                                    \n    1                          Google One       Productivity\n    2                             Disney+      Entertainment\n    3       TikTok - Videos, Music & LIVE      Entertainment\n    4                    Candy Crush Saga              Games\n    5      Tinder: Dating, Chat & Friends  Social networking\n    6                         Coin Master              Games\n    7                              Roblox              Games\n    8      Bumble - Dating & Make Friends             Dating\n    9                         Royal Match              Games\n    10        Spotify: Music and Podcasts      Music & Audio\n\n    Top 10 grossing apps in October 2023 (Ireland) for iOS platforms, listing the rank, app name, and category.\n\n                App Name                    Category         \n    Rank                                                    \n    1      Tinder: Dating, Chat & Friends  Social networking\n    2                             Disney+      Entertainment\n    3      YouTube: Watch, Listen, Stream      Entertainment\n    4        Audible: Audio Entertainment      Entertainment\n    5                    Candy Crush Saga              Games\n    6       TikTok - Videos, Music & LIVE      Entertainment\n    7      Bumble - Dating & Make Friends             Dating\n    8                              Roblox              Games\n    9         LinkedIn: Job Search & News           Business\n    10        Duolingo - Language Lessons          Education\n    \"\"\"\n"
  },
  {
    "path": "examples/vision/slides.py",
    "content": "import json\nimport logging\nimport sys\nfrom typing import Optional\n\nfrom dotenv import find_dotenv, load_dotenv\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom rich import print as rprint\n\nimport instructor\n\nload_dotenv(find_dotenv())\n\nIMAGE_FILE = \"image-file.txt\"  # file with all the images to be processed\n\n# Add logger\nlogging.basicConfig()\nlogger = logging.getLogger(\"app\")\nlogger.setLevel(\"INFO\")\n\n\nclass Competitor(BaseModel):\n    name: str\n    features: Optional[list[str]]\n\n\n# Define models\nclass Industry(BaseModel):\n    \"\"\"\n    Represents competitors from a specific industry extracted from an image using AI.\n    \"\"\"\n\n    name: str = Field(description=\"The name of the industry\")\n    competitor_list: list[Competitor] = Field(\n        description=\"A list of competitors for this industry\"\n    )\n\n\nclass Competition(BaseModel):\n    \"\"\"\n    Represents competitors extracted from an image using AI.\n\n    This class serves as a structured representation of\n    competitors and their qualities.\n    \"\"\"\n\n    industry_list: list[Industry] = Field(\n        description=\"A list of industries and their competitors\"\n    )\n\n\n# Define clients\nclient_image = instructor.from_openai(OpenAI(), mode=instructor.Mode.MD_JSON)\n\n\n# Define functions\ndef read_images(image_urls: list[str]) -> Competition:\n    \"\"\"\n    Given a list of image URLs, identify the competitors in the images.\n    \"\"\"\n\n    logger.info(f\"Identifying competitors in images... {len(image_urls)} images\")\n\n    return client_image.chat.completions.create(\n        model=\"gpt-4-vision-preview\",\n        response_model=Competition,\n        max_tokens=2048,\n        temperature=0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"Identify competitors and generate key features for each competitor.\",\n                    },\n                    *[\n                        {\"type\": \"image_url\", \"image_url\": {\"url\": url}}\n                        for url in image_urls\n                    ],\n                ],\n            }\n        ],\n    )\n\n\ndef process_and_identify_competitors():\n    \"\"\"\n    Main function to process the image list file and identify competitors.\n    \"\"\"\n\n    logger.info(\"Starting app...\")\n\n    try:\n        with open(IMAGE_FILE) as file:\n            logger.info(f\"Reading images from file: {IMAGE_FILE}\")\n            image_list = file.read().splitlines()\n            logger.info(f\"{len(image_list)} images read from file: {IMAGE_FILE}\")\n    except Exception as e:\n        logger.error(f\"Error reading images from file: {IMAGE_FILE}\")\n        logger.error(e)\n        sys.exit(1)\n\n    competitors = read_images(image_list)\n\n    rprint(f\"[green]{len(competitors.industry_list)} industries identified:[/green]\")\n    for industry in competitors.industry_list:\n        rprint(f\"[green]{industry.name}[/green]\")\n        rprint(f\"[blue]Features: {industry.competitor_list}[/blue]\")\n\n    logger.info(\"Writing results to file...\")\n\n    with open(\"results.json\", \"w\") as f:\n        json.dump(\n            {\n                \"competitors\": competitors.model_dump(),\n            },\n            f,\n            indent=4,\n        )\n\n\nif __name__ == \"__main__\":\n    process_and_identify_competitors()\n\n\"\"\"\nExample output:\n{\n    \"competitors\": {\n        \"industry_list\": [\n            {\n                \"name\": \"Accommodation and Hospitality\",\n                \"competitor_list\": [\n                    {\n                        \"name\": \"craigslist\",\n                        \"features\": [\n                            \"Transactions Offline\",\n                            \"Inexpensive\"\n                        ]\n                    },\n                    {\n                        \"name\": \"couchsurfing\",\n                        \"features\": [\n                            \"Transactions Offline\",\n                            \"Inexpensive\"\n                        ]\n                    },\n                    {\n                        \"name\": \"BedandBreakfast.com\",\n                        \"features\": [\n                            \"Transactions Offline\",\n                            \"Inexpensive\"\n                        ]\n                    },\n                    {\n                        \"name\": \"airbnb\",\n                        \"features\": [\n                            \"Transactions Online\",\n                            \"Inexpensive\"\n                        ]\n                    },\n                    {\n                        \"name\": \"HOSTELS.com\",\n                        \"features\": [\n                            \"Transactions Online\",\n                            \"Inexpensive\"\n                        ]\n                    },\n                    {\n                        \"name\": \"VRBO\",\n                        \"features\": [\n                            \"Transactions Offline\",\n                            \"Costly\"\n                        ]\n                    },\n                    {\n                        \"name\": \"Rentahome\",\n                        \"features\": [\n                            \"Transactions Online\",\n                            \"Costly\"\n                        ]\n                    },\n                        {\n                        \"name\": \"Orbitz\",\n                        \"features\": [\n                            \"Transactions Online\",\n                            \"Costly\"\n                        ]\n                    },\n                    {\n                        \"name\": \"Hotels.com\",\n                        \"features\": [\n                            \"Transactions Online\",\n                            \"Costly\"\n                        ]\n                    }\n                ]\n            },\n            {\n                \"name\": \"E-commerce Wine Retailers\",\n                \"competitor_list\": [\n                    {\n                        \"name\": \"winesimple\",\n                        \"features\": [\n                            \"Ecommerce Retailers\",\n                            \"True Personalized Selections\",\n                            \"Brand Name Wine\",\n                            \"No Inventory Cost\",\n                            \"Target Mass Market\"\n                        ]\n                    },\n                    {\n                        \"name\": \"nakedwines.com\",\n                        \"features\": [\n                            \"Ecommerce Retailers\",\n                            \"Target Mass Market\"\n                        ]\n                    },\n                    {\n                        \"name\": \"Club W\",\n                        \"features\": [\n                            \"Ecommerce Retailers\",\n                            \"Brand Name Wine\",\n                            \"Target Mass Market\"\n                        ]\n                    },\n                    {\n                        \"name\": \"Tasting Room\",\n                        \"features\": [\n                            \"Ecommerce Retailers\",\n                            \"True Personalized Selections\",\n                            \"Brand Name Wine\"\n                        ]\n                    },\n                    {\n                        \"name\": \"hellovino\",\n                        \"features\": [\n                            \"Ecommerce Retailers\",\n                            \"True Personalized Selections\",\n                            \"No Inventory Cost\",\n                            \"Target Mass Market\"\n                        ]\n                    }\n                ]\n            }\n        ]\n    }\n}\n\"\"\"\n"
  },
  {
    "path": "examples/watsonx/watsonx.py",
    "content": "import os\n\nimport litellm\nfrom litellm import completion\nfrom pydantic import BaseModel, Field\n\nimport instructor\nfrom instructor import Mode\n\nlitellm.drop_params = True  # watsonx.ai doesn't support `json_mode`\n\nos.environ[\"WATSONX_URL\"] = \"https://us-south.ml.cloud.ibm.com\"\nos.environ[\"WATSONX_API_KEY\"] = \"\"\nos.environ[\"WATSONX_PROJECT_ID\"] = \"\"\n# Additional options: https://docs.litellm.ai/docs/providers/watsonx\n\n\nclass Company(BaseModel):\n    name: str = Field(description=\"name of the company\")\n    year_founded: int = Field(description=\"year the company was founded\")\n\n\nclient = instructor.from_litellm(completion, mode=Mode.JSON)\n\nresp = client.chat.completions.create(\n    model=\"watsonx/meta-llama/llama-3-8b-instruct\",\n    max_tokens=1024,\n    messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"\"\"\\\nGiven the following text, create a Company object:\n\nIBM was founded in 1911 as the Computing-Tabulating-Recording Company (CTR), a holding company of manufacturers of record-keeping and measuring systems.\n\"\"\",\n        }\n    ],\n    project_id=os.environ[\"WATSONX_PROJECT_ID\"],\n    response_model=Company,\n)\n\nprint(resp.model_dump_json(indent=2))\n\"\"\"\n{\n  \"name\": \"IBM\",\n  \"year_founded\": 1911\n}\n\"\"\"\n"
  },
  {
    "path": "examples/youtube/run.py",
    "content": "import instructor\nfrom openai import OpenAI\nfrom pydantic import BaseModel, Field\nfrom youtube_transcript_api import YouTubeTranscriptApi\nfrom rich.console import Console\nfrom rich.table import Table\nfrom rich.live import Live\n\nclient = instructor.from_openai(OpenAI())\n\n\nclass Chapter(BaseModel):\n    start_ts: float = Field(\n        ...,\n        description=\"The start timestamp indicating when the chapter starts in the video.\",\n    )\n    end_ts: float = Field(\n        ...,\n        description=\"The end timestamp indicating when the chapter ends in the video.\",\n    )\n    title: str = Field(\n        ..., description=\"A concise and descriptive title for the chapter.\"\n    )\n    summary: str = Field(\n        ...,\n        description=\"A brief summary of the chapter's content, don't use words like 'the speaker'\",\n    )\n\n\ndef get_youtube_transcript(video_id: str) -> str:\n    try:\n        transcript = YouTubeTranscriptApi.get_transcript(video_id)\n        return \" \".join(\n            [f\"ts={entry['start']} - {entry['text']}\" for entry in transcript]\n        )\n    except Exception as e:\n        print(f\"Error fetching transcript: {e}\")\n        return \"\"\n\n\ndef extract_chapters(transcript: str):\n    class Chapters(BaseModel):\n        chapters: list[Chapter]\n\n    return client.chat.completions.create_partial(\n        model=\"gpt-4o\",  # You can experiment with different models\n        response_model=Chapters,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Analyze the given YouTube transcript and extract chapters. For each chapter, provide a start timestamp, end timestamp, title, and summary.\",\n            },\n            {\"role\": \"user\", \"content\": transcript},\n        ],\n    )\n\n\nif __name__ == \"__main__\":\n    video_id = input(\"Enter a Youtube Url: \")\n    video_id = video_id.split(\"v=\")[1]\n    console = Console()\n\n    with console.status(\"[bold green]Processing YouTube URL...\") as status:\n        transcripts = get_youtube_transcript(video_id)\n        status.update(\"[bold blue]Generating Clips...\")\n        chapters = extract_chapters(transcripts)\n\n        table = Table(title=\"Video Chapters\")\n        table.add_column(\"Title\", style=\"magenta\")\n        table.add_column(\"Description\", style=\"green\")\n        table.add_column(\"Start\", style=\"cyan\")\n        table.add_column(\"End\", style=\"cyan\")\n\n        with Live(refresh_per_second=4) as live:\n            for extraction in chapters:\n                if not extraction.chapters:\n                    continue\n\n                new_table = Table(title=\"Video Chapters\")\n                new_table.add_column(\"Title\", style=\"magenta\")\n                new_table.add_column(\"Description\", style=\"green\")\n                new_table.add_column(\"Start\", style=\"cyan\")\n                new_table.add_column(\"End\", style=\"cyan\")\n\n                for chapter in extraction.chapters:\n                    new_table.add_row(\n                        chapter.title,\n                        chapter.summary,\n                        f\"{chapter.start_ts:.2f}\" if chapter.start_ts else \"\",\n                        f\"{chapter.end_ts:.2f}\" if chapter.end_ts else \"\",\n                    )\n                    new_table.add_row(\"\", \"\", \"\", \"\")  # Add an empty row for spacing\n\n                live.update(new_table)\n\n    console.print(\"\\nChapter extraction complete!\")\n"
  },
  {
    "path": "examples/youtube-clips/run.py",
    "content": "from youtube_transcript_api import YouTubeTranscriptApi\nfrom pydantic import BaseModel, Field\nfrom collections.abc import Generator, Iterable\nimport instructor\nimport openai\n\nclient = instructor.from_openai(openai.OpenAI())\n\n\ndef extract_video_id(url: str) -> str | None:\n    import re\n\n    match = re.search(r\"v=([a-zA-Z0-9_-]+)\", url)\n    if match:\n        return match.group(1)\n\n\nclass TranscriptSegment(BaseModel):\n    source_id: int\n    start: float\n    text: str\n\n\ndef get_transcript_with_timing(\n    video_id: str,\n) -> Generator[TranscriptSegment, None, None]:\n    \"\"\"\n    Fetches the transcript of a YouTube video along with the start and end times for each text segment,\n    and returns them as a list of Pydantic models.\n\n    Parameters:\n    - video_id (str): The YouTube video ID for which the transcript is to be fetched.\n\n    Returns:\n    - A generator that yields TranscriptSegment models, each containing 'index', 'start', and 'text' keys.\n    \"\"\"\n    transcript = YouTubeTranscriptApi.get_transcript(video_id)\n    for ii, segment in enumerate(transcript):\n        yield TranscriptSegment(\n            source_id=ii, start=segment[\"start\"], text=segment[\"text\"]\n        )\n\n\nclass YoutubeClip(BaseModel):\n    title: str = Field(\n        description=\"Specific and informative title for the individual clip.\"\n    )\n    description: str = Field(\n        description=\"A detailed description of the clip, including any notable quotes or phrases. should be a summary of sorts.\"\n    )\n    start: float\n    end: float\n    source_ids: list[int] = Field(exclude=True)\n\n\nclass YoutubeClips(BaseModel):\n    clips: list[YoutubeClip]\n\n\ndef yield_clips(segments: Iterable[TranscriptSegment]) -> Iterable[YoutubeClips]:\n    \"\"\"\n    Extracts a list of YouTube clips from a list of transcript segments.\n\n    Parameters:\n    - segments (Iterable[TranscriptSegment]): A list of TranscriptSegment models, each containing 'index', 'start', and 'text' keys.\n\n    Returns:\n    - A generator that yields YoutubeClipw models, each containing 'title', 'description', 'start', 'end', and 'source_ids' keys.\n    \"\"\"\n\n    return client.chat.completions.create(\n        model=\"gpt-4-turbo-preview\",\n        stream=True,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are given a sequence of YouTube transcripts and your job is to return notable clips that can be recut as smaller videos. give very specific titles and descriptions. Make sure the length of clips is proportional to the length of the video. Note that this is a transcript and so there might be spelling errors. Note that and correct any spellings. Use the context to make sure you're spelling things correctly. \",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": f\"Let's use the following transcript segments.\\n{segments}\",\n            },\n        ],\n        response_model=instructor.Partial[YoutubeClips],\n        validation_context={\"segments\": segments},\n    )  # type: ignore\n\n\n# Example usage\nif __name__ == \"__main__\":\n    from rich.table import Table\n    from rich.console import Console\n    from rich.prompt import Prompt\n\n    console = Console()\n    url = Prompt.ask(\"Enter a YouTube URL\")\n\n    with console.status(\"[bold green]Processing YouTube URL...\") as status:\n        video_id = extract_video_id(url)\n\n        if video_id is None:\n            raise ValueError(\"Invalid YouTube video URL\")\n\n        transcript = list(get_transcript_with_timing(video_id))\n        status.update(\"[bold green]Generating clips...\")\n\n        for clip in yield_clips(transcript):\n            console.clear()\n\n            table = Table(title=\"YouTube Clips\", padding=(0, 1))\n\n            table.add_column(\"Title\", style=\"cyan\")\n            table.add_column(\"Description\", style=\"magenta\")\n            table.add_column(\"Start\", justify=\"right\", style=\"green\")\n            table.add_column(\"End\", justify=\"right\", style=\"green\")\n            for youtube_clip in clip.clips or []:\n                table.add_row(\n                    youtube_clip.title,\n                    youtube_clip.description,\n                    str(youtube_clip.start),\n                    str(youtube_clip.end),\n                )\n            console.print(table)\n"
  },
  {
    "path": "examples/youtube-flashcards/run.py",
    "content": "import uuid\n\nimport instructor\nimport openai\nfrom burr.core import action, State, ApplicationBuilder\nfrom pydantic import BaseModel, Field\nfrom pydantic.json_schema import SkipJsonSchema\nfrom youtube_transcript_api import YouTubeTranscriptApi\n\n\nclass QuestionAnswer(BaseModel):\n    question: str = Field(description=\"Question about the topic\")\n    options: list[str] = Field(\n        description=\"Potential answers to the question.\", min_items=3, max_items=5\n    )\n    answer_index: int = Field(\n        description=\"Index of the correct answer options (starting from 0).\", ge=0, lt=5\n    )\n    difficulty: int = Field(\n        description=\"Difficulty of this question from 1 to 5, 5 being the most difficult.\",\n        gt=0,\n        le=5,\n    )\n    youtube_url: SkipJsonSchema[str | None] = None\n    id: uuid.UUID = Field(description=\"Unique identifier\", default_factory=uuid.uuid4)\n\n\n@action(reads=[], writes=[\"youtube_url\"])\ndef process_user_input(state: State, user_input: str) -> State:\n    \"\"\"Process user input and update the YouTube URL.\"\"\"\n    youtube_url = (\n        user_input  # In practice, we would have more complex validation logic.\n    )\n    return state.update(youtube_url=youtube_url)\n\n\n@action(reads=[\"youtube_url\"], writes=[\"transcript\"])\ndef get_youtube_transcript(state: State) -> State:\n    \"\"\"Get the official YouTube transcript for a video given it's URL\"\"\"\n    youtube_url = state[\"youtube_url\"]\n\n    _, _, video_id = youtube_url.partition(\"?v=\")\n    transcript = YouTubeTranscriptApi.get_transcript(video_id)\n    full_transcript = \" \".join([entry[\"text\"] for entry in transcript])\n\n    # store the transcript in state\n    return state.update(transcript=full_transcript, youtube_url=youtube_url)\n\n\n@action(reads=[\"transcript\", \"youtube_url\"], writes=[\"question_answers\"])\ndef generate_question_and_answers(state: State) -> State:\n    \"\"\"Generate `QuestionAnswer` from a YouTube transcript using an LLM.\"\"\"\n    # read the transcript from state\n    transcript = state[\"transcript\"]\n    youtube_url = state[\"youtube_url\"]\n\n    # create the instructor client\n    instructor_client = instructor.from_openai(openai.OpenAI())\n    system_prompt = (\n        \"Analyze the given YouTube transcript and generate question-answer pairs\"\n        \" to help study and understand the topic better. Please rate all questions from 1 to 5\"\n        \" based on their difficulty.\"\n    )\n    response = instructor_client.chat.completions.create_iterable(\n        model=\"gpt-4o-mini\",\n        response_model=QuestionAnswer,\n        messages=[\n            {\"role\": \"system\", \"content\": system_prompt},\n            {\"role\": \"user\", \"content\": transcript},\n        ],\n    )\n\n    # iterate over QuestionAnswer, add the `youtube_url`, and append to state\n    for qna in response:\n        qna.youtube_url = youtube_url\n        # `State` is immutable, so `.append()` returns a new object with the appended value\n        state = state.append(question_answers=qna)\n\n    return state\n\n\ndef build_application():\n    return (\n        ApplicationBuilder()\n        .with_actions(\n            process_user_input,\n            get_youtube_transcript,\n            generate_question_and_answers,\n        )\n        .with_transitions(\n            (\"process_user_input\", \"get_youtube_transcript\"),\n            (\"get_youtube_transcript\", \"generate_question_and_answers\"),\n            (\"generate_question_and_answers\", \"process_user_input\"),\n        )\n        .with_entrypoint(\"process_user_input\")\n        .with_tracker(project=\"youtube-qna\")\n        .build()\n    )\n\n\nif __name__ == \"__main__\":\n    app = build_application()\n\n    while True:\n        user_input = input(\"Enter a YouTube URL (q to quit): \")\n        if user_input.lower() == \"q\":\n            break\n\n        action_name, result, state = app.run(\n            halt_before=[\"process_user_input\"],\n            inputs={\"user_input\": user_input},\n        )\n        print(f\"{len(state['question_answers'])} question-answer pairs generated\")\n\n        print(\"Preview:\\n\")\n        count = 0\n        for qna in state[\"question_answers\"]:\n            if count > 3:\n                break\n            print(qna.question)\n            print(qna.options)\n            print()\n            count += 1\n"
  },
  {
    "path": "github_issue.md",
    "content": "# Refactor OpenAISchema class methods to standalone functions\n\n## Summary\n\nCurrently, schema generation for different LLM providers requires models to inherit from `OpenAISchema` or be wrapped with the `@openai_schema` decorator. This creates an unnecessary inheritance requirement and couples schema generation to class-based patterns.\n\nWe should refactor the schema generation logic into standalone, provider-agnostic functions.\n\n## Current State Analysis\n\n**Current usage pattern**: `response_model.openai_schema` (where response_model inherits from OpenAISchema)\n\n**Affected files with usage counts**:\n- `instructor/utils/` (12 calls across cerebras.py, writer.py, fireworks.py, openai.py, mistral.py)\n- `instructor/process_response.py` (11 calls)\n- `instructor/dsl/parallel.py` (3 calls - handles parallel tools)\n- `instructor/distil.py` (1 call)\n- `instructor/function_calls.py` (13 calls - method definitions and internal usage)\n- `instructor/utils/core.py` (1 call - decorator application)\n- `instructor/utils/anthropic.py` (1 call - anthropic_schema)\n- `instructor/utils/google.py` (1 call - gemini_schema)\n- Examples and tests (20+ calls)\n\n**Total**: ~60 usages across codebase\n\n## Proposed Solution\n\n### 1. Create `instructor/schema_utils.py` with standalone functions:\n\n```python\nfrom __future__ import annotations\nimport functools\nfrom typing import Any, Type\nfrom docstring_parser import parse\nfrom pydantic import BaseModel\n\n@functools.lru_cache(maxsize=256)\ndef generate_openai_schema(model: Type[BaseModel]) -> dict[str, Any]:\n    \"\"\"Generate OpenAI function schema from Pydantic model.\"\"\"\n    # Move logic from OpenAISchema.openai_schema here\n\ndef generate_anthropic_schema(model: Type[BaseModel]) -> dict[str, Any]:\n    \"\"\"Generate Anthropic tool schema from Pydantic model.\"\"\"\n    # Move logic from OpenAISchema.anthropic_schema here\n\ndef generate_gemini_schema(model: Type[BaseModel]) -> Any:\n    \"\"\"Generate Gemini function schema from Pydantic model.\"\"\"\n    # Move logic from OpenAISchema.gemini_schema here\n```\n\n### 2. Update OpenAISchema class to delegate to new functions:\n\n```python\nclass OpenAISchema(BaseModel):\n    @classproperty\n    def openai_schema(cls):\n        return generate_openai_schema(cls)\n\n    @classproperty  \n    def anthropic_schema(cls):\n        return generate_anthropic_schema(cls)\n\n    @classproperty\n    def gemini_schema(cls):\n        return generate_gemini_schema(cls)\n```\n\n### 3. Migration path:\n\n**Phase 1**: Add new functions, maintain backward compatibility\n- All existing `response_model.openai_schema` calls continue working\n- New code can use `generate_openai_schema(response_model)` directly\n\n**Phase 2**: Internal migration  \n- Replace internal usage in utils/ and process_response.py\n- Update parallel tools handling in dsl/parallel.py\n\n**Phase 3**: Deprecation\n- Mark `@openai_schema` decorator as deprecated\n- Encourage users to migrate to standalone functions\n\n## Benefits\n\n1. **No inheritance requirement** - Any Pydantic model can generate schemas\n2. **Provider-agnostic** - Clean separation of schema generation logic\n3. **Better testability** - Functions are easier to unit test\n4. **Performance** - LRU cache maintains current performance characteristics\n5. **Backward compatibility** - Zero breaking changes during transition\n6. **Cleaner API** - More functional approach vs class-based inheritance\n\n## Implementation Checklist\n\n- [ ] Create `instructor/schema_utils.py` with standalone functions\n- [ ] Update `OpenAISchema` class to delegate to new functions  \n- [ ] Add comprehensive tests comparing old vs new output\n- [ ] Update internal usage in utils/ (12 locations)\n- [ ] Update process_response.py (11 locations)\n- [ ] Update parallel tools handling in dsl/parallel.py\n- [ ] Update distil.py usage\n- [ ] Mark decorator as deprecated with warning\n- [ ] Update documentation and examples\n- [ ] Run full test suite to ensure no regressions\n\n## Special Considerations\n\n- **Parallel tools**: `dsl/parallel.py` uses both `openai_schema(model).openai_schema` and `openai_schema(model).anthropic_schema` patterns\n- **Caching**: Current `@classproperty` provides implicit memoization - maintain with `@lru_cache`\n- **Error handling**: Preserve current validation and error behavior\n- **Provider compatibility**: Ensure schema output remains identical for all providers\n\nThis refactoring will modernize the schema generation approach while maintaining full backward compatibility.\n"
  },
  {
    "path": "instructor/__init__.py",
    "content": "import importlib.util\n\n__version__ = \"1.14.4\"\n\nfrom .mode import Mode\nfrom .processing.multimodal import Image, Audio\n\nfrom .dsl import (\n    CitationMixin,\n    Maybe,\n    Partial,\n    IterableModel,\n)\n\nfrom .validation import llm_validator, openai_moderation\nfrom .processing.function_calls import OpenAISchema, openai_schema\nfrom .processing.schema import (\n    generate_openai_schema,\n    generate_anthropic_schema,\n    generate_gemini_schema,\n)\nfrom .core.patch import apatch, patch\nfrom .core.client import (\n    Instructor,\n    AsyncInstructor,\n    from_openai,\n    from_litellm,\n)\nfrom .core import hooks\nfrom .utils.providers import Provider\nfrom .auto_client import from_provider\nfrom .batch import BatchProcessor, BatchRequest, BatchJob\nfrom .distil import FinetuneFormat, Instructions\n\n# Backward compatibility: Re-export removed functions\nfrom .processing.response import handle_response_model\nfrom .dsl.parallel import handle_parallel_model\n\n__all__ = [\n    \"Instructor\",\n    \"Image\",\n    \"Audio\",\n    \"from_openai\",\n    \"from_litellm\",\n    \"from_provider\",\n    \"AsyncInstructor\",\n    \"Provider\",\n    \"OpenAISchema\",\n    \"CitationMixin\",\n    \"IterableModel\",\n    \"Maybe\",\n    \"Partial\",\n    \"openai_schema\",\n    \"generate_openai_schema\",\n    \"generate_anthropic_schema\",\n    \"generate_gemini_schema\",\n    \"Mode\",\n    \"patch\",\n    \"apatch\",\n    \"FinetuneFormat\",\n    \"Instructions\",\n    \"BatchProcessor\",\n    \"BatchRequest\",\n    \"BatchJob\",\n    \"llm_validator\",\n    \"openai_moderation\",\n    \"hooks\",\n    \"client\",  # Backward compatibility\n    # Backward compatibility exports\n    \"handle_response_model\",\n    \"handle_parallel_model\",\n]\n\n# Backward compatibility: Make instructor.client available as an attribute\n# This allows code like `instructor.client.Instructor` to work\nfrom . import client\n\n\nif importlib.util.find_spec(\"anthropic\") is not None:\n    from .providers.anthropic.client import from_anthropic\n\n    __all__ += [\"from_anthropic\"]\n\n# Keep from_gemini for backward compatibility but it's deprecated\nif (\n    importlib.util.find_spec(\"google\")\n    and importlib.util.find_spec(\"google.generativeai\") is not None\n):\n    from .providers.gemini.client import from_gemini\n\n    __all__ += [\"from_gemini\"]\n\nif importlib.util.find_spec(\"fireworks\") is not None:\n    from .providers.fireworks.client import from_fireworks\n\n    __all__ += [\"from_fireworks\"]\n\nif importlib.util.find_spec(\"cerebras\") is not None:\n    from .providers.cerebras.client import from_cerebras\n\n    __all__ += [\"from_cerebras\"]\n\nif importlib.util.find_spec(\"groq\") is not None:\n    from .providers.groq.client import from_groq\n\n    __all__ += [\"from_groq\"]\n\nif importlib.util.find_spec(\"mistralai\") is not None:\n    from .providers.mistral.client import from_mistral\n\n    __all__ += [\"from_mistral\"]\n\nif importlib.util.find_spec(\"cohere\") is not None:\n    from .providers.cohere.client import from_cohere\n\n    __all__ += [\"from_cohere\"]\n\nif all(importlib.util.find_spec(pkg) for pkg in (\"vertexai\", \"jsonref\")):\n    try:\n        from .providers.vertexai.client import from_vertexai\n    except Exception:\n        # Optional dependency may be present but broken/misconfigured at import time.\n        # Avoid failing `import instructor` in that case.\n        pass\n    else:\n        __all__ += [\"from_vertexai\"]\n\nif importlib.util.find_spec(\"boto3\") is not None:\n    from .providers.bedrock.client import from_bedrock\n\n    __all__ += [\"from_bedrock\"]\n\nif importlib.util.find_spec(\"writerai\") is not None:\n    from .providers.writer.client import from_writer\n\n    __all__ += [\"from_writer\"]\n\nif importlib.util.find_spec(\"xai_sdk\") is not None:\n    from .providers.xai.client import from_xai\n\n    __all__ += [\"from_xai\"]\n\nif importlib.util.find_spec(\"openai\") is not None:\n    from .providers.perplexity.client import from_perplexity\n\n    __all__ += [\"from_perplexity\"]\n\nif (\n    importlib.util.find_spec(\"google\")\n    and importlib.util.find_spec(\"google.genai\") is not None\n):\n    from .providers.genai.client import from_genai\n\n    __all__ += [\"from_genai\"]\n"
  },
  {
    "path": "instructor/_types/__init__.py",
    "content": ""
  },
  {
    "path": "instructor/_types/_alias.py",
    "content": "from typing import Literal\n\nfrom typing_extensions import TypeAlias\n\nModelNames: TypeAlias = Literal[\n    \"gpt-4o\",\n    \"gpt-4-0125-preview\",\n    \"gpt-4-turbo-preview\",\n    \"gpt-4-1106-preview\",\n    \"gpt-4-vision-preview\",\n    \"gpt-4\",\n    \"gpt-4-0314\",\n    \"gpt-4-0613\",\n    \"gpt-4-32k\",\n    \"gpt-4-32k-0314\",\n    \"gpt-4-32k-0613\",\n    \"gpt-3.5-turbo\",\n    \"gpt-3.5-turbo-16k\",\n    \"gpt-3.5-turbo-0301\",\n    \"gpt-3.5-turbo-0613\",\n    \"gpt-3.5-turbo-1106\",\n    \"gpt-3.5-turbo-0125\",\n    \"gpt-3.5-turbo-16k-0613\",\n    \"gpt-3.5-turbo-instruct\",\n    \"text-embedding-ada-002\",\n    \"text-embedding-ada-002-v2\",\n    \"text-embedding-3-small\",\n    \"text-embedding-3-large\",\n]\n"
  },
  {
    "path": "instructor/auto_client.py",
    "content": "from __future__ import annotations\nfrom typing import Any, Union, Literal, overload\nfrom .core.client import AsyncInstructor, Instructor\nimport instructor\nfrom instructor.models import KnownModelName\nfrom instructor.cache import BaseCache\nimport warnings\nimport logging\n\n# Type alias for the return type\nInstructorType = Union[Instructor, AsyncInstructor]\n\nlogger = logging.getLogger(\"instructor.auto_client\")\n\n\n# List of supported providers\nsupported_providers = [\n    \"openai\",\n    \"azure_openai\",\n    \"databricks\",\n    \"anthropic\",\n    \"google\",\n    \"generative-ai\",\n    \"vertexai\",\n    \"mistral\",\n    \"cohere\",\n    \"perplexity\",\n    \"groq\",\n    \"writer\",\n    \"bedrock\",\n    \"cerebras\",\n    \"deepseek\",\n    \"fireworks\",\n    \"ollama\",\n    \"openrouter\",\n    \"xai\",\n    \"litellm\",\n]\n\n\n@overload\ndef from_provider(\n    model: KnownModelName,\n    async_client: Literal[True] = True,\n    cache: BaseCache | None = None,  # noqa: ARG001\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\n@overload\ndef from_provider(\n    model: KnownModelName,\n    async_client: Literal[False] = False,\n    cache: BaseCache | None = None,  # noqa: ARG001\n    **kwargs: Any,\n) -> Instructor: ...\n\n\n@overload\ndef from_provider(\n    model: str,\n    async_client: Literal[True] = True,\n    cache: BaseCache | None = None,  # noqa: ARG001\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\n@overload\ndef from_provider(\n    model: str,\n    async_client: Literal[False] = False,\n    cache: BaseCache | None = None,  # noqa: ARG001\n    **kwargs: Any,\n) -> Instructor: ...\n\n\ndef from_provider(\n    model: Union[str, KnownModelName],  # noqa: UP007\n    async_client: bool = False,\n    cache: BaseCache | None = None,\n    mode: Union[instructor.Mode, None] = None,  # noqa: ARG001, UP007\n    **kwargs: Any,\n) -> Union[Instructor, AsyncInstructor]:  # noqa: UP007\n    \"\"\"Create an Instructor client from a model string.\n\n    Args:\n        model: String in format \"provider/model-name\"\n              (e.g., \"openai/gpt-4\", \"anthropic/claude-3-sonnet\", \"google/gemini-pro\")\n        async_client: Whether to return an async client\n        cache: Optional cache adapter (e.g., ``AutoCache`` or ``RedisCache``)\n               to enable transparent response caching. Automatically flows through\n               **kwargs to all provider implementations.\n        mode: Override the default mode for the provider. If not specified, uses the\n              recommended default mode for each provider.\n        **kwargs: Additional arguments passed to the provider client functions.\n                 This includes the cache parameter and any provider-specific options.\n\n    Returns:\n        Instructor or AsyncInstructor instance\n\n    Raises:\n        ValueError: If provider is not supported or model string is invalid\n        ImportError: If required package for provider is not installed\n\n    Examples:\n        >>> import instructor\n        >>> from instructor.cache import AutoCache\n        >>>\n        >>> # Basic usage\n        >>> client = instructor.from_provider(\"openai/gpt-4\")\n        >>> client = instructor.from_provider(\"anthropic/claude-3-sonnet\")\n        >>>\n        >>> # With caching\n        >>> cache = AutoCache(maxsize=1000)\n        >>> client = instructor.from_provider(\"openai/gpt-4\", cache=cache)\n        >>>\n        >>> # Async clients\n        >>> async_client = instructor.from_provider(\"openai/gpt-4\", async_client=True)\n    \"\"\"\n    # Add cache to kwargs if provided so it flows through to provider functions\n    if cache is not None:\n        kwargs[\"cache\"] = cache\n\n    try:\n        provider, model_name = model.split(\"/\", 1)\n    except ValueError:\n        from .core.exceptions import ConfigurationError\n\n        raise ConfigurationError(\n            'Model string must be in format \"provider/model-name\" '\n            '(e.g. \"openai/gpt-4\" or \"anthropic/claude-3-sonnet\")'\n        ) from None\n\n    provider_info = {\"provider\": provider, \"operation\": \"initialize\"}\n    logger.info(\n        \"Initializing %s provider with model %s\",\n        provider,\n        model_name,\n        extra=provider_info,\n    )\n    logger.debug(\n        \"Provider configuration: async_client=%s, mode=%s\",\n        async_client,\n        mode,\n        extra=provider_info,\n    )\n    api_key = None\n    if \"api_key\" in kwargs:\n        api_key = kwargs.pop(\"api_key\")\n        if api_key:\n            logger.debug(\n                \"API key provided for %s provider (length: %d characters)\",\n                provider,\n                len(api_key),\n                extra=provider_info,\n            )\n\n    if provider == \"openai\":\n        try:\n            import openai\n            import httpx\n            from instructor import from_openai  # type: ignore[attr-defined]\n            from openai import DEFAULT_MAX_RETRIES, NotGiven, Timeout, not_given\n            from collections.abc import Mapping\n            from typing import cast\n\n            # Extract base_url and other OpenAI client parameters from kwargs\n            base_url = kwargs.pop(\"base_url\", None)\n            organization = cast(str | None, kwargs.pop(\"organization\", None))\n\n            timeout_raw = kwargs.pop(\"timeout\", not_given)\n            timeout: float | Timeout | None | NotGiven\n            timeout = (\n                not_given\n                if timeout_raw is not_given\n                else cast(float | Timeout | None, timeout_raw)\n            )\n\n            max_retries_raw = kwargs.pop(\"max_retries\", None)\n            max_retries = (\n                DEFAULT_MAX_RETRIES\n                if max_retries_raw is None\n                else int(cast(int, max_retries_raw))\n            )\n\n            default_headers = cast(\n                Mapping[str, str] | None, kwargs.pop(\"default_headers\", None)\n            )\n            default_query = cast(\n                Mapping[str, object] | None, kwargs.pop(\"default_query\", None)\n            )\n            http_client_raw = kwargs.pop(\"http_client\", None)\n            strict_response_validation = bool(\n                kwargs.pop(\"_strict_response_validation\", False)\n            )\n\n            if async_client:\n                http_client = cast(httpx.AsyncClient | None, http_client_raw)\n                client = openai.AsyncOpenAI(\n                    api_key=api_key,\n                    base_url=base_url,\n                    organization=organization,\n                    timeout=timeout,\n                    max_retries=max_retries,\n                    default_headers=default_headers,\n                    default_query=default_query,\n                    http_client=http_client,\n                    _strict_response_validation=strict_response_validation,\n                )\n            else:\n                http_client = cast(httpx.Client | None, http_client_raw)\n                client = openai.OpenAI(\n                    api_key=api_key,\n                    base_url=base_url,\n                    organization=organization,\n                    timeout=timeout,\n                    max_retries=max_retries,\n                    default_headers=default_headers,\n                    default_query=default_query,\n                    http_client=http_client,\n                    _strict_response_validation=strict_response_validation,\n                )\n\n            result = from_openai(\n                client,\n                model=model_name,\n                mode=mode if mode else instructor.Mode.TOOLS,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the OpenAI provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"azure_openai\":\n        try:\n            import os\n            from openai import AzureOpenAI, AsyncAzureOpenAI\n            from instructor import from_openai  # type: ignore[attr-defined]\n\n            # Get required Azure OpenAI configuration from environment\n            api_key = api_key or os.environ.get(\"AZURE_OPENAI_API_KEY\")\n            azure_endpoint = kwargs.pop(\n                \"azure_endpoint\", os.environ.get(\"AZURE_OPENAI_ENDPOINT\")\n            )\n            api_version = kwargs.pop(\"api_version\", \"2024-02-01\")\n\n            if not api_key:\n                from .core.exceptions import ConfigurationError\n\n                raise ConfigurationError(\n                    \"AZURE_OPENAI_API_KEY is not set. \"\n                    \"Set it with `export AZURE_OPENAI_API_KEY=<your-api-key>` or pass it as kwarg api_key=<your-api-key>\"\n                )\n\n            if not azure_endpoint:\n                from .core.exceptions import ConfigurationError\n\n                raise ConfigurationError(\n                    \"AZURE_OPENAI_ENDPOINT is not set. \"\n                    \"Set it with `export AZURE_OPENAI_ENDPOINT=<your-endpoint>` or pass it as kwarg azure_endpoint=<your-endpoint>\"\n                )\n\n            client = (\n                AsyncAzureOpenAI(\n                    api_key=api_key,\n                    api_version=api_version,\n                    azure_endpoint=azure_endpoint,\n                )\n                if async_client\n                else AzureOpenAI(\n                    api_key=api_key,\n                    api_version=api_version,\n                    azure_endpoint=azure_endpoint,\n                )\n            )\n            result = from_openai(\n                client,\n                model=model_name,\n                mode=mode if mode else instructor.Mode.TOOLS,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the Azure OpenAI provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"databricks\":\n        try:\n            import os\n            import openai\n            from instructor import from_openai  # type: ignore[attr-defined]\n\n            api_key = (\n                api_key\n                or os.environ.get(\"DATABRICKS_TOKEN\")\n                or os.environ.get(\"DATABRICKS_API_KEY\")\n            )\n            if not api_key:\n                from .core.exceptions import ConfigurationError\n\n                raise ConfigurationError(\n                    \"DATABRICKS_TOKEN is not set. \"\n                    \"Set it with `export DATABRICKS_TOKEN=<your-token>` or `export DATABRICKS_API_KEY=<your-token>` \"\n                    \"or pass it as kwarg `api_key=<your-token>`.\"\n                )\n\n            base_url = kwargs.pop(\"base_url\", None)\n            if base_url is None:\n                base_url = (\n                    os.environ.get(\"DATABRICKS_BASE_URL\")\n                    or os.environ.get(\"DATABRICKS_HOST\")\n                    or os.environ.get(\"DATABRICKS_WORKSPACE_URL\")\n                )\n\n            if not base_url:\n                from .core.exceptions import ConfigurationError\n\n                raise ConfigurationError(\n                    \"DATABRICKS_HOST is not set. \"\n                    \"Set it with `export DATABRICKS_HOST=<your-workspace-url>` or `export DATABRICKS_WORKSPACE_URL=<your-workspace-url>` \"\n                    \"or pass `base_url=<your-workspace-url>`.\"\n                )\n\n            base_url = str(base_url).rstrip(\"/\")\n            if not base_url.endswith(\"/serving-endpoints\"):\n                base_url = f\"{base_url}/serving-endpoints\"\n\n            openai_client_kwargs = {}\n            for key in (\n                \"organization\",\n                \"timeout\",\n                \"max_retries\",\n                \"default_headers\",\n                \"http_client\",\n                \"app_info\",\n            ):\n                if key in kwargs:\n                    openai_client_kwargs[key] = kwargs.pop(key)\n\n            client = (\n                openai.AsyncOpenAI(\n                    api_key=api_key, base_url=base_url, **openai_client_kwargs\n                )\n                if async_client\n                else openai.OpenAI(\n                    api_key=api_key, base_url=base_url, **openai_client_kwargs\n                )\n            )\n            result = from_openai(\n                client,\n                model=model_name,\n                mode=mode if mode else instructor.Mode.TOOLS,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the Databricks provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n    elif provider == \"anthropic\":\n        try:\n            import anthropic\n            from instructor import from_anthropic  # type: ignore[attr-defined]  # type: ignore[attr-defined]\n\n            client = (\n                anthropic.AsyncAnthropic(api_key=api_key)\n                if async_client\n                else anthropic.Anthropic(api_key=api_key)\n            )\n            max_tokens = kwargs.pop(\"max_tokens\", 4096)\n            result = from_anthropic(\n                client,\n                model=model_name,\n                mode=mode if mode else instructor.Mode.ANTHROPIC_TOOLS,\n                max_tokens=max_tokens,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The anthropic package is required to use the Anthropic provider. \"\n                \"Install it with `pip install anthropic`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"google\":\n        # Import google-genai package - catch ImportError only for actual imports\n        try:\n            import google.genai as genai\n            from instructor import from_genai  # type: ignore[attr-defined]\n        except ImportError as e:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The google-genai package is required to use the Google provider. \"\n                \"Install it with `pip install google-genai`.\"\n            ) from e\n\n        try:\n            import os\n\n            # Remove vertexai from kwargs if present to avoid passing it twice\n            vertexai_flag = kwargs.pop(\"vertexai\", False)\n\n            # Get API key from kwargs or environment\n            api_key = api_key or os.environ.get(\"GOOGLE_API_KEY\")\n\n            # Extract client-specific parameters\n            client_kwargs = {}\n            for key in [\n                \"debug_config\",\n                \"http_options\",\n                \"credentials\",\n                \"project\",\n                \"location\",\n            ]:\n                if key in kwargs:\n                    client_kwargs[key] = kwargs.pop(key)\n\n            client = genai.Client(\n                vertexai=vertexai_flag,\n                api_key=api_key,\n                **client_kwargs,\n            )  # type: ignore\n            if async_client:\n                result = from_genai(\n                    client,\n                    use_async=True,\n                    model=model_name,\n                    mode=mode if mode else instructor.Mode.GENAI_TOOLS,\n                    **kwargs,\n                )  # type: ignore\n            else:\n                result = from_genai(\n                    client,\n                    model=model_name,\n                    mode=mode if mode else instructor.Mode.GENAI_TOOLS,\n                    **kwargs,\n                )  # type: ignore\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"mistral\":\n        try:\n            from mistralai import Mistral\n            from instructor import from_mistral  # type: ignore[attr-defined]\n            import os\n\n            api_key = api_key or os.environ.get(\"MISTRAL_API_KEY\")\n\n            if api_key:\n                client = Mistral(api_key=api_key)\n            else:\n                raise ValueError(\n                    \"MISTRAL_API_KEY is not set. \"\n                    \"Set it with `export MISTRAL_API_KEY=<your-api-key>`.\"\n                )\n\n            if async_client:\n                result = from_mistral(\n                    client, model=model_name, use_async=True, **kwargs\n                )\n            else:\n                result = from_mistral(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The mistralai package is required to use the Mistral provider. \"\n                \"Install it with `pip install mistralai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"cohere\":\n        try:\n            import cohere\n            from instructor import from_cohere  # type: ignore[attr-defined]\n\n            client = (\n                cohere.AsyncClientV2(api_key=api_key)\n                if async_client\n                else cohere.ClientV2(api_key=api_key)\n            )\n            result = from_cohere(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The cohere package is required to use the Cohere provider. \"\n                \"Install it with `pip install cohere`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"perplexity\":\n        try:\n            import openai\n            from instructor import from_perplexity  # type: ignore[attr-defined]\n            import os\n\n            api_key = api_key or os.environ.get(\"PERPLEXITY_API_KEY\")\n            if not api_key:\n                raise ValueError(\n                    \"PERPLEXITY_API_KEY is not set. \"\n                    \"Set it with `export PERPLEXITY_API_KEY=<your-api-key>` or pass it as a kwarg api_key=<your-api-key>\"\n                )\n\n            client = (\n                openai.AsyncOpenAI(\n                    api_key=api_key, base_url=\"https://api.perplexity.ai\"\n                )\n                if async_client\n                else openai.OpenAI(\n                    api_key=api_key, base_url=\"https://api.perplexity.ai\"\n                )\n            )\n            result = from_perplexity(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the Perplexity provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"groq\":\n        try:\n            import groq\n            from instructor import from_groq  # type: ignore[attr-defined]\n\n            client = (\n                groq.AsyncGroq(api_key=api_key)\n                if async_client\n                else groq.Groq(api_key=api_key)\n            )\n            result = from_groq(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The groq package is required to use the Groq provider. \"\n                \"Install it with `pip install groq`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"writer\":\n        try:\n            from writerai import AsyncWriter, Writer\n            from instructor import from_writer  # type: ignore[attr-defined]\n\n            client = (\n                AsyncWriter(api_key=api_key)\n                if async_client\n                else Writer(api_key=api_key)\n            )\n            result = from_writer(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The writerai package is required to use the Writer provider. \"\n                \"Install it with `pip install writer-sdk`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"bedrock\":\n        try:\n            import os\n            import boto3\n            from instructor import from_bedrock  # type: ignore[attr-defined]\n\n            # Get AWS configuration from environment or kwargs\n            if \"region\" in kwargs:\n                region = kwargs.pop(\"region\")\n            else:\n                logger.debug(\n                    \"AWS_DEFAULT_REGION is not set. Using default region us-east-1\"\n                )\n                region = os.environ.get(\"AWS_DEFAULT_REGION\", \"us-east-1\")\n\n            # Extract AWS-specific parameters\n            # Dictionary to collect AWS credentials and session parameters for boto3 client\n            aws_kwargs = {}\n            for key in [\n                \"aws_access_key_id\",\n                \"aws_secret_access_key\",\n                \"aws_session_token\",\n            ]:\n                if key in kwargs:\n                    aws_kwargs[key] = kwargs.pop(key)\n                elif key.upper() in os.environ:\n                    logger.debug(f\"Using {key.upper()} from environment variable\")\n                    aws_kwargs[key] = os.environ[key.upper()]\n\n            # Add region to client configuration\n            aws_kwargs[\"region_name\"] = region\n\n            # Create bedrock-runtime client\n            client = boto3.client(\"bedrock-runtime\", **aws_kwargs)\n\n            # Determine default mode based on model\n            if mode is None:\n                # Anthropic models (Claude) support tools, others use JSON\n                if model_name and (\n                    \"anthropic\" in model_name.lower() or \"claude\" in model_name.lower()\n                ):\n                    default_mode = instructor.Mode.BEDROCK_TOOLS\n                else:\n                    default_mode = instructor.Mode.BEDROCK_JSON\n            else:\n                default_mode = mode\n\n            result = from_bedrock(\n                client,\n                mode=default_mode,\n                async_client=async_client,\n                _async=async_client,  # for backward compatibility\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The boto3 package is required to use the AWS Bedrock provider. \"\n                \"Install it with `pip install boto3`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"cerebras\":\n        try:\n            from cerebras.cloud.sdk import AsyncCerebras, Cerebras\n            from instructor import from_cerebras  # type: ignore[attr-defined]\n\n            client = (\n                AsyncCerebras(api_key=api_key)\n                if async_client\n                else Cerebras(api_key=api_key)\n            )\n            result = from_cerebras(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The cerebras package is required to use the Cerebras provider. \"\n                \"Install it with `pip install cerebras`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"fireworks\":\n        try:\n            from fireworks.client import AsyncFireworks, Fireworks\n            from instructor import from_fireworks  # type: ignore[attr-defined]\n\n            client = (\n                AsyncFireworks(api_key=api_key)\n                if async_client\n                else Fireworks(api_key=api_key)\n            )\n            result = from_fireworks(client, model=model_name, **kwargs)\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The fireworks-ai package is required to use the Fireworks provider. \"\n                \"Install it with `pip install fireworks-ai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"vertexai\":\n        warnings.warn(\n            \"The 'vertexai' provider is deprecated. Use 'google' provider with vertexai=True instead. \"\n            \"Example: instructor.from_provider('google/gemini-pro', vertexai=True)\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        # Import google-genai package - catch ImportError only for actual imports\n        try:\n            import google.genai as genai  # type: ignore\n            from instructor import from_genai  # type: ignore[attr-defined]\n        except ImportError as e:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The google-genai package is required to use the VertexAI provider. \"\n                \"Install it with `pip install google-genai`.\"\n            ) from e\n\n        try:\n            import os\n\n            # Get project and location from kwargs or environment\n            project = kwargs.pop(\"project\", os.environ.get(\"GOOGLE_CLOUD_PROJECT\"))\n            location = kwargs.pop(\n                \"location\", os.environ.get(\"GOOGLE_CLOUD_LOCATION\", \"us-central1\")\n            )\n\n            if not project:\n                raise ValueError(\n                    \"Project ID is required for Vertex AI. \"\n                    \"Set it with `export GOOGLE_CLOUD_PROJECT=<your-project-id>` \"\n                    \"or pass it as kwarg project=<your-project-id>\"\n                )\n\n            client = genai.Client(\n                vertexai=True,\n                project=project,\n                location=location,\n                **kwargs,\n            )  # type: ignore\n            kwargs[\"model\"] = model_name  # Pass model as part of kwargs\n            if async_client:\n                result = from_genai(\n                    client,\n                    use_async=True,\n                    mode=mode if mode else instructor.Mode.GENAI_TOOLS,\n                    **kwargs,\n                )  # type: ignore\n            else:\n                result = from_genai(\n                    client, mode=mode if mode else instructor.Mode.GENAI_TOOLS, **kwargs\n                )  # type: ignore\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"generative-ai\":\n        warnings.warn(\n            \"The 'generative-ai' provider is deprecated. Use 'google' provider instead. \"\n            \"Example: instructor.from_provider('google/gemini-pro')\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        # Import google-genai package - catch ImportError only for actual imports\n        try:\n            from google import genai\n            from instructor import from_genai  # type: ignore[attr-defined]\n        except ImportError as e:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The google-genai package is required to use the Google GenAI provider. \"\n                \"Install it with `pip install google-genai`.\"\n            ) from e\n\n        try:\n            import os\n\n            # Get API key from kwargs or environment\n            api_key = api_key or os.environ.get(\"GOOGLE_API_KEY\")\n\n            client = genai.Client(vertexai=False, api_key=api_key)\n            if async_client:\n                result = from_genai(\n                    client,\n                    use_async=True,\n                    model=model_name,\n                    mode=mode if mode else instructor.Mode.GENAI_TOOLS,\n                    **kwargs,\n                )  # type: ignore\n            else:\n                result = from_genai(\n                    client,\n                    model=model_name,\n                    mode=mode if mode else instructor.Mode.GENAI_TOOLS,\n                    **kwargs,\n                )  # type: ignore\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"ollama\":\n        try:\n            import openai\n            from instructor import from_openai  # type: ignore[attr-defined]\n\n            # Get base_url from kwargs or use default\n            base_url = kwargs.pop(\"base_url\", \"http://localhost:11434/v1\")\n            api_key = kwargs.pop(\"api_key\", \"ollama\")  # required but unused\n\n            client = (\n                openai.AsyncOpenAI(base_url=base_url, api_key=api_key)\n                if async_client\n                else openai.OpenAI(base_url=base_url, api_key=api_key)\n            )\n\n            # Models that support function calling (tools mode)\n            tool_capable_models = {\n                \"llama3.1\",\n                \"llama3.2\",\n                \"llama4\",\n                \"mistral-nemo\",\n                \"firefunction-v2\",\n                \"command-a\",\n                \"command-r\",\n                \"command-r-plus\",\n                \"command-r7b\",\n                \"qwen2.5\",\n                \"qwen2.5-coder\",\n                \"qwen3\",\n                \"devstral\",\n            }\n\n            # Check if model supports tools by looking at model name\n            supports_tools = any(\n                capable_model in model_name.lower()\n                for capable_model in tool_capable_models\n            )\n\n            default_mode = (\n                instructor.Mode.TOOLS if supports_tools else instructor.Mode.JSON\n            )\n\n            result = from_openai(\n                client,\n                model=model_name,\n                mode=mode if mode else default_mode,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the Ollama provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"deepseek\":\n        try:\n            import openai\n            from instructor import from_openai  # type: ignore[attr-defined]\n            import os\n\n            # Get API key from kwargs or environment\n            api_key = api_key or os.environ.get(\"DEEPSEEK_API_KEY\")\n\n            if not api_key:\n                from .core.exceptions import ConfigurationError\n\n                raise ConfigurationError(\n                    \"DEEPSEEK_API_KEY is not set. \"\n                    \"Set it with `export DEEPSEEK_API_KEY=<your-api-key>` or pass it as kwarg api_key=<your-api-key>\"\n                )\n\n            # DeepSeek uses OpenAI-compatible API\n            base_url = kwargs.pop(\"base_url\", \"https://api.deepseek.com\")\n\n            client = (\n                openai.AsyncOpenAI(api_key=api_key, base_url=base_url)\n                if async_client\n                else openai.OpenAI(api_key=api_key, base_url=base_url)\n            )\n\n            result = from_openai(\n                client,\n                model=model_name,\n                mode=mode if mode else instructor.Mode.TOOLS,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the DeepSeek provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"xai\":\n        try:\n            from xai_sdk.sync.client import Client as SyncClient\n            from xai_sdk.aio.client import Client as AsyncClient\n            from instructor import from_xai  # type: ignore[attr-defined]\n\n            client = (\n                AsyncClient(api_key=api_key)\n                if async_client\n                else SyncClient(api_key=api_key)\n            )\n            result = from_xai(\n                client,\n                mode=mode if mode else instructor.Mode.XAI_JSON,\n                model=model_name,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The xAI provider needs the optional dependency `xai-sdk`. \"\n                'Install it with `uv pip install \"instructor[xai]\"` (or `pip install \"instructor[xai]\"`). '\n                \"Note: xai-sdk requires Python 3.10+.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"openrouter\":\n        try:\n            import openai\n            from instructor import from_openai  # type: ignore[attr-defined]\n            import os\n\n            # Get API key from kwargs or environment\n            api_key = api_key or os.environ.get(\"OPENROUTER_API_KEY\")\n\n            if not api_key:\n                from .core.exceptions import ConfigurationError\n\n                raise ConfigurationError(\n                    \"OPENROUTER_API_KEY is not set. \"\n                    \"Set it with `export OPENROUTER_API_KEY=<your-api-key>` or pass it as kwarg api_key=<your-api-key>\"\n                )\n\n            # OpenRouter uses OpenAI-compatible API\n            base_url = kwargs.pop(\"base_url\", \"https://openrouter.ai/api/v1\")\n\n            client = (\n                openai.AsyncOpenAI(api_key=api_key, base_url=base_url)\n                if async_client\n                else openai.OpenAI(api_key=api_key, base_url=base_url)\n            )\n\n            result = from_openai(\n                client,\n                model=model_name,\n                mode=mode if mode else instructor.Mode.TOOLS,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The openai package is required to use the OpenRouter provider. \"\n                \"Install it with `pip install openai`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    elif provider == \"litellm\":\n        try:\n            from litellm import completion, acompletion\n            from instructor import from_litellm\n\n            completion_func = acompletion if async_client else completion\n            result = from_litellm(\n                completion_func,\n                mode=mode if mode else instructor.Mode.TOOLS,\n                **kwargs,\n            )\n            logger.info(\n                \"Client initialized\",\n                extra={**provider_info, \"status\": \"success\"},\n            )\n            return result\n        except ImportError:\n            from .core.exceptions import ConfigurationError\n\n            raise ConfigurationError(\n                \"The litellm package is required to use the LiteLLM provider. \"\n                \"Install it with `pip install litellm`.\"\n            ) from None\n        except Exception as e:\n            logger.error(\n                \"Error initializing %s client: %s\",\n                provider,\n                e,\n                exc_info=True,\n                extra={**provider_info, \"status\": \"error\"},\n            )\n            raise\n\n    else:\n        from .core.exceptions import ConfigurationError\n\n        logger.error(\n            \"Error initializing %s client: unsupported provider\",\n            provider,\n            extra={**provider_info, \"status\": \"error\"},\n        )\n        raise ConfigurationError(\n            f\"Unsupported provider: {provider}. \"\n            f\"Supported providers are: {supported_providers}\"\n        )\n"
  },
  {
    "path": "instructor/batch/__init__.py",
    "content": "\"\"\"\nUnified Batch Processing API for Multiple Providers\n\nThis module provides a unified interface for batch processing across OpenAI and Anthropic\nproviders. The API uses a Maybe/Result-like pattern with custom_id\ntracking for type-safe handling of batch results.\n\nSupported Providers:\n- OpenAI: 50% cost savings on batch requests\n- Anthropic: 50% cost savings on batch requests (Message Batches API)\n\nFeatures:\n- Type-safe Maybe/Result pattern for handling successes and errors\n- Custom ID tracking for correlating results to original requests\n- Unified interface across all providers\n- Helper functions for filtering and extracting results\n\nExample usage:\n    from instructor.batch import BatchProcessor, filter_successful, extract_results\n    from pydantic import BaseModel\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    processor = BatchProcessor(\"openai/gpt-4o-mini\", User)\n    batch_id = processor.submit_batch(\"requests.jsonl\")\n\n    # Results are BatchSuccess[T] | BatchError union types\n    all_results = processor.retrieve_results(batch_id)\n    successful_results = filter_successful(all_results)\n    extracted_users = extract_results(all_results)\n\nDocumentation:\n- OpenAI Batch API: https://platform.openai.com/docs/guides/batch\n- Anthropic Message Batches: https://docs.anthropic.com/en/api/creating-message-batches\n\"\"\"\n\nfrom typing import Any, Optional\n\n# Import all public symbols from the modules\nfrom .models import (\n    BatchSuccess,\n    BatchError,\n    BatchStatus,\n    BatchTimestamps,\n    BatchRequestCounts,\n    BatchErrorInfo,\n    BatchFiles,\n    BatchJobInfo,\n    BatchResult,\n    T,\n)\nfrom .utils import (\n    filter_successful,\n    filter_errors,\n    extract_results,\n    get_results_by_custom_id,\n)\nfrom .request import (\n    BatchRequest,\n    Function,\n    Tool,\n    RequestBody,\n    BatchModel,\n)\nfrom .processor import BatchProcessor\n\n\nclass BatchJob:\n    \"\"\"Legacy BatchJob class for backward compatibility\"\"\"\n\n    @classmethod\n    def parse_from_file(\n        cls, file_path: str, response_model: type[T]\n    ) -> tuple[list[T], list[dict[Any, Any]]]:\n        with open(file_path) as file:\n            content = file.read()\n        return cls.parse_from_string(content, response_model)\n\n    @classmethod\n    def parse_from_string(\n        cls, content: str, response_model: type[T]\n    ) -> tuple[list[T], list[dict[Any, Any]]]:\n        \"\"\"Enhanced parser that works with all providers using JSON schema\"\"\"\n        import json\n\n        res: list[T] = []\n        error_objs: list[dict[Any, Any]] = []\n\n        lines = content.strip().split(\"\\n\")\n        for line in lines:\n            if not line.strip():\n                continue\n\n            try:\n                data = json.loads(line)\n                extracted_data = cls._extract_structured_data(data)\n\n                if extracted_data:\n                    try:\n                        result = response_model(**extracted_data)\n                        res.append(result)\n                    except Exception:\n                        error_objs.append(data)\n                else:\n                    error_objs.append(data)\n\n            except Exception:\n                error_objs.append({\"error\": \"Failed to parse JSON\", \"raw_line\": line})\n\n        return res, error_objs\n\n    @classmethod\n    def _extract_structured_data(cls, data: dict[str, Any]) -> Optional[dict[str, Any]]:\n        \"\"\"Extract structured data from various provider response formats\"\"\"\n        import json\n\n        try:\n            # Try OpenAI JSON schema format first\n            if \"response\" in data and \"body\" in data[\"response\"]:\n                choices = data[\"response\"][\"body\"].get(\"choices\", [])\n                if choices:\n                    message = choices[0].get(\"message\", {})\n\n                    # JSON schema response\n                    if \"content\" in message:\n                        content = message[\"content\"]\n                        if isinstance(content, str):\n                            return json.loads(content)\n\n                    # Tool calls (legacy)\n                    if \"tool_calls\" in message:\n                        tool_call = message[\"tool_calls\"][0]\n                        return json.loads(tool_call[\"function\"][\"arguments\"])\n\n            # Try Anthropic format\n            if \"result\" in data and \"message\" in data[\"result\"]:\n                content = data[\"result\"][\"message\"][\"content\"]\n                if isinstance(content, list) and len(content) > 0:\n                    # Tool use response\n                    for item in content:\n                        if item.get(\"type\") == \"tool_use\":\n                            return item.get(\"input\", {})\n                    # Text response with JSON\n                    for item in content:\n                        if item.get(\"type\") == \"text\":\n                            text = item.get(\"text\", \"\")\n                            return json.loads(text)\n\n        except Exception:\n            pass\n\n        return None\n\n\n# Define what gets exported when someone does \"from instructor.batch import *\"\n__all__ = [\n    # Core types\n    \"T\",\n    \"BatchResult\",\n    # Models\n    \"BatchSuccess\",\n    \"BatchError\",\n    \"BatchStatus\",\n    \"BatchTimestamps\",\n    \"BatchRequestCounts\",\n    \"BatchErrorInfo\",\n    \"BatchFiles\",\n    \"BatchJobInfo\",\n    # Utility functions\n    \"filter_successful\",\n    \"filter_errors\",\n    \"extract_results\",\n    \"get_results_by_custom_id\",\n    # Request models\n    \"BatchRequest\",\n    \"Function\",\n    \"Tool\",\n    \"RequestBody\",\n    \"BatchModel\",\n    # Main processor\n    \"BatchProcessor\",\n    # Legacy\n    \"BatchJob\",\n]\n"
  },
  {
    "path": "instructor/batch/models.py",
    "content": "\"\"\"\nData models and types for batch processing.\n\nThis module contains all the Pydantic models, enums, and type definitions\nused throughout the batch processing system.\n\"\"\"\n\nfrom __future__ import annotations\nfrom typing import Any, Union, TypeVar, Generic\nfrom typing_extensions import TypeAlias\nfrom pydantic import BaseModel, Field, ConfigDict\nfrom datetime import datetime, timezone\nfrom enum import Enum\n\nT = TypeVar(\"T\", bound=BaseModel)\n\n\nclass BatchSuccess(BaseModel, Generic[T]):\n    \"\"\"Successful batch result with custom_id\"\"\"\n\n    custom_id: str\n    result: T\n    success: bool = True\n\n    model_config = ConfigDict(arbitrary_types_allowed=True)\n\n\nclass BatchError(BaseModel):\n    \"\"\"Error information for failed batch requests\"\"\"\n\n    custom_id: str\n    error_type: str\n    error_message: str\n    success: bool = False\n    raw_data: dict[str, Any] | None = None\n\n\nclass BatchStatus(str, Enum):\n    \"\"\"Normalized batch status across providers\"\"\"\n\n    PENDING = \"pending\"\n    PROCESSING = \"processing\"\n    COMPLETED = \"completed\"\n    FAILED = \"failed\"\n    CANCELLED = \"cancelled\"\n    EXPIRED = \"expired\"\n\n\nclass BatchTimestamps(BaseModel):\n    \"\"\"Comprehensive timestamp tracking\"\"\"\n\n    created_at: datetime | None = None\n    started_at: datetime | None = None  # in_progress_at, processing start\n    completed_at: datetime | None = None  # completed_at, ended_at\n    failed_at: datetime | None = None\n    cancelled_at: datetime | None = None\n    expired_at: datetime | None = None\n    expires_at: datetime | None = None\n\n\nclass BatchRequestCounts(BaseModel):\n    \"\"\"Unified request counts across providers\"\"\"\n\n    total: int | None = None\n\n    # OpenAI fields\n    completed: int | None = None\n    failed: int | None = None\n\n    # Anthropic fields\n    processing: int | None = None\n    succeeded: int | None = None\n    errored: int | None = None\n    cancelled: int | None = None\n    expired: int | None = None\n\n\nclass BatchErrorInfo(BaseModel):\n    \"\"\"Batch-level error information\"\"\"\n\n    error_type: str | None = None\n    error_message: str | None = None\n    error_code: str | None = None\n\n\nclass BatchFiles(BaseModel):\n    \"\"\"File references for batch job\"\"\"\n\n    input_file_id: str | None = None\n    output_file_id: str | None = None\n    error_file_id: str | None = None\n    results_url: str | None = None  # Anthropic\n\n\nclass BatchJobInfo(BaseModel):\n    \"\"\"Enhanced unified batch job information with comprehensive provider support\"\"\"\n\n    # Core identifiers\n    id: str\n    provider: str\n\n    # Status information\n    status: BatchStatus\n    raw_status: str  # Original provider status\n\n    # Timing information\n    timestamps: BatchTimestamps\n\n    # Request tracking\n    request_counts: BatchRequestCounts\n\n    # File references\n    files: BatchFiles\n\n    # Error information\n    error: BatchErrorInfo | None = None\n\n    # Provider-specific data\n    metadata: dict[str, Any] = Field(default_factory=dict)\n    raw_data: dict[str, Any] | None = None\n\n    # Additional fields\n    model: str | None = None\n    endpoint: str | None = None\n    completion_window: str | None = None\n\n    @classmethod\n    def from_openai(cls, batch_data: dict[str, Any]) -> BatchJobInfo:\n        \"\"\"Create from OpenAI batch response\"\"\"\n        # Normalize status\n        status_map = {\n            \"validating\": BatchStatus.PENDING,\n            \"in_progress\": BatchStatus.PROCESSING,\n            \"finalizing\": BatchStatus.PROCESSING,\n            \"completed\": BatchStatus.COMPLETED,\n            \"failed\": BatchStatus.FAILED,\n            \"expired\": BatchStatus.EXPIRED,\n            \"cancelled\": BatchStatus.CANCELLED,\n            \"cancelling\": BatchStatus.CANCELLED,\n        }\n\n        # Parse timestamps\n        timestamps = BatchTimestamps(\n            created_at=(\n                datetime.fromtimestamp(batch_data[\"created_at\"], tz=timezone.utc)\n                if batch_data.get(\"created_at\")\n                else None\n            ),\n            started_at=(\n                datetime.fromtimestamp(batch_data[\"in_progress_at\"], tz=timezone.utc)\n                if batch_data.get(\"in_progress_at\")\n                else None\n            ),\n            completed_at=(\n                datetime.fromtimestamp(batch_data[\"completed_at\"], tz=timezone.utc)\n                if batch_data.get(\"completed_at\")\n                else None\n            ),\n            failed_at=(\n                datetime.fromtimestamp(batch_data[\"failed_at\"], tz=timezone.utc)\n                if batch_data.get(\"failed_at\")\n                else None\n            ),\n            cancelled_at=(\n                datetime.fromtimestamp(batch_data[\"cancelled_at\"], tz=timezone.utc)\n                if batch_data.get(\"cancelled_at\")\n                else None\n            ),\n            expired_at=(\n                datetime.fromtimestamp(batch_data[\"expired_at\"], tz=timezone.utc)\n                if batch_data.get(\"expired_at\")\n                else None\n            ),\n            expires_at=(\n                datetime.fromtimestamp(batch_data[\"expires_at\"], tz=timezone.utc)\n                if batch_data.get(\"expires_at\")\n                else None\n            ),\n        )\n\n        # Parse request counts\n        request_counts_data = batch_data.get(\"request_counts\", {})\n        request_counts = BatchRequestCounts(\n            total=request_counts_data.get(\"total\"),\n            completed=request_counts_data.get(\"completed\"),\n            failed=request_counts_data.get(\"failed\"),\n        )\n\n        # Parse files\n        files = BatchFiles(\n            input_file_id=batch_data.get(\"input_file_id\"),\n            output_file_id=batch_data.get(\"output_file_id\"),\n            error_file_id=batch_data.get(\"error_file_id\"),\n        )\n\n        # Parse error information\n        error = None\n        if batch_data.get(\"errors\"):\n            error_data = batch_data[\"errors\"]\n            error = BatchErrorInfo(\n                error_type=error_data.get(\"type\"),\n                error_message=error_data.get(\"message\"),\n                error_code=error_data.get(\"code\"),\n            )\n\n        return cls(\n            id=batch_data[\"id\"],\n            provider=\"openai\",\n            status=status_map.get(batch_data[\"status\"], BatchStatus.PENDING),\n            raw_status=batch_data[\"status\"],\n            timestamps=timestamps,\n            request_counts=request_counts,\n            files=files,\n            error=error,\n            metadata=batch_data.get(\"metadata\", {}),\n            raw_data=batch_data,\n            endpoint=batch_data.get(\"endpoint\"),\n            completion_window=batch_data.get(\"completion_window\"),\n        )\n\n    @classmethod\n    def from_anthropic(cls, batch_data: dict[str, Any]) -> BatchJobInfo:\n        \"\"\"Create from Anthropic batch response\"\"\"\n        # Normalize status\n        status_map = {\n            \"in_progress\": BatchStatus.PROCESSING,\n            \"ended\": BatchStatus.COMPLETED,\n            \"failed\": BatchStatus.FAILED,\n            \"cancelled\": BatchStatus.CANCELLED,\n            \"expired\": BatchStatus.EXPIRED,\n        }\n\n        # Parse timestamps\n        def parse_iso_timestamp(timestamp_value):\n            if not timestamp_value:\n                return None\n            try:\n                # Handle different timestamp format variations\n                if isinstance(timestamp_value, datetime):\n                    return timestamp_value\n                elif isinstance(timestamp_value, str):\n                    return datetime.fromisoformat(\n                        timestamp_value.replace(\"Z\", \"+00:00\")\n                    )\n                else:\n                    return None\n            except (ValueError, AttributeError):\n                return None\n\n        timestamps = BatchTimestamps(\n            created_at=parse_iso_timestamp(batch_data.get(\"created_at\")),\n            started_at=parse_iso_timestamp(\n                batch_data.get(\"created_at\")\n            ),  # Anthropic doesn't provide started_at, use created_at\n            cancelled_at=parse_iso_timestamp(batch_data.get(\"cancel_initiated_at\")),\n            completed_at=parse_iso_timestamp(batch_data.get(\"ended_at\")),\n            expires_at=parse_iso_timestamp(batch_data.get(\"expires_at\")),\n        )\n\n        # Parse request counts\n        request_counts_data = batch_data.get(\"request_counts\", {})\n        request_counts = BatchRequestCounts(\n            processing=request_counts_data.get(\"processing\"),\n            succeeded=request_counts_data.get(\"succeeded\"),\n            errored=request_counts_data.get(\"errored\"),\n            cancelled=request_counts_data.get(\n                \"canceled\"\n            ),  # Note: Anthropic uses \"canceled\"\n            expired=request_counts_data.get(\"expired\"),\n            total=request_counts_data.get(\"processing\", 0)\n            + request_counts_data.get(\"succeeded\", 0)\n            + request_counts_data.get(\"errored\", 0),\n        )\n\n        # Parse files\n        files = BatchFiles(\n            results_url=batch_data.get(\"results_url\"),\n        )\n\n        return cls(\n            id=batch_data[\"id\"],\n            provider=\"anthropic\",\n            status=status_map.get(batch_data[\"processing_status\"], BatchStatus.PENDING),\n            raw_status=batch_data[\"processing_status\"],\n            timestamps=timestamps,\n            request_counts=request_counts,\n            files=files,\n            raw_data=batch_data,\n        )\n\n\n# Union type for batch results - like a Maybe/Result type\nBatchResult: TypeAlias = Union[BatchSuccess[T], BatchError]  # type: ignore\n"
  },
  {
    "path": "instructor/batch/processor.py",
    "content": "\"\"\"\nBatch processor for unified batch processing across providers.\n\nThis module contains the BatchProcessor class that provides a unified interface\nfor batch processing across different LLM providers.\n\"\"\"\n\nfrom __future__ import annotations\nfrom typing import Any, Generic\nimport json\nimport os\nimport io\nfrom .models import BatchResult, BatchSuccess, BatchError, BatchJobInfo, T\nfrom .request import BatchRequest\nfrom .providers import get_provider\n\n\nclass BatchProcessor(Generic[T]):\n    \"\"\"Unified batch processor that works across all providers\"\"\"\n\n    def __init__(self, model: str, response_model: type[T]):\n        self.model = model\n        self.response_model = response_model\n\n        # Parse provider from model string\n        try:\n            self.provider_name, self.model_name = model.split(\"/\", 1)\n        except ValueError as err:\n            raise ValueError(\n                'Model string must be in format \"provider/model-name\" '\n                '(e.g. \"openai/gpt-4\" or \"anthropic/claude-3-sonnet\")'\n            ) from err\n\n        # Get the batch provider instance\n        self.provider = get_provider(self.provider_name)\n\n    def create_batch_from_messages(\n        self,\n        messages_list: list[list[dict[str, Any]]],\n        file_path: str | None = None,\n        max_tokens: int | None = 1000,\n        temperature: float | None = 0.1,\n    ) -> str | io.BytesIO:\n        \"\"\"Create batch file from list of message conversations\n\n        Args:\n            messages_list: List of message conversations, each as a list of message dicts\n            file_path: Path to save the batch request file. If None, returns BytesIO buffer\n            max_tokens: Maximum tokens per request\n            temperature: Temperature for generation\n\n        Returns:\n            The file path where the batch was saved, or BytesIO buffer if file_path is None\n        \"\"\"\n        if file_path is not None:\n            if os.path.exists(file_path):\n                os.remove(file_path)\n\n            batch_requests = []\n            for i, messages in enumerate(messages_list):\n                batch_request = BatchRequest[self.response_model](\n                    custom_id=f\"request-{i}\",\n                    messages=messages,\n                    response_model=self.response_model,\n                    model=self.model_name,\n                    max_tokens=max_tokens,\n                    temperature=temperature,\n                )\n                batch_request.save_to_file(file_path, self.provider_name)\n                batch_requests.append(batch_request)\n\n            print(f\"Created batch file {file_path} with {len(batch_requests)} requests\")\n            return file_path\n        else:\n            # Create BytesIO buffer - caller is responsible for cleanup\n            buffer = io.BytesIO()\n            batch_requests = []\n            for i, messages in enumerate(messages_list):\n                batch_request = BatchRequest[self.response_model](\n                    custom_id=f\"request-{i}\",\n                    messages=messages,\n                    response_model=self.response_model,\n                    model=self.model_name,\n                    max_tokens=max_tokens,\n                    temperature=temperature,\n                )\n                batch_request.save_to_file(buffer, self.provider_name)\n                batch_requests.append(batch_request)\n\n            print(f\"Created batch buffer with {len(batch_requests)} requests\")\n            buffer.seek(0)  # Reset buffer position for reading\n            return buffer\n\n    def submit_batch(\n        self,\n        file_path_or_buffer: str | io.BytesIO,\n        metadata: dict[str, Any] | None = None,\n        **kwargs,\n    ) -> str:\n        \"\"\"Submit batch job to the provider and return job ID\n\n        Args:\n            file_path_or_buffer: Path to the batch request file or BytesIO buffer\n            metadata: Optional metadata to attach to the batch job\n            **kwargs: Additional provider-specific arguments\n        \"\"\"\n        if metadata is None:\n            metadata = {\"description\": \"Instructor batch job\"}\n\n        return self.provider.submit_batch(\n            file_path_or_buffer, metadata=metadata, **kwargs\n        )\n\n    def get_batch_status(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Get batch job status from the provider\"\"\"\n        return self.provider.get_status(batch_id)\n\n    def retrieve_results(self, batch_id: str) -> list[BatchResult]:\n        \"\"\"Retrieve and parse batch results from the provider\"\"\"\n        results_content = self.provider.retrieve_results(batch_id)\n        return self.parse_results(results_content)\n\n    def list_batches(self, limit: int = 10) -> list[BatchJobInfo]:\n        \"\"\"List batch jobs for the current provider\n\n        Args:\n            limit: Maximum number of batch jobs to return\n\n        Returns:\n            List of BatchJobInfo objects with normalized batch information\n        \"\"\"\n        return self.provider.list_batches(limit)\n\n    def get_results(\n        self, batch_id: str, file_path: str | None = None\n    ) -> list[BatchResult]:\n        \"\"\"Get batch results, optionally saving raw results to a file\n\n        Args:\n            batch_id: The batch job ID\n            file_path: Optional file path to save raw results. If provided,\n                      raw results will be saved to this file. If not provided,\n                      results are only kept in memory.\n\n        Returns:\n            List of BatchResult objects (BatchSuccess[T] or BatchError)\n        \"\"\"\n        # Retrieve results directly to memory\n        results_content = self.retrieve_results(batch_id)\n\n        # If file path is provided, save raw results to file\n        if file_path is not None:\n            self.provider.download_results(batch_id, file_path)\n\n        return results_content\n\n    def cancel_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Cancel a batch job\n\n        Args:\n            batch_id: The batch job ID to cancel\n\n        Returns:\n            Dict containing the cancelled batch information\n        \"\"\"\n        return self.provider.cancel_batch(batch_id)\n\n    def delete_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Delete a batch job (only available for completed batches)\n\n        Args:\n            batch_id: The batch job ID to delete\n\n        Returns:\n            Dict containing the deletion confirmation\n        \"\"\"\n        return self.provider.delete_batch(batch_id)\n\n    def parse_results(self, results_content: str) -> list[BatchResult]:\n        \"\"\"Parse batch results from content string into Maybe-like results with custom_id tracking\"\"\"\n        results: list[BatchResult] = []\n\n        lines = results_content.strip().split(\"\\n\")\n        for line in lines:\n            if not line.strip():\n                continue\n\n            try:\n                data = json.loads(line)\n                custom_id = data.get(\"custom_id\", \"unknown\")\n                extracted_data = self._extract_from_response(data)\n\n                if extracted_data:\n                    try:\n                        # Parse into response model\n                        result = self.response_model(**extracted_data)\n                        batch_result = BatchSuccess[T](\n                            custom_id=custom_id, result=result\n                        )\n                        results.append(batch_result)\n                    except Exception as e:\n                        error_result = BatchError(\n                            custom_id=custom_id,\n                            error_type=\"parsing_error\",\n                            error_message=f\"Failed to parse into {self.response_model.__name__}: {e}\",\n                            raw_data=extracted_data,\n                        )\n                        results.append(error_result)\n                else:\n                    # Check if this is a provider error response\n                    error_message = \"Unknown error\"\n                    error_type = \"extraction_error\"\n\n                    if self.provider_name == \"anthropic\" and \"result\" in data:\n                        result = data[\"result\"]\n                        if result.get(\"type\") == \"error\":\n                            error_info = result.get(\"error\", {})\n                            if isinstance(error_info, dict) and \"error\" in error_info:\n                                error_details = error_info[\"error\"]\n                                error_message = error_details.get(\n                                    \"message\", \"Unknown Anthropic error\"\n                                )\n                                error_type = error_details.get(\n                                    \"type\", \"anthropic_error\"\n                                )\n                            else:\n                                error_message = str(error_info)\n                                error_type = \"anthropic_error\"\n\n                    error_result = BatchError(\n                        custom_id=custom_id,\n                        error_type=error_type,\n                        error_message=error_message,\n                        raw_data=data,\n                    )\n                    results.append(error_result)\n\n            except Exception as e:\n                error_result = BatchError(\n                    custom_id=\"unknown\",\n                    error_type=\"json_parse_error\",\n                    error_message=f\"Failed to parse JSON: {e}\",\n                    raw_data={\"raw_line\": line},\n                )\n                results.append(error_result)\n\n        return results\n\n    def _extract_from_response(self, data: dict[str, Any]) -> dict[str, Any] | None:\n        \"\"\"Extract structured data from provider-specific response format\"\"\"\n        try:\n            if self.provider_name == \"openai\":\n                # OpenAI JSON schema response\n                content = data[\"response\"][\"body\"][\"choices\"][0][\"message\"][\"content\"]\n                return json.loads(content)\n\n            elif self.provider_name == \"anthropic\":\n                # Anthropic batch response format\n                if \"result\" not in data:\n                    return None\n\n                result = data[\"result\"]\n\n                # Check if result is an error\n                if result.get(\"type\") == \"error\":\n                    # Return None to indicate error, let caller handle\n                    return None\n\n                # Handle successful message result\n                if result.get(\"type\") == \"succeeded\" and \"message\" in result:\n                    content = result[\"message\"][\"content\"]\n                    if isinstance(content, list) and len(content) > 0:\n                        # Try tool_use first\n                        for item in content:\n                            if item.get(\"type\") == \"tool_use\":\n                                return item.get(\"input\", {})\n\n                        # Fallback to text content and parse JSON\n                        for item in content:\n                            if item.get(\"type\") == \"text\":\n                                text = item.get(\"text\", \"\")\n                                try:\n                                    return json.loads(text)\n                                except json.JSONDecodeError:\n                                    continue\n\n                return None\n\n        except Exception:\n            return None\n\n        return None\n"
  },
  {
    "path": "instructor/batch/providers/__init__.py",
    "content": "\"\"\"\nProvider-specific batch processing implementations.\n\nThis module contains provider-specific implementations for OpenAI and Anthropic\nbatch processing APIs.\n\"\"\"\n\nfrom .base import BatchProvider\nimport importlib.util\n\nif importlib.util.find_spec(\"openai\") is not None:\n    from .openai import OpenAIProvider\nif importlib.util.find_spec(\"anthropic\") is not None:\n    from .anthropic import AnthropicProvider\n\n\ndef get_provider(provider_name: str) -> BatchProvider:\n    \"\"\"Factory function to get the appropriate provider instance\"\"\"\n    if provider_name == \"openai\":\n        if OpenAIProvider is None:\n            raise ValueError(\"OpenAI is not installed\")\n        return OpenAIProvider()\n    elif provider_name == \"anthropic\":\n        if AnthropicProvider is None:\n            raise ValueError(\"Anthropic is not installed\")\n        return AnthropicProvider()\n    else:\n        raise ValueError(f\"Unsupported provider: {provider_name}\")\n\n\n__all__ = [\"BatchProvider\", \"OpenAIProvider\", \"AnthropicProvider\", \"get_provider\"]\n"
  },
  {
    "path": "instructor/batch/providers/anthropic.py",
    "content": "\"\"\"\nAnthropic-specific batch processing implementation.\n\nThis module contains the Anthropic batch processing provider class.\n\"\"\"\n\nimport json\nfrom typing import Any, Optional, Union\nimport io\nimport logging\nfrom .base import BatchProvider\nfrom ..models import BatchJobInfo\n\nlogger = logging.getLogger(__name__)\n\n\nclass AnthropicProvider(BatchProvider):\n    \"\"\"Anthropic batch processing provider\"\"\"\n\n    def submit_batch(\n        self,\n        file_path_or_buffer: Union[str, io.BytesIO],\n        metadata: Optional[dict[str, Any]] = None,\n        **kwargs,\n    ) -> str:\n        \"\"\"Submit Anthropic batch job\"\"\"\n        _ = kwargs  # Unused but accepted for API consistency\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # Note: Anthropic doesn't support metadata in batch creation\n            # but we accept it for API consistency\n            if metadata:\n                print(\n                    f\"Note: Anthropic batches don't support metadata. Ignoring: {metadata}\"\n                )\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            if isinstance(file_path_or_buffer, str):\n                with open(file_path_or_buffer) as f:\n                    requests = [json.loads(line) for line in f if line.strip()]\n            elif isinstance(file_path_or_buffer, io.BytesIO):\n                file_path_or_buffer.seek(0)\n                content = file_path_or_buffer.read().decode(\"utf-8\")\n                requests = [\n                    json.loads(line) for line in content.split(\"\\n\") if line.strip()\n                ]\n            else:\n                raise ValueError(\n                    f\"Unsupported file_path_or_buffer type: {type(file_path_or_buffer)}\"\n                )\n\n            batch = batches_client.create(requests=requests)\n            return batch.id\n        except (ValueError, TypeError) as e:\n            # Re-raise validation errors as-is\n            logger.error(f\"Validation error in Anthropic batch submission: {e}\")\n            raise\n        except Exception as e:\n            raise RuntimeError(f\"Failed to submit Anthropic batch: {e}\") from e\n\n    def get_status(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Get Anthropic batch status\"\"\"\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            batch = batches_client.retrieve(batch_id)\n            return {\n                \"id\": batch.id,\n                \"status\": batch.processing_status,\n                \"created_at\": batch.created_at,\n                \"request_counts\": getattr(batch, \"request_counts\", {}),\n            }\n        except Exception as e:\n            raise Exception(f\"Failed to get Anthropic batch status: {e}\") from e\n\n    def retrieve_results(self, batch_id: str) -> str:\n        \"\"\"Retrieve Anthropic batch results\"\"\"\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            batch = batches_client.retrieve(batch_id)\n\n            # Check for various terminal states\n            if batch.processing_status in [\"failed\", \"cancelled\", \"expired\"]:\n                raise Exception(\n                    f\"Batch job failed with status: {batch.processing_status}\"\n                )\n\n            if batch.processing_status != \"ended\":\n                raise Exception(\n                    f\"Batch not completed, status: {batch.processing_status}\"\n                )\n\n            # Check if all requests failed\n            request_counts = getattr(batch, \"request_counts\", None)\n            if request_counts:\n                succeeded = getattr(request_counts, \"succeeded\", 0)\n                errored = getattr(request_counts, \"errored\", 0)\n                total = getattr(request_counts, \"total\", 0)\n\n                if errored > 0 and succeeded == 0:\n                    raise RuntimeError(\n                        f\"All {total} batch requests failed. No results will be available.\"\n                    )\n\n            results = batches_client.results(batch_id)\n            results_lines = []\n            for result in results:\n                results_lines.append(result.model_dump_json())\n\n            return \"\\n\".join(results_lines)\n        except Exception as e:\n            raise Exception(f\"Failed to retrieve Anthropic results: {e}\") from e\n\n    def download_results(self, batch_id: str, file_path: str) -> None:\n        \"\"\"Download Anthropic batch results to a file\"\"\"\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            batch = batches_client.retrieve(batch_id)\n\n            # Check for various terminal states\n            if batch.processing_status in [\"failed\", \"cancelled\", \"expired\"]:\n                raise Exception(\n                    f\"Batch job failed with status: {batch.processing_status}\"\n                )\n\n            if batch.processing_status != \"ended\":\n                raise Exception(\n                    f\"Batch not completed, status: {batch.processing_status}\"\n                )\n\n            # Check if all requests failed\n            request_counts = getattr(batch, \"request_counts\", None)\n            if request_counts:\n                succeeded = getattr(request_counts, \"succeeded\", 0)\n                errored = getattr(request_counts, \"errored\", 0)\n                total = getattr(request_counts, \"total\", 0)\n\n                if errored > 0 and succeeded == 0:\n                    raise RuntimeError(\n                        f\"All {total} batch requests failed. No results will be available.\"\n                    )\n\n            results = batches_client.results(batch_id)\n            with open(file_path, \"w\") as f:\n                for result in results:\n                    f.write(result.model_dump_json() + \"\\n\")\n        except Exception as e:\n            raise Exception(f\"Failed to download Anthropic results: {e}\") from e\n\n    def cancel_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Cancel Anthropic batch job\"\"\"\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            batch = batches_client.cancel(batch_id)\n            return batch.model_dump()\n        except Exception as e:\n            raise Exception(f\"Failed to cancel Anthropic batch: {e}\") from e\n\n    def delete_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Delete Anthropic batch job\"\"\"\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            batch = batches_client.retrieve(batch_id)\n            return {\n                \"id\": batch.id,\n                \"status\": batch.processing_status,\n                \"message\": \"Anthropic does not support batch deletion\",\n            }\n        except Exception as e:\n            raise Exception(f\"Failed to delete Anthropic batch: {e}\") from e\n\n    def list_batches(self, limit: int = 10) -> list[BatchJobInfo]:\n        \"\"\"List Anthropic batch jobs\"\"\"\n        try:\n            import anthropic\n\n            client = anthropic.Anthropic()\n\n            # TODO(#batch-api-stable): Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n\n            batches = batches_client.list(limit=limit)\n            return [\n                BatchJobInfo.from_anthropic(batch.model_dump())\n                for batch in batches.data\n            ]\n        except Exception as e:\n            raise Exception(f\"Failed to list Anthropic batches: {e}\") from e\n"
  },
  {
    "path": "instructor/batch/providers/base.py",
    "content": "\"\"\"\nBase provider class for batch processing.\n\nThis module defines the abstract base class that all batch providers must implement.\n\"\"\"\n\nfrom abc import ABC, abstractmethod\nfrom typing import Any, Optional, Union\nimport io\nimport logging\nfrom ..models import BatchJobInfo\n\nlogger = logging.getLogger(__name__)\n\n\nclass BatchProvider(ABC):\n    \"\"\"Abstract base class for batch processing providers\"\"\"\n\n    @abstractmethod\n    def submit_batch(\n        self,\n        file_path_or_buffer: Union[str, io.BytesIO],\n        metadata: Optional[dict[str, Any]] = None,\n        **kwargs,\n    ) -> str:\n        \"\"\"Submit a batch job and return the job ID\"\"\"\n        pass\n\n    @abstractmethod\n    def get_status(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Get the status of a batch job\"\"\"\n        pass\n\n    @abstractmethod\n    def retrieve_results(self, batch_id: str) -> str:\n        \"\"\"Retrieve batch results as a string\"\"\"\n        pass\n\n    @abstractmethod\n    def download_results(self, batch_id: str, file_path: str) -> None:\n        \"\"\"Download batch results to a file\"\"\"\n        pass\n\n    @abstractmethod\n    def cancel_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Cancel a batch job\"\"\"\n        pass\n\n    @abstractmethod\n    def delete_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Delete a batch job\"\"\"\n        pass\n\n    @abstractmethod\n    def list_batches(self, limit: int = 10) -> list[BatchJobInfo]:\n        \"\"\"List batch jobs\"\"\"\n        pass\n"
  },
  {
    "path": "instructor/batch/providers/openai.py",
    "content": "\"\"\"\nOpenAI-specific batch processing implementation.\n\nThis module contains the OpenAI batch processing provider class.\n\"\"\"\n\nfrom typing import Any, Optional, Union\nimport io\nimport logging\nfrom .base import BatchProvider\nfrom ..models import BatchJobInfo\n\nlogger = logging.getLogger(__name__)\n\n\nclass OpenAIProvider(BatchProvider):\n    \"\"\"OpenAI batch processing provider\"\"\"\n\n    def submit_batch(\n        self,\n        file_path_or_buffer: Union[str, io.BytesIO],\n        metadata: Optional[dict[str, Any]] = None,\n        **kwargs,\n    ) -> str:\n        \"\"\"Submit OpenAI batch job\"\"\"\n        try:\n            from openai import OpenAI\n\n            client = OpenAI()\n\n            if metadata is None:\n                metadata = {\"description\": \"Instructor batch job\"}\n\n            logger.debug(f\"Submitting batch job with metadata: {metadata}\")\n\n            if isinstance(file_path_or_buffer, str):\n                logger.debug(f\"Creating batch file from path: {file_path_or_buffer}\")\n                with open(file_path_or_buffer, \"rb\") as f:\n                    batch_file = client.files.create(file=f, purpose=\"batch\")\n            elif isinstance(file_path_or_buffer, io.BytesIO):\n                logger.debug(\"Creating batch file from BytesIO buffer\")\n                file_path_or_buffer.seek(0)\n                batch_file = client.files.create(\n                    file=file_path_or_buffer, purpose=\"batch\"\n                )\n            else:\n                raise ValueError(\n                    f\"Unsupported file_path_or_buffer type: {type(file_path_or_buffer)}\"\n                )\n\n            batch_job = client.batches.create(\n                input_file_id=batch_file.id,\n                endpoint=\"/v1/chat/completions\",\n                completion_window=kwargs.get(\"completion_window\", \"24h\"),\n                metadata=metadata,\n            )\n            logger.info(f\"Successfully submitted batch job: {batch_job.id}\")\n            return batch_job.id\n        except (ValueError, TypeError) as e:\n            # Re-raise validation errors as-is\n            logger.error(f\"Validation error in OpenAI batch submission: {e}\")\n            raise\n        except Exception as e:\n            logger.error(f\"Failed to submit OpenAI batch: {e}\")\n            raise RuntimeError(f\"Failed to submit OpenAI batch: {e}\") from e\n\n    def get_status(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Get OpenAI batch status\"\"\"\n        try:\n            from openai import OpenAI\n\n            client = OpenAI()\n            batch = client.batches.retrieve(batch_id)\n            return {\n                \"id\": batch.id,\n                \"status\": batch.status,\n                \"created_at\": batch.created_at,\n                \"request_counts\": {\n                    \"total\": getattr(batch.request_counts, \"total\", 0),\n                    \"completed\": getattr(batch.request_counts, \"completed\", 0),\n                    \"failed\": getattr(batch.request_counts, \"failed\", 0),\n                },\n            }\n        except Exception as e:\n            raise Exception(f\"Failed to get OpenAI batch status: {e}\") from e\n\n    def retrieve_results(self, batch_id: str) -> str:\n        \"\"\"Retrieve OpenAI batch results\"\"\"\n        try:\n            from openai import OpenAI\n            import time\n\n            client = OpenAI()\n            batch = client.batches.retrieve(batch_id)\n\n            if batch.status != \"completed\":\n                raise Exception(f\"Batch not completed, status: {batch.status}\")\n\n            # Check if all requests failed\n            request_counts = getattr(batch, \"request_counts\", None)\n            if request_counts:\n                completed = getattr(request_counts, \"completed\", 0)\n                failed = getattr(request_counts, \"failed\", 0)\n                total = getattr(request_counts, \"total\", 0)\n\n                if failed > 0 and completed == 0:\n                    raise RuntimeError(\n                        f\"All {total} batch requests failed. No output file will be available. \"\n                    )\n\n            if not batch.output_file_id:\n                # Sometimes output file isn't immediately available, wait longer and retry more\n                max_retries = 10\n                for attempt in range(max_retries):\n                    wait_time = min(\n                        5 + attempt, 15\n                    )  # Progressive backoff: 5s, 6s, 7s... up to 15s\n                    print(\n                        f\"Output file not ready, waiting {wait_time}s (attempt {attempt + 1}/{max_retries})...\"\n                    )\n                    time.sleep(wait_time)\n                    batch = client.batches.retrieve(batch_id)\n                    if batch.output_file_id:\n                        print(f\"Output file now available: {batch.output_file_id}\")\n                        break\n                    # Check if batch failed during our wait\n                    if batch.status != \"completed\":\n                        raise Exception(\n                            f\"Batch status changed to {batch.status} while waiting for output file\"\n                        )\n                    if attempt == max_retries - 1:\n                        # Final attempt - provide detailed error info\n                        raise RuntimeError(\n                            f\"No output file available after {max_retries} retries over {sum(range(5, 5 + max_retries))} seconds. \"\n                            f\"Batch status: {batch.status}, Request counts: {getattr(batch, 'request_counts', 'unknown')}. \"\n                        )\n\n            if batch.output_file_id is None:\n                raise RuntimeError(\"Batch has no output file ID available\")\n            file_response = client.files.content(batch.output_file_id)\n            return file_response.text\n        except Exception as e:\n            raise Exception(f\"Failed to retrieve OpenAI results: {e}\") from e\n\n    def download_results(self, batch_id: str, file_path: str) -> None:\n        \"\"\"Download OpenAI batch results to a file\"\"\"\n        try:\n            from openai import OpenAI\n            import time\n\n            client = OpenAI()\n            batch = client.batches.retrieve(batch_id)\n\n            if batch.status != \"completed\":\n                raise Exception(f\"Batch not completed, status: {batch.status}\")\n\n            # Check if all requests failed\n            request_counts = getattr(batch, \"request_counts\", None)\n            if request_counts:\n                completed = getattr(request_counts, \"completed\", 0)\n                failed = getattr(request_counts, \"failed\", 0)\n                total = getattr(request_counts, \"total\", 0)\n\n                if failed > 0 and completed == 0:\n                    raise RuntimeError(\n                        f\"All {total} batch requests failed. No output file will be available.\"\n                    )\n\n            if not batch.output_file_id:\n                # Sometimes output file isn't immediately available, wait longer and retry more\n                max_retries = 10\n                for attempt in range(max_retries):\n                    wait_time = min(\n                        5 + attempt, 15\n                    )  # Progressive backoff: 5s, 6s, 7s... up to 15s\n                    print(\n                        f\"Output file not ready, waiting {wait_time}s (attempt {attempt + 1}/{max_retries})...\"\n                    )\n                    time.sleep(wait_time)\n                    batch = client.batches.retrieve(batch_id)\n                    if batch.output_file_id:\n                        print(f\"Output file now available: {batch.output_file_id}\")\n                        break\n                    # Check if batch failed during our wait\n                    if batch.status != \"completed\":\n                        raise Exception(\n                            f\"Batch status changed to {batch.status} while waiting for output file\"\n                        )\n                    if attempt == max_retries - 1:\n                        # Final attempt - provide detailed error info\n                        raise Exception(\n                            f\"No output file available after {max_retries} retries over {sum(range(5, 5 + max_retries))} seconds. \"\n                            f\"Batch status: {batch.status}, Request counts: {getattr(batch, 'request_counts', 'unknown')}.\"\n                        )\n\n            if batch.output_file_id is None:\n                raise RuntimeError(\"Batch has no output file ID available\")\n            file_response = client.files.content(batch.output_file_id)\n            with open(file_path, \"w\") as f:\n                f.write(file_response.text)\n        except Exception as e:\n            raise Exception(f\"Failed to download OpenAI results: {e}\") from e\n\n    def cancel_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Cancel OpenAI batch job\"\"\"\n        try:\n            from openai import OpenAI\n\n            client = OpenAI()\n            batch = client.batches.cancel(batch_id)\n            return batch.model_dump()\n        except Exception as e:\n            raise Exception(f\"Failed to cancel OpenAI batch: {e}\") from e\n\n    def delete_batch(self, batch_id: str) -> dict[str, Any]:\n        \"\"\"Delete OpenAI batch job\"\"\"\n        try:\n            from openai import OpenAI\n\n            client = OpenAI()\n            # OpenAI doesn't have a delete endpoint, so we'll return the batch info\n            batch = client.batches.retrieve(batch_id)\n            return {\n                \"id\": batch.id,\n                \"status\": batch.status,\n                \"message\": \"OpenAI does not support batch deletion\",\n            }\n        except Exception as e:\n            raise Exception(f\"Failed to delete OpenAI batch: {e}\") from e\n\n    def list_batches(self, limit: int = 10) -> list[BatchJobInfo]:\n        \"\"\"List OpenAI batch jobs\"\"\"\n        try:\n            from openai import OpenAI\n\n            client = OpenAI()\n            batches = client.batches.list(limit=limit)\n            return [\n                BatchJobInfo.from_openai(batch.model_dump()) for batch in batches.data\n            ]\n        except Exception as e:\n            raise Exception(f\"Failed to list OpenAI batches: {e}\") from e\n"
  },
  {
    "path": "instructor/batch/request.py",
    "content": "\"\"\"\nBatch request models and schema utilities.\n\nThis module contains the BatchRequest class and related models for creating\nprovider-specific batch requests with JSON schema generation.\n\"\"\"\n\nfrom __future__ import annotations\nfrom typing import Any, Generic\nfrom pydantic import BaseModel, Field, ConfigDict\nimport json\nimport io\nfrom .models import T\n\n\nclass Function(BaseModel):\n    name: str\n    description: str\n    parameters: Any\n\n\nclass Tool(BaseModel):\n    type: str\n    function: Function\n\n\nclass RequestBody(BaseModel):\n    model: str\n    messages: list[dict[str, Any]]\n    max_tokens: int | None = Field(default=1000)\n    temperature: float | None = Field(default=1.0)\n    tools: list[Tool] | None\n    tool_choice: dict[str, Any] | None\n\n\nclass BatchModel(BaseModel):\n    custom_id: str\n    body: RequestBody\n    url: str\n    method: str\n\n\nclass BatchRequest(BaseModel, Generic[T]):\n    \"\"\"Unified batch request that works across all providers using JSON schema\"\"\"\n\n    custom_id: str\n    messages: list[dict[str, Any]]\n    response_model: type[T]\n    model: str\n    max_tokens: int | None = Field(default=1000)\n    temperature: float | None = Field(default=0.1)\n\n    model_config = ConfigDict(arbitrary_types_allowed=True)\n\n    def get_json_schema(self) -> dict[str, Any]:\n        \"\"\"Generate JSON schema from response_model\"\"\"\n        return self.response_model.model_json_schema()\n\n    def to_openai_format(self) -> dict[str, Any]:\n        \"\"\"Convert to OpenAI batch format with JSON schema\"\"\"\n        schema = self.get_json_schema()\n\n        # OpenAI strict mode requires additionalProperties to be false\n        def make_strict_schema(schema_dict):\n            \"\"\"Recursively add additionalProperties: false for OpenAI strict mode\"\"\"\n            if isinstance(schema_dict, dict):\n                if \"type\" in schema_dict:\n                    if schema_dict[\"type\"] == \"object\":\n                        schema_dict[\"additionalProperties\"] = False\n                    elif schema_dict[\"type\"] == \"array\" and \"items\" in schema_dict:\n                        schema_dict[\"items\"] = make_strict_schema(schema_dict[\"items\"])\n\n                # Recursively process properties\n                if \"properties\" in schema_dict:\n                    for prop_name, prop_schema in schema_dict[\"properties\"].items():\n                        schema_dict[\"properties\"][prop_name] = make_strict_schema(\n                            prop_schema\n                        )\n\n                # Process definitions/defs\n                for key in [\"definitions\", \"$defs\"]:\n                    if key in schema_dict:\n                        for def_name, def_schema in schema_dict[key].items():\n                            schema_dict[key][def_name] = make_strict_schema(def_schema)\n\n            return schema_dict\n\n        strict_schema = make_strict_schema(schema.copy())\n\n        return {\n            \"custom_id\": self.custom_id,\n            \"method\": \"POST\",\n            \"url\": \"/v1/chat/completions\",\n            \"body\": {\n                \"model\": self.model,\n                \"messages\": self.messages,\n                \"max_tokens\": self.max_tokens,\n                \"temperature\": self.temperature,\n                \"response_format\": {\n                    \"type\": \"json_schema\",\n                    \"json_schema\": {\n                        \"name\": self.response_model.__name__,\n                        \"strict\": True,\n                        \"schema\": strict_schema,\n                    },\n                },\n            },\n        }\n\n    def to_anthropic_format(self) -> dict[str, Any]:\n        \"\"\"Convert to Anthropic batch format with JSON schema\"\"\"\n        schema = self.get_json_schema()\n\n        # Ensure schema has proper format for Anthropic\n        if \"type\" not in schema:\n            schema[\"type\"] = \"object\"\n        if \"additionalProperties\" not in schema:\n            schema[\"additionalProperties\"] = False\n\n        # Extract system message and convert to system parameter\n        system_message = None\n        filtered_messages = []\n\n        for message in self.messages:\n            if message.get(\"role\") == \"system\":\n                system_message = message.get(\"content\", \"\")\n            else:\n                filtered_messages.append(message)\n\n        params = {\n            \"model\": self.model,\n            \"max_tokens\": self.max_tokens,\n            \"temperature\": self.temperature,\n            \"messages\": filtered_messages,\n            \"tools\": [\n                {\n                    \"name\": \"extract_data\",\n                    \"description\": f\"Extract data matching the {self.response_model.__name__} schema\",\n                    \"input_schema\": schema,\n                }\n            ],\n            \"tool_choice\": {\"type\": \"tool\", \"name\": \"extract_data\"},\n        }\n\n        # Add system parameter if system message exists\n        if system_message:\n            params[\"system\"] = system_message\n\n        return {\n            \"custom_id\": self.custom_id,\n            \"params\": params,\n        }\n\n    def save_to_file(\n        self, file_path_or_buffer: str | io.BytesIO, provider: str\n    ) -> None:\n        \"\"\"Save batch request to file or BytesIO buffer in provider-specific format\"\"\"\n        if provider == \"openai\":\n            data = self.to_openai_format()\n        elif provider == \"anthropic\":\n            data = self.to_anthropic_format()\n        else:\n            raise ValueError(f\"Unsupported provider: {provider}\")\n\n        json_line = json.dumps(data) + \"\\n\"\n\n        if isinstance(file_path_or_buffer, str):\n            with open(file_path_or_buffer, \"a\") as f:\n                f.write(json_line)\n        elif isinstance(file_path_or_buffer, io.BytesIO):\n            file_path_or_buffer.write(json_line.encode(\"utf-8\"))\n        else:\n            raise ValueError(\n                f\"Unsupported file_path_or_buffer type: {type(file_path_or_buffer)}\"\n            )\n"
  },
  {
    "path": "instructor/batch/utils.py",
    "content": "\"\"\"\nUtility functions for batch processing.\n\nThis module contains helper functions for filtering, extracting, and manipulating\nbatch results.\n\"\"\"\n\nfrom .models import BatchResult, BatchSuccess, BatchError, T\n\n\ndef filter_successful(results: list[BatchResult]) -> list[BatchSuccess[T]]:\n    \"\"\"Filter to only successful results\"\"\"\n    return [r for r in results if r.success]\n\n\ndef filter_errors(results: list[BatchResult]) -> list[BatchError]:\n    \"\"\"Filter to only error results\"\"\"\n    return [r for r in results if not r.success]\n\n\ndef extract_results(results: list[BatchResult]) -> list[T]:\n    \"\"\"Extract just the result objects from successful results\"\"\"\n    return [r.result for r in results if r.success]\n\n\ndef get_results_by_custom_id(results: list[BatchResult]) -> dict[str, BatchResult]:\n    \"\"\"Create a dictionary mapping custom_id to results\"\"\"\n    return {r.custom_id: r for r in results}\n"
  },
  {
    "path": "instructor/cache/__init__.py",
    "content": "\"\"\"Caching utilities for Instructor.\n\nThis module provides a very small abstraction layer so that users can\nplug different cache back-ends (in-process LRU, `diskcache`, `redis`, …)\ninto the Instructor client via the ``cache=...`` keyword::\n\n    from instructor import from_provider\n    from instructor.cache import AutoCache\n\n    cache = AutoCache(maxsize=10_000)\n    client = from_provider(\"openai/gpt-4o\", cache=cache)\n\nThe cache object must implement :class:`BaseCache`.  A minimal\nrequirement is to expose synchronous ``get`` / ``set`` methods (async\nwrappers currently call them directly).  The default implementation\n``AutoCache`` is an in-process LRU cache with a configurable size.\n\nThis first iteration purposefully keeps the API narrow: no eviction\nhooks, no invalidation, no TTL for the LRU variant.  The objective is to\nprovide a safe foundation which we will extend in follow-up work.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport hashlib\nimport json\nimport threading\nfrom abc import ABC, abstractmethod\nfrom collections import OrderedDict\nfrom typing import Any\nimport logging\n\n# The project already depends on pydantic; type checker in some\n# environments might not have its stubs – silence if missing.\nfrom pydantic import BaseModel  # type: ignore[import-not-found]\n\n__all__ = [\n    \"BaseCache\",\n    \"AutoCache\",\n    \"DiskCache\",\n    \"make_cache_key\",\n]\n\n\nclass BaseCache(ABC):\n    \"\"\"Abstract cache contract.\n\n    Concrete subclasses *must* be thread-safe.\n    \"\"\"\n\n    @abstractmethod\n    def get(self, key: str) -> Any | None:  # noqa: ANN401 – value type arbitrary\n        \"\"\"Return *None* to indicate a cache miss.\"\"\"\n\n    @abstractmethod\n    def set(\n        self,\n        key: str,\n        value: Any,\n        ttl: int | None = None,  # noqa: ARG002\n    ) -> None:  # noqa: ANN401\n        \"\"\"Store *value* under *key*.\n\n        ``ttl`` is time-to-live in **seconds**.  Implementations *may*\n        ignore it (e.g. :class:`AutoCache`).\n        \"\"\"\n\n\nclass AutoCache(BaseCache):\n    \"\"\"Thread-safe in-process LRU cache using :class:`collections.OrderedDict`.\"\"\"\n\n    def __init__(self, maxsize: int = 128):\n        if maxsize <= 0:\n            raise ValueError(\"maxsize must be > 0\")\n        self._maxsize = maxsize\n        self._cache: OrderedDict[str, Any] = OrderedDict()\n        self._lock = threading.Lock()\n\n    # ---------------------------------------------------------------------\n    # BaseCache implementation\n    # ---------------------------------------------------------------------\n    def get(self, key: str) -> Any | None:  # noqa: ANN401\n        with self._lock:\n            try:\n                value = self._cache.pop(key)\n            except KeyError:\n                return None\n            # Move to the end (most recently used)\n            self._cache[key] = value\n            return value\n\n    def set(\n        self,\n        key: str,\n        value: Any,\n        ttl: int | None = None,  # noqa: ARG002\n    ) -> None:  # noqa: ANN401\n        # *ttl* is ignored for the in-process cache.\n        with self._lock:\n            if key in self._cache:\n                self._cache.pop(key, None)\n            self._cache[key] = value\n            if len(self._cache) > self._maxsize:\n                # popitem(last=False) pops the *least* recently used entry\n                self._cache.popitem(last=False)\n\n\n# -------------------------------------------------------------------------\n# Optional back-ends – imported lazily so users do not need extra deps\n# -------------------------------------------------------------------------\n\n\ndef _import_diskcache():  # pragma: no cover – only executed when requested\n    import importlib  # type: ignore[]\n\n    if importlib.util.find_spec(\"diskcache\") is None:  # type: ignore[attr-defined]\n        raise ImportError(\n            \"diskcache is not installed.  Install it with `pip install diskcache`.\"\n        )\n    import diskcache  # type: ignore\n\n    return diskcache\n\n\nclass DiskCache(BaseCache):\n    \"\"\"Wrapper around `diskcache.Cache`.\"\"\"\n\n    def __init__(self, directory: str = \".instructor_cache\", **kwargs: Any):\n        diskcache = _import_diskcache()\n        self._cache = diskcache.Cache(directory, **kwargs)\n\n    def get(self, key: str) -> Any | None:  # noqa: ANN401\n        return self._cache.get(key)\n\n    def set(self, key: str, value: Any, ttl: int | None = None) -> None:  # noqa: ANN401\n        if ttl is None:\n            self._cache.set(key, value)\n        else:\n            self._cache.set(key, value, expire=ttl)\n\n\n# -------------------------------------------------------------------------\n# Cache-key helper\n# -------------------------------------------------------------------------\n\n\ndef make_cache_key(\n    *,\n    messages: Any,\n    model: str | None,\n    response_model: type[BaseModel] | None,\n    mode: str | None = None,\n) -> str:  # noqa: ANN401\n    \"\"\"Compute a *deterministic* cache key.\n\n    The key space uses SHA-256(\"json payload\") to keep the final length\n    fixed regardless of input size.\n\n    Components that influence the key:\n        • provider/model name\n        • serialized *messages* (user + system prompt, etc.)\n        • *mode* (Tools, JSON, …) – helps when users change Instructor mode\n        • *response_model* schema – so edits to field definitions or\n          descriptions invalidate prior cache entries (critical!).\n    \"\"\"\n\n    payload: dict[str, Any] = {\n        \"model\": model,\n        \"messages\": messages,\n        \"mode\": mode,\n    }\n\n    if response_model is not None:\n        # Include the entire JSON schema – guarantees busting when either\n        # a field or its meta (title, description, constraints) changes.\n        payload[\"schema\"] = response_model.model_json_schema()\n\n    # ``default=str`` converts non-serializable objects (e.g. datetime) to\n    # string so dumps never fails.\n    data = json.dumps(payload, sort_keys=True, default=str)\n    return hashlib.sha256(data.encode()).hexdigest()\n\n\n# -------------------------------------------------------------------------\n# Convenience helpers used by patch.py to avoid duplication\n# -------------------------------------------------------------------------\n\nlogger = logging.getLogger(\"instructor.cache\")\n\n\ndef load_cached_response(cache: BaseCache, key: str, response_model: type[BaseModel]):  # noqa: ANN201\n    \"\"\"Return parsed model if *key* exists in *cache* else None.\"\"\"\n    cached = cache.get(key)\n    if cached is None:\n        return None\n    import json\n\n    try:\n        data = json.loads(cached)\n        model_json = data[\"model\"]\n        raw_json = data.get(\"raw\")\n    except Exception:  # noqa: BLE001\n        model_json = cached\n        raw_json = None\n\n    obj = response_model.model_validate_json(model_json)  # type: ignore[arg-type]\n    if raw_json is not None:\n        # `_raw_response` is an internal attribute used by Instructor; it may not\n        # be declared on the Pydantic model type.\n        try:\n            # Try to deserialize as JSON and reconstruct object structure\n            import json\n\n            raw_data = json.loads(raw_json)\n\n            # Check if this looks like a Pydantic-serialized object (has proper structure)\n            if isinstance(raw_data, dict) and any(\n                key in raw_data for key in [\"id\", \"object\", \"model\", \"choices\"]\n            ):\n                # Looks like a proper completion object - use SimpleNamespace reconstruction\n                from types import SimpleNamespace\n\n                obj._raw_response = json.loads(\n                    raw_json, object_hook=lambda d: SimpleNamespace(**d)\n                )  # type: ignore[attr-defined]\n                logger.debug(\"Restored raw response as SimpleNamespace object\")\n            else:\n                # Plain dict/list - keep as-is\n                obj._raw_response = raw_data  # type: ignore[attr-defined]\n                logger.debug(\"Restored raw response as plain data structure\")\n        except (json.JSONDecodeError, TypeError):\n            # Not valid JSON - probably string fallback\n            obj._raw_response = raw_json  # type: ignore[attr-defined]\n            logger.debug(\n                \"Restored raw response as string (original could not be fully serialized)\"\n            )\n    logger.debug(\"cache hit: %s\", key)\n    return obj\n\n\ndef store_cached_response(\n    cache: BaseCache, key: str, model: BaseModel, ttl: int | None = None\n) -> None:  # noqa: D401\n    \"\"\"Serialize *model* and optional raw response to JSON and cache it.\"\"\"\n    import json\n\n    raw_resp = getattr(model, \"_raw_response\", None)\n    if raw_resp is not None:\n        try:\n            # Try Pydantic model serialization first (OpenAI, Anthropic, etc.)\n            raw_json = raw_resp.model_dump_json()  # type: ignore[attr-defined]\n            logger.debug(\"Cached raw response as Pydantic JSON\")\n        except (AttributeError, TypeError) as e:\n            # Fallback for non-Pydantic responses (custom providers, plain dicts, etc.)\n            try:\n                import json\n\n                raw_json = json.dumps(raw_resp, default=str)\n                logger.debug(\n                    \"Cached raw response as plain JSON (provider may not support full reconstruction)\"\n                )\n            except (TypeError, ValueError):\n                # Final fallback - string representation\n                raw_json = str(raw_resp)\n                logger.warning(\n                    \"Raw response could not be serialized as JSON, using string fallback. \"\n                    \"create_with_completion may not fully restore original object structure.\"\n                )\n    else:\n        raw_json = None\n\n    payload = {\n        \"model\": model.model_dump_json(),  # type: ignore[attr-defined]\n        \"raw\": raw_json,\n    }\n    cache.set(key, json.dumps(payload), ttl=ttl)\n    logger.debug(\"cache store: %s\", key)\n"
  },
  {
    "path": "instructor/cli/__init__.py",
    "content": ""
  },
  {
    "path": "instructor/cli/batch.py",
    "content": "import os\nfrom rich.console import Console\nfrom rich.table import Table\nfrom rich.live import Live\nimport typer\nimport time\nimport json\nimport warnings\nfrom instructor.batch import BatchProcessor, BatchJobInfo\n\nfrom tqdm import tqdm\n\napp = typer.Typer()\n\nconsole = Console()\n\n\ndef generate_table(batch_jobs: list[BatchJobInfo], provider: str, full_id: bool = False):\n    \"\"\"Generate enhanced table for batch jobs using unified BatchJobInfo objects\n    \n    Args:\n        batch_jobs: List of batch job info objects\n        provider: Provider name (openai, anthropic)\n        full_id: If True, show full batch IDs without truncation\n    \"\"\"\n    table = Table(title=f\"{provider.title()} Batch Jobs\")\n\n    # Adjust column width based on full_id flag\n    id_max_width = None if full_id else 20\n    table.add_column(\"Batch ID\", style=\"dim\", max_width=id_max_width, no_wrap=True)\n    table.add_column(\"Status\", min_width=10)\n    table.add_column(\"Created\", style=\"dim\", min_width=10)\n    table.add_column(\"Started\", style=\"dim\", min_width=10)\n    table.add_column(\"Duration\", style=\"dim\", min_width=7)\n\n    # Add provider-specific columns for request counts\n    if provider == \"openai\":\n        table.add_column(\"Completed\", justify=\"right\", min_width=8)\n        table.add_column(\"Failed\", justify=\"right\", min_width=6)\n        table.add_column(\"Total\", justify=\"right\", min_width=6)\n    elif provider == \"anthropic\":\n        table.add_column(\"Succeeded\", justify=\"right\", min_width=8)\n        table.add_column(\"Errored\", justify=\"right\", min_width=7)\n        table.add_column(\"Processing\", justify=\"right\", min_width=9)\n\n    for batch_job in batch_jobs:\n        # Color code status\n        status_color = {\n            \"pending\": \"yellow\",\n            \"processing\": \"blue\",\n            \"completed\": \"green\",\n            \"failed\": \"red\",\n            \"cancelled\": \"red\",\n            \"expired\": \"red\",\n        }.get(batch_job.status.value, \"white\")\n\n        colored_status = f\"[{status_color}]{batch_job.status.value}[/{status_color}]\"\n\n        # Format timestamps\n        created_str = (\n            batch_job.timestamps.created_at.strftime(\"%m/%d %H:%M\")\n            if batch_job.timestamps.created_at\n            else \"N/A\"\n        )\n        started_str = (\n            batch_job.timestamps.started_at.strftime(\"%m/%d %H:%M\")\n            if batch_job.timestamps.started_at\n            else \"N/A\"\n        )\n\n        # Calculate duration\n        duration_str = \"N/A\"\n        if batch_job.timestamps.started_at and batch_job.timestamps.completed_at:\n            duration = (\n                batch_job.timestamps.completed_at - batch_job.timestamps.started_at\n            )\n            total_minutes = duration.total_seconds() / 60\n            if total_minutes < 60:\n                duration_str = f\"{int(total_minutes)}m\"\n            else:\n                hours = total_minutes / 60\n                duration_str = f\"{hours:.1f}h\"\n        elif batch_job.timestamps.started_at and batch_job.status.value == \"processing\":\n            from datetime import datetime, timezone\n\n            duration = datetime.now(timezone.utc) - batch_job.timestamps.started_at\n            total_minutes = duration.total_seconds() / 60\n            if total_minutes < 60:\n                duration_str = f\"{int(total_minutes)}m\"\n            else:\n                hours = total_minutes / 60\n                duration_str = f\"{hours:.1f}h\"\n\n        # Truncate batch ID for display only if full_id is False\n        batch_id_display = str(batch_job.id)\n        if not full_id and len(batch_id_display) > 18:\n            batch_id_display = batch_id_display[:15] + \"...\"\n\n        if provider == \"openai\":\n            table.add_row(\n                batch_id_display,\n                colored_status,\n                created_str,\n                started_str,\n                duration_str,\n                str(batch_job.request_counts.completed or 0),\n                str(batch_job.request_counts.failed or 0),\n                str(batch_job.request_counts.total or 0),\n            )\n        elif provider == \"anthropic\":\n            table.add_row(\n                str(batch_job.id),\n                colored_status,\n                created_str,\n                started_str,\n                duration_str,\n                str(batch_job.request_counts.succeeded or 0),\n                str(batch_job.request_counts.errored or 0),\n                str(batch_job.request_counts.processing or 0),\n            )\n\n    return table\n\n\ndef get_jobs(limit: int = 10, provider: str = \"openai\") -> list[BatchJobInfo]:\n    \"\"\"Get batch jobs for the specified provider using BatchProcessor\"\"\"\n\n    # Create a dummy model string for the provider\n    # We just need the provider part for listing batches\n    model_map = {\n        \"openai\": \"openai/gpt-4o-mini\",\n        \"anthropic\": \"anthropic/claude-3-sonnet\",\n    }\n\n    if provider not in model_map:\n        raise ValueError(f\"Unsupported provider: {provider}\")\n\n    # Create a dummy response model (not used for listing)\n    from pydantic import BaseModel\n\n    class DummyModel(BaseModel):\n        dummy: str = \"dummy\"\n\n    try:\n        # Create BatchProcessor instance\n        processor = BatchProcessor(model_map[provider], DummyModel)\n        # Get batch jobs\n        return processor.list_batches(limit=limit)\n    except Exception as e:\n        console.print(f\"[red]Error listing {provider} batch jobs: {e}[/red]\")\n        return []\n\n\n@app.command(name=\"list\", help=\"See all existing batch jobs\")\ndef watch(\n    limit: int = typer.Option(10, help=\"Total number of batch jobs to show\"),\n    poll: int = typer.Option(\n        10, help=\"Time in seconds to wait for the batch job to complete\"\n    ),\n    screen: bool = typer.Option(False, help=\"Enable or disable screen output\"),\n    live: bool = typer.Option(\n        False, help=\"Enable live polling to continuously update the table\"\n    ),\n    provider: str = typer.Option(\n        \"openai\",\n        help=\"Provider to use (e.g., 'openai', 'anthropic')\",\n    ),\n    # Deprecated flag for backward compatibility\n    use_anthropic: bool = typer.Option(\n        None,\n        help=\"[DEPRECATED] Use --model instead. Use Anthropic API instead of OpenAI\",\n    ),\n    full_id: bool = typer.Option(\n        False,\n        \"--full-id\",\n        help=\"Show full batch IDs without truncation\",\n    ),\n):\n    \"\"\"\n    Monitor the status of the most recent batch jobs\n    \"\"\"\n    # Handle deprecated flag\n    if use_anthropic is not None:\n        warnings.warn(\n            \"--use-anthropic is deprecated. Use --provider 'anthropic' instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        if use_anthropic:\n            provider = \"anthropic\"\n\n    # Check if required API key is available for the provider\n    required_keys = {\n        \"anthropic\": \"ANTHROPIC_API_KEY\",\n        \"openai\": \"OPENAI_API_KEY\",\n    }\n\n    if provider in required_keys and not os.getenv(required_keys[provider]):\n        console.print(\n            f\"[red]Error: {required_keys[provider]} environment variable not set for {provider}[/red]\"\n        )\n        return\n\n    batch_jobs = get_jobs(limit, provider)\n    table = generate_table(batch_jobs, provider, full_id=full_id)\n\n    if not live:\n        # Show table once and exit\n        console.print(table)\n        return\n\n    # Live polling mode\n    with Live(table, refresh_per_second=2, screen=screen) as live_table:\n        while True:\n            batch_jobs = get_jobs(limit, provider)\n            table = generate_table(batch_jobs, provider, full_id=full_id)\n            live_table.update(table)\n            time.sleep(poll)\n\n\n@app.command(\n    help=\"Create a batch job from a file\",\n)\ndef create_from_file(\n    file_path: str = typer.Option(help=\"File containing the batch job requests\"),\n    model: str = typer.Option(\n        \"openai/gpt-4o-mini\",\n        help=\"Model in format 'provider/model-name' (e.g., 'openai/gpt-4', 'anthropic/claude-3-sonnet')\",\n    ),\n    description: str = typer.Option(\n        \"Instructor batch job\",\n        help=\"Description/metadata for the batch job\",\n    ),\n    completion_window: str = typer.Option(\n        \"24h\",\n        help=\"Completion window for the batch job (OpenAI only)\",\n    ),\n    # Deprecated flag for backward compatibility\n    use_anthropic: bool = typer.Option(\n        None,\n        help=\"[DEPRECATED] Use --model instead. Use Anthropic API instead of OpenAI\",\n    ),\n):\n    \"\"\"Create a batch job from a file using the unified BatchProcessor\"\"\"\n    # Handle deprecated flag\n    if use_anthropic is not None:\n        warnings.warn(\n            \"--use-anthropic is deprecated. Use --model 'anthropic/claude-3-sonnet' instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        if use_anthropic:\n            model = \"anthropic/claude-3-sonnet\"\n\n    try:\n        # Create a dummy response model (not used for direct file submission)\n        from pydantic import BaseModel\n\n        class DummyModel(BaseModel):\n            dummy: str = \"dummy\"\n\n        # Create BatchProcessor instance\n        processor = BatchProcessor(model, DummyModel)\n\n        # Prepare metadata\n        metadata = {\n            \"description\": description,\n        }\n\n        with console.status(f\"[bold green]Submitting batch job...\", spinner=\"dots\"):\n            batch_id = processor.submit_batch(\n                file_path, metadata=metadata, completion_window=completion_window\n            )\n\n        console.print(f\"[bold green]Batch job created with ID: {batch_id}[/bold green]\")\n\n        # Show updated batch list\n        provider_name = model.split(\"/\", 1)[0]\n        watch(limit=5, poll=2, screen=False, live=False, provider=provider_name)\n\n    except Exception as e:\n        console.print(f\"[bold red]Error creating batch job: {e}[/bold red]\")\n\n\n@app.command(help=\"Cancel a batch job\")\ndef cancel(\n    batch_id: str = typer.Option(help=\"Batch job ID to cancel\"),\n    provider: str = typer.Option(\n        \"openai\",\n        help=\"Provider to use (e.g., 'openai', 'anthropic')\",\n    ),\n    # Deprecated flag for backward compatibility\n    use_anthropic: bool = typer.Option(\n        None,\n        help=\"[DEPRECATED] Use --provider 'anthropic' instead. Use Anthropic API instead of OpenAI\",\n    ),\n):\n    \"\"\"Cancel a batch job using the unified BatchProcessor\"\"\"\n    # Handle deprecated flag\n    if use_anthropic is not None:\n        warnings.warn(\n            \"--use-anthropic is deprecated. Use --provider 'anthropic' instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        if use_anthropic:\n            provider = \"anthropic\"\n\n    try:\n        # Create a dummy response model (not used for cancellation)\n        from pydantic import BaseModel\n\n        class DummyModel(BaseModel):\n            dummy: str = \"dummy\"\n\n        # Create a dummy model string for the provider\n        model_map = {\n            \"openai\": \"openai/gpt-4o-mini\",\n            \"anthropic\": \"anthropic/claude-3-sonnet\",\n        }\n\n        if provider not in model_map:\n            console.print(f\"[red]Unsupported provider: {provider}[/red]\")\n            return\n\n        # Create BatchProcessor instance\n        processor = BatchProcessor(model_map[provider], DummyModel)\n\n        with console.status(\n            f\"[bold yellow]Cancelling {provider} batch job...\", spinner=\"dots\"\n        ):\n            processor.cancel_batch(batch_id)\n\n        console.print(\n            f\"[bold green]Batch {batch_id} cancelled successfully![/bold green]\"\n        )\n\n        # Show updated status\n        watch(limit=5, poll=2, screen=False, live=False, provider=provider)\n\n    except NotImplementedError as e:\n        console.print(f\"[yellow]Note: {e}[/yellow]\")\n    except Exception as e:\n        console.print(f\"[bold red]Error cancelling batch {batch_id}: {e}[/bold red]\")\n\n\n@app.command(help=\"Delete a completed batch job\")\ndef delete(\n    batch_id: str = typer.Option(help=\"Batch job ID to delete\"),\n    provider: str = typer.Option(\n        \"openai\",\n        help=\"Provider to use (e.g., 'openai', 'anthropic')\",\n    ),\n):\n    \"\"\"Delete a batch job using the unified BatchProcessor\"\"\"\n    try:\n        # Create a dummy response model (not used for deletion)\n        from pydantic import BaseModel\n\n        class DummyModel(BaseModel):\n            dummy: str = \"dummy\"\n\n        # Create a dummy model string for the provider\n        model_map = {\n            \"openai\": \"openai/gpt-4o-mini\",\n            \"anthropic\": \"anthropic/claude-3-sonnet\",\n        }\n\n        if provider not in model_map:\n            console.print(f\"[red]Unsupported provider: {provider}[/red]\")\n            return\n\n        # Create BatchProcessor instance\n        processor = BatchProcessor(model_map[provider], DummyModel)\n\n        with console.status(\n            f\"[bold yellow]Deleting {provider} batch job...\", spinner=\"dots\"\n        ):\n            processor.delete_batch(batch_id)\n\n        console.print(\n            f\"[bold green]Batch {batch_id} deleted successfully![/bold green]\"\n        )\n\n        # Show updated status\n        watch(limit=5, poll=2, screen=False, live=False, provider=provider)\n\n    except NotImplementedError as e:\n        console.print(f\"[yellow]Note: {e}[/yellow]\")\n    except Exception as e:\n        console.print(f\"[bold red]Error deleting batch {batch_id}: {e}[/bold red]\")\n\n\n@app.command(help=\"Download the file associated with a batch job\")\ndef download_file(\n    batch_id: str = typer.Option(help=\"Batch job ID to download\"),\n    download_file_path: str = typer.Option(help=\"Path to download file to\"),\n    provider: str = typer.Option(\n        \"openai\",\n        help=\"Provider to use (e.g., 'openai', 'anthropic')\",\n    ),\n):\n    try:\n        if provider == \"anthropic\":\n            from anthropic import Anthropic\n\n            client = Anthropic()\n            # TODO: Remove beta fallback when stable API is available\n            try:\n                batches_client = client.messages.batches\n            except AttributeError:\n                batches_client = client.beta.messages.batches\n            batch = batches_client.retrieve(batch_id)\n            if batch.processing_status != \"ended\":\n                raise ValueError(\"Only completed Jobs can be downloaded\")\n\n            results_url = batch.results_url\n            if not results_url:\n                raise ValueError(\"Results URL not available\")\n\n            with open(download_file_path, \"w\") as file:\n                for result in tqdm(client.messages.batches.results(batch_id)):\n                    file.write(json.dumps(result.model_dump()) + \"\\n\")\n        else:\n            from openai import OpenAI\n\n            client = OpenAI()\n            batch = client.batches.retrieve(batch_id=batch_id)\n            status = batch.status\n\n            if status != \"completed\":\n                raise ValueError(\"Only completed Jobs can be downloaded\")\n\n            file_id = batch.output_file_id\n\n            assert file_id, f\"Equivalent Output File not found for {batch_id}\"\n            file_response = client.files.content(file_id)\n\n            with open(download_file_path, \"w\") as file:\n                file.write(file_response.text)\n\n    except Exception as e:\n        console.log(f\"[bold red]Error downloading file for {batch_id}: {e}\")\n\n\n@app.command(help=\"Retrieve results from a batch job\")\ndef results(\n    batch_id: str = typer.Option(help=\"Batch job ID to get results from\"),\n    output_file: str = typer.Option(help=\"File to save the results to\"),\n    model: str = typer.Option(\n        \"openai/gpt-4o-mini\",\n        help=\"Model in format 'provider/model-name' (e.g., 'openai/gpt-4', 'anthropic/claude-3-sonnet')\",\n    ),\n):\n    \"\"\"Retrieve and save batch job results\"\"\"\n    provider, _ = model.split(\"/\", 1)\n\n    try:\n        if provider == \"openai\":\n            from openai import OpenAI\n\n            client = OpenAI()\n            batch = client.batches.retrieve(batch_id=batch_id)\n\n            if batch.status != \"completed\":\n                console.print(\n                    f\"[yellow]Batch status is '{batch.status}', not completed[/yellow]\"\n                )\n                return\n\n            file_id = batch.output_file_id\n            if not file_id:\n                console.print(\"[red]No output file available[/red]\")\n                return\n\n            file_response = client.files.content(file_id)\n            with open(output_file, \"w\") as f:\n                f.write(file_response.text)\n            console.print(f\"[bold green]Results saved to: {output_file}[/bold green]\")\n\n        elif provider == \"anthropic\":\n            from anthropic import Anthropic\n\n            client = Anthropic()\n            batch = client.beta.messages.batches.retrieve(batch_id)\n\n            if batch.processing_status != \"ended\":\n                console.print(\n                    f\"[yellow]Batch status is '{batch.processing_status}', not ended[/yellow]\"\n                )\n                return\n\n            # Get results from Anthropic batch API\n            results_iter = client.beta.messages.batches.results(batch_id)\n\n            with open(output_file, \"w\") as f:\n                for result in results_iter:\n                    f.write(json.dumps(result.model_dump()) + \"\\n\")\n            console.print(f\"[bold green]Results saved to: {output_file}[/bold green]\")\n\n        else:\n            console.print(f\"[red]Unsupported provider: {provider}[/red]\")\n\n    except Exception as e:\n        console.log(f\"[bold red]Error retrieving results for {batch_id}: {e}\")\n\n\n@app.command(help=\"Create batch job using BatchProcessor\")\ndef create(\n    messages_file: str = typer.Option(help=\"JSONL file with message conversations\"),\n    model: str = typer.Option(\n        \"openai/gpt-4o-mini\",\n        help=\"Model in format 'provider/model-name' (e.g., 'openai/gpt-4', 'anthropic/claude-3-sonnet')\",\n    ),\n    response_model: str = typer.Option(\n        help=\"Python class path for response model (e.g., 'examples.User')\"\n    ),\n    output_file: str = typer.Option(\n        \"batch_requests.jsonl\", help=\"Output file for batch requests\"\n    ),\n    max_tokens: int = typer.Option(1000, help=\"Maximum tokens per request\"),\n    temperature: float = typer.Option(0.1, help=\"Temperature for generation\"),\n):\n    \"\"\"Create a batch job using the unified BatchProcessor\"\"\"\n    try:\n        # Import the response model dynamically\n        module_path, class_name = response_model.rsplit(\".\", 1)\n        import importlib\n\n        module = importlib.import_module(module_path)\n        response_class = getattr(module, class_name)\n\n        # Load messages from file\n        messages_list = []\n        with open(messages_file) as f:\n            for line in f:\n                if line.strip():\n                    messages_list.append(json.loads(line))\n\n        # Create batch processor\n        processor = BatchProcessor(model, response_class)\n\n        # Create batch file\n        with console.status(\n            f\"[bold green]Creating batch file with {len(messages_list)} requests...\",\n            spinner=\"dots\",\n        ):\n            processor.create_batch_from_messages(\n                messages_list, output_file, max_tokens, temperature\n            )\n\n        console.print(f\"[bold green]Batch file created: {output_file}[/bold green]\")\n        console.print(\n            f\"[yellow]Use 'instructor batch create-from-file --file-path {output_file}' to submit the batch[/yellow]\"\n        )\n\n    except Exception as e:\n        console.log(f\"[bold red]Error creating batch: {e}\")\n"
  },
  {
    "path": "instructor/cli/cli.py",
    "content": "from typing import Optional\nimport typer\nfrom typer import Typer, launch\nimport instructor.cli.jobs as jobs\nimport instructor.cli.files as files\nimport instructor.cli.usage as usage\nimport instructor.cli.deprecated_hub as hub\nimport instructor.cli.batch as batch\n\napp: Typer = typer.Typer()\n\napp.add_typer(jobs.app, name=\"jobs\", help=\"Monitor and create fine tuning jobs\")\napp.add_typer(files.app, name=\"files\", help=\"Manage files on OpenAI's servers\")\napp.add_typer(usage.app, name=\"usage\", help=\"Check OpenAI API usage data\")\napp.add_typer(\n    hub.app, name=\"hub\", help=\"[DEPRECATED] The instructor hub is no longer available\"\n)\napp.add_typer(batch.app, name=\"batch\", help=\"Manage OpenAI Batch jobs\")\n\n\n@app.command()\ndef docs(\n    query: Optional[str] = typer.Argument(None, help=\"Search the documentation\"),\n) -> None:\n    \"\"\"\n    Open the instructor documentation website.\n    \"\"\"\n    if query:\n        launch(f\"https://python.useinstructor.com/?q={query}\")\n    else:\n        launch(\"https://python.useinstructor.com/\")\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "instructor/cli/deprecated_hub.py",
    "content": "from typer import Exit, echo, Typer\n\napp: Typer = Typer(help=\"Instructor Hub CLI (Deprecated)\")\n\n\n@app.command(name=\"hub\")\ndef hub() -> None:\n    \"\"\"\n    This command has been deprecated. The instructor hub is no longer available.\n    Please refer to our cookbook examples at https://python.useinstructor.com/examples/\n    \"\"\"\n    echo(\n        \"The instructor hub has been deprecated. Please refer to our cookbook examples at https://python.useinstructor.com/examples/\"\n    )\n    raise Exit(1)\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "instructor/cli/files.py",
    "content": "# type: ignore - stub mismatched\n\nimport time\nfrom datetime import datetime\nfrom typing import Literal, cast\n\nimport openai\nimport typer\nfrom openai import OpenAI\nfrom rich.console import Console\nfrom rich.table import Table\n\nclient = OpenAI()\napp = typer.Typer()\nconsole = Console()\n\n\n# Sample response data\ndef generate_file_table(files: list[openai.types.FileObject]) -> Table:\n    table = Table(\n        title=\"OpenAI Files\",\n    )\n    table.add_column(\"File ID\", style=\"dim\")\n    table.add_column(\"Size (bytes)\", justify=\"right\")\n    table.add_column(\"Creation Time\")\n    table.add_column(\"Filename\")\n    table.add_column(\"Purpose\")\n\n    for file in files:\n        table.add_row(\n            file[\"id\"],\n            str(file[\"bytes\"]),\n            str(datetime.fromtimestamp(file[\"created_at\"])),\n            file[\"filename\"],\n            file[\"purpose\"],\n        )\n\n    return table\n\n\ndef get_files() -> list[openai.types.FileObject]:\n    files = client.files.list()\n    files = files.data\n    files = sorted(files, key=lambda x: x.created_at, reverse=True)\n    return files\n\n\ndef get_file_status(file_id: str) -> str:\n    response = client.files.retrieve(file_id)\n    return response.status\n\n\n@app.command(\n    help=\"Upload a file to OpenAI's servers, will monitor the upload status until it is processed\",\n)\ndef upload(\n    filepath: str = typer.Argument(help=\"Path to the file to upload\"),\n    purpose: str = typer.Option(\"fine-tune\", help=\"Purpose of the file\"),\n    poll: int = typer.Option(5, help=\"Polling interval in seconds\"),\n) -> None:\n    # Literals aren't supported by Typer yet.\n    file_purpose = cast(Literal[\"fine-tune\", \"assistants\"], purpose)\n    with open(filepath, \"rb\") as file:\n        response = client.files.create(file=file, purpose=file_purpose)\n    file_id = response[\"id\"]  # type: ignore - types might be out of date\n    with console.status(f\"Monitoring upload: {file_id}...\") as status:\n        status.spinner_style = \"dots\"\n        while True:\n            file_status = get_file_status(file_id)\n            if file_status == \"processed\":\n                console.log(f\"[bold green]File {file_id} uploaded successfully!\")\n                break\n            time.sleep(poll)\n\n\n@app.command(\n    help=\"Download a file from OpenAI's servers\",\n)\ndef download(\n    file_id: str = typer.Argument(help=\"ID of the file to download\"),\n    output: str = typer.Argument(help=\"Output path for the downloaded file\"),\n) -> None:\n    with console.status(f\"[bold green]Downloading file {file_id}...\", spinner=\"dots\"):\n        content = client.files.download(file_id)\n        with open(output, \"wb\") as file:\n            file.write(content)\n        console.log(f\"[bold green]File {file_id} downloaded successfully!\")\n\n\n@app.command(\n    help=\"Delete a file from OpenAI's servers\",\n)\ndef delete(file_id: str = typer.Argument(help=\"ID of the file to delete\")) -> None:\n    with console.status(f\"[bold red]Deleting file {file_id}...\", spinner=\"dots\"):\n        try:\n            client.files.delete(file_id)\n            console.log(f\"[bold red]File {file_id} deleted successfully!\")\n        except Exception as e:\n            console.log(f\"[bold red]Error deleting file {file_id}: {e}\")\n            return\n\n\n@app.command(\n    help=\"Monitor the status of a file on OpenAI's servers\",\n)\ndef status(\n    file_id: str = typer.Argument(help=\"ID of the file to check the status of\"),\n) -> None:\n    with console.status(f\"Monitoring status of file {file_id}...\") as status:\n        while True:\n            file_status = get_file_status(file_id)\n            status.update(f\"File status: {file_status}\")\n            if file_status in [\"pending\", \"processed\"]:\n                break\n            time.sleep(5)\n\n\n@app.command(\n    help=\"List the files on OpenAI's servers\",\n)\ndef list() -> None:\n    files = get_files()\n    console.log(generate_file_table(files))\n"
  },
  {
    "path": "instructor/cli/jobs.py",
    "content": "from typing import Optional, TypedDict\nfrom openai import OpenAI\n\nfrom openai.types.fine_tuning.job_create_params import Hyperparameters\nimport typer\nimport time\nfrom rich.live import Live\nfrom rich.table import Table\nfrom rich.console import Console\nfrom datetime import datetime\nfrom openai.types.fine_tuning import FineTuningJob\n\nclient = OpenAI()\napp = typer.Typer()\nconsole = Console()\n\n\nclass FuneTuningParams(TypedDict, total=False):\n    hyperparameters: Hyperparameters\n    validation_file: Optional[str]\n    suffix: Optional[str]\n\n\ndef generate_table(jobs: list[FineTuningJob]) -> Table:\n    # Sorting the jobs by creation time\n    jobs = sorted(jobs, key=lambda x: x.created_at, reverse=True)\n\n    table = Table(\n        title=\"OpenAI Fine Tuning Job Monitoring\",\n        caption=\"Automatically refreshes every 5 seconds, press Ctrl+C to exit\",\n    )\n\n    table.add_column(\"Job ID\", style=\"dim\")\n    table.add_column(\"Status\")\n    table.add_column(\"Creation Time\", justify=\"right\")\n    table.add_column(\"Completion Time\", justify=\"right\")\n    table.add_column(\"Model Name\")\n    table.add_column(\"File ID\")\n    table.add_column(\"Epochs\")\n    table.add_column(\"Base Model\")\n\n    for job in jobs:\n        status_emoji = {\n            \"running\": \"⏳\",\n            \"succeeded\": \"✅\",\n            \"failed\": \"❌\",\n            \"cancelled\": \"🚫\",\n        }.get(job.status, \"❓\")\n\n        finished_at = (\n            str(datetime.fromtimestamp(job.finished_at)) if job.finished_at else \"N/A\"\n        )\n\n        table.add_row(\n            job.id,\n            f\"{status_emoji} [{status_color(job.status)}]{job.status}[/]\",\n            str(datetime.fromtimestamp(job.created_at)),\n            finished_at,\n            job.fine_tuned_model,\n            job.training_file,\n            str(job.hyperparameters.n_epochs),\n            job.model,\n        )\n\n    return table\n\n\ndef status_color(status: str) -> str:\n    return {\"running\": \"yellow\", \"succeeded\": \"green\", \"failed\": \"red\"}.get(\n        status, \"white\"\n    )\n\n\ndef get_jobs(limit: int = 5) -> list[FineTuningJob]:\n    return client.fine_tuning.jobs.list(limit=limit).data\n\n\ndef get_file_status(file_id: str) -> str:\n    response = client.files.retrieve(file_id)\n    return response.status\n\n\n@app.command(\n    name=\"list\",\n    help=\"Monitor the status of the most recent fine-tuning jobs.\",\n)\ndef watch(\n    limit: int = typer.Option(5, help=\"Limit the number of jobs to monitor\"),\n    poll: int = typer.Option(5, help=\"Polling interval in seconds\"),\n    screen: bool = typer.Option(False, help=\"Enable or disable screen output\"),\n) -> None:\n    \"\"\"\n    Monitor the status of the most recent fine-tuning jobs.\n    \"\"\"\n    jobs = get_jobs(limit=limit)\n    with Live(generate_table(jobs), refresh_per_second=2, screen=screen) as live_table:\n        while True:\n            jobs = get_jobs(limit=limit)\n            live_table.update(generate_table(jobs))\n            time.sleep(poll)\n\n\n@app.command(\n    help=\"Create a fine-tuning job from an existing ID.\",\n)\ndef create_from_id(\n    id: str = typer.Argument(help=\"ID of the existing fine-tuning job\"),\n    model: str = typer.Option(\"gpt-3.5-turbo\", help=\"Model to use for fine-tuning\"),\n    n_epochs: Optional[int] = typer.Option(\n        None, help=\"Number of epochs for fine-tuning\", show_default=False\n    ),\n    batch_size: Optional[int] = typer.Option(\n        None, help=\"Batch size for fine-tuning\", show_default=False\n    ),\n    learning_rate_multiplier: Optional[float] = typer.Option(\n        None, help=\"Learning rate multiplier for fine-tuning\", show_default=False\n    ),\n    validation_file_id: Optional[str] = typer.Option(\n        None, help=\"ID of the uploaded validation file\"\n    ),\n) -> None:\n    hyperparameters_dict: Hyperparameters = {}\n    if n_epochs is not None:\n        hyperparameters_dict[\"n_epochs\"] = n_epochs\n    if batch_size is not None:\n        hyperparameters_dict[\"batch_size\"] = batch_size\n    if learning_rate_multiplier is not None:\n        hyperparameters_dict[\"learning_rate_multiplier\"] = learning_rate_multiplier\n\n    with console.status(\n        f\"[bold green]Creating fine-tuning job from ID {id}...\", spinner=\"dots\"\n    ):\n        job = client.fine_tuning.jobs.create(\n            training_file=id,\n            model=model,\n            hyperparameters=hyperparameters_dict,\n            validation_file=validation_file_id if validation_file_id else None,\n        )\n        console.log(f\"[bold green]Fine-tuning job created with ID: {job.id}\")\n    watch(limit=5, poll=2, screen=False)\n\n\n@app.command(\n    help=\"Create a fine-tuning job from a file.\",\n)\ndef create_from_file(\n    file: str = typer.Argument(help=\"Path to the file for fine-tuning\"),\n    model: str = typer.Option(\"gpt-3.5-turbo\", help=\"Model to use for fine-tuning\"),\n    poll: int = typer.Option(2, help=\"Polling interval in seconds\"),\n    n_epochs: Optional[int] = typer.Option(\n        None, help=\"Number of epochs for fine-tuning\", show_default=False\n    ),\n    batch_size: Optional[int] = typer.Option(\n        None, help=\"Batch size for fine-tuning\", show_default=False\n    ),\n    learning_rate_multiplier: Optional[float] = typer.Option(\n        None, help=\"Learning rate multiplier for fine-tuning\", show_default=False\n    ),\n    validation_file: Optional[str] = typer.Option(\n        None, help=\"Path to the validation file\"\n    ),\n    model_suffix: Optional[str] = typer.Option(\n        None, help=\"Suffix to identify the model\"\n    ),\n) -> None:\n    hyperparameters_dict: Hyperparameters = {}\n    if n_epochs is not None:\n        hyperparameters_dict[\"n_epochs\"] = n_epochs\n    if batch_size is not None:\n        hyperparameters_dict[\"batch_size\"] = batch_size\n    if learning_rate_multiplier is not None:\n        hyperparameters_dict[\"learning_rate_multiplier\"] = learning_rate_multiplier\n\n    with open(file, \"rb\") as file_buffer:\n        response = client.files.create(file=file_buffer, purpose=\"fine-tune\")\n\n    file_id = response.id\n\n    validation_file_id = None\n    if validation_file:\n        with open(validation_file, \"rb\") as val_file:\n            val_response = client.files.create(file=val_file, purpose=\"fine-tune\")\n        validation_file_id = val_response.id\n\n    with console.status(f\"Monitoring upload: {file_id} before finetuning...\") as status:\n        status.spinner_style = \"dots\"\n        while True:\n            file_status = get_file_status(file_id)\n            validation_file_status = (\n                get_file_status(validation_file_id) if validation_file_id else \"\"\n            )\n\n            if file_status == \"processed\" and (\n                not validation_file_id or validation_file_status == \"processed\"\n            ):\n                console.log(f\"[bold green]File {file_id} uploaded successfully!\")\n                if validation_file_id:\n                    console.log(\n                        f\"[bold green]Validation file {validation_file_id} uploaded successfully!\"\n                    )\n                break\n\n            time.sleep(poll)\n\n    additional_params: FuneTuningParams = {}\n    if hyperparameters_dict:\n        additional_params[\"hyperparameters\"] = hyperparameters_dict\n    if validation_file:\n        additional_params[\"validation_file\"] = validation_file\n    if model_suffix:\n        additional_params[\"suffix\"] = model_suffix\n\n    job = client.fine_tuning.jobs.create(\n        training_file=file_id,\n        model=model,\n        **additional_params,\n    )\n    if validation_file_id:\n        console.log(\n            f\"[bold green]Fine-tuning job created with ID: {job.id} from file ID: {file_id} and validation_file ID: {validation_file_id}\"\n        )\n    else:\n        console.log(\n            f\"[bold green]Fine-tuning job created with ID: {job.id} from file ID: {file_id}\"\n        )\n    watch(limit=5, poll=poll, screen=False)\n\n\n@app.command(\n    help=\"Cancel a fine-tuning job.\",\n)\ndef cancel(\n    id: str = typer.Argument(help=\"ID of the fine-tuning job to cancel\"),\n) -> None:\n    with console.status(f\"[bold red]Cancelling job {id}...\", spinner=\"dots\"):\n        try:\n            client.fine_tuning.jobs.cancel(id)\n            console.log(f\"[bold red]Job {id} cancelled successfully!\")\n        except Exception as e:\n            console.log(f\"[bold red]Error cancelling job {id}: {e}\")\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "instructor/cli/usage.py",
    "content": "from typing import Any, Union\nfrom collections.abc import Awaitable\nfrom datetime import datetime, timedelta\nimport typer\nimport os\nimport aiohttp\nimport asyncio\nfrom builtins import list as List\nfrom collections import defaultdict\nfrom rich.console import Console\nfrom rich.table import Table\nfrom rich.progress import Progress\n\nfrom instructor._types._alias import ModelNames\n\n\napp = typer.Typer()\nconsole = Console()\n\napi_key = os.environ.get(\"OPENAI_API_KEY\")\n\n\nasync def fetch_usage(date: str) -> dict[str, Any]:\n    headers = {\"Authorization\": f\"Bearer {api_key}\"}\n    url = f\"https://api.openai.com/v1/usage?date={date}\"\n    async with aiohttp.ClientSession() as session:\n        async with session.get(url, headers=headers) as resp:\n            return await resp.json()\n\n\nasync def get_usage_for_past_n_days(n_days: int) -> list[dict[str, Any]]:\n    tasks: List[Awaitable[dict[str, Any]]] = []  # noqa: UP006 - conflicting with the fn name\n    all_data: List[dict[str, Any]] = []  # noqa: UP006 - conflicting with the fn name\n    with Progress() as progress:\n        if n_days > 1:\n            task = progress.add_task(\"[green]Fetching usage data...\", total=n_days)\n            for i in range(n_days):\n                date = (datetime.now() - timedelta(days=i)).strftime(\"%Y-%m-%d\")\n                tasks.append(fetch_usage(date))\n                progress.update(task, advance=1)\n        else:\n            tasks.append(fetch_usage(datetime.now().strftime(\"%Y-%m-%d\")))\n\n        fetched_data = await asyncio.gather(*tasks)\n        for data in fetched_data:\n            all_data.extend(data.get(\"data\", []))\n    return all_data\n\n\n# Define the cost per unit for each model\nMODEL_COSTS = {\n    \"gpt-4o\": {\"prompt\": 0.005 / 1000, \"completion\": 0.015 / 1000},\n    \"gpt-4o-2024-05-13\": {\"prompt\": 0.005 / 1000, \"completion\": 0.015 / 1000},\n    \"gpt-4-turbo\": {\"prompt\": 0.01 / 1000, \"completion\": 0.03 / 1000},\n    \"gpt-4-turbo-2024-04-09\": {\"prompt\": 0.01 / 1000, \"completion\": 0.03 / 1000},\n    \"gpt-4-0125-preview\": {\"prompt\": 0.01 / 1000, \"completion\": 0.03 / 1000},\n    \"gpt-4-turbo-preview\": {\"prompt\": 0.01 / 1000, \"completion\": 0.03 / 1000},\n    \"gpt-4-1106-preview\": {\"prompt\": 0.01 / 1000, \"completion\": 0.03 / 1000},\n    \"gpt-4-vision-preview\": {\"prompt\": 0.01 / 1000, \"completion\": 0.03 / 1000},\n    \"gpt-4\": {\"prompt\": 0.03 / 1000, \"completion\": 0.06 / 1000},\n    \"gpt-4-0314\": {\"prompt\": 0.03 / 1000, \"completion\": 0.06 / 1000},\n    \"gpt-4-0613\": {\"prompt\": 0.03 / 1000, \"completion\": 0.06 / 1000},\n    \"gpt-4-32k\": {\"prompt\": 0.06 / 1000, \"completion\": 0.12 / 1000},\n    \"gpt-4-32k-0314\": {\"prompt\": 0.06 / 1000, \"completion\": 0.12 / 1000},\n    \"gpt-4-32k-0613\": {\"prompt\": 0.06 / 1000, \"completion\": 0.12 / 1000},\n    \"gpt-3.5-turbo\": {\"prompt\": 0.0005 / 1000, \"completion\": 0.0015 / 1000},\n    \"gpt-3.5-turbo-16k\": {\"prompt\": 0.0030 / 1000, \"completion\": 0.0040 / 1000},\n    \"gpt-3.5-turbo-0301\": {\"prompt\": 0.0015 / 1000, \"completion\": 0.0020 / 1000},\n    \"gpt-3.5-turbo-0613\": {\"prompt\": 0.0015 / 1000, \"completion\": 0.0020 / 1000},\n    \"gpt-3.5-turbo-1106\": {\"prompt\": 0.0010 / 1000, \"completion\": 0.0020 / 1000},\n    \"gpt-3.5-turbo-0125\": {\"prompt\": 0.0005 / 1000, \"completion\": 0.0015 / 1000},\n    \"gpt-3.5-turbo-16k-0613\": {\"prompt\": 0.0030 / 1000, \"completion\": 0.0040 / 1000},\n    \"gpt-3.5-turbo-instruct\": {\"prompt\": 0.0015 / 1000, \"completion\": 0.0020 / 1000},\n    \"text-embedding-3-small\": 0.00002 / 1000,\n    \"text-embedding-3-large\": 0.00013 / 1000,\n    \"text-embedding-ada-002\": 0.00010 / 1000,\n}\n\n\ndef get_model_cost(\n    model: ModelNames,\n) -> Union[dict[str, float], float]:\n    \"\"\"Get the cost details for a given model.\"\"\"\n    if model in MODEL_COSTS:\n        return MODEL_COSTS[model]\n\n    if model.startswith(\"gpt-3.5-turbo-16k\"):\n        return MODEL_COSTS[\"gpt-3.5-turbo-16k\"]\n    elif model.startswith(\"gpt-3.5-turbo\"):\n        return MODEL_COSTS[\"gpt-3.5-turbo\"]\n    elif model.startswith(\"gpt-4-turbo\"):\n        return MODEL_COSTS[\"gpt-4-turbo-preview\"]\n    elif model.startswith(\"gpt-4-32k\"):\n        return MODEL_COSTS[\"gpt-4-32k\"]\n    elif model.startswith(\"gpt-4o\"):\n        return MODEL_COSTS[\"gpt-4o\"]\n    elif model.startswith(\"gpt-4\"):\n        return MODEL_COSTS[\"gpt-4\"]\n    else:\n        raise ValueError(f\"Cost for model {model} not found\")\n\n\ndef calculate_cost(\n    snapshot_id: ModelNames,\n    n_context_tokens: int,\n    n_generated_tokens: int,\n) -> float:\n    \"\"\"Calculate the cost based on the snapshot ID and number of tokens.\"\"\"\n    cost = get_model_cost(snapshot_id)\n\n    if isinstance(cost, (float, int)):\n        return cost * (n_context_tokens + n_generated_tokens)\n\n    prompt_cost = cost[\"prompt\"] * n_context_tokens\n    completion_cost = cost[\"completion\"] * n_generated_tokens\n    return prompt_cost + completion_cost\n\n\ndef group_and_sum_by_date_and_snapshot(usage_data: list[dict[str, Any]]) -> Table:\n    \"\"\"Group and sum the usage data by date and snapshot, including costs.\"\"\"\n    summary: defaultdict[str, defaultdict[str, dict[str, Union[int, float]]]] = (\n        defaultdict(\n            lambda: defaultdict(\n                lambda: {\"total_requests\": 0, \"total_tokens\": 0, \"total_cost\": 0.0}\n            )\n        )\n    )\n\n    for usage in usage_data:\n        snapshot_id = usage[\"snapshot_id\"]\n        date = datetime.fromtimestamp(usage[\"aggregation_timestamp\"]).strftime(\n            \"%Y-%m-%d\"\n        )\n        summary[date][snapshot_id][\"total_requests\"] += usage[\"n_requests\"]\n        summary[date][snapshot_id][\"total_tokens\"] += usage[\"n_generated_tokens_total\"]\n\n        # Calculate and add the cost\n        cost = calculate_cost(\n            snapshot_id,\n            usage[\"n_context_tokens_total\"],\n            usage[\"n_generated_tokens_total\"],\n        )\n        summary[date][snapshot_id][\"total_cost\"] += cost\n\n    table = Table(title=\"Usage Summary by Date, Snapshot, and Cost\")\n    table.add_column(\"Date\", style=\"dim\")\n    table.add_column(\"Model\", style=\"dim\")\n    table.add_column(\"Total Requests\", justify=\"right\")\n    table.add_column(\"Total Cost ($)\", justify=\"right\")\n\n    # Sort dates and snapshots in descending order\n    sorted_dates = sorted(summary.keys(), reverse=True)\n    for date in sorted_dates:\n        sorted_snapshots = sorted(summary[date].keys(), reverse=True)\n        for snapshot_id in sorted_snapshots:\n            data = summary[date][snapshot_id]\n            table.add_row(\n                date,\n                snapshot_id,\n                str(data[\"total_requests\"]),\n                \"{:.2f}\".format(data[\"total_cost\"]),\n            )\n\n    return table\n\n\n@app.command(help=\"Displays OpenAI API usage data for the past N days.\")\ndef list(\n    n: int = typer.Option(0, help=\"Number of days.\"),\n) -> None:\n    all_data = asyncio.run(get_usage_for_past_n_days(n))\n    table = group_and_sum_by_date_and_snapshot(all_data)\n    console.print(table)\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "instructor/client.py",
    "content": "\"\"\"Backwards compatibility module for instructor.client.\n\nThis module provides lazy imports to maintain backwards compatibility.\n\"\"\"\n\nimport warnings\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to provide backward compatibility for client imports.\"\"\"\n    warnings.warn(\n        f\"Importing from 'instructor.client' is deprecated and will be removed in v2.0.0. \"\n        f\"Please update your imports to use 'instructor.core.client.{name}' instead:\\n\"\n        \"  from instructor.core.client import Instructor, AsyncInstructor, from_openai, from_litellm\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    from .core import client as core_client\n\n    # Try to get the attribute from the core.client module\n    if hasattr(core_client, name):\n        return getattr(core_client, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/core/__init__.py",
    "content": "\"\"\"Core components of the instructor package.\"\"\"\n\nfrom .client import Instructor, AsyncInstructor, Response, from_openai, from_litellm\nfrom .exceptions import (\n    InstructorRetryException,\n    InstructorError,\n    ConfigurationError,\n    IncompleteOutputException,\n    ValidationError,\n    ProviderError,\n    ModeError,\n    ClientError,\n    AsyncValidationError,\n    FailedAttempt,\n    ResponseParsingError,\n    MultimodalError,\n)\nfrom .hooks import Hooks, HookName\nfrom .patch import patch, apatch\nfrom .retry import retry_sync, retry_async\n\n__all__ = [\n    \"Instructor\",\n    \"AsyncInstructor\",\n    \"Response\",\n    \"InstructorRetryException\",\n    \"InstructorError\",\n    \"ConfigurationError\",\n    \"IncompleteOutputException\",\n    \"ValidationError\",\n    \"ProviderError\",\n    \"ModeError\",\n    \"ClientError\",\n    \"AsyncValidationError\",\n    \"FailedAttempt\",\n    \"ResponseParsingError\",\n    \"MultimodalError\",\n    \"Hooks\",\n    \"HookName\",\n    \"patch\",\n    \"apatch\",\n    \"from_openai\",\n    \"from_litellm\",\n    \"retry_sync\",\n    \"retry_async\",\n]\n"
  },
  {
    "path": "instructor/core/client.py",
    "content": "from __future__ import annotations\n\nimport openai\nimport inspect\nfrom functools import partial\nimport instructor\nfrom ..utils.providers import Provider, get_provider\nfrom openai.types.chat import ChatCompletionMessageParam\nfrom typing import (\n    TypeVar,\n    Callable,\n    overload,\n    Union,\n    Literal,\n    Any,\n    get_origin,\n    get_args,\n)\nfrom tenacity import (\n    AsyncRetrying,\n    Retrying,\n)\nfrom collections.abc import Generator, Iterable, Awaitable, AsyncGenerator\nfrom typing_extensions import Self\nfrom pydantic import BaseModel\nfrom ..dsl.partial import Partial\nfrom .hooks import Hooks, HookName\n\n\nT = TypeVar(\"T\", bound=Union[BaseModel, \"Iterable[Any]\", \"Partial[Any]\"])\n\n\nclass Response:\n    def __init__(\n        self,\n        client: Instructor,\n    ):\n        self.client = client\n\n    def create(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T] | None = None,\n        max_retries: int | Retrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        **kwargs,\n    ) -> T | Any:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return self.client.create(\n            response_model=response_model,\n            validation_context=validation_context,\n            context=context,\n            max_retries=max_retries,\n            strict=strict,\n            messages=input,\n            **kwargs,\n        )\n\n    def create_with_completion(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying = 3,\n        **kwargs,\n    ) -> tuple[T, Any]:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return self.client.create_with_completion(\n            messages=input,\n            response_model=response_model,\n            max_retries=max_retries,\n            **kwargs,\n        )\n\n    def create_iterable(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying = 3,\n        **kwargs,\n    ) -> Generator[T, None, None]:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return self.client.create_iterable(\n            messages=input,\n            response_model=response_model,\n            max_retries=max_retries,\n            **kwargs,\n        )\n\n    def create_partial(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying = 3,\n        **kwargs,\n    ) -> Generator[T, None, None]:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return self.client.create_partial(\n            messages=input,\n            response_model=response_model,\n            max_retries=max_retries,\n            **kwargs,\n        )\n\n\nclass AsyncResponse(Response):\n    def __init__(self, client: AsyncInstructor):\n        self.client = client\n\n    async def create(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T] | None = None,\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        **kwargs,\n    ) -> T | Any:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return await self.client.create(\n            response_model=response_model,\n            validation_context=validation_context,\n            context=context,\n            max_retries=max_retries,\n            strict=strict,\n            messages=input,\n            **kwargs,\n        )\n\n    async def create_with_completion(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | AsyncRetrying = 3,\n        **kwargs,\n    ) -> tuple[T, Any]:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return await self.client.create_with_completion(\n            messages=input,\n            response_model=response_model,\n            max_retries=max_retries,\n            **kwargs,\n        )\n\n    async def create_iterable(\n        self,\n        input: str | list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | AsyncRetrying = 3,\n        **kwargs,\n    ) -> AsyncGenerator[T, None]:\n        if isinstance(input, str):\n            input = [\n                {\n                    \"role\": \"user\",\n                    \"content\": input,\n                }\n            ]\n\n        return self.client.create_iterable(\n            messages=input,\n            response_model=response_model,\n            max_retries=max_retries,\n            **kwargs,\n        )\n\n\nclass Instructor:\n    client: Any | None\n    create_fn: Callable[..., Any]\n    mode: instructor.Mode\n    default_model: str | None = None\n    provider: Provider\n    hooks: Hooks\n\n    def __init__(\n        self,\n        client: Any | None,\n        create: Callable[..., Any],\n        mode: instructor.Mode = instructor.Mode.TOOLS,\n        provider: Provider = Provider.OPENAI,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ):\n        self.client = client\n        self.create_fn = create\n        self.mode = mode\n        if mode == instructor.Mode.FUNCTIONS:\n            instructor.Mode.warn_mode_functions_deprecation()\n\n        self.kwargs = kwargs\n        self.provider = provider\n        self.hooks = hooks or Hooks()\n\n        if mode in {\n            instructor.Mode.RESPONSES_TOOLS,\n            instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        }:\n            assert isinstance(client, (openai.OpenAI, openai.AsyncOpenAI))\n            self.responses = Response(client=self)\n\n    def on(\n        self,\n        hook_name: (\n            HookName\n            | Literal[\n                \"completion:kwargs\",\n                \"completion:response\",\n                \"completion:error\",\n                \"completion:last_attempt\",\n                \"parse:error\",\n            ]\n        ),\n        handler: Callable[[Any], None],\n    ) -> None:\n        self.hooks.on(hook_name, handler)\n\n    def off(\n        self,\n        hook_name: (\n            HookName\n            | Literal[\n                \"completion:kwargs\",\n                \"completion:response\",\n                \"completion:error\",\n                \"completion:last_attempt\",\n                \"parse:error\",\n            ]\n        ),\n        handler: Callable[[Any], None],\n    ) -> None:\n        self.hooks.off(hook_name, handler)\n\n    def clear(\n        self,\n        hook_name: (\n            HookName\n            | Literal[\n                \"completion:kwargs\",\n                \"completion:response\",\n                \"completion:error\",\n                \"completion:last_attempt\",\n                \"parse:error\",\n            ]\n        )\n        | None = None,\n    ) -> None:\n        self.hooks.clear(hook_name)\n\n    @property\n    def chat(self) -> Self:\n        return self\n\n    @property\n    def completions(self) -> Self:\n        return self\n\n    @property\n    def messages(self) -> Self:\n        return self\n\n    @overload\n    def create(\n        self: AsyncInstructor,\n        response_model: type[T],\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,  # {{ edit_1 }}\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Awaitable[T]: ...\n\n    @overload\n    def create(\n        self: Self,\n        response_model: type[T],\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | Retrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,  # {{ edit_1 }}\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> T: ...\n\n    @overload\n    def create(\n        self: AsyncInstructor,\n        response_model: None,\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,  # {{ edit_1 }}\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Awaitable[Any]: ...\n\n    @overload\n    def create(\n        self: Self,\n        response_model: None,\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | Retrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,  # {{ edit_1 }}\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Any: ...\n\n    def create(\n        self,\n        response_model: type[T] | None,\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | Retrying | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> T | Any | Awaitable[T] | Awaitable[Any]:\n        kwargs = self.handle_kwargs(kwargs)\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        return self.create_fn(\n            response_model=response_model,\n            messages=messages,\n            max_retries=max_retries,\n            validation_context=validation_context,\n            context=context,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        )\n\n    @overload\n    def create_partial(\n        self: AsyncInstructor,\n        response_model: type[T],\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,  # {{ edit_1 }}\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> AsyncGenerator[T, None]: ...\n\n    @overload\n    def create_partial(\n        self: Self,\n        response_model: type[T],\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | Retrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Generator[T, None, None]: ...\n\n    def create_partial(\n        self,\n        response_model: type[T],\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | Retrying | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Generator[T, None, None] | AsyncGenerator[T, None]:\n        kwargs[\"stream\"] = True\n\n        kwargs = self.handle_kwargs(kwargs)\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        response_model = instructor.Partial[response_model]  # type: ignore\n        return self.create_fn(\n            messages=messages,\n            response_model=response_model,\n            max_retries=max_retries,\n            validation_context=validation_context,\n            context=context,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        )\n\n    @overload\n    def create_iterable(\n        self: AsyncInstructor,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> AsyncGenerator[T, None]: ...\n\n    @overload\n    def create_iterable(\n        self: Self,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Generator[T, None, None]: ...\n\n    def create_iterable(\n        self,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Generator[T, None, None] | AsyncGenerator[T, None]:\n        kwargs[\"stream\"] = True\n        kwargs = self.handle_kwargs(kwargs)\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        response_model = Iterable[response_model]  # type: ignore\n        return self.create_fn(\n            messages=messages,\n            response_model=response_model,\n            max_retries=max_retries,\n            validation_context=validation_context,\n            context=context,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        )\n\n    @overload\n    def create_with_completion(\n        self: AsyncInstructor,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> Awaitable[tuple[T, Any]]: ...\n\n    @overload\n    def create_with_completion(\n        self: Self,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> tuple[T, Any]: ...\n\n    def create_with_completion(\n        self,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | Retrying | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> tuple[T, Any] | Awaitable[tuple[T, Any]]:\n        kwargs = self.handle_kwargs(kwargs)\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        model = self.create_fn(\n            messages=messages,\n            response_model=response_model,\n            max_retries=max_retries,\n            validation_context=validation_context,\n            context=context,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        )\n        return model, model._raw_response\n\n    def handle_kwargs(self, kwargs: dict[str, Any]) -> dict[str, Any]:\n        \"\"\"\n        Handle and process keyword arguments for the API call.\n\n        This method merges the provided kwargs with the default kwargs stored in the instance.\n        It ensures that any kwargs passed to the method call take precedence over the default ones.\n        \"\"\"\n        for key, value in self.kwargs.items():\n            if key not in kwargs:\n                kwargs[key] = value\n        return kwargs\n\n    def __getattr__(self, attr: str) -> Any:\n        if attr not in {\"create\", \"chat\", \"messages\"}:\n            return getattr(self.client, attr)\n\n        return getattr(self, attr)\n\n\nclass AsyncInstructor(Instructor):\n    client: Any | None\n    create_fn: Callable[..., Any]\n    mode: instructor.Mode\n    default_model: str | None = None\n    provider: Provider\n    hooks: Hooks\n\n    def __init__(\n        self,\n        client: Any | None,\n        create: Callable[..., Any],\n        mode: instructor.Mode = instructor.Mode.TOOLS,\n        provider: Provider = Provider.OPENAI,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ):\n        self.client = client\n        self.create_fn = create\n        self.mode = mode\n        self.kwargs = kwargs\n        self.provider = provider\n        self.hooks = hooks or Hooks()\n\n        if mode in {\n            instructor.Mode.RESPONSES_TOOLS,\n            instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        }:\n            assert isinstance(client, (openai.OpenAI, openai.AsyncOpenAI))\n            self.responses = AsyncResponse(client=self)\n\n    async def create(  # type: ignore[override]\n        self,\n        response_model: type[T] | None,\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> T | Any:\n        kwargs = self.handle_kwargs(kwargs)\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        # Check if the response model is an iterable type\n        if (\n            get_origin(response_model) in {Iterable}\n            and get_args(response_model)\n            and get_args(response_model)[0] is not None\n            and self.mode\n            not in {\n                instructor.Mode.PARALLEL_TOOLS,\n                instructor.Mode.VERTEXAI_PARALLEL_TOOLS,\n                instructor.Mode.ANTHROPIC_PARALLEL_TOOLS,\n            }\n        ):\n            return self.create_iterable(\n                messages=messages,\n                response_model=get_args(response_model)[0],\n                max_retries=max_retries,\n                validation_context=validation_context,\n                context=context,\n                strict=strict,\n                hooks=hooks,  # Pass the per-call hooks to create_iterable\n                **kwargs,\n            )\n\n        return await self.create_fn(\n            response_model=response_model,\n            validation_context=validation_context,\n            context=context,\n            max_retries=max_retries,\n            messages=messages,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        )\n\n    async def create_partial(  # type: ignore[override]\n        self,\n        response_model: type[T],\n        messages: list[ChatCompletionMessageParam],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> AsyncGenerator[T, None]:\n        kwargs = self.handle_kwargs(kwargs)\n        kwargs[\"stream\"] = True\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        async for item in await self.create_fn(\n            response_model=instructor.Partial[response_model],  # type: ignore\n            validation_context=validation_context,\n            context=context,\n            max_retries=max_retries,\n            messages=messages,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        ):\n            yield item\n\n    async def create_iterable(  # type: ignore[override]\n        self,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> AsyncGenerator[T, None]:\n        kwargs = self.handle_kwargs(kwargs)\n        kwargs[\"stream\"] = True\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        async for item in await self.create_fn(\n            response_model=Iterable[response_model],\n            validation_context=validation_context,\n            context=context,\n            max_retries=max_retries,\n            messages=messages,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        ):\n            yield item\n\n    async def create_with_completion(  # type: ignore[override]\n        self,\n        messages: list[ChatCompletionMessageParam],\n        response_model: type[T],\n        max_retries: int | AsyncRetrying = 3,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        **kwargs: Any,\n    ) -> tuple[T, Any]:\n        kwargs = self.handle_kwargs(kwargs)\n\n        # Combine client hooks with per-call hooks\n        combined_hooks = self.hooks\n        if hooks is not None:\n            combined_hooks = self.hooks + hooks\n\n        response = await self.create_fn(\n            response_model=response_model,\n            validation_context=validation_context,\n            context=context,\n            max_retries=max_retries,\n            messages=messages,\n            strict=strict,\n            hooks=combined_hooks,\n            **kwargs,\n        )\n        return response, response._raw_response\n\n\n@overload\ndef from_openai(\n    client: openai.OpenAI,\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> Instructor:\n    pass\n\n\n@overload\ndef from_openai(\n    client: openai.AsyncOpenAI,\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> AsyncInstructor:\n    pass\n\n\ndef map_chat_completion_to_response(messages, client, *args, **kwargs) -> Any:\n    return client.responses.create(\n        *args,\n        input=messages,\n        **kwargs,\n    )\n\n\nasync def async_map_chat_completion_to_response(\n    messages, client, *args, **kwargs\n) -> Any:\n    return await client.responses.create(\n        *args,\n        input=messages,\n        **kwargs,\n    )\n\n\ndef from_openai(\n    client: openai.OpenAI | openai.AsyncOpenAI,\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    if hasattr(client, \"base_url\"):\n        provider = get_provider(str(client.base_url))\n    else:\n        provider = Provider.OPENAI\n\n    if not isinstance(client, (openai.OpenAI, openai.AsyncOpenAI)):\n        import warnings\n\n        warnings.warn(\n            \"Client should be an instance of openai.OpenAI or openai.AsyncOpenAI. Unexpected behavior may occur with other client types.\",\n            stacklevel=2,\n        )\n\n    if provider in {Provider.OPENROUTER}:\n        assert mode in {\n            instructor.Mode.TOOLS,\n            instructor.Mode.OPENROUTER_STRUCTURED_OUTPUTS,\n            instructor.Mode.JSON,\n        }\n\n    if provider in {Provider.ANYSCALE, Provider.TOGETHER}:\n        assert mode in {\n            instructor.Mode.TOOLS,\n            instructor.Mode.JSON,\n            instructor.Mode.JSON_SCHEMA,\n            instructor.Mode.MD_JSON,\n        }\n\n    if provider in {Provider.OPENAI, Provider.DATABRICKS}:\n        assert mode in {\n            instructor.Mode.TOOLS,\n            instructor.Mode.JSON,\n            instructor.Mode.FUNCTIONS,\n            instructor.Mode.PARALLEL_TOOLS,\n            instructor.Mode.MD_JSON,\n            instructor.Mode.TOOLS_STRICT,\n            instructor.Mode.JSON_O1,\n            instructor.Mode.RESPONSES_TOOLS,\n            instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        }\n\n    if isinstance(client, openai.OpenAI):\n        return Instructor(\n            client=client,\n            create=instructor.patch(\n                create=(\n                    client.chat.completions.create\n                    if mode\n                    not in {\n                        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n                        instructor.Mode.RESPONSES_TOOLS,\n                    }\n                    else partial(map_chat_completion_to_response, client=client)\n                ),\n                mode=mode,\n            ),\n            mode=mode,\n            provider=provider,\n            **kwargs,\n        )\n\n    if isinstance(client, openai.AsyncOpenAI):\n        return AsyncInstructor(\n            client=client,\n            create=instructor.patch(\n                create=(\n                    client.chat.completions.create\n                    if mode\n                    not in {\n                        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n                        instructor.Mode.RESPONSES_TOOLS,\n                    }\n                    else partial(async_map_chat_completion_to_response, client=client)\n                ),\n                mode=mode,\n            ),\n            mode=mode,\n            provider=provider,\n            **kwargs,\n        )\n\n\n@overload\ndef from_litellm(\n    completion: Callable[..., Awaitable[Any]],\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\n@overload\ndef from_litellm(\n    completion: Callable[..., Any],\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> Instructor: ...\n\n\ndef from_litellm(\n    completion: Callable[..., Any] | Callable[..., Awaitable[Any]],\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    is_async = inspect.iscoroutinefunction(completion)\n\n    if not is_async:\n        return Instructor(\n            client=None,\n            create=instructor.patch(create=completion, mode=mode),\n            mode=mode,\n            **kwargs,\n        )\n    else:\n        return AsyncInstructor(\n            client=None,\n            create=instructor.patch(create=completion, mode=mode),\n            mode=mode,\n            **kwargs,\n        )\n"
  },
  {
    "path": "instructor/core/exceptions.py",
    "content": "from __future__ import annotations\n\nfrom textwrap import dedent\nfrom typing import Any, NamedTuple\nfrom jinja2 import Template\n\n\nclass InstructorError(Exception):\n    \"\"\"Base exception for all Instructor-specific errors.\n\n    This is the root exception class for the Instructor library. All custom\n    exceptions in Instructor inherit from this class, allowing you to catch\n    any Instructor-related error with a single except clause.\n\n    Attributes:\n        failed_attempts: Optional list of FailedAttempt objects tracking\n            retry attempts that failed before this exception was raised.\n            Each attempt includes the attempt number, exception, and\n            partial completion data.\n\n    Examples:\n        Catch all Instructor errors:\n        ```python\n        try:\n            response = client.chat.completions.create(...)\n        except InstructorError as e:\n            logger.error(f\"Instructor error: {e}\")\n            # Handle any Instructor-specific error\n        ```\n\n        Create error from another exception:\n        ```python\n        try:\n            # some operation\n        except ValueError as e:\n            raise InstructorError.from_exception(e)\n        ```\n\n    See Also:\n        - FailedAttempt: NamedTuple containing retry attempt information\n        - InstructorRetryException: Raised when retries are exhausted\n    \"\"\"\n\n    failed_attempts: list[FailedAttempt] | None = None\n\n    @classmethod\n    def from_exception(\n        cls, exception: Exception, failed_attempts: list[FailedAttempt] | None = None\n    ):\n        \"\"\"Create an InstructorError from another exception.\n\n        Args:\n            exception: The original exception to wrap\n            failed_attempts: Optional list of failed retry attempts\n\n        Returns:\n            A new instance of this exception class with the message from\n            the original exception\n        \"\"\"\n        return cls(str(exception), failed_attempts=failed_attempts)\n\n    def __init__(\n        self,\n        *args: Any,\n        failed_attempts: list[FailedAttempt] | None = None,\n        **kwargs: dict[str, Any],\n    ):\n        self.failed_attempts = failed_attempts\n        super().__init__(*args, **kwargs)\n\n    def __str__(self) -> str:\n        # If no failed attempts, use the standard exception string representation\n        if not self.failed_attempts:\n            return super().__str__()\n\n        template = Template(\n            dedent(\n                \"\"\"\n                <failed_attempts>\n                {% for attempt in failed_attempts %}\n                <generation number=\"{{ attempt.attempt_number }}\">\n                <exception>\n                    {{ attempt.exception }}\n                </exception>\n                <completion>\n                    {{ attempt.completion }}\n                </completion>\n                </generation>\n                {% endfor %}\n                </failed_attempts>\n\n                <last_exception>\n                    {{ last_exception }}\n                </last_exception>\n                \"\"\"\n            ).strip()\n        )\n        return template.render(\n            last_exception=super().__str__(), failed_attempts=self.failed_attempts\n        )\n\n\nclass FailedAttempt(NamedTuple):\n    \"\"\"Represents a single failed retry attempt.\n\n    This immutable tuple stores information about a failed attempt during\n    the retry process, allowing users to inspect what went wrong across\n    multiple retry attempts.\n\n    Attributes:\n        attempt_number: The sequential number of this attempt (1-indexed)\n        exception: The exception that caused this attempt to fail\n        completion: Optional partial completion data from the LLM before\n            the failure occurred. This can be useful for debugging or\n            implementing custom recovery logic.\n\n    Examples:\n        ```python\n        from instructor.core.exceptions import InstructorRetryException\n\n        try:\n            response = client.chat.completions.create(...)\n        except InstructorRetryException as e:\n            for attempt in e.failed_attempts:\n                print(f\"Attempt {attempt.attempt_number} failed:\")\n                print(f\"  Error: {attempt.exception}\")\n                print(f\"  Partial data: {attempt.completion}\")\n        ```\n    \"\"\"\n\n    attempt_number: int\n    exception: Exception\n    completion: Any | None = None\n\n\nclass IncompleteOutputException(InstructorError):\n    \"\"\"Exception raised when LLM output is truncated due to token limits.\n\n    This exception occurs when the LLM hits the max_tokens limit before\n    completing its response. This is particularly common with:\n    - Large structured outputs\n    - Very detailed responses\n    - Low max_tokens settings\n\n    Attributes:\n        last_completion: The partial/incomplete response from the LLM\n            before truncation occurred\n\n    Common Solutions:\n        - Increase max_tokens in your request\n        - Simplify your response model\n        - Use streaming with Partial models to get incomplete data\n        - Break down complex extractions into smaller tasks\n\n    Examples:\n        ```python\n        try:\n            response = client.chat.completions.create(\n                response_model=DetailedReport,\n                max_tokens=100,  # Too low\n                ...\n            )\n        except IncompleteOutputException as e:\n            print(f\"Output truncated. Partial data: {e.last_completion}\")\n            # Retry with higher max_tokens\n            response = client.chat.completions.create(\n                response_model=DetailedReport,\n                max_tokens=2000,\n                ...\n            )\n        ```\n\n    See Also:\n        - instructor.dsl.Partial: For handling partial/incomplete responses\n    \"\"\"\n\n    def __init__(\n        self,\n        *args: Any,\n        last_completion: Any | None = None,\n        message: str = \"The output is incomplete due to a max_tokens length limit.\",\n        **kwargs: dict[str, Any],\n    ):\n        self.last_completion = last_completion\n        super().__init__(message, *args, **kwargs)\n\n\nclass InstructorRetryException(InstructorError):\n    \"\"\"Exception raised when all retry attempts have been exhausted.\n\n    This exception is raised after the maximum number of retries has been\n    reached without successfully validating the LLM response. It contains\n    detailed information about all failed attempts, making it useful for\n    debugging and implementing custom recovery logic.\n\n    Attributes:\n        last_completion: The final (unsuccessful) completion from the LLM\n        messages: The conversation history sent to the LLM (deprecated,\n            use create_kwargs instead)\n        n_attempts: The total number of attempts made\n        total_usage: The cumulative token usage across all attempts\n        create_kwargs: The parameters used in the create() call, including\n            model, messages, temperature, etc.\n        failed_attempts: List of FailedAttempt objects with details about\n            each failed retry\n\n    Common Causes:\n        - Response model too strict for the LLM's capabilities\n        - Ambiguous or contradictory requirements\n        - LLM model not powerful enough for the task\n        - Insufficient context or examples in the prompt\n\n    Examples:\n        ```python\n        try:\n            response = client.chat.completions.create(\n                response_model=StrictModel,\n                max_retries=3,\n                ...\n            )\n        except InstructorRetryException as e:\n            print(f\"Failed after {e.n_attempts} attempts\")\n            print(f\"Total tokens used: {e.total_usage}\")\n            print(f\"Model used: {e.create_kwargs.get('model')}\")\n\n            # Inspect failed attempts\n            for attempt in e.failed_attempts:\n                print(f\"Attempt {attempt.attempt_number}: {attempt.exception}\")\n\n            # Implement fallback strategy\n            response = fallback_handler(e.last_completion)\n        ```\n\n    See Also:\n        - FailedAttempt: Contains details about each retry attempt\n        - ValidationError: Raised when response validation fails\n    \"\"\"\n\n    def __init__(\n        self,\n        *args: Any,\n        last_completion: Any | None = None,\n        messages: list[Any] | None = None,\n        n_attempts: int,\n        total_usage: int,\n        create_kwargs: dict[str, Any] | None = None,\n        failed_attempts: list[FailedAttempt] | None = None,\n        **kwargs: dict[str, Any],\n    ):\n        self.last_completion = last_completion\n        self.messages = messages\n        self.n_attempts = n_attempts\n        self.total_usage = total_usage\n        self.create_kwargs = create_kwargs\n        super().__init__(*args, failed_attempts=failed_attempts, **kwargs)\n\n\nclass ValidationError(InstructorError):\n    \"\"\"Exception raised when LLM response validation fails.\n\n    This exception occurs when the LLM's response doesn't meet the\n    validation requirements defined in your Pydantic model, such as:\n    - Field validation failures\n    - Type mismatches\n    - Custom validator failures\n    - Missing required fields\n\n    Note: This is distinct from Pydantic's ValidationError and provides\n    Instructor-specific context through the failed_attempts attribute.\n\n    Examples:\n        ```python\n        from pydantic import BaseModel, field_validator\n\n        class User(BaseModel):\n            age: int\n\n            @field_validator('age')\n            def age_must_be_positive(cls, v):\n                if v < 0:\n                    raise ValueError('Age must be positive')\n                return v\n\n        try:\n            response = client.chat.completions.create(\n                response_model=User,\n                ...\n            )\n        except ValidationError as e:\n            print(f\"Validation failed: {e}\")\n            # Validation errors are automatically retried\n        ```\n\n    See Also:\n        - InstructorRetryException: Raised when validation fails repeatedly\n    \"\"\"\n\n    pass\n\n\nclass ProviderError(InstructorError):\n    \"\"\"Exception raised for provider-specific errors.\n\n    This exception is used to wrap errors specific to LLM providers\n    (OpenAI, Anthropic, etc.) and provides context about which provider\n    caused the error.\n\n    Attributes:\n        provider: The name of the provider that raised the error\n            (e.g., \"openai\", \"anthropic\", \"gemini\")\n\n    Common Causes:\n        - API authentication failures\n        - Rate limiting\n        - Invalid model names\n        - Provider-specific API errors\n        - Network connectivity issues\n\n    Examples:\n        ```python\n        try:\n            client = instructor.from_openai(openai_client)\n            response = client.chat.completions.create(...)\n        except ProviderError as e:\n            print(f\"Provider {e.provider} error: {e}\")\n            # Implement provider-specific error handling\n            if e.provider == \"openai\":\n                # Handle OpenAI-specific errors\n                pass\n        ```\n    \"\"\"\n\n    def __init__(self, provider: str, message: str, *args: Any, **kwargs: Any):\n        self.provider = provider\n        super().__init__(f\"{provider}: {message}\", *args, **kwargs)\n\n\nclass ConfigurationError(InstructorError):\n    \"\"\"Exception raised for configuration-related errors.\n\n    This exception occurs when there are issues with how Instructor\n    is configured or initialized, such as:\n    - Missing required dependencies\n    - Invalid parameters\n    - Incompatible settings\n    - Improper client initialization\n\n    Common Scenarios:\n        - Missing provider SDK (e.g., anthropic package not installed)\n        - Invalid model string format in from_provider()\n        - Incompatible parameter combinations\n        - Invalid max_retries configuration\n\n    Examples:\n        ```python\n        try:\n            # Missing provider SDK\n            client = instructor.from_provider(\"anthropic/claude-3\")\n        except ConfigurationError as e:\n            print(f\"Configuration issue: {e}\")\n            # e.g., \"The anthropic package is required...\"\n\n        try:\n            # Invalid model string\n            client = instructor.from_provider(\"invalid-format\")\n        except ConfigurationError as e:\n            print(f\"Configuration issue: {e}\")\n            # e.g., \"Model string must be in format 'provider/model-name'\"\n        ```\n    \"\"\"\n\n    pass\n\n\nclass ModeError(InstructorError):\n    \"\"\"Exception raised when an invalid mode is used for a provider.\n\n    Different LLM providers support different modes (e.g., TOOLS, JSON,\n    FUNCTIONS). This exception is raised when you try to use a mode that\n    isn't supported by the current provider.\n\n    Attributes:\n        mode: The invalid mode that was attempted\n        provider: The provider name\n        valid_modes: List of modes supported by this provider\n\n    Examples:\n        ```python\n        try:\n            client = instructor.from_openai(\n                openai_client,\n                mode=instructor.Mode.ANTHROPIC_TOOLS  # Wrong for OpenAI\n            )\n        except ModeError as e:\n            print(f\"Invalid mode '{e.mode}' for {e.provider}\")\n            print(f\"Use one of: {', '.join(e.valid_modes)}\")\n            # Retry with valid mode\n            client = instructor.from_openai(\n                openai_client,\n                mode=instructor.Mode.TOOLS\n            )\n        ```\n\n    See Also:\n        - instructor.Mode: Enum of all available modes\n    \"\"\"\n\n    def __init__(\n        self,\n        mode: str,\n        provider: str,\n        valid_modes: list[str],\n        *args: Any,\n        **kwargs: Any,\n    ):\n        self.mode = mode\n        self.provider = provider\n        self.valid_modes = valid_modes\n        message = f\"Invalid mode '{mode}' for provider '{provider}'. Valid modes: {', '.join(valid_modes)}\"\n        super().__init__(message, *args, **kwargs)\n\n\nclass ClientError(InstructorError):\n    \"\"\"Exception raised for client initialization or usage errors.\n\n    This exception covers errors related to improper client usage or\n    initialization that don't fit other categories.\n\n    Common Scenarios:\n        - Passing invalid client object to from_* functions\n        - Missing required client configuration\n        - Attempting operations on improperly initialized clients\n\n    Examples:\n        ```python\n        try:\n            # Invalid client type\n            client = instructor.from_openai(\"not_a_client\")\n        except ClientError as e:\n            print(f\"Client error: {e}\")\n        ```\n    \"\"\"\n\n    pass\n\n\nclass AsyncValidationError(ValueError, InstructorError):\n    \"\"\"Exception raised during async validation.\n\n    This exception is used specifically for errors that occur during\n    asynchronous validation operations. It inherits from both ValueError\n    and InstructorError to maintain compatibility with existing code.\n\n    Attributes:\n        errors: List of ValueError instances from failed validations\n\n    Examples:\n        ```python\n        from instructor.validation import async_field_validator\n\n        class Model(BaseModel):\n            urls: list[str]\n\n            @async_field_validator('urls')\n            async def validate_urls(cls, v):\n                # Async validation logic\n                ...\n\n        try:\n            response = await client.chat.completions.create(\n                response_model=Model,\n                ...\n            )\n        except AsyncValidationError as e:\n            print(f\"Async validation failed: {e.errors}\")\n        ```\n    \"\"\"\n\n    errors: list[ValueError]\n\n\nclass ResponseParsingError(ValueError, InstructorError):\n    \"\"\"Exception raised when unable to parse the LLM response.\n\n    This exception occurs when the LLM's raw response cannot be parsed\n    into the expected format. Common scenarios include:\n    - Malformed JSON in JSON mode\n    - Missing required fields in the response\n    - Unexpected response structure\n    - Invalid tool call format\n\n    Note: This exception inherits from both ValueError and InstructorError\n    to maintain backwards compatibility with code that catches ValueError.\n\n    Attributes:\n        mode: The mode being used when parsing failed\n        raw_response: The raw response that failed to parse (if available)\n\n    Examples:\n        ```python\n        try:\n            response = client.chat.completions.create(\n                response_model=User,\n                mode=instructor.Mode.JSON,\n                ...\n            )\n        except ResponseParsingError as e:\n            print(f\"Failed to parse response in {e.mode} mode\")\n            print(f\"Raw response: {e.raw_response}\")\n            # May indicate the model doesn't support this mode well\n        ```\n\n        Backwards compatible with ValueError:\n        ```python\n        try:\n            response = client.chat.completions.create(...)\n        except ValueError as e:\n            # Still catches ResponseParsingError\n            print(f\"Parsing error: {e}\")\n        ```\n    \"\"\"\n\n    def __init__(\n        self,\n        message: str,\n        *args: Any,\n        mode: str | None = None,\n        raw_response: Any | None = None,\n        **kwargs: Any,\n    ):\n        self.mode = mode\n        self.raw_response = raw_response\n        context = f\" (mode: {mode})\" if mode else \"\"\n        super().__init__(f\"{message}{context}\", *args, **kwargs)\n\n\nclass MultimodalError(ValueError, InstructorError):\n    \"\"\"Exception raised for multimodal content processing errors.\n\n    This exception is raised when there are issues processing multimodal\n    content (images, audio, PDFs, etc.), such as:\n    - Unsupported file formats\n    - File not found\n    - Invalid base64 encoding\n    - Provider doesn't support multimodal content\n\n    Note: This exception inherits from both ValueError and InstructorError\n    to maintain backwards compatibility with code that catches ValueError.\n\n    Attributes:\n        content_type: The type of content that failed (e.g., 'image', 'audio', 'pdf')\n        file_path: The file path if applicable\n\n    Examples:\n        ```python\n        from instructor import Image\n\n        try:\n            response = client.chat.completions.create(\n                response_model=Analysis,\n                messages=[{\n                    \"role\": \"user\",\n                    \"content\": [\n                        {\"type\": \"text\", \"text\": \"Analyze this image\"},\n                        Image.from_path(\"/invalid/path.jpg\")\n                    ]\n                }]\n            )\n        except MultimodalError as e:\n            print(f\"Multimodal error with {e.content_type}: {e}\")\n            if e.file_path:\n                print(f\"File path: {e.file_path}\")\n        ```\n\n        Backwards compatible with ValueError:\n        ```python\n        try:\n            img = Image.from_path(\"/path/to/image.jpg\")\n        except ValueError as e:\n            # Still catches MultimodalError\n            print(f\"Image error: {e}\")\n        ```\n    \"\"\"\n\n    def __init__(\n        self,\n        message: str,\n        *args: Any,\n        content_type: str | None = None,\n        file_path: str | None = None,\n        **kwargs: Any,\n    ):\n        self.content_type = content_type\n        self.file_path = file_path\n        context_parts = []\n        if content_type:\n            context_parts.append(f\"content_type: {content_type}\")\n        if file_path:\n            context_parts.append(f\"file: {file_path}\")\n        context = f\" ({', '.join(context_parts)})\" if context_parts else \"\"\n        super().__init__(f\"{message}{context}\", *args, **kwargs)\n"
  },
  {
    "path": "instructor/core/hooks.py",
    "content": "from __future__ import annotations\nfrom enum import Enum\nfrom collections import defaultdict\nfrom typing import Any, Literal, TypeVar, Protocol, Union\n\nimport traceback\nimport warnings\n\nT = TypeVar(\"T\")\n\n\nclass HookName(Enum):\n    COMPLETION_KWARGS = \"completion:kwargs\"\n    COMPLETION_RESPONSE = \"completion:response\"\n    COMPLETION_ERROR = \"completion:error\"\n    COMPLETION_LAST_ATTEMPT = \"completion:last_attempt\"\n    PARSE_ERROR = \"parse:error\"\n\n\n# Handler protocol types for type safety\nclass CompletionKwargsHandler(Protocol):\n    \"\"\"Protocol for completion kwargs handlers.\"\"\"\n\n    def __call__(self, *args: Any, **kwargs: Any) -> None: ...\n\n\nclass CompletionResponseHandler(Protocol):\n    \"\"\"Protocol for completion response handlers.\"\"\"\n\n    def __call__(self, response: Any) -> None: ...\n\n\nclass CompletionErrorHandler(Protocol):\n    \"\"\"Protocol for completion error and last attempt handlers.\"\"\"\n\n    def __call__(self, error: Exception) -> None: ...\n\n\nclass ParseErrorHandler(Protocol):\n    \"\"\"Protocol for parse error handlers.\"\"\"\n\n    def __call__(self, error: Exception) -> None: ...\n\n\n# Type alias for hook name parameter\nHookNameType = Union[\n    HookName,\n    Literal[\n        \"completion:kwargs\",\n        \"completion:response\",\n        \"completion:error\",\n        \"completion:last_attempt\",\n        \"parse:error\",\n    ],\n]\n\n# Type alias for all handler types\nHandlerType = Union[\n    CompletionKwargsHandler,\n    CompletionResponseHandler,\n    CompletionErrorHandler,\n    ParseErrorHandler,\n]\n\n\nclass Hooks:\n    \"\"\"\n    Hooks class for handling and emitting events related to completion processes.\n\n    This class provides a mechanism to register event handlers and emit events\n    for various stages of the completion process.\n    \"\"\"\n\n    def __init__(self) -> None:\n        \"\"\"Initialize the hooks container.\"\"\"\n        self._handlers: defaultdict[HookName, list[HandlerType]] = defaultdict(list)\n\n    def on(\n        self,\n        hook_name: HookNameType,\n        handler: HandlerType,\n    ) -> None:\n        \"\"\"\n        Register an event handler for a specific event.\n\n        This method allows you to attach a handler function to a specific event.\n        When the event is emitted, all registered handlers for that event will be called.\n\n        Args:\n            hook_name: The event to listen for. This can be either a HookName enum\n                       value or a string representation of the event name.\n            handler: The function to be called when the event is emitted.\n\n        Raises:\n            ValueError: If the hook_name is not a valid HookName enum or string representation.\n\n        Example:\n            >>> def on_completion_kwargs(*args: Any, **kwargs: Any) -> None:\n            ...     print(f\"Completion kwargs: {args}, {kwargs}\")\n            >>> hooks = Hooks()\n            >>> hooks.on(HookName.COMPLETION_KWARGS, on_completion_kwargs)\n            >>> hooks.emit_completion_arguments(model=\"gpt-3.5-turbo\", temperature=0.7)\n            Completion kwargs: (), {'model': 'gpt-3.5-turbo', 'temperature': 0.7}\n        \"\"\"\n        hook_name = self.get_hook_name(hook_name)\n        self._handlers[hook_name].append(handler)\n\n    def get_hook_name(self, hook_name: HookNameType) -> HookName:\n        \"\"\"\n        Convert a string hook name to its corresponding enum value.\n\n        Args:\n            hook_name: Either a HookName enum value or string representation.\n\n        Returns:\n            The corresponding HookName enum value.\n\n        Raises:\n            ValueError: If the string doesn't match any HookName enum value.\n        \"\"\"\n        if isinstance(hook_name, str):\n            try:\n                return HookName(hook_name)\n            except ValueError as err:\n                raise ValueError(f\"Invalid hook name: {hook_name}\") from err\n        return hook_name\n\n    def emit(self, hook_name: HookName, *args: Any, **kwargs: Any) -> None:\n        \"\"\"\n        Generic method to emit events for any hook type.\n\n        Args:\n            hook_name: The hook to emit\n            *args: Positional arguments to pass to handlers\n            **kwargs: Keyword arguments to pass to handlers\n        \"\"\"\n        for handler in self._handlers[hook_name]:\n            try:\n                handler(*args, **kwargs)  # type: ignore\n            except Exception:\n                error_traceback = traceback.format_exc()\n                warnings.warn(\n                    f\"Error in {hook_name.value} handler:\\n{error_traceback}\",\n                    stacklevel=2,\n                )\n\n    def emit_completion_arguments(self, *args: Any, **kwargs: Any) -> None:\n        \"\"\"\n        Emit a completion arguments event.\n\n        Args:\n            *args: Positional arguments to pass to handlers\n            **kwargs: Keyword arguments to pass to handlers\n        \"\"\"\n        self.emit(HookName.COMPLETION_KWARGS, *args, **kwargs)\n\n    def emit_completion_response(self, response: Any) -> None:\n        \"\"\"\n        Emit a completion response event.\n\n        Args:\n            response: The completion response to pass to handlers\n        \"\"\"\n        self.emit(HookName.COMPLETION_RESPONSE, response)\n\n    def emit_completion_error(self, error: Exception) -> None:\n        \"\"\"\n        Emit a completion error event.\n\n        Args:\n            error: The exception to pass to handlers\n        \"\"\"\n        self.emit(HookName.COMPLETION_ERROR, error)\n\n    def emit_completion_last_attempt(self, error: Exception) -> None:\n        \"\"\"\n        Emit a completion last attempt event.\n\n        Args:\n            error: The exception to pass to handlers\n        \"\"\"\n        self.emit(HookName.COMPLETION_LAST_ATTEMPT, error)\n\n    def emit_parse_error(self, error: Exception) -> None:\n        \"\"\"\n        Emit a parse error event.\n\n        Args:\n            error: The exception to pass to handlers\n        \"\"\"\n        self.emit(HookName.PARSE_ERROR, error)\n\n    def off(\n        self,\n        hook_name: HookNameType,\n        handler: HandlerType,\n    ) -> None:\n        \"\"\"\n        Remove a specific handler from an event.\n\n        Args:\n            hook_name: The name of the hook.\n            handler: The handler to remove.\n        \"\"\"\n        hook_name = self.get_hook_name(hook_name)\n        if hook_name in self._handlers:\n            if handler in self._handlers[hook_name]:\n                self._handlers[hook_name].remove(handler)\n                if not self._handlers[hook_name]:\n                    del self._handlers[hook_name]\n\n    def clear(\n        self,\n        hook_name: HookNameType | None = None,\n    ) -> None:\n        \"\"\"\n        Clear handlers for a specific event or all events.\n\n        Args:\n            hook_name: The name of the event to clear handlers for.\n                      If None, all handlers are cleared.\n        \"\"\"\n        if hook_name is not None:\n            hook_name = self.get_hook_name(hook_name)\n            self._handlers.pop(hook_name, None)\n        else:\n            self._handlers.clear()\n\n    def __add__(self, other: Hooks) -> Hooks:\n        \"\"\"\n        Combine two Hooks instances into a new one.\n\n        This creates a new Hooks instance that contains all handlers from both\n        the current instance and the other instance. Handlers are combined by\n        appending the other's handlers after the current instance's handlers.\n\n        Args:\n            other: Another Hooks instance to combine with this one.\n\n        Returns:\n            A new Hooks instance containing all handlers from both instances.\n\n        Example:\n            >>> hooks1 = Hooks()\n            >>> hooks2 = Hooks()\n            >>> hooks1.on(\"completion:kwargs\", lambda **kw: print(\"Hook 1\"))\n            >>> hooks2.on(\"completion:kwargs\", lambda **kw: print(\"Hook 2\"))\n            >>> combined = hooks1 + hooks2\n            >>> combined.emit_completion_arguments()  # Prints both \"Hook 1\" and \"Hook 2\"\n        \"\"\"\n        if not isinstance(other, Hooks):\n            return NotImplemented\n\n        combined = Hooks()\n\n        # Copy handlers from self\n        for hook_name, handlers in self._handlers.items():\n            combined._handlers[hook_name].extend(handlers.copy())\n\n        # Add handlers from other\n        for hook_name, handlers in other._handlers.items():\n            combined._handlers[hook_name].extend(handlers.copy())\n\n        return combined\n\n    def __iadd__(self, other: Hooks) -> Hooks:\n        \"\"\"\n        Add another Hooks instance to this one in-place.\n\n        This modifies the current instance by adding all handlers from the other\n        instance. The other instance's handlers are appended after the current\n        instance's handlers for each event type.\n\n        Args:\n            other: Another Hooks instance to add to this one.\n\n        Returns:\n            This Hooks instance (for method chaining).\n\n        Example:\n            >>> hooks1 = Hooks()\n            >>> hooks2 = Hooks()\n            >>> hooks1.on(\"completion:kwargs\", lambda **kw: print(\"Hook 1\"))\n            >>> hooks2.on(\"completion:kwargs\", lambda **kw: print(\"Hook 2\"))\n            >>> hooks1 += hooks2\n            >>> hooks1.emit_completion_arguments()  # Prints both \"Hook 1\" and \"Hook 2\"\n        \"\"\"\n        if not isinstance(other, Hooks):\n            return NotImplemented\n\n        # Add handlers from other to self\n        for hook_name, handlers in other._handlers.items():\n            self._handlers[hook_name].extend(handlers.copy())\n\n        return self\n\n    @classmethod\n    def combine(cls, *hooks_instances: Hooks) -> Hooks:\n        \"\"\"\n        Combine multiple Hooks instances into a new one.\n\n        This class method creates a new Hooks instance that contains all handlers\n        from all provided instances. Handlers are combined in the order of the\n        provided instances.\n\n        Args:\n            *hooks_instances: Variable number of Hooks instances to combine.\n\n        Returns:\n            A new Hooks instance containing all handlers from all instances.\n\n        Example:\n            >>> hooks1 = Hooks()\n            >>> hooks2 = Hooks()\n            >>> hooks3 = Hooks()\n            >>> hooks1.on(\"completion:kwargs\", lambda **kw: print(\"Hook 1\"))\n            >>> hooks2.on(\"completion:kwargs\", lambda **kw: print(\"Hook 2\"))\n            >>> hooks3.on(\"completion:kwargs\", lambda **kw: print(\"Hook 3\"))\n            >>> combined = Hooks.combine(hooks1, hooks2, hooks3)\n            >>> combined.emit_completion_arguments()  # Prints all three hooks\n        \"\"\"\n        combined = cls()\n\n        for hooks_instance in hooks_instances:\n            if not isinstance(hooks_instance, cls):\n                raise TypeError(f\"Expected Hooks instance, got {type(hooks_instance)}\")\n            combined += hooks_instance\n\n        return combined\n\n    def copy(self) -> Hooks:\n        \"\"\"\n        Create a deep copy of this Hooks instance.\n\n        Returns:\n            A new Hooks instance with all the same handlers.\n\n        Example:\n            >>> original = Hooks()\n            >>> original.on(\"completion:kwargs\", lambda **kw: print(\"Hook\"))\n            >>> copy = original.copy()\n            >>> copy.emit_completion_arguments()  # Prints \"Hook\"\n        \"\"\"\n        new_hooks = Hooks()\n        for hook_name, handlers in self._handlers.items():\n            new_hooks._handlers[hook_name].extend(handlers.copy())\n        return new_hooks\n"
  },
  {
    "path": "instructor/core/patch.py",
    "content": "from __future__ import annotations\nfrom functools import wraps\nfrom typing import (\n    Any,\n    Callable,\n    Protocol,\n    TypeVar,\n    overload,\n)\nfrom collections.abc import Awaitable\nfrom typing_extensions import ParamSpec\n\nfrom openai import AsyncOpenAI, OpenAI  # type: ignore[import-not-found]\nfrom pydantic import BaseModel  # type: ignore[import-not-found]\n\nfrom ..processing.response import handle_response_model\nfrom .retry import retry_async, retry_sync\nfrom ..utils import is_async\nfrom .hooks import Hooks\nfrom ..templating import handle_templating\n\nfrom ..mode import Mode\nimport logging\n\nfrom tenacity import (  # type: ignore[import-not-found]\n    AsyncRetrying,\n    Retrying,\n)\n\nlogger = logging.getLogger(\"instructor\")\n\nT_Model = TypeVar(\"T_Model\", bound=BaseModel)\nT_Retval = TypeVar(\"T_Retval\")\nT_ParamSpec = ParamSpec(\"T_ParamSpec\")\n\n\nclass InstructorChatCompletionCreate(Protocol):\n    def __call__(\n        self,\n        response_model: type[T_Model] | None = None,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        max_retries: int | Retrying = 1,\n        *args: Any,\n        **kwargs: Any,\n    ) -> T_Model: ...\n\n\nclass AsyncInstructorChatCompletionCreate(Protocol):\n    async def __call__(\n        self,\n        response_model: type[T_Model] | None = None,\n        validation_context: dict[str, Any] | None = None,  # Deprecate in 2.0\n        context: dict[str, Any] | None = None,\n        max_retries: int | AsyncRetrying = 1,\n        *args: Any,\n        **kwargs: Any,\n    ) -> T_Model: ...\n\n\ndef handle_context(\n    context: dict[str, Any] | None = None,\n    validation_context: dict[str, Any] | None = None,\n) -> dict[str, Any] | None:\n    \"\"\"\n    Handle the context and validation_context parameters.\n    If both are provided, raise an error.\n    If validation_context is provided, issue a deprecation warning and use it as context.\n    If neither is provided, return None.\n    \"\"\"\n    if context is not None and validation_context is not None:\n        from .exceptions import ConfigurationError\n\n        raise ConfigurationError(\n            \"Cannot provide both 'context' and 'validation_context'. Use 'context' instead.\"\n        )\n    if validation_context is not None and context is None:\n        import warnings\n\n        warnings.warn(\n            \"'validation_context' is deprecated. Use 'context' instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        context = validation_context\n    return context\n\n\n@overload\ndef patch(\n    client: OpenAI,\n    mode: Mode = Mode.TOOLS,\n) -> OpenAI: ...\n\n\n@overload\ndef patch(\n    client: AsyncOpenAI,\n    mode: Mode = Mode.TOOLS,\n) -> AsyncOpenAI: ...\n\n\n@overload\ndef patch(\n    create: Callable[T_ParamSpec, T_Retval],\n    mode: Mode = Mode.TOOLS,\n) -> InstructorChatCompletionCreate: ...\n\n\n@overload\ndef patch(\n    create: Awaitable[T_Retval],\n    mode: Mode = Mode.TOOLS,\n) -> InstructorChatCompletionCreate: ...\n\n\ndef patch(  # type: ignore\n    client: OpenAI | AsyncOpenAI | None = None,\n    create: Callable[T_ParamSpec, T_Retval] | None = None,\n    mode: Mode = Mode.TOOLS,\n) -> OpenAI | AsyncOpenAI:\n    \"\"\"\n    Patch the `client.chat.completions.create` method\n\n    Enables the following features:\n\n    - `response_model` parameter to parse the response from OpenAI's API\n    - `max_retries` parameter to retry the function if the response is not valid\n    - `validation_context` parameter to validate the response using the pydantic model\n    - `strict` parameter to use strict json parsing\n    - `hooks` parameter to hook into the completion process\n    \"\"\"\n\n    logger.debug(f\"Patching `client.chat.completions.create` with {mode=}\")\n\n    if create is not None:\n        func = create\n    elif client is not None:\n        func = client.chat.completions.create\n    else:\n        raise ValueError(\"Either client or create must be provided\")\n\n    func_is_async = is_async(func)\n\n    @wraps(func)  # type: ignore\n    async def new_create_async(\n        response_model: type[T_Model] | None = None,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,\n        max_retries: int | AsyncRetrying = 1,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        *args: T_ParamSpec.args,\n        **kwargs: T_ParamSpec.kwargs,\n    ) -> T_Model:\n        # -----------------------------\n        # Cache handling (async path)\n        # -----------------------------\n        from ..cache import BaseCache, make_cache_key, load_cached_response\n\n        cache: BaseCache | None = kwargs.pop(\"cache\", None)  # type: ignore[assignment]\n        cache_ttl_raw = kwargs.pop(\"cache_ttl\", None)\n        cache_ttl: int | None = (\n            cache_ttl_raw if isinstance(cache_ttl_raw, int) else None\n        )\n\n        context = handle_context(context, validation_context)\n\n        response_model, new_kwargs = handle_response_model(\n            response_model=response_model, mode=mode, **kwargs\n        )  # type: ignore\n        new_kwargs = handle_templating(new_kwargs, mode=mode, context=context)\n\n        # Attempt cache lookup **before** hitting retry layer\n        if cache is not None and response_model is not None:\n            key = make_cache_key(\n                messages=new_kwargs.get(\"messages\")\n                or new_kwargs.get(\"contents\")\n                or new_kwargs.get(\"chat_history\"),\n                model=new_kwargs.get(\"model\"),\n                response_model=response_model,\n                mode=mode.value if hasattr(mode, \"value\") else str(mode),\n            )\n            obj = load_cached_response(cache, key, response_model)\n            if obj is not None:\n                return obj  # type: ignore[return-value]\n\n        response = await retry_async(\n            func=func,  # type:ignore\n            response_model=response_model,\n            context=context,\n            max_retries=max_retries,\n            args=args,\n            kwargs=new_kwargs,\n            strict=strict,\n            mode=mode,\n            hooks=hooks,\n        )\n\n        # Store in cache *after* successful call\n        if cache is not None and response_model is not None:\n            try:\n                from pydantic import BaseModel as _BM  # type: ignore[import-not-found]\n\n                if isinstance(response, _BM):\n                    # mypy: ignore-next-line\n                    from ..cache import store_cached_response\n\n                    store_cached_response(cache, key, response, ttl=cache_ttl)\n            except ModuleNotFoundError:\n                pass\n        return response  # type: ignore\n\n    @wraps(func)  # type: ignore\n    def new_create_sync(\n        response_model: type[T_Model] | None = None,\n        validation_context: dict[str, Any] | None = None,\n        context: dict[str, Any] | None = None,\n        max_retries: int | Retrying = 1,\n        strict: bool = True,\n        hooks: Hooks | None = None,\n        *args: T_ParamSpec.args,\n        **kwargs: T_ParamSpec.kwargs,\n    ) -> T_Model:\n        # -----------------------------\n        # Cache handling (sync path)\n        # -----------------------------\n        from ..cache import BaseCache, make_cache_key, load_cached_response\n\n        cache: BaseCache | None = kwargs.pop(\"cache\", None)  # type: ignore[assignment]\n        cache_ttl_raw = kwargs.pop(\"cache_ttl\", None)\n        cache_ttl: int | None = (\n            cache_ttl_raw if isinstance(cache_ttl_raw, int) else None\n        )\n\n        context = handle_context(context, validation_context)\n        # print(f\"instructor.patch: patched_function {func.__name__}\")\n        response_model, new_kwargs = handle_response_model(\n            response_model=response_model, mode=mode, **kwargs\n        )  # type: ignore\n\n        new_kwargs = handle_templating(new_kwargs, mode=mode, context=context)\n\n        # Attempt cache lookup\n        if cache is not None and response_model is not None:\n            key = make_cache_key(\n                messages=new_kwargs.get(\"messages\")\n                or new_kwargs.get(\"contents\")\n                or new_kwargs.get(\"chat_history\"),\n                model=new_kwargs.get(\"model\"),\n                response_model=response_model,\n                mode=mode.value if hasattr(mode, \"value\") else str(mode),\n            )\n            obj = load_cached_response(cache, key, response_model)\n            if obj is not None:\n                return obj  # type: ignore[return-value]\n\n        response = retry_sync(\n            func=func,  # type: ignore\n            response_model=response_model,\n            context=context,\n            max_retries=max_retries,\n            args=args,\n            hooks=hooks,\n            strict=strict,\n            kwargs=new_kwargs,\n            mode=mode,\n        )\n\n        # Save to cache\n        if cache is not None and response_model is not None:\n            try:\n                from pydantic import BaseModel as _BM  # type: ignore[import-not-found]\n\n                if isinstance(response, _BM):\n                    # mypy: ignore-next-line\n                    from ..cache import store_cached_response\n\n                    store_cached_response(cache, key, response, ttl=cache_ttl)\n            except ModuleNotFoundError:\n                pass\n        return response  # type: ignore\n\n    new_create = new_create_async if func_is_async else new_create_sync\n\n    if client is not None:\n        client.chat.completions.create = new_create  # type: ignore\n        return client\n    else:\n        return new_create  # type: ignore\n\n\ndef apatch(client: AsyncOpenAI, mode: Mode = Mode.TOOLS) -> AsyncOpenAI:\n    \"\"\"\n    No longer necessary, use `patch` instead.\n\n    Patch the `client.chat.completions.create` method\n\n    Enables the following features:\n\n    - `response_model` parameter to parse the response from OpenAI's API\n    - `max_retries` parameter to retry the function if the response is not valid\n    - `validation_context` parameter to validate the response using the pydantic model\n    - `strict` parameter to use strict json parsing\n    \"\"\"\n    import warnings\n\n    warnings.warn(\n        \"apatch is deprecated, use patch instead\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n    return patch(client, mode=mode)\n"
  },
  {
    "path": "instructor/core/retry.py",
    "content": "# type: ignore[all]\n\nfrom __future__ import annotations\n\nimport logging\nfrom json import JSONDecodeError\nfrom typing import Any, Callable, TypeVar\n\nfrom .exceptions import (\n    InstructorRetryException,\n    AsyncValidationError,\n    FailedAttempt,\n    ValidationError as InstructorValidationError,\n)\nfrom .hooks import Hooks\nfrom ..mode import Mode\nfrom ..processing.response import (\n    process_response,\n    process_response_async,\n    handle_reask_kwargs,\n)\nfrom ..utils import update_total_usage\nfrom openai.types.chat import ChatCompletion\nfrom openai.types.completion_usage import (\n    CompletionUsage,\n    CompletionTokensDetails,\n    PromptTokensDetails,\n)\nfrom pydantic import BaseModel, ValidationError\nfrom tenacity import (\n    AsyncRetrying,\n    RetryError,\n    Retrying,\n    stop_after_attempt,\n    stop_after_delay,\n)\nfrom typing_extensions import ParamSpec\n\nlogger = logging.getLogger(\"instructor\")\n\n# Type Variables\nT_Model = TypeVar(\"T_Model\", bound=BaseModel)\nT_Retval = TypeVar(\"T_Retval\")\nT_ParamSpec = ParamSpec(\"T_ParamSpec\")\nT = TypeVar(\"T\")\n\n\ndef initialize_retrying(\n    max_retries: int | Retrying | AsyncRetrying,\n    is_async: bool,\n    timeout: float | None = None,\n):\n    \"\"\"\n    Initialize the retrying mechanism based on the type (synchronous or asynchronous).\n\n    Args:\n        max_retries (int | Retrying | AsyncRetrying): Maximum number of retries or a retrying object.\n        is_async (bool): Flag indicating if the retrying is asynchronous.\n        timeout (float | None): Optional timeout in seconds to limit total retry duration.\n\n    Returns:\n        Retrying | AsyncRetrying: Configured retrying object.\n    \"\"\"\n    if isinstance(max_retries, int):\n        logger.debug(f\"max_retries: {max_retries}, timeout: {timeout}\")\n\n        # Create stop conditions\n        stop_conditions = [stop_after_attempt(max_retries)]\n        if timeout is not None:\n            # Add global timeout: stop after timeout seconds total\n            stop_conditions.append(stop_after_delay(timeout))\n\n        # Combine stop conditions with OR logic (stop if ANY condition is met)\n        stop_condition = stop_conditions[0]\n        for condition in stop_conditions[1:]:\n            stop_condition = stop_condition | condition\n\n        if is_async:\n            max_retries = AsyncRetrying(stop=stop_condition)\n        else:\n            max_retries = Retrying(stop=stop_condition)\n    elif not isinstance(max_retries, (Retrying, AsyncRetrying)):\n        from .exceptions import ConfigurationError\n\n        raise ConfigurationError(\n            \"max_retries must be an int or a `tenacity.Retrying`/`tenacity.AsyncRetrying` object\"\n        )\n    return max_retries\n\n\ndef initialize_usage(mode: Mode) -> CompletionUsage | Any:\n    \"\"\"\n    Initialize the total usage based on the mode.\n\n    Args:\n        mode (Mode): The mode of operation.\n\n    Returns:\n        CompletionUsage | Any: Initialized usage object.\n    \"\"\"\n    total_usage = CompletionUsage(\n        completion_tokens=0,\n        prompt_tokens=0,\n        total_tokens=0,\n        completion_tokens_details=CompletionTokensDetails(\n            audio_tokens=0, reasoning_tokens=0\n        ),\n        prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0),\n    )\n    if mode in {Mode.ANTHROPIC_TOOLS, Mode.ANTHROPIC_JSON}:\n        from anthropic.types import Usage as AnthropicUsage\n\n        total_usage = AnthropicUsage(\n            input_tokens=0,\n            output_tokens=0,\n            cache_read_input_tokens=0,\n            cache_creation_input_tokens=0,\n        )\n    return total_usage\n\n\ndef extract_messages(kwargs: dict[str, Any]) -> Any:\n    \"\"\"\n    Extract messages from kwargs, helps handles the cohere and gemini chat history cases\n\n    Args:\n        kwargs (Dict[str, Any]): Keyword arguments containing message data.\n\n    Returns:\n        Any: Extracted messages.\n    \"\"\"\n    # Directly check for keys in an efficient order (most common first)\n    # instead of nested get() calls which are inefficient\n    if \"messages\" in kwargs:\n        return kwargs[\"messages\"]\n    if \"contents\" in kwargs:\n        return kwargs[\"contents\"]\n    if \"chat_history\" in kwargs:\n        return kwargs[\"chat_history\"]\n    return []\n\n\ndef retry_sync(\n    func: Callable[T_ParamSpec, T_Retval],\n    response_model: type[T_Model] | None,\n    args: Any,\n    kwargs: Any,\n    context: dict[str, Any] | None = None,\n    max_retries: int | Retrying = 1,\n    strict: bool | None = None,\n    mode: Mode = Mode.TOOLS,\n    hooks: Hooks | None = None,\n) -> T_Model | None:\n    \"\"\"\n    Retry a synchronous function upon specified exceptions.\n\n    Args:\n        func (Callable[T_ParamSpec, T_Retval]): The function to retry.\n        response_model (Optional[type[T_Model]]): The model to validate the response against.\n        args (Any): Positional arguments for the function.\n        kwargs (Any): Keyword arguments for the function.\n        context (Optional[Dict[str, Any]], optional): Additional context for validation. Defaults to None.\n        max_retries (int | Retrying, optional): Maximum number of retries or a retrying object. Defaults to 1.\n        strict (Optional[bool], optional): Strict mode flag. Defaults to None.\n        mode (Mode, optional): The mode of operation. Defaults to Mode.TOOLS.\n        hooks (Optional[Hooks], optional): Hooks for emitting events. Defaults to None.\n\n    Returns:\n        T_Model | None: The processed response model or None.\n\n    Raises:\n        InstructorRetryException: If all retry attempts fail.\n    \"\"\"\n    hooks = hooks or Hooks()\n    total_usage = initialize_usage(mode)\n    # Extract timeout from kwargs if available (for global timeout across retries)\n    timeout = kwargs.get(\"timeout\")\n    max_retries = initialize_retrying(max_retries, is_async=False, timeout=timeout)\n\n    # Pre-extract stream flag to avoid repeated lookup\n    stream = kwargs.get(\"stream\", False)\n\n    # Track all failed attempts\n    failed_attempts: list[FailedAttempt] = []\n\n    try:\n        response = None\n        for attempt in max_retries:\n            with attempt:\n                logger.debug(f\"Retrying, attempt: {attempt.retry_state.attempt_number}\")\n                try:\n                    hooks.emit_completion_arguments(*args, **kwargs)\n                    response = func(*args, **kwargs)\n                    hooks.emit_completion_response(response)\n                    response = update_total_usage(\n                        response=response, total_usage=total_usage\n                    )\n\n                    return process_response(  # type: ignore\n                        response=response,\n                        response_model=response_model,\n                        validation_context=context,\n                        strict=strict,\n                        mode=mode,\n                        stream=stream,\n                    )\n                except (\n                    ValidationError,\n                    JSONDecodeError,\n                    InstructorValidationError,\n                ) as e:\n                    logger.debug(f\"Parse error: {e}\")\n                    hooks.emit_parse_error(e)\n\n                    # Track this failed attempt\n                    failed_attempts.append(\n                        FailedAttempt(\n                            attempt_number=attempt.retry_state.attempt_number,\n                            exception=e,\n                            completion=response,\n                        )\n                    )\n\n                    # Check if this is the last attempt\n                    if isinstance(max_retries, Retrying) and hasattr(\n                        max_retries, \"stop\"\n                    ):\n                        # For tenacity Retrying objects, check if next attempt would exceed limit\n                        will_retry = (\n                            attempt.retry_state.outcome is None\n                            or not attempt.retry_state.outcome.failed\n                        )\n                        is_last_attempt = (\n                            not will_retry\n                            or attempt.retry_state.attempt_number\n                            >= getattr(\n                                max_retries.stop, \"max_attempt_number\", float(\"inf\")\n                            )\n                        )\n                        if is_last_attempt:\n                            hooks.emit_completion_last_attempt(e)\n\n                    kwargs = handle_reask_kwargs(\n                        kwargs=kwargs,\n                        mode=mode,\n                        response=response,\n                        exception=e,\n                        failed_attempts=failed_attempts,\n                    )\n                    raise e\n                except Exception as e:\n                    # Emit completion:error for non-validation errors (API errors, network errors, etc.)\n                    logger.debug(f\"Completion error: {e}\")\n                    hooks.emit_completion_error(e)\n\n                    # Track this failed attempt\n                    failed_attempts.append(\n                        FailedAttempt(\n                            attempt_number=attempt.retry_state.attempt_number,\n                            exception=e,\n                            completion=response,\n                        )\n                    )\n\n                    # Check if this is the last attempt for completion errors\n                    if isinstance(max_retries, Retrying) and hasattr(\n                        max_retries, \"stop\"\n                    ):\n                        will_retry = (\n                            attempt.retry_state.outcome is None\n                            or not attempt.retry_state.outcome.failed\n                        )\n                        is_last_attempt = (\n                            not will_retry\n                            or attempt.retry_state.attempt_number\n                            >= getattr(\n                                max_retries.stop, \"max_attempt_number\", float(\"inf\")\n                            )\n                        )\n                        if is_last_attempt:\n                            hooks.emit_completion_last_attempt(e)\n                    raise e\n    except RetryError as e:\n        logger.debug(f\"Retry error: {e}\")\n        raise InstructorRetryException(\n            e.last_attempt._exception,\n            last_completion=response,\n            n_attempts=attempt.retry_state.attempt_number,\n            #! deprecate messages soon\n            messages=extract_messages(\n                kwargs\n            ),  # Use the optimized function instead of nested lookups\n            create_kwargs=kwargs,\n            total_usage=total_usage,\n            failed_attempts=failed_attempts,\n        ) from e\n\n\nasync def retry_async(\n    func: Callable[T_ParamSpec, T_Retval],\n    response_model: type[T_Model] | None,\n    args: Any,\n    kwargs: Any,\n    context: dict[str, Any] | None = None,\n    max_retries: int | AsyncRetrying = 1,\n    strict: bool | None = None,\n    mode: Mode = Mode.TOOLS,\n    hooks: Hooks | None = None,\n) -> T_Model | None:\n    \"\"\"\n    Retry an asynchronous function upon specified exceptions.\n\n    Args:\n        func (Callable[T_ParamSpec, T_Retval]): The asynchronous function to retry.\n        response_model (Optional[type[T_Model]]): The model to validate the response against.\n        context (Optional[Dict[str, Any]]): Additional context for validation.\n        args (Any): Positional arguments for the function.\n        kwargs (Any): Keyword arguments for the function.\n        max_retries (int | AsyncRetrying, optional): Maximum number of retries or an async retrying object. Defaults to 1.\n        strict (Optional[bool], optional): Strict mode flag. Defaults to None.\n        mode (Mode, optional): The mode of operation. Defaults to Mode.TOOLS.\n        hooks (Optional[Hooks], optional): Hooks for emitting events. Defaults to None.\n\n    Returns:\n        T_Model | None: The processed response model or None.\n\n    Raises:\n        InstructorRetryException: If all retry attempts fail.\n    \"\"\"\n    hooks = hooks or Hooks()\n    total_usage = initialize_usage(mode)\n    # Extract timeout from kwargs if available (for global timeout across retries)\n    timeout = kwargs.get(\"timeout\")\n    max_retries = initialize_retrying(max_retries, is_async=True, timeout=timeout)\n\n    # Pre-extract stream flag to avoid repeated lookup\n    stream = kwargs.get(\"stream\", False)\n\n    # Track all failed attempts\n    failed_attempts: list[FailedAttempt] = []\n\n    try:\n        response = None\n        async for attempt in max_retries:\n            logger.debug(f\"Retrying, attempt: {attempt.retry_state.attempt_number}\")\n            with attempt:\n                try:\n                    hooks.emit_completion_arguments(*args, **kwargs)\n                    response: ChatCompletion = await func(*args, **kwargs)\n                    hooks.emit_completion_response(response)\n                    response = update_total_usage(\n                        response=response, total_usage=total_usage\n                    )\n\n                    return await process_response_async(\n                        response=response,\n                        response_model=response_model,\n                        validation_context=context,\n                        strict=strict,\n                        mode=mode,\n                        stream=stream,\n                    )\n                except (\n                    ValidationError,\n                    JSONDecodeError,\n                    AsyncValidationError,\n                    InstructorValidationError,\n                ) as e:\n                    logger.debug(f\"Parse error: {e}\")\n                    hooks.emit_parse_error(e)\n\n                    # Track this failed attempt\n                    failed_attempts.append(\n                        FailedAttempt(\n                            attempt_number=attempt.retry_state.attempt_number,\n                            exception=e,\n                            completion=response,\n                        )\n                    )\n\n                    # Check if this is the last attempt\n                    if isinstance(max_retries, AsyncRetrying) and hasattr(\n                        max_retries, \"stop\"\n                    ):\n                        # For tenacity AsyncRetrying objects, check if next attempt would exceed limit\n                        will_retry = (\n                            attempt.retry_state.outcome is None\n                            or not attempt.retry_state.outcome.failed\n                        )\n                        is_last_attempt = (\n                            not will_retry\n                            or attempt.retry_state.attempt_number\n                            >= getattr(\n                                max_retries.stop, \"max_attempt_number\", float(\"inf\")\n                            )\n                        )\n                        if is_last_attempt:\n                            hooks.emit_completion_last_attempt(e)\n\n                    kwargs = handle_reask_kwargs(\n                        kwargs=kwargs,\n                        mode=mode,\n                        response=response,\n                        exception=e,\n                        failed_attempts=failed_attempts,\n                    )\n                    raise e\n                except Exception as e:\n                    # Emit completion:error for non-validation errors (API errors, network errors, etc.)\n                    logger.debug(f\"Completion error: {e}\")\n                    hooks.emit_completion_error(e)\n\n                    # Track this failed attempt\n                    failed_attempts.append(\n                        FailedAttempt(\n                            attempt_number=attempt.retry_state.attempt_number,\n                            exception=e,\n                            completion=response,\n                        )\n                    )\n\n                    # Check if this is the last attempt for completion errors\n                    if isinstance(max_retries, AsyncRetrying) and hasattr(\n                        max_retries, \"stop\"\n                    ):\n                        will_retry = (\n                            attempt.retry_state.outcome is None\n                            or not attempt.retry_state.outcome.failed\n                        )\n                        is_last_attempt = (\n                            not will_retry\n                            or attempt.retry_state.attempt_number\n                            >= getattr(\n                                max_retries.stop, \"max_attempt_number\", float(\"inf\")\n                            )\n                        )\n                        if is_last_attempt:\n                            hooks.emit_completion_last_attempt(e)\n                    raise e\n    except RetryError as e:\n        logger.debug(f\"Retry error: {e}\")\n        raise InstructorRetryException(\n            e.last_attempt._exception,\n            last_completion=response,\n            n_attempts=attempt.retry_state.attempt_number,\n            #! deprecate messages soon\n            messages=extract_messages(\n                kwargs\n            ),  # Use the optimized function instead of nested lookups\n            create_kwargs=kwargs,\n            total_usage=total_usage,\n            failed_attempts=failed_attempts,\n        ) from e\n"
  },
  {
    "path": "instructor/distil.py",
    "content": "import enum\nimport json\nimport uuid\nimport logging\nimport inspect\nimport functools\n\nfrom typing import (\n    Any,\n    Callable,\n    Optional,\n    TypeVar,\n    TypedDict,\n    Literal,\n    Union,\n)\nfrom typing_extensions import ParamSpec, NotRequired\nfrom openai.types.chat.chat_completion import ChatCompletion\nfrom openai.types.chat.chat_completion_message_param import ChatCompletionMessageParam\nfrom pydantic import BaseModel, validate_call\n\nfrom openai import OpenAI\nfrom .processing.function_calls import openai_schema\n\n\nP = ParamSpec(\"P\")\nT_Retval = TypeVar(\"T_Retval\", bound=BaseModel)\n\n\nclass OpenAIChatKwargs(TypedDict):\n    messages: list[ChatCompletionMessageParam]\n    functions: NotRequired[list[dict[str, Any]]]\n\n\nclass FinetuneFormat(enum.Enum):\n    MESSAGES = \"messages\"\n    RAW = \"raw\"\n\n\ndef get_signature_from_fn(fn: Callable[..., Any]) -> str:\n    \"\"\"\n    Get the function signature as a string.\n\n    :Example:\n\n    >>> def my_function(a: int, b: int) -> int:\n    >>>     return a + b\n    >>>\n    >>> get_signature_from_fn(my_function)\n    \"def my_function(a: int, b: int) -> int\"\n\n    :param fn: Function to get the signature for.\n    :return: Function signature as a string.\n    \"\"\"\n    sig = inspect.signature(fn)\n    lines = f\"def {fn.__name__}{sig}\"  # type: ignore\n    docstring = inspect.getdoc(fn)\n    if docstring:\n        formatted_docstring = f'\"\"\"\\n{docstring}\\n\"\"\"'\n    else:\n        formatted_docstring = \"\"\n    return f\"{lines}\\n{formatted_docstring}\"\n\n\n@functools.lru_cache\ndef format_function(func: Callable[..., Any]) -> str:\n    \"\"\"\n    Format a function as a string with docstring and body.\n    \"\"\"\n    source_lines = inspect.getsourcelines(func)\n    definition = \" \".join(source_lines[0]).strip()\n\n    docstring = inspect.getdoc(func)\n    if docstring:\n        formatted_docstring = f'\"\"\"\\n{docstring}\\n\"\"\"'\n    else:\n        formatted_docstring = \"\"\n\n    body = inspect.getsource(func)\n    body = body.replace(f\"def {func.__name__}\", \"\")  # type: ignore\n\n    return f\"{definition}\\n{formatted_docstring}\\n{body}\"\n\n\ndef is_return_type_base_model_or_instance(func: Callable[..., Any]) -> bool:\n    \"\"\"\n    Check if the return type of a function is a pydantic BaseModel or an instance of it.\n\n    :param func: Function to check.\n    :return: True if the return type is a pydantic BaseModel or an instance of it.\n    \"\"\"\n    return_type = inspect.signature(func).return_annotation\n    assert return_type != inspect.Signature.empty, (\n        \"Must have a return type hint that is a pydantic BaseModel\"\n    )\n    return inspect.isclass(return_type) and issubclass(return_type, BaseModel)\n\n\nclass Instructions:\n    def __init__(\n        self,\n        name: Optional[str] = None,\n        id: Optional[str] = None,\n        log_handlers: Optional[list[logging.Handler]] = None,\n        finetune_format: FinetuneFormat = FinetuneFormat.MESSAGES,\n        indent: int = 2,\n        include_code_body: bool = False,\n        openai_client: Optional[OpenAI] = None,\n    ) -> None:\n        \"\"\"\n        Instructions for distillation and dispatch.\n\n        :param name: Name of the instructions.\n        :param id: ID of the instructions.\n        :param log_handlers: List of log handlers to use.\n        :param finetune_format: Format to use for finetuning.\n        :param indent: Indentation to use for finetuning.\n        :param include_code_body: Whether to include the code body in the finetuning.\n        \"\"\"\n        self.name = name\n        self.id = id or str(uuid.uuid4())\n        self.unique_id = str(uuid.uuid4())\n        self.finetune_format = finetune_format\n        self.indent = indent\n        self.include_code_body = include_code_body\n        self.client = openai_client or OpenAI()\n\n        self.logger = logging.getLogger(self.name)\n        for handler in log_handlers or []:\n            self.logger.addHandler(handler)\n\n    def distil(\n        self,\n        *args: Any,\n        name: Optional[str] = None,\n        mode: Literal[\"distil\", \"dispatch\"] = \"distil\",\n        model: str = \"gpt-3.5-turbo\",\n        fine_tune_format: Optional[FinetuneFormat] = None,\n    ) -> Union[\n        Callable[P, Union[T_Retval, ChatCompletion]],\n        Callable[[Callable[P, T_Retval]], Callable[P, Union[T_Retval, ChatCompletion]]],\n    ]:\n        \"\"\"\n        Decorator to track the function call and response, supports distillation and dispatch modes.\n\n        If used without arguments, it must be used as a decorator.\n\n        :Example:\n\n        >>> @distil\n        >>> def my_function() -> MyModel:\n        >>>     return MyModel()\n        >>>\n        >>> @distil(name=\"my_function\")\n        >>> def my_function() -> MyModel:\n        >>>     return MyModel()\n\n        :param fn: Function to track.\n        :param name: Name of the function to track. Defaults to the function name.\n        :param mode: Mode to use for distillation. Defaults to \"distil\".\n        \"\"\"\n        allowed_modes = {\"distil\", \"dispatch\"}\n        assert mode in allowed_modes, f\"Must be in {allowed_modes}\"\n\n        if fine_tune_format is None:\n            fine_tune_format = self.finetune_format\n\n        def _wrap_distil(\n            fn: Callable[P, T_Retval],\n        ) -> Callable[P, Union[T_Retval, ChatCompletion]]:\n            msg = f\"Return type hint for {fn} must subclass `pydantic.BaseModel'\"\n            assert is_return_type_base_model_or_instance(fn), msg\n            return_base_model = inspect.signature(fn).return_annotation\n\n            @functools.wraps(fn)\n            def _dispatch(*args: P.args, **kwargs: P.kwargs) -> ChatCompletion:\n                openai_kwargs = self.openai_kwargs(\n                    name=name if name else fn.__name__,  # type: ignore\n                    fn=fn,\n                    args=args,\n                    kwargs=kwargs,\n                    base_model=return_base_model,\n                )\n                return self.client.chat.completions.create(\n                    **openai_kwargs,\n                    model=model,\n                    response_model=return_base_model,  # type: ignore - TODO figure out why `response_model` is not recognized\n                )\n\n            @functools.wraps(fn)\n            def _distil(*args: P.args, **kwargs: P.kwargs) -> T_Retval:\n                resp = fn(*args, **kwargs)\n                self.track(\n                    fn,\n                    args,\n                    kwargs,\n                    resp,\n                    name=name,\n                    finetune_format=fine_tune_format,\n                )\n                return resp\n\n            return _dispatch if mode == \"dispatch\" else _distil\n\n        if len(args) == 1 and callable(args[0]):\n            return _wrap_distil(args[0])  # type: ignore\n\n        return _wrap_distil\n\n    @validate_call\n    def track(\n        self,\n        fn: Callable[..., Any],\n        args: tuple[Any, ...],\n        kwargs: dict[str, Any],\n        resp: BaseModel,\n        name: Optional[str] = None,\n        finetune_format: FinetuneFormat = FinetuneFormat.MESSAGES,\n    ) -> None:\n        \"\"\"\n        Track the function call and response in a log file, later used for finetuning.\n\n        :param fn: Function to track.\n        :param args: Arguments passed to the function.\n        :param kwargs: Keyword arguments passed to the function.\n        :param resp: Response returned by the function.\n        :param name: Name of the function to track. Defaults to the function name.\n        :param finetune_format: Format to use for finetuning. Defaults to \"raw\".\n        \"\"\"\n        name = name if name else fn.__name__  # type: ignore\n        base_model = type(resp)\n\n        if finetune_format == FinetuneFormat.MESSAGES:\n            openai_function_call = openai_schema(base_model).openai_schema\n            openai_kwargs = self.openai_kwargs(name, fn, args, kwargs, base_model)\n            openai_kwargs[\"messages\"].append(\n                {\n                    \"role\": \"assistant\",\n                    \"function_call\": {\n                        \"name\": base_model.__name__,\n                        \"arguments\": resp.model_dump_json(indent=self.indent),\n                    },\n                }\n            )\n            openai_kwargs[\"functions\"] = [openai_function_call]\n            self.logger.info(json.dumps(openai_kwargs))\n\n        if finetune_format == FinetuneFormat.RAW:\n            function_body = dict(\n                fn_name=name,\n                fn_repr=format_function(fn),\n                args=args,\n                kwargs=kwargs,\n                resp=resp.model_dump(),\n                schema=base_model.model_json_schema(),\n            )\n            self.logger.info(json.dumps(function_body))\n\n    def openai_kwargs(\n        self,\n        name: str,\n        fn: Callable[..., Any],\n        args: tuple[Any, ...],\n        kwargs: dict[str, Any],\n        base_model: type[BaseModel],\n    ) -> OpenAIChatKwargs:\n        if self.include_code_body:\n            func_def = format_function(fn)\n        else:\n            func_def = get_signature_from_fn(fn)\n\n        str_args = \", \".join(map(str, args))\n        str_kwargs = (\n            \", \".join(f\"{k}={json.dumps(v)}\" for k, v in kwargs.items()) or None\n        )\n        call_args = \", \".join(filter(None, [str_args, str_kwargs]))\n\n        function_body: OpenAIChatKwargs = {\n            \"messages\": [\n                {\n                    \"role\": \"system\",\n                    \"content\": f\"Predict the results of this function:\\n\\n{func_def}\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Return `{name}({call_args})`\",\n                },\n            ],\n        }\n        return function_body\n"
  },
  {
    "path": "instructor/dsl/__init__.py",
    "content": "from .iterable import IterableModel\nfrom .maybe import Maybe\nfrom .partial import Partial\nfrom .citation import CitationMixin\nfrom .simple_type import is_simple_type, ModelAdapter\nfrom .response_list import ListResponse, ResponseList\nfrom . import validators  # Backwards compatibility module\n\n__all__ = [  # noqa: F405\n    \"CitationMixin\",\n    \"IterableModel\",\n    \"ListResponse\",\n    \"Maybe\",\n    \"Partial\",\n    \"ResponseList\",\n    \"is_simple_type\",\n    \"ModelAdapter\",\n    \"validators\",\n]\n"
  },
  {
    "path": "instructor/dsl/citation.py",
    "content": "from pydantic import BaseModel, Field, model_validator, ValidationInfo\nfrom collections.abc import Generator\n\n\nclass CitationMixin(BaseModel):\n    \"\"\"\n    Helpful mixing that can use `validation_context={\"context\": context}` in `from_response` to find the span of the substring_phrase in the context.\n\n    ## Usage\n\n    ```python\n    from pydantic import BaseModel, Field\n    from instructor import CitationMixin\n\n    class User(BaseModel):\n        name: str = Field(description=\"The name of the person\")\n        age: int = Field(description=\"The age of the person\")\n        role: str = Field(description=\"The role of the person\")\n\n\n    context = \"Betty was a student. Jason was a student. Jason is 20 years old\"\n\n    user = openai.ChatCompletion.create(\n        model=\"gpt-3.5-turbo\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract jason from {context}\",\n            },\n        response_model=User,\n        validation_context={\"context\": context},\n        ]\n    )\n\n    for quote in user.substring_quotes:\n        assert quote in context\n\n    print(user.model_dump())\n    ```\n\n    ## Result\n    ```\n    {\n        \"name\": \"Jason Liu\",\n        \"age\": 20,\n        \"role\": \"student\",\n        \"substring_quotes\": [\n            \"Jason was a student\",\n            \"Jason is 20 years old\",\n        ]\n    }\n    ```\n\n    \"\"\"\n\n    substring_quotes: list[str] = Field(\n        description=\"List of unique and specific substrings of the quote that was used to answer the question.\",\n    )\n\n    @model_validator(mode=\"after\")  # type: ignore[misc]\n    def validate_sources(self, info: ValidationInfo) -> \"CitationMixin\":\n        \"\"\"\n        For each substring_phrase, find the span of the substring_phrase in the context.\n        If the span is not found, remove the substring_phrase from the list.\n        \"\"\"\n        if info.context is None:\n            return self\n\n        # Get the context from the info\n        text_chunks = info.context.get(\"context\", None)\n\n        # Get the spans of the substring_phrase in the context\n        spans = list(self.get_spans(text_chunks))\n        # Replace the substring_phrase with the actual substring\n        self.substring_quotes = [text_chunks[span[0] : span[1]] for span in spans]\n        return self\n\n    def _get_span(\n        self, quote: str, context: str, errs: int = 5\n    ) -> Generator[tuple[int, int], None, None]:\n        import regex\n\n        minor = quote\n        major = context\n\n        errs_ = 0\n        s = regex.search(f\"({minor}){{e<={errs_}}}\", major)\n        while s is None and errs_ <= errs:\n            errs_ += 1\n            s = regex.search(f\"({minor}){{e<={errs_}}}\", major)\n\n        if s is not None:\n            yield from s.spans()\n\n    def get_spans(self, context: str) -> Generator[tuple[int, int], None, None]:\n        for quote in self.substring_quotes:\n            yield from self._get_span(quote, context)\n"
  },
  {
    "path": "instructor/dsl/iterable.py",
    "content": "from collections.abc import AsyncGenerator, Generator, Iterable\nfrom typing import (\n    Any,\n    ClassVar,\n    Optional,\n    cast,\n    get_origin,\n    get_args,\n    Union,\n    TYPE_CHECKING,\n)\nimport json\nfrom pydantic import BaseModel, Field, create_model\nfrom ..mode import Mode\nfrom ..utils import extract_json_from_stream, extract_json_from_stream_async\n\nif TYPE_CHECKING:\n    pass\n\n\nclass IterableBase:\n    task_type: ClassVar[Optional[type[BaseModel]]] = None\n\n    @classmethod\n    def from_streaming_response(\n        cls, completion: Iterable[Any], mode: Mode, **kwargs: Any\n    ) -> Generator[BaseModel, None, None]:  # noqa: ARG003\n        json_chunks = cls.extract_json(completion, mode)\n\n        if mode in {Mode.MD_JSON, Mode.GEMINI_TOOLS}:\n            json_chunks = extract_json_from_stream(json_chunks)\n\n        if mode in {Mode.VERTEXAI_TOOLS, Mode.MISTRAL_TOOLS}:\n            response = next(json_chunks)\n            if not response:\n                return\n\n            json_response = json.loads(response)\n            if not json_response[\"tasks\"]:\n                return\n\n            for item in json_response[\"tasks\"]:\n                yield cls.extract_cls_task_type(json.dumps(item), **kwargs)\n\n        yield from cls.tasks_from_chunks(json_chunks, **kwargs)\n\n    @classmethod\n    async def from_streaming_response_async(\n        cls, completion: AsyncGenerator[Any, None], mode: Mode, **kwargs: Any\n    ) -> AsyncGenerator[BaseModel, None]:\n        json_chunks = cls.extract_json_async(completion, mode)\n\n        if mode in {Mode.MD_JSON, Mode.GEMINI_TOOLS}:\n            json_chunks = extract_json_from_stream_async(json_chunks)\n\n        if mode in {Mode.MISTRAL_TOOLS, Mode.VERTEXAI_TOOLS}:\n            async for item in cls.tasks_from_mistral_chunks(json_chunks, **kwargs):\n                yield item\n        else:\n            async for item in cls.tasks_from_chunks_async(json_chunks, **kwargs):\n                yield item\n\n    @classmethod\n    async def tasks_from_mistral_chunks(\n        cls, json_chunks: AsyncGenerator[str, None], **kwargs: Any\n    ) -> AsyncGenerator[BaseModel, None]:\n        \"\"\"Process streaming chunks from Mistral and VertexAI.\n\n        Handles the specific JSON format used by these providers when streaming.\"\"\"\n\n        async for chunk in json_chunks:\n            if not chunk:\n                continue\n            json_response = json.loads(chunk)\n            if not json_response[\"tasks\"]:\n                continue\n\n            for item in json_response[\"tasks\"]:\n                obj = cls.extract_cls_task_type(json.dumps(item), **kwargs)\n                yield obj\n\n    @classmethod\n    def tasks_from_chunks(\n        cls, json_chunks: Iterable[str], **kwargs: Any\n    ) -> Generator[BaseModel, None, None]:\n        started = False\n        potential_object = \"\"\n        for chunk in json_chunks:\n            potential_object += chunk\n            if not started:\n                if \"[\" in chunk:\n                    started = True\n                    potential_object = chunk[chunk.find(\"[\") + 1 :]\n\n            while True:\n                task_json, potential_object = cls.get_object(potential_object, 0)\n                if task_json:\n                    assert cls.task_type is not None\n                    obj = cls.extract_cls_task_type(task_json, **kwargs)\n                    yield obj\n                else:\n                    break\n\n    @classmethod\n    async def tasks_from_chunks_async(\n        cls, json_chunks: AsyncGenerator[str, None], **kwargs: Any\n    ) -> AsyncGenerator[BaseModel, None]:\n        started = False\n        potential_object = \"\"\n        async for chunk in json_chunks:\n            potential_object += chunk\n            if not started:\n                if \"[\" in chunk:\n                    started = True\n                    potential_object = chunk[chunk.find(\"[\") + 1 :]\n\n            while True:\n                task_json, potential_object = cls.get_object(potential_object, 0)\n                if task_json:\n                    assert cls.task_type is not None\n                    obj = cls.extract_cls_task_type(task_json, **kwargs)\n                    yield obj\n                else:\n                    break\n\n    @classmethod\n    def extract_cls_task_type(\n        cls,\n        task_json: str,\n        **kwargs: Any,\n    ):\n        assert cls.task_type is not None\n        if get_origin(cls.task_type) is Union:\n            union_members = get_args(cls.task_type)\n            for member in union_members:\n                try:\n                    obj = member.model_validate_json(task_json, **kwargs)\n                    return obj\n                except Exception:\n                    pass\n        else:\n            return cls.task_type.model_validate_json(task_json, **kwargs)\n        raise ValueError(\n            f\"Failed to extract task type with {task_json} for {cls.task_type}\"\n        )\n\n    @staticmethod\n    def extract_json(\n        completion: Iterable[Any], mode: Mode\n    ) -> Generator[str, None, None]:\n        json_started = False\n        for chunk in completion:\n            try:\n                if mode in {Mode.COHERE_TOOLS, Mode.COHERE_JSON_SCHEMA}:\n                    event_type = getattr(chunk, \"event_type\", None)\n                    if event_type == \"text-generation\":\n                        if text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-chunk\":\n                        delta = getattr(chunk, \"tool_call_delta\", None)\n                        args = getattr(delta, \"parameters\", None) or getattr(\n                            delta, \"text\", None\n                        )\n                        if args:\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-generation\":\n                        tool_calls = getattr(chunk, \"tool_calls\", None)\n                        if tool_calls:\n                            args = json.dumps(tool_calls[0].parameters)\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    else:\n                        chunk_type = getattr(chunk, \"type\", None)\n                        if chunk_type == \"content-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            content = getattr(message, \"content\", None)\n                            if text := getattr(content, \"text\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                text.find(\"{\"),\n                                                text.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    text = text[json_start:]\n                                yield text\n                        elif chunk_type == \"tool-call-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            tool_calls = getattr(message, \"tool_calls\", None)\n                            function = getattr(tool_calls, \"function\", None)\n                            if args := getattr(function, \"arguments\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                args.find(\"{\"),\n                                                args.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    args = args[json_start:]\n                                yield args\n                if mode == Mode.ANTHROPIC_JSON:\n                    if json_chunk := chunk.delta.text:\n                        yield json_chunk\n                if mode == Mode.ANTHROPIC_TOOLS:\n                    yield chunk.delta.partial_json\n                if mode == Mode.GEMINI_JSON:\n                    yield chunk.text\n                if mode == Mode.VERTEXAI_JSON:\n                    yield chunk.candidates[0].content.parts[0].text\n                if mode == Mode.VERTEXAI_TOOLS:\n                    yield json.dumps(\n                        chunk.candidates[0].content.parts[0].function_call.args\n                    )\n                if mode == Mode.MISTRAL_STRUCTURED_OUTPUTS:\n                    yield chunk.data.choices[0].delta.content\n                if mode == Mode.MISTRAL_TOOLS:\n                    if not chunk.data.choices[0].delta.tool_calls:\n                        continue\n                    yield chunk.data.choices[0].delta.tool_calls[0].function.arguments\n\n                if mode in {Mode.GENAI_TOOLS}:\n                    yield json.dumps(\n                        chunk.candidates[0].content.parts[0].function_call.args\n                    )\n                if mode in {Mode.GENAI_STRUCTURED_OUTPUTS}:\n                    yield chunk.candidates[0].content.parts[0].text\n\n                if mode in {Mode.GEMINI_TOOLS}:\n                    resp = chunk.candidates[0].content.parts[0].function_call\n                    resp_dict = type(resp).to_dict(resp)  # type:ignore\n\n                    if \"args\" in resp_dict:\n                        yield json.dumps(resp_dict[\"args\"])\n\n                if mode in {\n                    Mode.RESPONSES_TOOLS,\n                    Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n                }:\n                    from openai.types.responses import (\n                        ResponseFunctionCallArgumentsDeltaEvent,\n                    )\n\n                    if isinstance(chunk, ResponseFunctionCallArgumentsDeltaEvent):\n                        yield chunk.delta\n                elif chunk.choices:\n                    if mode == Mode.FUNCTIONS:\n                        Mode.warn_mode_functions_deprecation()\n                        if json_chunk := chunk.choices[0].delta.function_call.arguments:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.JSON,\n                        Mode.MD_JSON,\n                        Mode.JSON_SCHEMA,\n                        Mode.CEREBRAS_JSON,\n                        Mode.FIREWORKS_JSON,\n                        Mode.PERPLEXITY_JSON,\n                        Mode.WRITER_JSON,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.content:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.TOOLS,\n                        Mode.TOOLS_STRICT,\n                        Mode.FIREWORKS_TOOLS,\n                        Mode.WRITER_TOOLS,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.tool_calls:\n                            if json_chunk[0].function.arguments is not None:\n                                yield json_chunk[0].function.arguments\n                    else:\n                        raise NotImplementedError(\n                            f\"Mode {mode} is not supported for MultiTask streaming\"\n                        )\n            except AttributeError:\n                pass\n\n    @staticmethod\n    async def extract_json_async(\n        completion: AsyncGenerator[Any, None], mode: Mode\n    ) -> AsyncGenerator[str, None]:\n        json_started = False\n        async for chunk in completion:\n            try:\n                if mode in {Mode.COHERE_TOOLS, Mode.COHERE_JSON_SCHEMA}:\n                    event_type = getattr(chunk, \"event_type\", None)\n                    if event_type == \"text-generation\":\n                        if text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-chunk\":\n                        delta = getattr(chunk, \"tool_call_delta\", None)\n                        args = getattr(delta, \"parameters\", None) or getattr(\n                            delta, \"text\", None\n                        )\n                        if args:\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-generation\":\n                        tool_calls = getattr(chunk, \"tool_calls\", None)\n                        if tool_calls:\n                            args = json.dumps(tool_calls[0].parameters)\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    else:\n                        chunk_type = getattr(chunk, \"type\", None)\n                        if chunk_type == \"content-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            content = getattr(message, \"content\", None)\n                            if text := getattr(content, \"text\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                text.find(\"{\"),\n                                                text.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    text = text[json_start:]\n                                yield text\n                        elif chunk_type == \"tool-call-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            tool_calls = getattr(message, \"tool_calls\", None)\n                            function = getattr(tool_calls, \"function\", None)\n                            if args := getattr(function, \"arguments\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                args.find(\"{\"),\n                                                args.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    args = args[json_start:]\n                                yield args\n                if mode == Mode.ANTHROPIC_JSON:\n                    if json_chunk := chunk.delta.text:\n                        yield json_chunk\n                if mode == Mode.ANTHROPIC_TOOLS:\n                    yield chunk.delta.partial_json\n                if mode == Mode.VERTEXAI_JSON:\n                    yield chunk.candidates[0].content.parts[0].text\n                if mode == Mode.VERTEXAI_TOOLS:\n                    yield json.dumps(\n                        chunk.candidates[0].content.parts[0].function_call.args\n                    )\n                if mode == Mode.MISTRAL_STRUCTURED_OUTPUTS:\n                    yield chunk.data.choices[0].delta.content\n                if mode == Mode.MISTRAL_TOOLS:\n                    if not chunk.data.choices[0].delta.tool_calls:\n                        continue\n                    yield chunk.data.choices[0].delta.tool_calls[0].function.arguments\n                if mode == Mode.GENAI_STRUCTURED_OUTPUTS:\n                    yield chunk.text\n                if mode in {Mode.GENAI_TOOLS}:\n                    yield json.dumps(\n                        chunk.candidates[0].content.parts[0].function_call.args\n                    )\n                if mode in {\n                    Mode.RESPONSES_TOOLS,\n                    Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n                }:\n                    from openai.types.responses import (\n                        ResponseFunctionCallArgumentsDeltaEvent,\n                    )\n\n                    if isinstance(chunk, ResponseFunctionCallArgumentsDeltaEvent):\n                        yield chunk.delta\n                elif chunk.choices:\n                    if mode == Mode.FUNCTIONS:\n                        Mode.warn_mode_functions_deprecation()\n                        if json_chunk := chunk.choices[0].delta.function_call.arguments:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.JSON,\n                        Mode.MD_JSON,\n                        Mode.JSON_SCHEMA,\n                        Mode.CEREBRAS_JSON,\n                        Mode.FIREWORKS_JSON,\n                        Mode.PERPLEXITY_JSON,\n                        Mode.WRITER_JSON,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.content:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.TOOLS,\n                        Mode.TOOLS_STRICT,\n                        Mode.FIREWORKS_TOOLS,\n                        Mode.WRITER_TOOLS,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.tool_calls:\n                            if json_chunk[0].function.arguments is not None:\n                                yield json_chunk[0].function.arguments\n                    else:\n                        raise NotImplementedError(\n                            f\"Mode {mode} is not supported for MultiTask streaming\"\n                        )\n            except AttributeError:\n                pass\n\n    @staticmethod\n    def get_object(s: str, stack: int) -> tuple[Optional[str], str]:\n        start_index = s.find(\"{\")\n        for i, c in enumerate(s):\n            if c == \"{\":\n                stack += 1\n            if c == \"}\":\n                stack -= 1\n                if stack == 0:\n                    return s[start_index : i + 1], s[i + 2 :]\n        return None, s\n\n\ndef IterableModel(\n    subtask_class: type[BaseModel],\n    name: Optional[str] = None,\n    description: Optional[str] = None,\n) -> type[BaseModel]:\n    # Import at runtime to avoid circular import\n    from ..processing.function_calls import OpenAISchema\n\n    \"\"\"\n    Dynamically create a IterableModel OpenAISchema that can be used to segment multiple\n    tasks given a base class. This creates class that can be used to create a toolkit\n    for a specific task, names and descriptions are automatically generated. However\n    they can be overridden.\n\n    ## Usage\n\n    ```python\n    from pydantic import BaseModel, Field\n    from instructor import IterableModel\n\n    class User(BaseModel):\n        name: str = Field(description=\"The name of the person\")\n        age: int = Field(description=\"The age of the person\")\n        role: str = Field(description=\"The role of the person\")\n\n    MultiUser = IterableModel(User)\n    ```\n\n    ## Result\n\n    ```python\n    class MultiUser(OpenAISchema, MultiTaskBase):\n        tasks: List[User] = Field(\n            default_factory=list,\n            repr=False,\n            description=\"Correctly segmented list of `User` tasks\",\n        )\n\n        @classmethod\n        def from_streaming_response(cls, completion) -> Generator[User]:\n            '''\n            Parse the streaming response from OpenAI and yield a `User` object\n            for each task in the response\n            '''\n            json_chunks = cls.extract_json(completion)\n            yield from cls.tasks_from_chunks(json_chunks)\n    ```\n\n    Parameters:\n        subtask_class (Type[OpenAISchema]): The base class to use for the MultiTask\n        name (Optional[str]): The name of the MultiTask class, if None then the name\n            of the subtask class is used as `Multi{subtask_class.__name__}`\n        description (Optional[str]): The description of the MultiTask class, if None\n            then the description is set to `Correct segmentation of `{subtask_class.__name__}` tasks`\n\n    Returns:\n        schema (OpenAISchema): A new class that can be used to segment multiple tasks\n    \"\"\"\n    if name is not None:\n        task_name = name\n    else:\n        # Handle `Union[A, B]` / `A | B` task types.\n        # `types.UnionType` does not have `__name__`, so fall back to a stable name.\n        task_name = getattr(subtask_class, \"__name__\", None)\n        if task_name is None and get_origin(subtask_class) is Union:\n            members = get_args(subtask_class)\n            task_name = \"Or\".join(getattr(m, \"__name__\", str(m)) for m in members)\n        if task_name is None:\n            task_name = str(subtask_class)\n\n    name = f\"Iterable{task_name}\"\n\n    list_tasks = (\n        list[subtask_class],  # type: ignore\n        Field(\n            default_factory=list,\n            repr=False,\n            description=f\"Correctly segmented list of `{task_name}` tasks\",\n        ),\n    )\n\n    base_models = cast(tuple[type[BaseModel], ...], (OpenAISchema, IterableBase))\n    new_cls = create_model(\n        name,\n        tasks=list_tasks,\n        __base__=base_models,\n    )\n    new_cls = cast(type[IterableBase], new_cls)\n\n    # set the class constructor BaseModel\n    new_cls.task_type = subtask_class\n\n    new_cls.__doc__ = (\n        f\"Correct segmentation of `{task_name}` tasks\"\n        if description is None\n        else description\n    )\n    assert issubclass(new_cls, OpenAISchema), (\n        \"The new class should be a subclass of OpenAISchema\"\n    )\n    return new_cls\n"
  },
  {
    "path": "instructor/dsl/json_tracker.py",
    "content": "\"\"\"\nJSON Completeness Tracker for Partial Streaming.\n\nTracks which parts of accumulated JSON are \"closed\" (complete) vs \"open\" (incomplete).\nUses jiter for parsing and a simple heuristic: if a value has a next sibling,\nit must be complete (because jiter had to finish parsing it to find the next one).\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom jiter import from_json\n\n\ndef is_json_complete(json_str: str) -> bool:\n    \"\"\"\n    Check if a JSON string represents a complete structure.\n\n    Uses jiter in strict mode - parsing fails if JSON is incomplete.\n    \"\"\"\n    if not json_str or not json_str.strip():\n        return False\n    try:\n        from_json(json_str.encode())  # No partial_mode = strict parsing\n        return True\n    except ValueError:\n        return False\n\n\nclass JsonCompleteness:\n    \"\"\"\n    Track completeness of JSON structures during streaming.\n\n    Uses a simple heuristic: if a value has a next sibling in the parsed\n    structure, it must be complete. For the last sibling, we don't know\n    until the parent completes - but that's fine because parent validation\n    will cover it.\n\n    Example:\n        tracker = JsonCompleteness()\n\n        # Incomplete - missing closing brace\n        tracker.analyze('{\"name\": \"Alice\", \"address\": {\"city\": \"NY')\n        tracker.is_path_complete(\"\")  # False - root incomplete\n        tracker.is_path_complete(\"name\")  # True - has next sibling \"address\"\n        tracker.is_path_complete(\"address\")  # False - last sibling, unknown\n\n        # Complete\n        tracker.analyze('{\"name\": \"Alice\"}')\n        tracker.is_path_complete(\"\")  # True - root complete\n    \"\"\"\n\n    def __init__(self) -> None:\n        self._complete_paths: set[str] = set()\n\n    def analyze(self, json_str: str) -> None:\n        \"\"\"Analyze a JSON string and determine completeness of each path.\"\"\"\n        self._complete_paths = set()\n\n        if not json_str or not json_str.strip():\n            return\n\n        # Try strict parsing first - if it succeeds, JSON is complete\n        try:\n            parsed = from_json(json_str.encode())\n            self._mark_all(parsed, \"\")\n            return\n        except ValueError:\n            pass  # JSON is incomplete, continue with partial parsing\n\n        # Root incomplete - use sibling heuristic\n        try:\n            parsed = from_json(json_str.encode(), partial_mode=\"trailing-strings\")\n        except ValueError:\n            return\n\n        self._check_siblings(parsed, \"\")\n\n    def _mark_all(self, data: Any, path: str) -> None:\n        \"\"\"Recursively mark path and all children as complete.\"\"\"\n        self._complete_paths.add(path)\n        if isinstance(data, dict):\n            for key, value in data.items():\n                child_path = f\"{path}.{key}\" if path else key\n                self._mark_all(value, child_path)\n        elif isinstance(data, list):\n            for i, item in enumerate(data):\n                self._mark_all(item, f\"{path}[{i}]\")\n\n    def _check_siblings(self, data: Any, path: str) -> None:\n        \"\"\"\n        Check completeness using sibling heuristic.\n\n        If a value has a next sibling, it's complete (jiter had to finish\n        parsing it to find the next sibling). Last sibling is unknown.\n        \"\"\"\n        if isinstance(data, dict):\n            keys = list(data.keys())\n            for i, key in enumerate(keys):\n                child_path = f\"{path}.{key}\" if path else key\n                if i < len(keys) - 1:\n                    # Has next sibling → complete\n                    self._mark_all(data[key], child_path)\n                else:\n                    # Last sibling → recurse to check children\n                    self._check_siblings(data[key], child_path)\n\n        elif isinstance(data, list):\n            for i, item in enumerate(data):\n                child_path = f\"{path}[{i}]\"\n                if i < len(data) - 1:\n                    # Has next sibling → complete\n                    self._mark_all(item, child_path)\n                else:\n                    # Last sibling → recurse\n                    self._check_siblings(item, child_path)\n\n    def is_path_complete(self, path: str) -> bool:\n        \"\"\"\n        Check if the sub-structure at the given path is complete.\n\n        Args:\n            path: Dot-separated path (e.g., \"user.address.city\", \"items[0]\")\n                  Use \"\" for root object.\n\n        Returns:\n            True if the structure at path is complete (closed), False otherwise.\n        \"\"\"\n        return path in self._complete_paths\n\n    def get_complete_paths(self) -> set[str]:\n        \"\"\"Return all paths that are complete.\"\"\"\n        return self._complete_paths.copy()\n\n    def is_root_complete(self) -> bool:\n        \"\"\"Check if the root JSON structure is complete.\"\"\"\n        return \"\" in self._complete_paths\n"
  },
  {
    "path": "instructor/dsl/maybe.py",
    "content": "from pydantic import BaseModel, Field, create_model\nfrom typing import Generic, Optional, TypeVar\n\nT = TypeVar(\"T\", bound=BaseModel)\n\n\nclass MaybeBase(BaseModel, Generic[T]):\n    \"\"\"\n    Extract a result from a model, if any, otherwise set the error and message fields.\n    \"\"\"\n\n    result: Optional[T]\n    error: bool = Field(default=False)\n    message: Optional[str]\n\n    def __bool__(self) -> bool:\n        return self.result is not None\n\n\ndef Maybe(model: type[T]) -> type[MaybeBase[T]]:\n    \"\"\"\n    Create a Maybe model for a given Pydantic model. This allows you to return a model that includes fields for `result`, `error`, and `message` for sitatations where the data may not be present in the context.\n\n    ## Usage\n\n    ```python\n    from pydantic import BaseModel, Field\n    from instructor import Maybe\n\n    class User(BaseModel):\n        name: str = Field(description=\"The name of the person\")\n        age: int = Field(description=\"The age of the person\")\n        role: str = Field(description=\"The role of the person\")\n\n    MaybeUser = Maybe(User)\n    ```\n\n    ## Result\n\n    ```python\n    class MaybeUser(BaseModel):\n        result: Optional[User]\n        error: bool = Field(default=False)\n        message: Optional[str]\n\n        def __bool__(self):\n            return self.result is not None\n    ```\n\n    Parameters:\n        model (Type[BaseModel]): The Pydantic model to wrap with Maybe.\n\n    Returns:\n        MaybeModel (Type[BaseModel]): A new Pydantic model that includes fields for `result`, `error`, and `message`.\n    \"\"\"\n    return create_model(\n        f\"Maybe{model.__name__}\",\n        __base__=MaybeBase,\n        result=(\n            Optional[model],\n            Field(\n                default=None,\n                description=\"Correctly extracted result from the model, if any, otherwise None\",\n            ),\n        ),\n        error=(bool, Field(default=False)),\n        message=(\n            Optional[str],\n            Field(\n                default=None,\n                description=\"Error message if no result was found, should be short and concise\",\n            ),\n        ),\n    )\n"
  },
  {
    "path": "instructor/dsl/parallel.py",
    "content": "import sys\nimport json\nfrom typing import (\n    Any,\n    Optional,\n    TypeVar,\n    Union,\n    get_args,\n    get_origin,\n    TYPE_CHECKING,\n)\nfrom collections.abc import Generator\nfrom pydantic import BaseModel\nfrom collections.abc import Iterable\n\nfrom ..mode import Mode\n\nif TYPE_CHECKING:\n    from ..processing.function_calls import OpenAISchema\n\n    T = TypeVar(\"T\", bound=OpenAISchema)\nelse:\n    # At runtime, we'll bind to BaseModel instead to avoid circular import\n    T = TypeVar(\"T\", bound=BaseModel)\n\n\nclass ParallelBase:\n    def __init__(self, *models: type[BaseModel]):\n        # Note that for everything else we've created a class, but for parallel base it is an instance\n        assert len(models) > 0, \"At least one model is required\"\n        self.models = models\n        self.registry = {\n            model.__name__ if hasattr(model, \"__name__\") else str(model): model\n            for model in models\n        }\n\n    def from_response(\n        self,\n        response: Any,\n        mode: Mode,\n        validation_context: Optional[Any] = None,\n        strict: Optional[bool] = None,\n    ) -> Generator[BaseModel, None, None]:\n        #! We expect this from the OpenAISchema class, We should address\n        #! this with a protocol or an abstract class... @jxnlco\n        assert mode == Mode.PARALLEL_TOOLS, \"Mode must be PARALLEL_TOOLS\"\n        for tool_call in response.choices[0].message.tool_calls:\n            name = tool_call.function.name\n            arguments = tool_call.function.arguments\n            yield self.registry[name].model_validate_json(\n                arguments, context=validation_context, strict=strict\n            )\n\n\nclass VertexAIParallelBase(ParallelBase):\n    def from_response(\n        self,\n        response: Any,\n        mode: Mode,\n        validation_context: Optional[Any] = None,\n        strict: Optional[bool] = None,\n    ) -> Generator[BaseModel, None, None]:\n        assert mode == Mode.VERTEXAI_PARALLEL_TOOLS, (\n            \"Mode must be VERTEXAI_PARALLEL_TOOLS\"\n        )\n\n        if not response or not response.candidates:\n            return\n\n        for candidate in response.candidates:\n            if not candidate.content or not candidate.content.parts:\n                continue\n\n            for part in candidate.content.parts:\n                if hasattr(part, \"function_call\") and part.function_call is not None:\n                    name = part.function_call.name\n                    arguments = part.function_call.args\n\n                    if name in self.registry:\n                        # Convert dict to JSON string before validation\n                        json_str = json.dumps(arguments)\n                        yield self.registry[name].model_validate_json(\n                            json_str, context=validation_context, strict=strict\n                        )\n\n\nif sys.version_info >= (3, 10):\n    from types import UnionType\n\n    def is_union_type(typehint: type[Iterable[T]]) -> bool:\n        return get_origin(get_args(typehint)[0]) in (Union, UnionType)\n\nelse:\n\n    def is_union_type(typehint: type[Iterable[T]]) -> bool:\n        return get_origin(get_args(typehint)[0]) is Union\n\n\ndef get_types_array(typehint: type[Iterable[T]]) -> tuple[type[T], ...]:\n    should_be_iterable = get_origin(typehint)\n\n    if should_be_iterable is not Iterable:\n        raise TypeError(f\"Model should be with Iterable instead of {typehint}\")\n\n    if is_union_type(typehint):\n        # works for Iterable[Union[int, str]], Iterable[int | str]\n        the_types = get_args(get_args(typehint)[0])\n        return the_types\n\n    # works for Iterable[int]\n    return get_args(typehint)\n\n\ndef handle_parallel_model(typehint: type[Iterable[T]]) -> list[dict[str, Any]]:\n    # Import at runtime to avoid circular import\n    from ..processing.function_calls import openai_schema\n\n    the_types = get_types_array(typehint)\n    return [\n        {\"type\": \"function\", \"function\": openai_schema(model).openai_schema}\n        for model in the_types\n    ]\n\n\ndef handle_anthropic_parallel_model(\n    typehint: type[Iterable[T]],\n) -> list[dict[str, Any]]:\n    # Import at runtime to avoid circular import\n    from ..processing.function_calls import openai_schema\n\n    the_types = get_types_array(typehint)\n    return [openai_schema(model).anthropic_schema for model in the_types]\n\n\ndef ParallelModel(typehint: type[Iterable[T]]) -> ParallelBase:\n    the_types = get_types_array(typehint)\n    return ParallelBase(*[model for model in the_types])\n\n\ndef VertexAIParallelModel(typehint: type[Iterable[T]]) -> VertexAIParallelBase:\n    the_types = get_types_array(typehint)\n    return VertexAIParallelBase(*[model for model in the_types])\n\n\nclass AnthropicParallelBase(ParallelBase):\n    def from_response(\n        self,\n        response: Any,\n        mode: Mode,\n        validation_context: Optional[Any] = None,\n        strict: Optional[bool] = None,\n    ) -> Generator[BaseModel, None, None]:\n        assert mode == Mode.ANTHROPIC_PARALLEL_TOOLS, (\n            \"Mode must be ANTHROPIC_PARALLEL_TOOLS\"\n        )\n\n        if not response or not hasattr(response, \"content\"):\n            return\n\n        for content in response.content:\n            if getattr(content, \"type\", None) == \"tool_use\":\n                name = content.name\n                arguments = content.input\n                if name in self.registry:\n                    json_str = json.dumps(arguments)\n                    yield self.registry[name].model_validate_json(\n                        json_str, context=validation_context, strict=strict\n                    )\n\n\ndef AnthropicParallelModel(typehint: type[Iterable[T]]) -> AnthropicParallelBase:\n    the_types = get_types_array(typehint)\n    return AnthropicParallelBase(*[model for model in the_types])\n"
  },
  {
    "path": "instructor/dsl/partial.py",
    "content": "# --------------------------------------------------------------------------------\n# The following code is adapted from a comment on GitHub in the pydantic/pydantic repository by silviumarcu.\n# Source: https://github.com/pydantic/pydantic/issues/6381#issuecomment-1831607091\n#\n# This code is used in accordance with the repository's license, and this reference\n# serves as an acknowledgment of the original author's contribution to this project.\n# --------------------------------------------------------------------------------\n\nfrom __future__ import annotations\n\nimport json\nimport re\nimport sys\nimport types\nimport warnings\nfrom collections.abc import AsyncGenerator, Generator, Iterable\nfrom copy import deepcopy\nfrom functools import cache\nfrom typing import (  # noqa: UP035\n    Any,\n    Generic,\n    List,  # needed for runtime check against typing.List annotations from user code\n    NoReturn,\n    Optional,\n    TypeVar,\n    Union,\n    get_args,\n    get_origin,\n)\n\nfrom jiter import from_json\nfrom pydantic import BaseModel, create_model\nfrom pydantic.fields import FieldInfo\n\nfrom instructor.mode import Mode\nfrom instructor.utils import extract_json_from_stream, extract_json_from_stream_async\nfrom instructor.dsl.json_tracker import JsonCompleteness, is_json_complete\n\nT_Model = TypeVar(\"T_Model\", bound=BaseModel)\n\nif sys.version_info >= (3, 10):\n    # types.UnionType is only available in Python 3.10 and above\n    UNION_ORIGINS = (Union, types.UnionType)\nelse:\n    UNION_ORIGINS = (Union,)\n\n# Track models currently being processed to prevent infinite recursion\n# with self-referential models (e.g., TreeNode with children: List[\"TreeNode\"])\n_processing_models: set[type] = set()\n\n\nclass MakeFieldsOptional:\n    pass\n\n\nclass PartialLiteralMixin:\n    \"\"\"DEPRECATED: This mixin is no longer necessary.\n\n    With completeness-based validation, Literal and Enum types are handled\n    automatically during streaming:\n    - Incomplete JSON: no validation runs, partial values are stored as-is\n    - Complete JSON: full validation against original model\n\n    You can safely remove this mixin from your models.\n    \"\"\"\n\n    def __init_subclass__(cls, **kwargs: Any) -> None:\n        super().__init_subclass__(**kwargs)\n        warnings.warn(\n            \"PartialLiteralMixin is deprecated and no longer necessary. \"\n            \"Completeness-based validation now handles Literal and Enum types \"\n            \"automatically during streaming. You can safely remove this mixin.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n\n\ndef remove_control_chars(s):\n    return re.sub(r\"[\\x00-\\x1F\\x7F-\\x9F]\", \"\", s)\n\n\ndef process_potential_object(potential_object, partial_mode, partial_model, **kwargs):\n    \"\"\"Process a potential JSON object using completeness-based validation.\n\n    - If JSON is complete (closed braces/brackets): validate against original model\n    - If JSON is incomplete: build partial object using model_construct (no validation)\n\n    Note: Pydantic v2.10+ has `experimental_allow_partial` but it doesn't support\n    BaseModel constraints during partial validation (only TypedDict). If Pydantic\n    adds BaseModel support in the future, this could potentially be simplified.\n    See: https://docs.pydantic.dev/latest/concepts/partial_validation/\n    \"\"\"\n    json_str = potential_object.strip() or \"{}\"\n    parsed = from_json(json_str.encode(), partial_mode=partial_mode)\n\n    tracker = JsonCompleteness()\n    tracker.analyze(json_str)\n\n    # Get original model for validation\n    original_model = getattr(partial_model, \"_original_model\", None)\n\n    # Check if root is complete AND has actual data (not just empty {})\n    root_complete = tracker.is_root_complete()\n    has_data = bool(parsed) if isinstance(parsed, dict) else True\n\n    if root_complete and has_data and original_model is not None:\n        # Root object is complete with data - validate against original model\n        return original_model.model_validate(parsed, **kwargs)\n    else:\n        # Object is incomplete or empty - build instance using model_construct (no validation)\n        model_for_construct = (\n            original_model if original_model is not None else partial_model\n        )\n        return _build_partial_object(parsed, model_for_construct, tracker, \"\", **kwargs)\n\n\ndef _build_partial_object(\n    data: Any,\n    model: type[BaseModel],\n    tracker: JsonCompleteness,\n    path: str,\n    **kwargs: Any,\n) -> Any:\n    \"\"\"Build a partial object using model_construct() to skip validation.\n\n    For each field:\n    - If the field's JSON is complete AND it's a nested BaseModel: validate it\n    - Otherwise: store without validation\n    \"\"\"\n    if data is None:\n        return None\n\n    if not isinstance(data, dict):\n        return data\n\n    result = {}\n\n    for field_name in data:\n        field_value = data[field_name]\n        field_path = f\"{path}.{field_name}\" if path else field_name\n\n        if field_value is None:\n            result[field_name] = None\n            continue\n\n        field_complete = tracker.is_path_complete(field_path)\n        field_info = model.model_fields.get(field_name)\n        field_type = field_info.annotation if field_info else None\n\n        if field_complete and field_type is not None:\n            if isinstance(field_type, type) and issubclass(field_type, BaseModel):\n                result[field_name] = field_type.model_validate(field_value, **kwargs)\n                continue\n\n        if isinstance(field_value, dict):\n            nested_model = None\n            if field_type is not None and isinstance(field_type, type):\n                if issubclass(field_type, BaseModel):\n                    nested_model = field_type\n\n            if nested_model:\n                result[field_name] = _build_partial_object(\n                    field_value, nested_model, tracker, field_path, **kwargs\n                )\n            else:\n                result[field_name] = field_value\n        elif isinstance(field_value, list):\n            result[field_name] = _build_partial_list(\n                field_value, model, field_name, tracker, field_path, **kwargs\n            )\n        else:\n            result[field_name] = field_value\n\n    # Set missing fields to None or empty nested models\n    for field_name, field_info in model.model_fields.items():\n        if field_name not in result:\n            field_type = field_info.annotation\n            if isinstance(field_type, type) and issubclass(field_type, BaseModel):\n                result[field_name] = _build_partial_object(\n                    {}, field_type, tracker, \"\", **kwargs\n                )\n            else:\n                result[field_name] = None\n\n    return model.model_construct(**result)\n\n\ndef _build_partial_list(\n    items: list,\n    original_model: type[BaseModel] | None,\n    field_name: str,\n    tracker: JsonCompleteness,\n    path: str,\n    **kwargs: Any,\n) -> list:\n    \"\"\"Build a partial list, validating complete items.\"\"\"\n    result = []\n\n    item_type = None\n    if original_model:\n        field_info = original_model.model_fields.get(field_name)\n        if field_info:\n            field_type = field_info.annotation\n            if get_origin(field_type) in (list, List):  # noqa: UP006\n                args = get_args(field_type)\n                if args:\n                    item_type = args[0]\n\n    for i, item in enumerate(items):\n        item_path = f\"{path}[{i}]\"\n        item_complete = tracker.is_path_complete(item_path)\n\n        if item_complete and item_type and isinstance(item_type, type):\n            if issubclass(item_type, BaseModel) and isinstance(item, dict):\n                result.append(item_type.model_validate(item, **kwargs))\n                continue\n\n        result.append(item)\n\n    return result\n\n\ndef _process_generic_arg(\n    arg: Any,\n    make_fields_optional: bool = False,\n) -> Any:\n    arg_origin = get_origin(arg)\n\n    if arg_origin is not None:\n        # Handle any nested generic type (Union, List, Dict, etc.)\n        nested_args = get_args(arg)\n        modified_nested_args = tuple(\n            _process_generic_arg(\n                t,\n                make_fields_optional=make_fields_optional,\n            )\n            for t in nested_args\n        )\n        # Special handling for Union types (types.UnionType isn't subscriptable)\n        if arg_origin in UNION_ORIGINS:\n            return Union[modified_nested_args]  # type: ignore\n\n        return arg_origin[modified_nested_args]\n    else:\n        if isinstance(arg, type) and issubclass(arg, BaseModel):\n            # Prevent infinite recursion for self-referential models\n            if arg in _processing_models:\n                return arg  # Already processing this model, return unwrapped\n            _processing_models.add(arg)\n            try:\n                return (\n                    Partial[arg, MakeFieldsOptional]  # type: ignore[valid-type]\n                    if make_fields_optional\n                    else Partial[arg]\n                )\n            finally:\n                _processing_models.discard(arg)\n        else:\n            return arg\n\n\ndef _make_field_optional(\n    field: FieldInfo,\n) -> tuple[Any, FieldInfo]:\n    tmp_field = deepcopy(field)\n\n    annotation = field.annotation\n\n    # Handle generics (like List, Dict, Union, Literal, etc.)\n    if get_origin(annotation) is not None:\n        # Get the generic base (like List, Dict) and its arguments (like User in List[User])\n        generic_base = get_origin(annotation)\n        generic_args = get_args(annotation)\n\n        modified_args = tuple(\n            _process_generic_arg(arg, make_fields_optional=True) for arg in generic_args\n        )\n\n        # Reconstruct the generic type with modified arguments\n        tmp_field.annotation = (\n            Optional[generic_base[modified_args]] if generic_base else None\n        )\n        tmp_field.default = None\n        tmp_field.default_factory = None\n    # If the field is a BaseModel, then recursively convert it's\n    # attributes to optionals.\n    elif isinstance(annotation, type) and issubclass(annotation, BaseModel):\n        tmp_field.annotation = Optional[Partial[annotation, MakeFieldsOptional]]  # type: ignore[assignment, valid-type]\n        tmp_field.default = {}\n        tmp_field.default_factory = None\n    else:\n        tmp_field.annotation = Optional[field.annotation]  # type:ignore\n        tmp_field.default = None\n        tmp_field.default_factory = None\n\n    return tmp_field.annotation, tmp_field  # type: ignore\n\n\nclass PartialBase(Generic[T_Model]):\n    @classmethod\n    @cache\n    def get_partial_model(cls) -> type[T_Model]:\n        \"\"\"Return a partial model for holding incomplete streaming data.\n\n        With completeness-based validation, we use model_construct() to build\n        partial objects without validation. This method creates a model with\n        all fields optional and stores a reference to the original model\n        for validation when JSON is complete.\n        \"\"\"\n        assert issubclass(cls, BaseModel), (\n            f\"{cls.__name__} must be a subclass of BaseModel\"\n        )\n\n        model_name = (\n            cls.__name__\n            if cls.__name__.startswith(\"Partial\")\n            else f\"Partial{cls.__name__}\"\n        )\n\n        # Create partial model with optional fields\n        partial_model = create_model(\n            model_name,\n            __base__=cls,\n            __module__=cls.__module__,\n            **{\n                field_name: _make_field_optional(field_info)\n                for field_name, field_info in cls.model_fields.items()\n            },  # type: ignore[all]\n        )\n\n        # Store reference to original model for validation of complete objects\n        original = getattr(cls, \"_original_model\", cls)\n        partial_model._original_model = original  # type: ignore[attr-defined]\n\n        return partial_model\n\n    @classmethod\n    def from_streaming_response(\n        cls, completion: Iterable[Any], mode: Mode, **kwargs: Any\n    ) -> Generator[T_Model, None, None]:\n        json_chunks = cls.extract_json(completion, mode)\n\n        if mode in {Mode.MD_JSON, Mode.GEMINI_TOOLS}:\n            json_chunks = extract_json_from_stream(json_chunks)\n\n        if mode == Mode.WRITER_TOOLS:\n            yield from cls.writer_model_from_chunks(json_chunks, **kwargs)\n        else:\n            yield from cls.model_from_chunks(json_chunks, **kwargs)\n\n    @classmethod\n    async def from_streaming_response_async(\n        cls, completion: AsyncGenerator[Any, None], mode: Mode, **kwargs: Any\n    ) -> AsyncGenerator[T_Model, None]:\n        json_chunks = cls.extract_json_async(completion, mode)\n\n        if mode in {Mode.MD_JSON, Mode.GEMINI_TOOLS}:\n            json_chunks = extract_json_from_stream_async(json_chunks)\n\n        if mode == Mode.WRITER_TOOLS:\n            async for item in cls.writer_model_from_chunks_async(json_chunks, **kwargs):\n                yield item\n        else:\n            async for item in cls.model_from_chunks_async(json_chunks, **kwargs):\n                yield item\n\n    @classmethod\n    def writer_model_from_chunks(\n        cls, json_chunks: Iterable[Any], **kwargs: Any\n    ) -> Generator[T_Model, None, None]:\n        potential_object = \"\"\n        partial_model = cls.get_partial_model()\n        # Always use trailing-strings mode to preserve incomplete data during streaming\n        # PartialLiteralMixin is deprecated - completeness-based validation handles Literals\n        partial_mode = \"trailing-strings\"\n        final_obj = None\n        for chunk in json_chunks:\n            # Writer mode special handling: chunk might be complete JSON replacing accumulated\n            if (\n                len(chunk) > len(potential_object)\n                and chunk.startswith(\"{\")\n                and chunk.endswith(\"}\")\n            ):\n                potential_object = chunk\n            else:\n                potential_object += chunk\n            obj = process_potential_object(\n                potential_object, partial_mode, partial_model, **kwargs\n            )\n            final_obj = obj\n            yield obj\n\n        # Final validation: only validate if the JSON is structurally complete\n        # If JSON is incomplete (stream ended mid-object), skip validation\n        if final_obj is not None:\n            original_model = getattr(cls, \"_original_model\", None)\n            if original_model is not None:\n                if is_json_complete(potential_object.strip() or \"{}\"):\n                    original_model.model_validate(\n                        final_obj.model_dump(exclude_none=True), **kwargs\n                    )\n\n    @classmethod\n    async def writer_model_from_chunks_async(\n        cls, json_chunks: AsyncGenerator[str, None], **kwargs: Any\n    ) -> AsyncGenerator[T_Model, None]:\n        potential_object = \"\"\n        partial_model = cls.get_partial_model()\n        # Always use trailing-strings mode to preserve incomplete data during streaming\n        # PartialLiteralMixin is deprecated - completeness-based validation handles Literals\n        partial_mode = \"trailing-strings\"\n        final_obj = None\n        async for chunk in json_chunks:\n            # Writer mode special handling: chunk might be complete JSON replacing accumulated\n            if (\n                len(chunk) > len(potential_object)\n                and chunk.startswith(\"{\")\n                and chunk.endswith(\"}\")\n            ):\n                potential_object = chunk\n            else:\n                potential_object += chunk\n            obj = process_potential_object(\n                potential_object, partial_mode, partial_model, **kwargs\n            )\n            final_obj = obj\n            yield obj\n\n        # Final validation: only validate if the JSON is structurally complete\n        # If JSON is incomplete (stream ended mid-object), skip validation\n        if final_obj is not None:\n            original_model = getattr(cls, \"_original_model\", None)\n            if original_model is not None:\n                if is_json_complete(potential_object.strip() or \"{}\"):\n                    original_model.model_validate(\n                        final_obj.model_dump(exclude_none=True), **kwargs\n                    )\n\n    @classmethod\n    def model_from_chunks(\n        cls, json_chunks: Iterable[Any], **kwargs: Any\n    ) -> Generator[T_Model, None, None]:\n        potential_object = \"\"\n        partial_model = cls.get_partial_model()\n        # Always use trailing-strings mode to preserve incomplete data during streaming\n        # PartialLiteralMixin is deprecated - completeness-based validation handles Literals\n        partial_mode = \"trailing-strings\"\n        final_obj = None\n        for chunk in json_chunks:\n            if chunk is None:\n                continue\n            if not isinstance(chunk, str):\n                try:\n                    chunk = str(chunk)\n                except Exception:\n                    continue\n            potential_object += remove_control_chars(chunk)\n            obj = process_potential_object(\n                potential_object, partial_mode, partial_model, **kwargs\n            )\n            final_obj = obj\n            yield obj\n\n        # Final validation: only validate if the JSON is structurally complete\n        # If JSON is incomplete (stream ended mid-object), skip validation\n        if final_obj is not None:\n            original_model = getattr(cls, \"_original_model\", None)\n            if original_model is not None:\n                if is_json_complete(potential_object.strip() or \"{}\"):\n                    original_model.model_validate(\n                        final_obj.model_dump(exclude_none=True), **kwargs\n                    )\n\n    @classmethod\n    async def model_from_chunks_async(\n        cls, json_chunks: AsyncGenerator[str, None], **kwargs: Any\n    ) -> AsyncGenerator[T_Model, None]:\n        potential_object = \"\"\n        partial_model = cls.get_partial_model()\n        # Always use trailing-strings mode to preserve incomplete data during streaming\n        # PartialLiteralMixin is deprecated - completeness-based validation handles Literals\n        partial_mode = \"trailing-strings\"\n        final_obj = None\n        async for chunk in json_chunks:\n            if chunk is None:\n                continue\n            if not isinstance(chunk, str):\n                try:\n                    chunk = str(chunk)\n                except Exception:\n                    continue\n            potential_object += remove_control_chars(chunk)\n            obj = process_potential_object(\n                potential_object, partial_mode, partial_model, **kwargs\n            )\n            final_obj = obj\n            yield obj\n\n        # Final validation: only validate if the JSON is structurally complete\n        # If JSON is incomplete (stream ended mid-object), skip validation\n        if final_obj is not None:\n            original_model = getattr(cls, \"_original_model\", None)\n            if original_model is not None:\n                if is_json_complete(potential_object.strip() or \"{}\"):\n                    original_model.model_validate(\n                        final_obj.model_dump(exclude_none=True), **kwargs\n                    )\n\n    @staticmethod\n    def extract_json(\n        completion: Iterable[Any], mode: Mode\n    ) -> Generator[str, None, None]:\n        \"\"\"Extract JSON chunks from various LLM provider streaming responses.\n\n        Each provider has a different structure for streaming responses that needs\n        specific handling to extract the relevant JSON data.\"\"\"\n        json_started = False\n        for chunk in completion:\n            try:\n                if mode in {Mode.COHERE_TOOLS, Mode.COHERE_JSON_SCHEMA}:\n                    event_type = getattr(chunk, \"event_type\", None)\n                    if event_type == \"text-generation\":\n                        if text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-chunk\":\n                        delta = getattr(chunk, \"tool_call_delta\", None)\n                        args = getattr(delta, \"parameters\", None) or getattr(\n                            delta, \"text\", None\n                        )\n                        if args:\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-generation\":\n                        tool_calls = getattr(chunk, \"tool_calls\", None)\n                        if tool_calls:\n                            args = json.dumps(tool_calls[0].parameters)\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    else:\n                        chunk_type = getattr(chunk, \"type\", None)\n                        if chunk_type == \"content-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            content = getattr(message, \"content\", None)\n                            if text := getattr(content, \"text\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                text.find(\"{\"),\n                                                text.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    text = text[json_start:]\n                                yield text\n                        elif chunk_type == \"tool-call-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            tool_calls = getattr(message, \"tool_calls\", None)\n                            function = getattr(tool_calls, \"function\", None)\n                            if args := getattr(function, \"arguments\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                args.find(\"{\"),\n                                                args.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    args = args[json_start:]\n                                yield args\n                if mode == Mode.MISTRAL_STRUCTURED_OUTPUTS:\n                    yield chunk.data.choices[0].delta.content\n                if mode == Mode.MISTRAL_TOOLS:\n                    if not chunk.data.choices[0].delta.tool_calls:\n                        continue\n                    yield chunk.data.choices[0].delta.tool_calls[0].function.arguments\n                if mode == Mode.ANTHROPIC_JSON:\n                    if json_chunk := chunk.delta.text:\n                        yield json_chunk\n                if mode == Mode.ANTHROPIC_TOOLS:\n                    yield chunk.delta.partial_json\n                if mode == Mode.VERTEXAI_JSON:\n                    yield chunk.candidates[0].content.parts[0].text\n                if mode == Mode.VERTEXAI_TOOLS:\n                    yield json.dumps(\n                        chunk.candidates[0].content.parts[0].function_call.args\n                    )\n\n                if mode == Mode.GENAI_STRUCTURED_OUTPUTS:\n                    try:\n                        yield chunk.text\n                    except ValueError as e:\n                        if \"valid `Part`\" in str(e):\n                            # Skip chunk with invalid Part (e.g., due to finish_reason=1 token limit)\n                            continue\n                        raise\n                if mode == Mode.GENAI_TOOLS:\n                    fc = chunk.candidates[0].content.parts[0].function_call.args\n                    yield json.dumps(fc)\n                if mode == Mode.GEMINI_JSON:\n                    try:\n                        yield chunk.text\n                    except ValueError as e:\n                        if \"valid `Part`\" in str(e):\n                            # Skip chunk with invalid Part (e.g., due to finish_reason=1 token limit)\n                            continue\n                        raise\n                if mode == Mode.GEMINI_TOOLS:\n                    resp = chunk.candidates[0].content.parts[0].function_call\n                    resp_dict = type(resp).to_dict(resp)  # type:ignore\n                    if \"args\" in resp_dict:\n                        yield json.dumps(resp_dict[\"args\"])\n                elif mode in {\n                    Mode.RESPONSES_TOOLS,\n                    Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n                }:\n                    from openai.types.responses import (\n                        ResponseFunctionCallArgumentsDeltaEvent,\n                    )\n\n                    if isinstance(chunk, ResponseFunctionCallArgumentsDeltaEvent):\n                        yield chunk.delta\n\n                elif chunk.choices:\n                    if mode == Mode.FUNCTIONS:\n                        Mode.warn_mode_functions_deprecation()\n                        if json_chunk := chunk.choices[0].delta.function_call.arguments:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.JSON,\n                        Mode.MD_JSON,\n                        Mode.JSON_SCHEMA,\n                        Mode.CEREBRAS_JSON,\n                        Mode.FIREWORKS_JSON,\n                        Mode.PERPLEXITY_JSON,\n                        Mode.WRITER_JSON,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.content:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.TOOLS,\n                        Mode.TOOLS_STRICT,\n                        Mode.FIREWORKS_TOOLS,\n                        Mode.WRITER_TOOLS,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.tool_calls:\n                            if json_chunk[0].function.arguments:\n                                yield json_chunk[0].function.arguments\n                    else:\n                        raise NotImplementedError(\n                            f\"Mode {mode} is not supported for MultiTask streaming\"\n                        )\n            except AttributeError:\n                pass\n\n    @staticmethod\n    async def extract_json_async(\n        completion: AsyncGenerator[Any, None], mode: Mode\n    ) -> AsyncGenerator[str, None]:\n        json_started = False\n        async for chunk in completion:\n            try:\n                if mode in {Mode.COHERE_TOOLS, Mode.COHERE_JSON_SCHEMA}:\n                    event_type = getattr(chunk, \"event_type\", None)\n                    if event_type == \"text-generation\":\n                        if text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-chunk\":\n                        delta = getattr(chunk, \"tool_call_delta\", None)\n                        args = getattr(delta, \"parameters\", None) or getattr(\n                            delta, \"text\", None\n                        )\n                        if args:\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    elif event_type == \"tool-calls-generation\":\n                        tool_calls = getattr(chunk, \"tool_calls\", None)\n                        if tool_calls:\n                            args = json.dumps(tool_calls[0].parameters)\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (args.find(\"{\"), args.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                args = args[json_start:]\n                            yield args\n                        elif text := getattr(chunk, \"text\", None):\n                            if not json_started:\n                                json_start = min(\n                                    (\n                                        pos\n                                        for pos in (text.find(\"{\"), text.find(\"[\"))\n                                        if pos != -1\n                                    ),\n                                    default=-1,\n                                )\n                                if json_start == -1:\n                                    continue\n                                json_started = True\n                                text = text[json_start:]\n                            yield text\n                    else:\n                        chunk_type = getattr(chunk, \"type\", None)\n                        if chunk_type == \"content-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            content = getattr(message, \"content\", None)\n                            if text := getattr(content, \"text\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                text.find(\"{\"),\n                                                text.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    text = text[json_start:]\n                                yield text\n                        elif chunk_type == \"tool-call-delta\":\n                            delta = getattr(chunk, \"delta\", None)\n                            message = getattr(delta, \"message\", None)\n                            tool_calls = getattr(message, \"tool_calls\", None)\n                            function = getattr(tool_calls, \"function\", None)\n                            if args := getattr(function, \"arguments\", None):\n                                if not json_started:\n                                    json_start = min(\n                                        (\n                                            pos\n                                            for pos in (\n                                                args.find(\"{\"),\n                                                args.find(\"[\"),\n                                            )\n                                            if pos != -1\n                                        ),\n                                        default=-1,\n                                    )\n                                    if json_start == -1:\n                                        continue\n                                    json_started = True\n                                    args = args[json_start:]\n                                yield args\n                if mode == Mode.ANTHROPIC_JSON:\n                    if json_chunk := chunk.delta.text:\n                        yield json_chunk\n                if mode == Mode.ANTHROPIC_TOOLS:\n                    yield chunk.delta.partial_json\n                if mode == Mode.MISTRAL_STRUCTURED_OUTPUTS:\n                    yield chunk.data.choices[0].delta.content\n                if mode == Mode.MISTRAL_TOOLS:\n                    if not chunk.data.choices[0].delta.tool_calls:\n                        continue\n                    yield chunk.data.choices[0].delta.tool_calls[0].function.arguments\n                if mode == Mode.VERTEXAI_JSON:\n                    yield chunk.candidates[0].content.parts[0].text\n                if mode == Mode.VERTEXAI_TOOLS:\n                    yield json.dumps(\n                        chunk.candidates[0].content.parts[0].function_call.args\n                    )\n                if mode == Mode.GENAI_STRUCTURED_OUTPUTS:\n                    try:\n                        yield chunk.text\n                    except ValueError as e:\n                        if \"valid `Part`\" in str(e):\n                            # Skip chunk with invalid Part (e.g., due to finish_reason=1 token limit)\n                            continue\n                        raise\n                if mode == Mode.GENAI_TOOLS:\n                    fc = chunk.candidates[0].content.parts[0].function_call.args\n                    yield json.dumps(fc)\n                if mode == Mode.GEMINI_JSON:\n                    try:\n                        yield chunk.text\n                    except ValueError as e:\n                        if \"valid `Part`\" in str(e):\n                            # Skip chunk with invalid Part (e.g., due to finish_reason=1 token limit)\n                            continue\n                        raise\n                if mode == Mode.GEMINI_TOOLS:\n                    resp = chunk.candidates[0].content.parts[0].function_call\n                    resp_dict = type(resp).to_dict(resp)  # type:ignore\n                    if \"args\" in resp_dict:\n                        yield json.dumps(resp_dict[\"args\"])\n\n                if mode in {\n                    Mode.RESPONSES_TOOLS,\n                    Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n                }:\n                    from openai.types.responses import (\n                        ResponseFunctionCallArgumentsDeltaEvent,\n                    )\n\n                    if isinstance(chunk, ResponseFunctionCallArgumentsDeltaEvent):\n                        yield chunk.delta\n                elif chunk.choices:\n                    if mode == Mode.FUNCTIONS:\n                        Mode.warn_mode_functions_deprecation()\n                        if json_chunk := chunk.choices[0].delta.function_call.arguments:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.JSON,\n                        Mode.MD_JSON,\n                        Mode.JSON_SCHEMA,\n                        Mode.CEREBRAS_JSON,\n                        Mode.FIREWORKS_JSON,\n                        Mode.PERPLEXITY_JSON,\n                        Mode.WRITER_JSON,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.content:\n                            yield json_chunk\n                    elif mode in {\n                        Mode.TOOLS,\n                        Mode.TOOLS_STRICT,\n                        Mode.FIREWORKS_TOOLS,\n                        Mode.WRITER_TOOLS,\n                    }:\n                        if json_chunk := chunk.choices[0].delta.tool_calls:\n                            if json_chunk[0].function.arguments:\n                                yield json_chunk[0].function.arguments\n                    else:\n                        raise NotImplementedError(\n                            f\"Mode {mode} is not supported for MultiTask streaming\"\n                        )\n            except AttributeError:\n                pass\n\n\nclass Partial(Generic[T_Model]):\n    \"\"\"Generate a new class which has PartialBase as a base class.\n\n    Notes:\n        This will enable partial validation of the model while streaming.\n\n    Example:\n        Partial[SomeModel]\n    \"\"\"\n\n    def __new__(\n        cls,\n        *args: object,  # noqa\n        **kwargs: object,  # noqa\n    ) -> Partial[T_Model]:\n        \"\"\"Cannot instantiate.\n\n        Raises:\n            TypeError: Direct instantiation not allowed.\n        \"\"\"\n        raise TypeError(\"Cannot instantiate abstract Partial class.\")\n\n    def __init_subclass__(\n        cls,\n        *args: object,\n        **kwargs: object,\n    ) -> NoReturn:\n        \"\"\"Cannot subclass.\n\n        Raises:\n           TypeError: Subclassing not allowed.\n        \"\"\"\n        raise TypeError(f\"Cannot subclass {cls.__module__}.Partial\")\n\n    def __class_getitem__(\n        cls,\n        wrapped_class: type[T_Model] | tuple[type[T_Model], type[MakeFieldsOptional]],\n    ) -> type[T_Model]:\n        \"\"\"Convert model to one that inherits from PartialBase.\n\n        We don't make the fields optional at this point, we just wrap them with `Partial` so the names of the nested models will be\n        `Partial{ModelName}`. We want the output of `model_json_schema()` to\n        reflect the name change, but everything else should be the same as the\n        original model. During validation, we'll generate a true partial model\n        to support partially defined fields.\n\n        \"\"\"\n\n        make_fields_optional = None\n        if isinstance(wrapped_class, tuple):\n            wrapped_class, make_fields_optional = wrapped_class\n\n        def _wrap_models(field: FieldInfo) -> tuple[object, FieldInfo]:\n            tmp_field = deepcopy(field)\n\n            annotation = field.annotation\n\n            # Handle generics (like List, Dict, etc.)\n            if get_origin(annotation) is not None:\n                # Get the generic base (like List, Dict) and its arguments (like User in List[User])\n                generic_base = get_origin(annotation)\n                generic_args = get_args(annotation)\n\n                modified_args = tuple(_process_generic_arg(arg) for arg in generic_args)\n\n                # Reconstruct the generic type with modified arguments\n                tmp_field.annotation = (\n                    generic_base[modified_args] if generic_base else None\n                )\n            # If the field is a BaseModel, then recursively convert it's\n            # attributes to optionals.\n            elif isinstance(annotation, type) and issubclass(annotation, BaseModel):\n                # Prevent infinite recursion for self-referential models\n                if annotation in _processing_models:\n                    tmp_field.annotation = (\n                        annotation  # Already processing, keep unwrapped\n                    )\n                else:\n                    _processing_models.add(annotation)\n                    try:\n                        tmp_field.annotation = Partial[annotation]\n                    finally:\n                        _processing_models.discard(annotation)\n            return tmp_field.annotation, tmp_field\n\n        model_name = (\n            wrapped_class.__name__\n            if wrapped_class.__name__.startswith(\"Partial\")\n            else f\"Partial{wrapped_class.__name__}\"\n        )\n\n        partial_model = create_model(\n            model_name,\n            __base__=(wrapped_class, PartialBase),  # type: ignore\n            __module__=wrapped_class.__module__,\n            **{\n                field_name: (\n                    _make_field_optional(field_info)\n                    if make_fields_optional is not None\n                    else _wrap_models(field_info)\n                )\n                for field_name, field_info in wrapped_class.model_fields.items()\n            },  # type: ignore\n        )\n\n        # Store reference to original model for final validation\n        partial_model._original_model = wrapped_class  # type: ignore[attr-defined]\n\n        return partial_model\n"
  },
  {
    "path": "instructor/dsl/response_list.py",
    "content": "\"\"\"List-like response wrapper.\n\nWhen a response model returns a list (for example `list[User]`), we still want to\nattach the provider's raw response so `create_with_completion()` can return it.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any, Generic, TypeVar\n\nT = TypeVar(\"T\")\n\n\nclass ListResponse(list[T], Generic[T]):\n    \"\"\"A list that preserves the underlying provider response.\n\n    This is used when a call returns a list of objects (e.g. `list[User]`), so\n    `create_with_completion()` can still return `(result, raw_response)` without\n    crashing on a plain `list`.\n    \"\"\"\n\n    _raw_response: Any | None\n\n    def __init__(self, iterable=(), _raw_response: Any | None = None):  # type: ignore[no-untyped-def]\n        super().__init__(iterable)\n        self._raw_response = _raw_response\n\n    @classmethod\n    def from_list(cls, items: list[T], *, raw_response: Any | None) -> ListResponse[T]:\n        return cls(items, _raw_response=raw_response)\n\n    def get_raw_response(self) -> Any | None:\n        return self._raw_response\n\n    def __getitem__(self, key):  # type: ignore[no-untyped-def]\n        value = super().__getitem__(key)\n        if isinstance(key, slice):\n            return type(self)(value, _raw_response=self._raw_response)\n        return value\n\n\n# Backwards-friendly alias\nResponseList = ListResponse\n"
  },
  {
    "path": "instructor/dsl/simple_type.py",
    "content": "from __future__ import annotations\nfrom inspect import isclass\nimport typing\nfrom pydantic import BaseModel, create_model\nfrom enum import Enum\nfrom typing import TYPE_CHECKING\n\nfrom instructor.dsl.partial import Partial\n\nif TYPE_CHECKING:\n    pass\n\n\nT = typing.TypeVar(\"T\")\n\n\nclass AdapterBase(BaseModel):\n    pass\n\n\nclass ModelAdapter(typing.Generic[T]):\n    \"\"\"\n    Accepts a response model and returns a BaseModel with the response model as the content.\n    \"\"\"\n\n    def __class_getitem__(cls, response_model: type[BaseModel]) -> type[BaseModel]:\n        # Import at runtime to avoid circular import\n        from ..processing.function_calls import OpenAISchema\n\n        assert is_simple_type(response_model), \"Only simple types are supported\"\n        return create_model(\n            \"Response\",\n            content=(response_model, ...),\n            __doc__=\"Correctly Formatted and Extracted Response.\",\n            __base__=(AdapterBase, OpenAISchema),\n        )\n\n\ndef validateIsSubClass(response_model: type):\n    \"\"\"\n    Temporary guard against issues with generics in Python 3.9\n    \"\"\"\n    import sys\n\n    if sys.version_info < (3, 10):\n        if len(typing.get_args(response_model)) == 0:\n            return False\n        return issubclass(typing.get_args(response_model)[0], BaseModel)\n    try:\n        # Add a guard here to prevent issues with GenericAlias\n        import types\n\n        if isinstance(response_model, types.GenericAlias):\n            return False\n    except Exception:\n        pass\n\n    return issubclass(response_model, BaseModel)\n\n\ndef is_simple_type(\n    response_model: type[BaseModel] | str | int | float | bool | typing.Any,\n) -> bool:\n    # ! we're getting mixes between classes and instances due to how we handle some\n    # ! response model types, we should fix this in later PRs\n\n    # Special case for Python 3.9: Directly handle list[Union[int, str]] pattern\n    import sys\n\n    if sys.version_info < (3, 10):\n        # Check if it's a list type with Union arguments using string representation\n        if str(response_model).startswith(\"list[typing.Union[\") or \"list[Union[\" in str(\n            response_model\n        ):\n            return True\n\n    try:\n        if isclass(response_model) and validateIsSubClass(response_model):\n            return False\n    except TypeError:\n        # ! In versions < 3.11, typing.Iterable is not a class, so we can't use isclass\n        # ! for now if `response_model` is an Iterable isclass and issubclass will raise\n        # ! TypeError, so we need to check if `response_model` is an Iterable\n        # ! This is a workaround for now, we should fix this in later PRs\n        return False\n\n    # Get the origin of the response model\n    origin = typing.get_origin(response_model)\n\n    # Handle special case for list[int | str], list[Union[int, str]] or similar type patterns\n    # Identify a list type by checking for various origins it might have\n    if origin in {typing.Iterable, Partial, list}:\n        # For list types, check the contents before deciding\n        if origin is list:\n            # Extract the inner types from the list\n            args = typing.get_args(response_model)\n            if args and len(args) == 1:\n                inner_arg = args[0]\n                # Special handling for Union types\n                inner_origin = typing.get_origin(inner_arg)\n\n                # Explicit check for Union types - try different patterns across Python versions\n                if (\n                    inner_origin is typing.Union\n                    or inner_origin == typing.Union\n                    or str(inner_origin) == \"typing.Union\"\n                    or str(type(inner_arg)) == \"<class 'typing._UnionGenericAlias'>\"\n                ):\n                    return True\n\n                # Check for Python 3.10+ pipe syntax\n                if hasattr(inner_arg, \"__or__\"):\n                    return True\n\n                # For simple list with basic types, also return True\n                if inner_arg in {str, int, float, bool}:\n                    return True\n\n                # Check if inner type is a BaseModel - if so, not a simple type\n                try:\n                    if isclass(inner_arg) and issubclass(inner_arg, BaseModel):\n                        return False\n                except TypeError:\n                    pass\n\n            # If no args or unknown pattern, treat as simple list\n            return len(args) == 0\n\n        # Extract the inner types from the list for other iterable types\n        args = typing.get_args(response_model)\n        if args and len(args) == 1:\n            inner_arg = args[0]\n            # Special handling for Union types\n            inner_origin = typing.get_origin(inner_arg)\n\n            # Explicit check for Union types - try different patterns across Python versions\n            if (\n                inner_origin is typing.Union\n                or inner_origin == typing.Union\n                or str(inner_origin) == \"typing.Union\"\n                or str(type(inner_arg)) == \"<class 'typing._UnionGenericAlias'>\"\n            ):\n                return True\n\n            # Check for Python 3.10+ pipe syntax\n            if hasattr(inner_arg, \"__or__\"):\n                return True\n\n            # For simple list with basic types, also return True\n            if inner_arg in {str, int, float, bool}:\n                return True\n\n        # For other iterable patterns, return False (e.g., streaming types)\n        return False\n\n    if response_model in {\n        str,\n        int,\n        float,\n        bool,\n    }:\n        return True\n\n    # If the response_model is a simple type like annotated\n    if origin in {\n        typing.Annotated,\n        typing.Literal,\n        typing.Union,\n        list,  # origin of List[T] is list\n    }:\n        return True\n\n    if isclass(response_model) and issubclass(response_model, Enum):\n        return True\n\n    return False\n"
  },
  {
    "path": "instructor/dsl/validators.py",
    "content": "\"\"\"Backwards compatibility module for instructor.dsl.validators.\n\nThis module provides lazy imports to avoid circular import issues.\n\"\"\"\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to avoid circular dependencies.\"\"\"\n    from ..processing import validators as processing_validators\n    from .. import validation\n\n    # Try processing.validators first\n    if hasattr(processing_validators, name):\n        return getattr(processing_validators, name)\n\n    # Then try validation module\n    if hasattr(validation, name):\n        return getattr(validation, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/exceptions.py",
    "content": "\"\"\"Backward compatibility module for instructor.exceptions imports.\n\n.. deprecated:: 1.11.0\n    This module is deprecated. Import exceptions from `instructor.core` instead.\n    For example: `from instructor.core import InstructorRetryException`\n\"\"\"\n\nimport warnings\n\n# Show deprecation warning when this module is imported\nwarnings.warn(\n    \"Importing from 'instructor.exceptions' is deprecated and will be removed in a future version. \"\n    \"Please import from 'instructor.core' instead. \"\n    \"For example: 'from instructor.core import InstructorRetryException'\",\n    DeprecationWarning,\n    stacklevel=2,\n)\n\n# Explicit re-exports for better IDE support and clarity\nfrom .core.exceptions import (\n    AsyncValidationError,\n    ClientError,\n    ConfigurationError,\n    FailedAttempt,\n    IncompleteOutputException,\n    InstructorError,\n    InstructorRetryException,\n    ModeError,\n    MultimodalError,\n    ProviderError,\n    ResponseParsingError,\n    ValidationError,\n)\n\n__all__ = [\n    \"AsyncValidationError\",\n    \"ClientError\",\n    \"ConfigurationError\",\n    \"FailedAttempt\",\n    \"IncompleteOutputException\",\n    \"InstructorError\",\n    \"InstructorRetryException\",\n    \"ModeError\",\n    \"MultimodalError\",\n    \"ProviderError\",\n    \"ResponseParsingError\",\n    \"ValidationError\",\n]\n"
  },
  {
    "path": "instructor/function_calls.py",
    "content": "\"\"\"Backwards compatibility module for instructor.function_calls.\n\nThis module re-exports everything from instructor.processing.function_calls\nfor backwards compatibility.\n\"\"\"\n\n# Re-export everything from the actual function_calls module\nfrom .processing.function_calls import *  # noqa: F401, F403\n"
  },
  {
    "path": "instructor/hooks.py",
    "content": "\"\"\"Backwards compatibility module for instructor.hooks.\n\nThis module provides lazy imports to maintain backwards compatibility.\n\"\"\"\n\nimport warnings\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to provide backward compatibility for hooks imports.\"\"\"\n    warnings.warn(\n        f\"Importing from 'instructor.hooks' is deprecated and will be removed in v2.0.0. \"\n        f\"Please update your imports to use 'instructor.core.hooks.{name}' instead:\\n\"\n        \"  from instructor.core.hooks import Hooks, HookName\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    from .core import hooks as core_hooks\n\n    # Try to get the attribute from the core.hooks module\n    if hasattr(core_hooks, name):\n        return getattr(core_hooks, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/mode.py",
    "content": "import enum\nimport warnings\n\n\n# Track if deprecation warning has been shown\n_functions_deprecation_shown = False\n\n\nclass Mode(enum.Enum):\n    \"\"\"\n    Mode enumeration for patching LLM API clients.\n\n    Each mode determines how the library formats and structures requests\n    to different provider APIs and how it processes their responses.\n    \"\"\"\n\n    # OpenAI modes\n    FUNCTIONS = \"function_call\"  # Deprecated\n    PARALLEL_TOOLS = \"parallel_tool_call\"\n    TOOLS = \"tool_call\"\n    TOOLS_STRICT = \"tools_strict\"\n    JSON = \"json_mode\"\n    JSON_O1 = \"json_o1\"\n    MD_JSON = \"markdown_json_mode\"\n    JSON_SCHEMA = \"json_schema_mode\"\n\n    # Add new modes to support responses api\n    RESPONSES_TOOLS = \"responses_tools\"\n    RESPONSES_TOOLS_WITH_INBUILT_TOOLS = \"responses_tools_with_inbuilt_tools\"\n\n    # XAI modes\n    XAI_JSON = \"xai_json\"\n    XAI_TOOLS = \"xai_tools\"\n\n    # Anthropic modes\n    ANTHROPIC_TOOLS = \"anthropic_tools\"\n    ANTHROPIC_REASONING_TOOLS = \"anthropic_reasoning_tools\"\n    ANTHROPIC_JSON = \"anthropic_json\"\n    ANTHROPIC_PARALLEL_TOOLS = \"anthropic_parallel_tools\"\n\n    # Mistral modes\n    MISTRAL_TOOLS = \"mistral_tools\"\n    MISTRAL_STRUCTURED_OUTPUTS = \"mistral_structured_outputs\"\n\n    # Vertex AI & Google modes\n    VERTEXAI_TOOLS = \"vertexai_tools\"\n    VERTEXAI_JSON = \"vertexai_json\"\n    VERTEXAI_PARALLEL_TOOLS = \"vertexai_parallel_tools\"\n    GEMINI_JSON = \"gemini_json\"\n    GEMINI_TOOLS = \"gemini_tools\"\n    GENAI_TOOLS = \"genai_tools\"\n    GENAI_STRUCTURED_OUTPUTS = \"genai_structured_outputs\"\n\n    # Cohere modes\n    COHERE_TOOLS = \"cohere_tools\"\n    COHERE_JSON_SCHEMA = \"json_object\"\n\n    # Cerebras modes\n    CEREBRAS_TOOLS = \"cerebras_tools\"\n    CEREBRAS_JSON = \"cerebras_json\"\n\n    # Fireworks modes\n    FIREWORKS_TOOLS = \"fireworks_tools\"\n    FIREWORKS_JSON = \"fireworks_json\"\n\n    # Other providers\n    WRITER_TOOLS = \"writer_tools\"\n    WRITER_JSON = \"writer_json\"\n    BEDROCK_TOOLS = \"bedrock_tools\"\n    BEDROCK_JSON = \"bedrock_json\"\n    PERPLEXITY_JSON = \"perplexity_json\"\n    OPENROUTER_STRUCTURED_OUTPUTS = \"openrouter_structured_outputs\"\n\n    # Classification helpers\n    @classmethod\n    def tool_modes(cls) -> set[\"Mode\"]:\n        \"\"\"Returns a set of all tool-based modes.\"\"\"\n        return {\n            cls.FUNCTIONS,\n            cls.PARALLEL_TOOLS,\n            cls.TOOLS,\n            cls.TOOLS_STRICT,\n            cls.ANTHROPIC_TOOLS,\n            cls.ANTHROPIC_REASONING_TOOLS,\n            cls.ANTHROPIC_PARALLEL_TOOLS,\n            cls.MISTRAL_TOOLS,\n            cls.VERTEXAI_TOOLS,\n            cls.VERTEXAI_PARALLEL_TOOLS,\n            cls.GEMINI_TOOLS,\n            cls.COHERE_TOOLS,\n            cls.CEREBRAS_TOOLS,\n            cls.FIREWORKS_TOOLS,\n            cls.WRITER_TOOLS,\n            cls.BEDROCK_TOOLS,\n            cls.OPENROUTER_STRUCTURED_OUTPUTS,\n            cls.MISTRAL_STRUCTURED_OUTPUTS,\n            cls.XAI_TOOLS,\n            cls.GENAI_TOOLS,\n            cls.RESPONSES_TOOLS,\n            cls.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n        }\n\n    @classmethod\n    def json_modes(cls) -> set[\"Mode\"]:\n        \"\"\"Returns a set of all JSON-based modes.\"\"\"\n        return {\n            cls.JSON,\n            cls.JSON_O1,\n            cls.MD_JSON,\n            cls.JSON_SCHEMA,\n            cls.ANTHROPIC_JSON,\n            cls.VERTEXAI_JSON,\n            cls.GEMINI_JSON,\n            cls.COHERE_JSON_SCHEMA,\n            cls.CEREBRAS_JSON,\n            cls.FIREWORKS_JSON,\n            cls.WRITER_JSON,\n            cls.BEDROCK_JSON,\n            cls.PERPLEXITY_JSON,\n            cls.OPENROUTER_STRUCTURED_OUTPUTS,\n            cls.MISTRAL_STRUCTURED_OUTPUTS,\n            cls.XAI_JSON,\n        }\n\n    @classmethod\n    def warn_mode_functions_deprecation(cls):\n        \"\"\"\n        Warn about FUNCTIONS mode deprecation.\n\n        Shows the warning only once per session to avoid spamming logs\n        with the same message.\n        \"\"\"\n        global _functions_deprecation_shown\n        if not _functions_deprecation_shown:\n            warnings.warn(\n                \"The FUNCTIONS mode is deprecated and will be removed in future versions\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            _functions_deprecation_shown = True\n"
  },
  {
    "path": "instructor/models.py",
    "content": "from typing_extensions import TypeAliasType\nfrom typing import Literal\n\n\nKnownModelName = TypeAliasType(\n    \"KnownModelName\",\n    Literal[\n        # Anthropic Models\n        \"anthropic/claude-3-7-sonnet-latest\",\n        \"anthropic/claude-3-7-sonnet-20250219\",\n        \"anthropic/claude-3-5-sonnet-latest\",\n        \"anthropic/claude-3-5-sonnet-20241022\",\n        \"anthropic/claude-3-5-sonnet-20240620\",\n        \"anthropic/claude-3-5-haiku-latest\",\n        \"anthropic/claude-3-5-haiku-20241022\",\n        \"anthropic/claude-3-opus-latest\",\n        \"anthropic/claude-3-opus-20240229\",\n        \"anthropic/claude-3-haiku-20240307\",\n        # Cohere Models - https://docs.cohere.com/docs/models\n        \"cohere/c4ai-aya-expanse-32b\",\n        \"cohere/c4ai-aya-expanse-8b\",\n        \"cohere/command\",\n        \"cohere/command-light\",\n        \"cohere/command-light-nightly\",\n        \"cohere/command-nightly\",\n        \"cohere/command-a-03-2025\",\n        \"cohere/command-r7b-12-2024\",\n        \"cohere/command-a-translate-08-2025\",\n        \"cohere/command-a-reasoning-08-2025\",\n        \"cohere/command-r\",  # deprecated 2025-09-15\n        \"cohere/command-r-03-2024\",  # deprecated 2025-09-15\n        \"cohere/command-r-08-2024\",\n        \"cohere/command-r-plus\",  # deprecated 2025-09-15\n        \"cohere/command-r-plus-04-2024\",  # deprecated 2025-09-15\n        \"cohere/command-r-plus-08-2024\",\n        \"cohere/command-r7b-12-2024\",\n        # OpenAI Models\n        \"openai/gpt-3.5-turbo\",\n        \"openai/gpt-3.5-turbo-0125\",\n        \"openai/gpt-3.5-turbo-1106\",\n        \"openai/gpt-3.5-turbo-16k\",\n        \"openai/gpt-4\",\n        \"openai/gpt-4-0125-preview\",\n        \"openai/gpt-4-0613\",\n        \"openai/gpt-4-1106-preview\",\n        \"openai/gpt-4-32k\",\n        \"openai/gpt-4-32k-0613\",\n        \"openai/gpt-4-turbo\",\n        \"openai/gpt-4-turbo-2024-04-09\",\n        \"openai/gpt-4-turbo-preview\",\n        \"openai/gpt-4.1\",\n        \"openai/gpt-4.1-2025-04-14\",\n        \"openai/gpt-4.1-mini\",\n        \"openai/gpt-4.1-mini-2025-04-14\",\n        \"openai/gpt-4.1-nano\",\n        \"openai/gpt-4.1-nano-2025-04-14\",\n        \"openai/gpt-4o\",\n        \"openai/gpt-4o-2024-05-13\",\n        \"openai/gpt-4o-2024-08-06\",\n        \"openai/gpt-4o-2024-11-20\",\n        \"openai/gpt-4o-audio-preview\",\n        \"openai/gpt-4o-audio-preview-2024-10-01\",\n        \"openai/gpt-4o-audio-preview-2024-12-17\",\n        \"openai/gpt-4o-mini\",\n        \"openai/gpt-4o-mini-2024-07-18\",\n        # Groq Models\n        \"groq/gemma2-9b-it\",\n        \"groq/llama-3.3-70b-versatile\",\n        \"groq/llama-3.1-8b-instant\",\n        \"groq/llama3-70b-8192\",\n        \"groq/llama3-8b-8192\",\n        \"groq/qwen-qwq-32b\",\n        # Mistral\n        \"mistral/codestral-latest\",\n        \"mistral/mistral-large-latest\",\n        \"mistral/mistral-small-latest\",\n        \"mistral/pixtral-large-latest\",\n        \"mistral/mistral-saba-latest\",\n        \"mistral/ministral-3b-latest\",\n        \"mistral/ministral-8b-latest\",\n        # Google Models\n        \"google/gemini-3-flash\",\n        \"google/gemini-3-flash-8b\",\n        \"google/gemini-1.5-pro\",\n        \"google/gemini-2.0-flash-exp\",\n        \"google/gemini-2.0-flash-thinking-exp-01-21\",\n        \"google/gemini-exp-1206\",\n        \"google/gemini-2.0-flash\",\n        \"google/gemini-2.0-flash-lite-preview-02-05\",\n        \"google/gemini-2.0-pro-exp-02-05\",\n        \"google/gemini-2.5-flash-preview-04-17\",\n        \"google/gemini-2.5-pro-exp-03-25\",\n        \"google/gemini-2.5-pro-preview-03-25\",\n        # VertexAI Models\n        \"vertexai/gemini-3-flash\",\n        \"vertexai/gemini-1.5-pro\",\n        \"vertexai/gemini-2.0-flash-exp\",\n        \"vertexai/gemini-2.0-flash-001\",\n        \"vertexai/gemini-2.0-flash-lite\",\n        \"vertexai/gemini-2.5-pro-preview-03-25\",\n        \"vertexai/gemini-2.5-pro-exp-03-25\",\n        \"vertexai/gemini-2.5-flash-preview-04-17\",\n        # Generative AI models\n        \"generative-ai/gemini-3-flash\",\n        \"generative-ai/gemini-3-flash-8b\",\n        \"generative-ai/gemini-1.5-pro\",\n        \"generative-ai/gemini-2.0-flash-exp\",\n        \"generative-ai/gemini-2.0-flash-thinking-exp-01-21\",\n        \"generative-ai/gemini-exp-1206\",\n        \"generative-ai/gemini-2.0-flash\",\n        \"generative-ai/gemini-2.0-flash-lite-preview-02-05\",\n        \"generative-ai/gemini-2.0-pro-exp-02-05\",\n        \"generative-ai/gemini-2.5-flash-preview-04-17\",\n        \"generative-ai/gemini-2.5-pro-exp-03-25\",\n        \"generative-ai/gemini-2.5-pro-preview-03-25\",\n        # Fireworks AI\n        \"fireworks/accounts/fireworks/models/llama4-maverick-instruct-basic\",\n        \"fireworks/accounts/fireworks/models/llama-v3p1-405b-instruct\",\n        \"fireworks/accounts/fireworks/models/llama4-scout-instruct-basic\",\n        \"fireworks/accounts/fireworks/models/qwen3-30b-a3b\",\n        \"fireworks/accounts/fireworks/models/qwen3-235b-a22b\",\n        \"fireworks/accounts/fireworks/models/deepseek-v3\",\n        \"fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct\",\n        \"fireworks/accounts/fireworks/models/llama-v3p3-70b-instruct\",\n        # Cerebras\n        \"cerebras/llama-4-scout-17b-16e-instruct\",\n        \"cerebras/llama3.1-8b\",\n        \"cerebras/llama-3.3-70b\",\n        # Writer\n        \"writer/palmyra-x5\",\n        \"writer/palmyra-x4\",\n        # Perplexity\n        \"perplexity/sonar-deep-research\",\n        \"perplexity/sonar-reasoning-pro\",\n        \"perplexity/sonar-pro\",\n        \"perplexity/sonar\",\n        \"perplexity/r1-1776\",\n    ],\n)\n"
  },
  {
    "path": "instructor/multimodal.py",
    "content": "\"\"\"Backwards compatibility module for instructor.multimodal.\n\nThis module provides lazy imports to maintain backwards compatibility.\n\"\"\"\n\nimport warnings\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to provide backward compatibility for multimodal imports.\"\"\"\n    # Issue deprecation warning when accessing multimodal imports\n    warnings.warn(\n        \"Importing from 'instructor.multimodal' is deprecated and will be removed in v2.0.0. \"\n        f\"Please update your imports to use 'instructor.processing.multimodal.{name}' instead:\\n\"\n        \"  from instructor.processing.multimodal import PDF, Image, Audio\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    from .processing import multimodal as processing_multimodal\n\n    # Try to get the attribute from the processing.multimodal module\n    if hasattr(processing_multimodal, name):\n        return getattr(processing_multimodal, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/patch.py",
    "content": "\"\"\"Backwards compatibility module for instructor.patch.\n\nThis module provides lazy imports to maintain backwards compatibility.\n\"\"\"\n\nimport warnings\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to provide backward compatibility for patch imports.\"\"\"\n    warnings.warn(\n        f\"Importing from 'instructor.patch' is deprecated and will be removed in v2.0.0. \"\n        f\"Please update your imports to use 'instructor.core.patch.{name}' instead:\\n\"\n        \"  from instructor.core.patch import patch, apatch\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    from .core import patch as core_patch\n\n    # Try to get the attribute from the core.patch module\n    if hasattr(core_patch, name):\n        return getattr(core_patch, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/process_response.py",
    "content": "\"\"\"Backwards compatibility module for instructor.process_response.\n\nThis module provides lazy imports to maintain backwards compatibility.\n\"\"\"\n\nimport warnings\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to provide backward compatibility for process_response imports.\"\"\"\n    warnings.warn(\n        f\"Importing from 'instructor.process_response' is deprecated and will be removed in v2.0.0. \"\n        f\"Please update your imports to use 'instructor.processing.response.{name}' instead:\\n\"\n        \"  from instructor.processing.response import process_response\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    from .processing import response as processing_response\n\n    # Try to get the attribute from the processing.response module\n    if hasattr(processing_response, name):\n        return getattr(processing_response, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/processing/__init__.py",
    "content": "\"\"\"Processing components for request/response handling.\"\"\"\n\nfrom .function_calls import OpenAISchema, openai_schema\nfrom .multimodal import convert_messages\nfrom .response import (\n    handle_response_model,\n    process_response,\n    process_response_async,\n    handle_reask_kwargs,\n)\nfrom .schema import (\n    generate_openai_schema,\n    generate_anthropic_schema,\n    generate_gemini_schema,\n)\nfrom .validators import Validator\n\n__all__ = [\n    \"OpenAISchema\",\n    \"openai_schema\",\n    \"convert_messages\",\n    \"handle_response_model\",\n    \"process_response\",\n    \"process_response_async\",\n    \"handle_reask_kwargs\",\n    \"generate_openai_schema\",\n    \"generate_anthropic_schema\",\n    \"generate_gemini_schema\",\n    \"Validator\",\n]\n"
  },
  {
    "path": "instructor/processing/function_calls.py",
    "content": "# type: ignore\nimport json\nimport logging\nimport re\nfrom functools import wraps\nfrom typing import Annotated, Any, Optional, TypeVar, cast\nfrom openai.types.chat import ChatCompletion\nfrom pydantic import (\n    BaseModel,\n    ConfigDict,\n    Field,\n    TypeAdapter,\n    create_model,\n)\n\nfrom ..core.exceptions import (\n    IncompleteOutputException,\n    ResponseParsingError,\n    ConfigurationError,\n)\nfrom ..mode import Mode\nfrom ..utils import (\n    classproperty,\n    extract_json_from_codeblock,\n)\nfrom .schema import (\n    generate_openai_schema,\n    generate_anthropic_schema,\n    generate_gemini_schema,\n)\n\n\nT = TypeVar(\"T\")\nModel = TypeVar(\"Model\", bound=BaseModel)\n\nlogger = logging.getLogger(\"instructor\")\n\n# No schema cache\n\n\n# Utility functions for common JSON parsing operations\ndef _handle_incomplete_output(completion: Any) -> None:\n    \"\"\"Check if a completion was incomplete and raise appropriate exception.\"\"\"\n    if (\n        hasattr(completion, \"choices\")\n        and completion.choices[0].finish_reason == \"length\"\n    ):\n        raise IncompleteOutputException(last_completion=completion)\n\n    # Handle Anthropic format\n    if hasattr(completion, \"stop_reason\") and completion.stop_reason == \"max_tokens\":\n        raise IncompleteOutputException(last_completion=completion)\n\n\ndef _extract_text_content(completion: Any) -> str:\n    \"\"\"Extract text content from various completion formats.\"\"\"\n    # OpenAI format\n    if hasattr(completion, \"choices\"):\n        return completion.choices[0].message.content or \"\"\n\n    # Simple text format\n    if hasattr(completion, \"text\"):\n        return completion.text\n\n    # Anthropic format\n    if hasattr(completion, \"content\"):\n        text_blocks = [c for c in completion.content if c.type == \"text\"]\n        if text_blocks:\n            return text_blocks[0].text\n\n    # Bedrock format\n    if isinstance(completion, dict) and \"output\" in completion:\n        try:\n            return completion.get(\"output\").get(\"message\").get(\"content\")[0].get(\"text\")\n        except (AttributeError, IndexError):\n            pass\n\n    return \"\"\n\n\ndef _validate_model_from_json(\n    cls: type[Any],\n    json_str: str,\n    validation_context: Optional[dict[str, Any]] = None,\n    strict: Optional[bool] = None,\n) -> Any:\n    \"\"\"Validate model from JSON string with appropriate error handling.\"\"\"\n    try:\n        if hasattr(cls, \"model_validate_json\"):\n            if strict:\n                return cls.model_validate_json(\n                    json_str, context=validation_context, strict=True\n                )\n            # Allow control characters\n            parsed = json.loads(json_str, strict=False)\n            return cls.model_validate(parsed, context=validation_context, strict=False)\n\n        adapter = TypeAdapter(cls)\n        if strict:\n            return adapter.validate_json(\n                json_str, context=validation_context, strict=True\n            )\n        parsed = json.loads(json_str, strict=False)\n        return adapter.validate_python(parsed, context=validation_context, strict=False)\n    except json.JSONDecodeError as e:\n        logger.debug(f\"JSON decode error: {e}\")\n        raise\n    except Exception as e:\n        logger.debug(f\"Model validation error: {e}\")\n        raise\n\n\nclass OpenAISchema(BaseModel):\n    # Ignore classproperty, since Pydantic doesn't understand it like it would a normal property.\n    model_config = ConfigDict(ignored_types=(classproperty,))\n\n    @classproperty\n    def openai_schema(cls) -> dict[str, Any]:\n        \"\"\"\n        Return the schema in the format of OpenAI's schema as jsonschema\n\n        Note:\n            Its important to add a docstring to describe how to best use this class, it will be included in the description attribute and be part of the prompt.\n\n        Returns:\n            model_json_schema (dict): A dictionary in the format of OpenAI's schema as jsonschema\n        \"\"\"\n        return generate_openai_schema(cls)\n\n    @classproperty\n    def anthropic_schema(cls) -> dict[str, Any]:\n        # Generate the Anthropic schema based on the OpenAI schema to avoid redundant schema generation\n        return generate_anthropic_schema(cls)\n\n    @classproperty\n    def gemini_schema(cls) -> Any:\n        # This is kept for backward compatibility but deprecated\n        return generate_gemini_schema(cls)\n\n    @classmethod\n    def from_response(\n        cls,\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n        mode: Mode = Mode.TOOLS,\n    ) -> BaseModel:\n        \"\"\"Execute the function from the response of an openai chat completion\n\n        Parameters:\n            completion (openai.ChatCompletion): The response from an openai chat completion\n            strict (bool): Whether to use strict json parsing\n            mode (Mode): The openai completion mode\n\n        Returns:\n            cls (OpenAISchema): An instance of the class\n        \"\"\"\n\n        if mode == Mode.ANTHROPIC_TOOLS:\n            return cls.parse_anthropic_tools(completion, validation_context, strict)\n\n        if mode == Mode.ANTHROPIC_TOOLS or mode == Mode.ANTHROPIC_REASONING_TOOLS:\n            return cls.parse_anthropic_tools(completion, validation_context, strict)\n\n        if mode == Mode.ANTHROPIC_JSON:\n            return cls.parse_anthropic_json(completion, validation_context, strict)\n\n        if mode == Mode.BEDROCK_JSON:\n            return cls.parse_bedrock_json(completion, validation_context, strict)\n\n        if mode == Mode.BEDROCK_TOOLS:\n            return cls.parse_bedrock_tools(completion, validation_context, strict)\n\n        if mode in {Mode.VERTEXAI_TOOLS, Mode.GEMINI_TOOLS}:\n            return cls.parse_vertexai_tools(completion, validation_context)\n\n        if mode == Mode.VERTEXAI_JSON:\n            return cls.parse_vertexai_json(completion, validation_context, strict)\n\n        if mode == Mode.COHERE_TOOLS:\n            return cls.parse_cohere_tools(completion, validation_context, strict)\n\n        if mode == Mode.GEMINI_JSON:\n            return cls.parse_gemini_json(completion, validation_context, strict)\n\n        if mode == Mode.GENAI_STRUCTURED_OUTPUTS:\n            return cls.parse_genai_structured_outputs(\n                completion, validation_context, strict\n            )\n\n        if mode == Mode.GEMINI_TOOLS:\n            return cls.parse_gemini_tools(completion, validation_context, strict)\n\n        if mode == Mode.GENAI_TOOLS:\n            return cls.parse_genai_tools(completion, validation_context, strict)\n\n        if mode == Mode.COHERE_JSON_SCHEMA:\n            return cls.parse_cohere_json_schema(completion, validation_context, strict)\n\n        if mode == Mode.WRITER_TOOLS:\n            return cls.parse_writer_tools(completion, validation_context, strict)\n\n        if mode == Mode.WRITER_JSON:\n            return cls.parse_writer_json(completion, validation_context, strict)\n\n        if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}:\n            return cls.parse_responses_tools(\n                completion,\n                validation_context,\n                strict,\n            )\n\n        if not completion.choices:\n            # This helps catch errors from OpenRouter\n            if hasattr(completion, \"error\"):\n                raise ResponseParsingError(\n                    f\"LLM provider returned error: {completion.error}\",\n                    mode=str(mode),\n                    raw_response=completion,\n                )\n\n            raise ResponseParsingError(\n                \"No completion choices found in LLM response\",\n                mode=str(mode),\n                raw_response=completion,\n            )\n\n        if completion.choices[0].finish_reason == \"length\":\n            raise IncompleteOutputException(last_completion=completion)\n\n        if mode == Mode.FUNCTIONS:\n            Mode.warn_mode_functions_deprecation()\n            return cls.parse_functions(completion, validation_context, strict)\n\n        if mode == Mode.MISTRAL_STRUCTURED_OUTPUTS:\n            return cls.parse_mistral_structured_outputs(\n                completion, validation_context, strict\n            )\n\n        if mode in {\n            Mode.TOOLS,\n            Mode.MISTRAL_TOOLS,\n            Mode.TOOLS_STRICT,\n            Mode.CEREBRAS_TOOLS,\n            Mode.FIREWORKS_TOOLS,\n        }:\n            return cls.parse_tools(completion, validation_context, strict)\n\n        if mode in {\n            Mode.JSON,\n            Mode.JSON_SCHEMA,\n            Mode.MD_JSON,\n            Mode.JSON_O1,\n            Mode.CEREBRAS_JSON,\n            Mode.FIREWORKS_JSON,\n            Mode.PERPLEXITY_JSON,\n            Mode.OPENROUTER_STRUCTURED_OUTPUTS,\n        }:\n            return cls.parse_json(completion, validation_context, strict)\n\n        raise ConfigurationError(\n            f\"Invalid or unsupported mode: {mode}. This mode may not be implemented for response parsing.\"\n        )\n\n    @classmethod\n    def parse_genai_structured_outputs(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        return cls.model_validate_json(\n            completion.text, context=validation_context, strict=strict\n        )\n\n    @classmethod\n    def parse_genai_tools(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        from google.genai import types\n\n        assert isinstance(completion, types.GenerateContentResponse)\n        assert len(completion.candidates) == 1\n\n        # Filter out thought parts (parts with thought: true)\n        parts = completion.candidates[0].content.parts\n        non_thought_parts = [\n            part for part in parts if not (hasattr(part, \"thought\") and part.thought)\n        ]\n\n        assert len(non_thought_parts) == 1, (\n            f\"Instructor does not support multiple function calls, use List[Model] instead\"\n        )\n        function_call = non_thought_parts[0].function_call\n        assert function_call is not None, (\n            f\"Please return your response as a function call with the schema {cls.openai_schema} and the name {cls.openai_schema['name']}\"\n        )\n\n        assert function_call.name == cls.openai_schema[\"name\"]\n        return cls.model_validate(\n            obj=function_call.args, context=validation_context, strict=strict\n        )\n\n    @classmethod\n    def parse_cohere_json_schema(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ):\n        # Handle both V1 and V2 response structures\n        if hasattr(completion, \"text\"):\n            # V1 format: direct text access\n            text = completion.text\n        elif hasattr(completion, \"message\") and hasattr(completion.message, \"content\"):\n            # V2 format: nested structure (message.content[].text)\n            # V2 responses may have multiple content items (thinking, text, etc.)\n            content_items = completion.message.content\n            if content_items and len(content_items) > 0:\n                # Find the text content item (skip thinking/other types)\n                # TODO handle these other content types\n                text = None\n                for item in content_items:\n                    if (\n                        hasattr(item, \"type\")\n                        and item.type == \"text\"\n                        and hasattr(item, \"text\")\n                    ):\n                        text = item.text\n                        break\n\n                if text is None:\n                    raise ResponseParsingError(\n                        \"Cohere V2 response has no text content item\",\n                        mode=\"COHERE_JSON_SCHEMA\",\n                        raw_response=completion,\n                    )\n            else:\n                raise ResponseParsingError(\n                    \"Cohere V2 response has no content\",\n                    mode=\"COHERE_JSON_SCHEMA\",\n                    raw_response=completion,\n                )\n        else:\n            raise ResponseParsingError(\n                f\"Unsupported Cohere response format. Expected 'text' (V1) or \"\n                f\"'message.content[].text' (V2), got: {type(completion)}\",\n                mode=\"COHERE_JSON_SCHEMA\",\n                raw_response=completion,\n            )\n\n        return cls.model_validate_json(text, context=validation_context, strict=strict)\n\n    @classmethod\n    def parse_anthropic_tools(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        from anthropic.types import Message\n\n        if isinstance(completion, Message) and completion.stop_reason == \"max_tokens\":\n            raise IncompleteOutputException(last_completion=completion)\n\n        # Anthropic returns arguments as a dict, dump to json for model validation below\n        tool_calls = [\n            json.dumps(c.input) for c in completion.content if c.type == \"tool_use\"\n        ]  # TODO update with anthropic specific types\n\n        tool_calls_validator = TypeAdapter(\n            Annotated[list[Any], Field(min_length=1, max_length=1)]\n        )\n        tool_call = tool_calls_validator.validate_python(tool_calls)[0]\n\n        return cls.model_validate_json(\n            tool_call, context=validation_context, strict=strict\n        )\n\n    @classmethod\n    def parse_anthropic_json(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        from anthropic.types import Message\n\n        last_block = None\n\n        if hasattr(completion, \"choices\"):\n            completion = completion.choices[0]\n            if completion.finish_reason == \"length\":\n                raise IncompleteOutputException(last_completion=completion)\n            text = completion.message.content\n        else:\n            assert isinstance(completion, Message)\n            if completion.stop_reason == \"max_tokens\":\n                raise IncompleteOutputException(last_completion=completion)\n            # Find the last text block in the completion\n            # this is because the completion is a list of blocks\n            # and the last block is the one that contains the text ideally\n            # this could happen due to things like multiple tool calls\n            # read: https://docs.anthropic.com/en/docs/build-with-claude/tool-use/web-search-tool#response\n            text_blocks = [c for c in completion.content if c.type == \"text\"]\n            last_block = text_blocks[-1]\n            text = last_block.text\n\n        extra_text = extract_json_from_codeblock(text)\n\n        if strict:\n            model = cls.model_validate_json(\n                extra_text, context=validation_context, strict=True\n            )\n        else:\n            # Allow control characters to pass through by using the non-strict JSON parser.\n            parsed = json.loads(extra_text, strict=False)\n            # Pydantic non-strict: https://docs.pydantic.dev/latest/concepts/strict_mode/\n            model = cls.model_validate(parsed, context=validation_context, strict=False)\n\n        return model\n\n    @classmethod\n    def parse_bedrock_json(\n        cls: type[BaseModel],\n        completion: Any,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        if isinstance(completion, dict):\n            # OpenAI will send the first content to be 'reasoningText', and then 'text'\n            content = completion[\"output\"][\"message\"][\"content\"]\n            text_content = next((c for c in content if \"text\" in c), None)\n            if not text_content:\n                raise ResponseParsingError(\n                    \"Unexpected format. No text content found in Bedrock response.\",\n                    mode=\"BEDROCK_JSON\",\n                    raw_response=completion,\n                )\n            text = text_content[\"text\"]\n            match = re.search(r\"```?json(.*?)```?\", text, re.DOTALL)\n            if match:\n                text = match.group(1).strip()\n\n            text = re.sub(r\"```?json|\\\\n\", \"\", text).strip()\n        else:\n            text = completion.text\n        return cls.model_validate_json(text, context=validation_context, strict=strict)\n\n    @classmethod\n    def parse_bedrock_tools(\n        cls: type[BaseModel],\n        completion: Any,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        if isinstance(completion, dict):\n            # Extract the tool use from Bedrock response\n            message = completion.get(\"output\", {}).get(\"message\", {})\n            content = message.get(\"content\", [])\n\n            # Find the tool use content block\n            for content_block in content:\n                if \"toolUse\" in content_block:\n                    tool_use = content_block[\"toolUse\"]\n                    assert tool_use.get(\"name\") == cls.__name__, (\n                        f\"Tool name mismatch: expected {cls.__name__}, got {tool_use.get('name')}\"\n                    )\n                    return cls.model_validate(\n                        tool_use.get(\"input\", {}),\n                        context=validation_context,\n                        strict=strict,\n                    )\n\n            raise ResponseParsingError(\n                \"No tool use found in Bedrock response\",\n                mode=\"BEDROCK_TOOLS\",\n                raw_response=completion,\n            )\n        else:\n            # Fallback for other response formats\n            return cls.model_validate_json(\n                completion.text, context=validation_context, strict=strict\n            )\n\n    @classmethod\n    def parse_gemini_json(\n        cls: type[BaseModel],\n        completion: Any,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        try:\n            text = completion.text\n        except ValueError:\n            logger.debug(\n                f\"Error response: {completion.result.candidates[0].finish_reason}\\n\\n{completion.result.candidates[0].safety_ratings}\"\n            )\n\n        try:\n            extra_text = extract_json_from_codeblock(text)  # type: ignore\n        except UnboundLocalError:\n            raise ResponseParsingError(\n                \"Unable to extract JSON from completion text. The response may have been blocked or empty.\",\n                mode=\"GEMINI_JSON\",\n                raw_response=completion,\n            ) from None\n\n        if strict:\n            return cls.model_validate_json(\n                extra_text, context=validation_context, strict=True\n            )\n        else:\n            # Allow control characters.\n            parsed = json.loads(extra_text, strict=False)\n            # Pydantic non-strict: https://docs.pydantic.dev/latest/concepts/strict_mode/\n            return cls.model_validate(parsed, context=validation_context, strict=False)\n\n    @classmethod\n    def parse_vertexai_tools(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n    ) -> BaseModel:\n        tool_call = completion.candidates[0].content.parts[0].function_call.args  # type: ignore\n        model = {}\n        for field in tool_call:  # type: ignore\n            model[field] = tool_call[field]\n        # We enable strict=False because the conversion from protobuf -> dict often results in types like ints being cast to floats, as a result in order for model.validate to work we need to disable strict mode.\n        return cls.model_validate(model, context=validation_context, strict=False)\n\n    @classmethod\n    def parse_vertexai_json(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        return cls.model_validate_json(\n            completion.text, context=validation_context, strict=strict\n        )\n\n    @classmethod\n    def parse_cohere_tools(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        \"\"\"\n        Parse Cohere tools response.\n\n        Supports:\n        - V1 native tool calls: completion.tool_calls[0].parameters\n        - V2 native tool calls: completion.message.tool_calls[0].function.arguments (JSON string)\n        - V1 text-based: completion.text (prompt-based approach)\n        - V2 text-based: completion.message.content[].text (prompt-based approach)\n        \"\"\"\n        # First, check for native Cohere tool calls (V1 and V2)\n        # V1: completion.tool_calls with tc.parameters (dict)\n        if hasattr(completion, \"tool_calls\") and completion.tool_calls:\n            # V1 tool call format\n            tool_call = completion.tool_calls[0]\n            # Parameters in V1 are already a dict\n            return cls.model_validate(\n                tool_call.parameters, context=validation_context, strict=strict\n            )\n\n        # V2: completion.message.tool_calls with tc.function.arguments (JSON string)\n        if (\n            hasattr(completion, \"message\")\n            and hasattr(completion.message, \"tool_calls\")\n            and completion.message.tool_calls\n        ):\n            # V2 tool call format\n            tool_call = completion.message.tool_calls[0]\n            # Arguments in V2 are a JSON string\n            import json\n\n            arguments = json.loads(tool_call.function.arguments)\n            return cls.model_validate(\n                arguments, context=validation_context, strict=strict\n            )\n\n        # Fallback to text-based extraction (current prompt-based approach)\n        # Handle both V1 and V2 text response structures\n        if hasattr(completion, \"text\"):\n            # V1 format: direct text access\n            text = completion.text\n        elif hasattr(completion, \"message\") and hasattr(completion.message, \"content\"):\n            # V2 format: nested structure (message.content[].text)\n            # V2 responses may have multiple content items (thinking, text, etc.)\n            content_items = completion.message.content\n            if content_items and len(content_items) > 0:\n                # Find the text content item (skip thinking/other types)\n                text = None\n                for item in content_items:\n                    if (\n                        hasattr(item, \"type\")\n                        and item.type == \"text\"\n                        and hasattr(item, \"text\")\n                    ):\n                        text = item.text\n                        break\n\n                if text is None:\n                    raise ResponseParsingError(\n                        \"Cohere V2 response has no text content item\",\n                        mode=\"COHERE_TOOLS\",\n                        raw_response=completion,\n                    )\n            else:\n                raise ResponseParsingError(\n                    \"Cohere V2 response has no content\",\n                    mode=\"COHERE_TOOLS\",\n                    raw_response=completion,\n                )\n        else:\n            raise ResponseParsingError(\n                f\"Unsupported Cohere response format. Expected tool_calls or text content. \"\n                f\"Got: {type(completion)}\",\n                mode=\"COHERE_TOOLS\",\n                raw_response=completion,\n            )\n\n        # Extract JSON from text (for prompt-based approach)\n        extra_text = extract_json_from_codeblock(text)\n        return cls.model_validate_json(\n            extra_text, context=validation_context, strict=strict\n        )\n\n    @classmethod\n    def parse_writer_tools(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        message = completion.choices[0].message\n        tool_calls = message.tool_calls if message.tool_calls else \"{}\"\n        assert len(tool_calls) == 1, (\n            \"Instructor does not support multiple tool calls, use List[Model] instead\"\n        )\n        assert tool_calls[0].function.name == cls.openai_schema[\"name\"], (\n            \"Tool name does not match\"\n        )\n        loaded_args = json.loads(tool_calls[0].function.arguments)\n        return cls.model_validate_json(\n            json.dumps(loaded_args) if isinstance(loaded_args, dict) else loaded_args,\n            context=validation_context,\n            strict=strict,\n        )\n\n    @classmethod\n    def parse_writer_json(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        _handle_incomplete_output(completion)\n\n        message = completion.choices[0].message.content or \"\"\n        json_content = extract_json_from_codeblock(message)\n\n        if strict:\n            return cls.model_validate_json(\n                json_content, context=validation_context, strict=True\n            )\n        else:\n            parsed = json.loads(json_content, strict=False)\n            return cls.model_validate(parsed, context=validation_context, strict=False)\n\n    @classmethod\n    def parse_functions(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        message = completion.choices[0].message\n        assert (\n            message.function_call.name == cls.openai_schema[\"name\"]  # type: ignore[index]\n        ), \"Function name does not match\"\n        return cls.model_validate_json(\n            message.function_call.arguments,  # type: ignore[attr-defined]\n            context=validation_context,\n            strict=strict,\n        )\n\n    @classmethod\n    def parse_responses_tools(\n        cls: type[BaseModel],\n        completion: Any,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        from openai.types.responses import ResponseFunctionToolCall\n\n        tool_call_message = None\n        for message in completion.output:\n            if isinstance(message, ResponseFunctionToolCall):\n                if message.name == cls.openai_schema[\"name\"]:\n                    tool_call_message = message\n                    break\n        if not tool_call_message:\n            raise ResponseParsingError(\n                f\"Required tool call '{cls.openai_schema['name']}' not found in response\",\n                mode=\"RESPONSES_TOOLS\",\n                raw_response=completion,\n            )\n\n        return cls.model_validate_json(\n            tool_call_message.arguments,  # type: ignore[attr-defined]\n            context=validation_context,\n            strict=strict,\n        )\n\n    @classmethod\n    def parse_tools(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        message = completion.choices[0].message\n        # this field seems to be missing when using instructor with some other tools (e.g. litellm)\n        # trying to fix this by adding a check\n\n        if hasattr(message, \"refusal\"):\n            assert message.refusal is None, (\n                f\"Unable to generate a response due to {message.refusal}\"\n            )\n        assert len(message.tool_calls or []) == 1, (\n            f\"Instructor does not support multiple tool calls, use List[Model] instead\"\n        )\n        tool_call = message.tool_calls[0]  # type: ignore\n        assert (\n            tool_call.function.name == cls.openai_schema[\"name\"]  # type: ignore[index]\n        ), \"Tool name does not match\"\n        return cls.model_validate_json(\n            tool_call.function.arguments,  # type: ignore\n            context=validation_context,\n            strict=strict,\n        )\n\n    @classmethod\n    def parse_mistral_structured_outputs(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        if not completion.choices or len(completion.choices) > 1:\n            raise ConfigurationError(\n                \"Instructor does not support multiple tool calls in MISTRAL_STRUCTURED_OUTPUTS mode. \"\n                \"Use list[Model] instead to handle multiple items.\"\n            )\n\n        message = completion.choices[0].message\n\n        return cls.model_validate_json(\n            message.content, context=validation_context, strict=strict\n        )\n\n    @classmethod\n    def parse_json(\n        cls: type[BaseModel],\n        completion: ChatCompletion,\n        validation_context: Optional[dict[str, Any]] = None,\n        strict: Optional[bool] = None,\n    ) -> BaseModel:\n        \"\"\"Parse JSON mode responses using the optimized extraction and validation.\"\"\"\n        # Check for incomplete output\n        _handle_incomplete_output(completion)\n\n        # Extract text from the response\n        message = _extract_text_content(completion)\n        if not message:\n            # Fallback for OpenAI format if _extract_text_content doesn't handle it\n            message = completion.choices[0].message.content or \"\"\n\n        # Extract JSON from the text\n        json_content = extract_json_from_codeblock(message)\n\n        # Validate the model from the JSON\n        return _validate_model_from_json(cls, json_content, validation_context, strict)\n\n\ndef openai_schema(cls: type[BaseModel]) -> OpenAISchema:\n    \"\"\"\n    Wrap a Pydantic model class to add OpenAISchema functionality.\n    \"\"\"\n    if not issubclass(cls, BaseModel):\n        raise ConfigurationError(\n            f\"response_model must be a Pydantic BaseModel subclass, got {type(cls).__name__}\"\n        )\n\n    # Create the wrapped model\n    schema = wraps(cls, updated=())(\n        create_model(\n            cls.__name__ if hasattr(cls, \"__name__\") else str(cls),\n            __base__=(cls, OpenAISchema),\n        )\n    )\n\n    return cast(OpenAISchema, schema)\n"
  },
  {
    "path": "instructor/processing/multimodal.py",
    "content": "from __future__ import annotations\nimport base64\nimport re\nfrom collections.abc import Mapping, Hashable\nfrom functools import lru_cache\nfrom typing import (\n    Any,\n    Callable,\n    Literal,\n    Optional,\n    Union,\n    TypedDict,\n    TypeVar,\n    cast,\n)\nfrom pathlib import Path\nfrom urllib.parse import urlparse\nimport mimetypes\nimport requests\nfrom pydantic import BaseModel, Field\n\nfrom ..core.exceptions import MultimodalError\nfrom ..mode import Mode\n\nF = TypeVar(\"F\", bound=Callable[..., Any])\nK = TypeVar(\"K\", bound=Hashable)\nV = TypeVar(\"V\")\n\n# OpenAI source: https://platform.openai.com/docs/guides/vision/what-type-of-files-can-i-upload\n# Anthropic source: https://docs.anthropic.com/en/docs/build-with-claude/vision#ensuring-image-quality\nVALID_MIME_TYPES = [\"image/jpeg\", \"image/png\", \"image/gif\", \"image/webp\"]\nVALID_AUDIO_MIME_TYPES = [\n    \"audio/aac\",\n    \"audio/flac\",\n    \"audio/mp3\",\n    \"audio/m4a\",\n    \"audio/mpeg\",\n    \"audio/mpga\",\n    \"audio/mp4\",\n    \"audio/opus\",\n    \"audio/pcm\",\n    \"audio/wav\",\n    \"audio/webm\",\n]\nVALID_PDF_MIME_TYPES = [\"application/pdf\"]\nCacheControlType = Mapping[str, str]\nOptionalCacheControlType = Optional[CacheControlType]\n\n\nclass ImageParamsBase(TypedDict):\n    type: Literal[\"image\"]\n    source: str\n\n\nclass ImageParams(ImageParamsBase, total=False):\n    cache_control: CacheControlType\n\n\nclass Image(BaseModel):\n    source: Union[str, Path] = Field(  # noqa: UP007\n        description=\"URL, file path, or base64 data of the image\"\n    )\n    media_type: str = Field(description=\"MIME type of the image\")\n    data: Union[str, None] = Field(  # noqa: UP007\n        None, description=\"Base64 encoded image data\", repr=False\n    )\n\n    @classmethod\n    def autodetect(cls, source: str | Path) -> Image:\n        \"\"\"Attempt to autodetect an image from a source string or Path.\"\"\"\n        if isinstance(source, str):\n            if cls.is_base64(source):\n                return cls.from_base64(source)\n            if source.startswith((\"http://\", \"https://\")):\n                return cls.from_url(source)\n            if source.startswith(\"gs://\"):\n                return cls.from_gs_url(source)\n            # Since detecting the max length of a file universally cross-platform is difficult,\n            # we'll just try/catch the Path conversion and file check\n            try:\n                path = Path(source)\n                if path.is_file():\n                    return cls.from_path(path)\n            except OSError:\n                pass  # Fall through to raw base64 attempt\n\n            return cls.from_raw_base64(source)\n\n        if isinstance(source, Path):\n            return cls.from_path(source)\n\n    @classmethod\n    def autodetect_safely(cls, source: Union[str, Path]) -> Union[Image, str]:  # noqa: UP007\n        \"\"\"Safely attempt to autodetect an image from a source string or path.\n\n        Args:\n            source (Union[str,path]): The source string or path.\n        Returns:\n            An Image if the source is detected to be a valid image, otherwise\n            the source itself as a string.\n        \"\"\"\n        try:\n            return cls.autodetect(source)\n        except ValueError:\n            return str(source)\n\n    @classmethod\n    def is_base64(cls, s: str) -> bool:\n        return bool(re.match(r\"^data:image/[a-zA-Z]+;base64,\", s))\n\n    @classmethod  # Caching likely unnecessary\n    def from_base64(cls, data_uri: str) -> Image:\n        header, encoded = data_uri.split(\",\", 1)\n        media_type = header.split(\":\")[1].split(\";\")[0]\n        if media_type not in VALID_MIME_TYPES:\n            raise MultimodalError(\n                f\"Unsupported image format: {media_type}. Supported formats: {', '.join(VALID_MIME_TYPES)}\",\n                content_type=\"image\",\n            )\n        return cls(\n            source=data_uri,\n            media_type=media_type,\n            data=encoded,\n        )\n\n    @classmethod\n    def from_gs_url(cls, data_uri: str, timeout: int = 30) -> Image:\n        \"\"\"\n        Create an Image instance from a Google Cloud Storage URL.\n\n        Args:\n            data_uri: GCS URL starting with gs://\n            timeout: Request timeout in seconds (default: 30)\n        \"\"\"\n        if not data_uri.startswith(\"gs://\"):\n            raise ValueError(\"URL must start with gs://\")\n\n        public_url = f\"https://storage.googleapis.com/{data_uri[5:]}\"\n\n        try:\n            response = requests.get(public_url, timeout=timeout)\n            response.raise_for_status()\n            media_type = response.headers.get(\"Content-Type\")\n            if media_type not in VALID_MIME_TYPES:\n                raise ValueError(f\"Unsupported image format: {media_type}\")\n\n            data = base64.b64encode(response.content).decode(\"utf-8\")\n\n            return cls(source=data_uri, media_type=media_type, data=data)\n        except requests.RequestException as e:\n            raise ValueError(\n                \"Failed to access GCS image (must be publicly readable)\"\n            ) from e\n\n    @classmethod  # Caching likely unnecessary\n    def from_raw_base64(cls, data: str) -> Image:\n        try:\n            decoded = base64.b64decode(data)\n\n            # Detect image type from file signature (magic bytes)\n            # This replaces imghdr which was removed in Python 3.13\n            img_type = None\n            if decoded.startswith(b\"\\xff\\xd8\\xff\"):\n                img_type = \"jpeg\"\n            elif decoded.startswith(b\"\\x89PNG\\r\\n\\x1a\\n\"):\n                img_type = \"png\"\n            elif decoded.startswith(b\"GIF87a\") or decoded.startswith(b\"GIF89a\"):\n                img_type = \"gif\"\n            elif decoded.startswith(b\"RIFF\") and decoded[8:12] == b\"WEBP\":\n                img_type = \"webp\"\n\n            if img_type:\n                media_type = f\"image/{img_type}\"\n                if media_type in VALID_MIME_TYPES:\n                    return cls(\n                        source=data,\n                        media_type=media_type,\n                        data=data,\n                    )\n            raise ValueError(f\"Unsupported image type: {img_type}\")\n        except Exception as e:\n            raise ValueError(f\"Invalid or unsupported base64 image data\") from e\n\n    @classmethod\n    @lru_cache\n    def from_url(cls, url: str) -> Image:\n        if url.startswith(\"gs://\"):\n            return cls.from_gs_url(url)\n        if cls.is_base64(url):\n            return cls.from_base64(url)\n\n        parsed_url = urlparse(url)\n        media_type, _ = mimetypes.guess_type(parsed_url.path)\n\n        if not media_type:\n            try:\n                response = requests.head(url, allow_redirects=True)\n                media_type = response.headers.get(\"Content-Type\")\n            except requests.RequestException as e:\n                raise ValueError(f\"Failed to fetch image from URL\") from e\n\n        if media_type not in VALID_MIME_TYPES:\n            raise ValueError(f\"Unsupported image format: {media_type}\")\n        return cls(source=url, media_type=media_type, data=None)\n\n    @classmethod\n    @lru_cache\n    def from_path(cls, path: Union[str, Path]) -> Image:  # noqa: UP007\n        path = Path(path)\n        if not path.is_file():\n            raise FileNotFoundError(f\"Image file not found: {path}\")\n\n        if path.stat().st_size == 0:\n            raise ValueError(\"Image file is empty\")\n\n        media_type, _ = mimetypes.guess_type(str(path))\n        if media_type not in VALID_MIME_TYPES:\n            raise ValueError(f\"Unsupported image format: {media_type}\")\n\n        data = base64.b64encode(path.read_bytes()).decode(\"utf-8\")\n        return cls(source=path, media_type=media_type, data=data)\n\n    @staticmethod\n    @lru_cache\n    def url_to_base64(url: str) -> str:\n        \"\"\"Cachable helper method for getting image url and encoding to base64.\"\"\"\n        response = requests.get(url)\n        response.raise_for_status()\n        data = base64.b64encode(response.content).decode(\"utf-8\")\n        return data\n\n    def to_anthropic(self) -> dict[str, Any]:\n        if (\n            isinstance(self.source, str)\n            and self.source.startswith((\"http://\", \"https://\"))\n            and not self.data\n        ):\n            self.data = self.url_to_base64(self.source)\n\n        return {\n            \"type\": \"image\",\n            \"source\": {\n                \"type\": \"base64\",\n                \"media_type\": self.media_type,\n                \"data\": self.data,\n            },\n        }\n\n    def to_openai(self, mode: Mode) -> dict[str, Any]:\n        image_type = (\n            \"input_image\"\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}\n            else \"image_url\"\n        )\n        if (\n            isinstance(self.source, str)\n            and self.source.startswith((\"http://\", \"https://\"))\n            and not self.is_base64(self.source)\n        ):\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}:\n                return {\"type\": \"input_image\", \"image_url\": self.source}\n            else:\n                return {\"type\": image_type, \"image_url\": {\"url\": self.source}}\n        elif self.data or self.is_base64(str(self.source)):\n            data = self.data or str(self.source).split(\",\", 1)[1]\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}:\n                return {\n                    \"type\": \"input_image\",\n                    \"image_url\": f\"data:{self.media_type};base64,{data}\",\n                }\n            else:\n                return {\n                    \"type\": image_type,\n                    \"image_url\": {\"url\": f\"data:{self.media_type};base64,{data}\"},\n                }\n        else:\n            raise ValueError(\"Image data is missing for base64 encoding.\")\n\n    def to_genai(self):\n        \"\"\"\n        Convert the Image instance to Google GenAI's API format.\n        \"\"\"\n        try:\n            from google.genai import types\n        except ImportError as err:\n            raise ImportError(\n                \"google-genai package is required for GenAI integration. Install with: pip install google-genai\"\n            ) from err\n\n        # Google Cloud Storage\n        if isinstance(self.source, str) and self.source.startswith(\"gs://\"):\n            return types.Part.from_bytes(\n                data=self.data,  # type: ignore\n                mime_type=self.media_type,\n            )\n\n        # URL\n        if isinstance(self.source, str) and self.source.startswith(\n            (\"http://\", \"https://\")\n        ):\n            return types.Part.from_bytes(\n                data=requests.get(self.source).content,\n                mime_type=self.media_type,\n            )\n\n        if self.data or self.is_base64(str(self.source)):\n            data = self.data or str(self.source).split(\",\", 1)[1]\n            return types.Part.from_bytes(\n                data=base64.b64decode(data), mime_type=self.media_type\n            )  # type: ignore\n\n        else:\n            raise ValueError(\"Image data is missing for base64 encoding.\")\n\n\nclass Audio(BaseModel):\n    \"\"\"Represents an audio that can be loaded from a URL or file path.\"\"\"\n\n    source: Union[str, Path] = Field(description=\"URL or file path of the audio\")  # noqa: UP007\n    data: Union[str, None] = Field(  # noqa: UP007\n        None, description=\"Base64 encoded audio data\", repr=False\n    )\n    media_type: str = Field(description=\"MIME type of the audio\")\n\n    @classmethod\n    def autodetect(cls, source: str | Path) -> Audio:\n        \"\"\"Attempt to autodetect an audio from a source string or Path.\"\"\"\n        if isinstance(source, str):\n            if cls.is_base64(source):\n                return cls.from_base64(source)\n            if source.startswith((\"http://\", \"https://\")):\n                return cls.from_url(source)\n            if source.startswith(\"gs://\"):\n                return cls.from_gs_url(source)\n            # Since detecting the max length of a file universally cross-platform is difficult,\n            # we'll just try/catch the Path conversion and file check\n            try:\n                path = Path(source)\n                if path.is_file():\n                    return cls.from_path(path)\n            except OSError:\n                pass  # Fall through to error\n\n            raise ValueError(\"Unable to determine audio source\")\n\n        if isinstance(source, Path):\n            return cls.from_path(source)\n\n    @classmethod\n    def autodetect_safely(cls, source: Union[str, Path]) -> Union[Audio, str]:  # noqa: UP007\n        \"\"\"Safely attempt to autodetect an audio from a source string or path.\n\n        Args:\n            source (Union[str,path]): The source string or path.\n        Returns:\n            An Audio if the source is detected to be a valid audio, otherwise\n            the source itself as a string.\n        \"\"\"\n        try:\n            return cls.autodetect(source)\n        except ValueError:\n            return str(source)\n\n    @classmethod\n    def is_base64(cls, s: str) -> bool:\n        return bool(re.match(r\"^data:audio/[a-zA-Z0-9+-]+;base64,\", s))\n\n    @classmethod\n    def from_base64(cls, data_uri: str) -> Audio:\n        header, encoded = data_uri.split(\",\", 1)\n        media_type = header.split(\":\")[1].split(\";\")[0]\n        if media_type not in VALID_AUDIO_MIME_TYPES:\n            raise ValueError(f\"Unsupported audio format: {media_type}\")\n        return cls(\n            source=data_uri,\n            media_type=media_type,\n            data=encoded,\n        )\n\n    @classmethod\n    def from_url(cls, url: str) -> Audio:\n        \"\"\"Create an Audio instance from a URL.\"\"\"\n        if url.startswith(\"gs://\"):\n            return cls.from_gs_url(url)\n        response = requests.get(url)\n        content_type = response.headers.get(\"content-type\")\n        assert content_type in VALID_AUDIO_MIME_TYPES, (\n            f\"Invalid audio format. Must be one of: {', '.join(VALID_AUDIO_MIME_TYPES)}\"\n        )\n\n        data = base64.b64encode(response.content).decode(\"utf-8\")\n        return cls(source=url, data=data, media_type=content_type)\n\n    @classmethod\n    def from_path(cls, path: Union[str, Path]) -> Audio:  # noqa: UP007\n        \"\"\"Create an Audio instance from a file path.\"\"\"\n        path = Path(path)\n        assert path.is_file(), f\"Audio file not found: {path}\"\n\n        mime_type = mimetypes.guess_type(str(path))[0]\n\n        if mime_type == \"audio/x-wav\":\n            mime_type = \"audio/wav\"\n\n        if (\n            mime_type == \"audio/vnd.dlna.adts\"\n        ):  # <--- this is the case for aac audio files in Windows\n            mime_type = \"audio/aac\"\n\n        assert mime_type in VALID_AUDIO_MIME_TYPES, (\n            f\"Invalid audio format. Must be one of: {', '.join(VALID_AUDIO_MIME_TYPES)}\"\n        )\n\n        data = base64.b64encode(path.read_bytes()).decode(\"utf-8\")\n        return cls(source=str(path), data=data, media_type=mime_type)\n\n    @classmethod\n    def from_gs_url(cls, data_uri: str, timeout: int = 30) -> Audio:\n        \"\"\"\n        Create an Audio instance from a Google Cloud Storage URL.\n\n        Args:\n            data_uri: GCS URL starting with gs://\n            timeout: Request timeout in seconds (default: 30)\n        \"\"\"\n        if not data_uri.startswith(\"gs://\"):\n            raise ValueError(\"URL must start with gs://\")\n\n        public_url = f\"https://storage.googleapis.com/{data_uri[5:]}\"\n\n        try:\n            response = requests.get(public_url, timeout=timeout)\n            response.raise_for_status()\n            media_type = response.headers.get(\"Content-Type\")\n            if media_type not in VALID_AUDIO_MIME_TYPES:\n                raise ValueError(f\"Unsupported audio format: {media_type}\")\n\n            data = base64.b64encode(response.content).decode(\"utf-8\")\n\n            return cls(source=data_uri, media_type=media_type, data=data)\n        except requests.RequestException as e:\n            raise ValueError(\n                \"Failed to access GCS audio (must be publicly readable)\"\n            ) from e\n\n    def to_openai(self, mode: Mode) -> dict[str, Any]:\n        \"\"\"Convert the Audio instance to OpenAI's API format.\"\"\"\n        if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}:\n            raise ValueError(\"OpenAI Responses doesn't support audio\")\n\n        return {\n            \"type\": \"input_audio\",\n            \"input_audio\": {\"data\": self.data, \"format\": \"wav\"},\n        }\n\n    def to_anthropic(self) -> dict[str, Any]:\n        raise NotImplementedError(\"Anthropic is not supported yet\")\n\n    def to_genai(self):\n        \"\"\"\n        Convert the Audio instance to Google GenAI's API format.\n        \"\"\"\n        try:\n            from google.genai import types\n        except ImportError as err:\n            raise ImportError(\n                \"google-genai package is required for GenAI integration. Install with: pip install google-genai\"\n            ) from err\n\n        return types.Part.from_bytes(\n            data=base64.b64decode(self.data),  # type: ignore\n            mime_type=self.media_type,\n        )\n\n\nclass ImageWithCacheControl(Image):\n    \"\"\"Image with Anthropic prompt caching support.\"\"\"\n\n    cache_control: OptionalCacheControlType = Field(\n        None, description=\"Optional Anthropic cache control image\"\n    )\n\n    @classmethod\n    def from_image_params(cls, image_params: ImageParams) -> Image:\n        source = image_params[\"source\"]\n        cache_control = image_params.get(\"cache_control\")\n        base_image = Image.autodetect(source)\n        return cls(\n            source=base_image.source,\n            media_type=base_image.media_type,\n            data=base_image.data,\n            cache_control=cache_control,\n        )\n\n    def to_anthropic(self) -> dict[str, Any]:\n        \"\"\"Override Anthropic return with cache_control.\"\"\"\n        result = super().to_anthropic()\n        if self.cache_control:\n            result[\"cache_control\"] = self.cache_control\n        return result\n\n\nclass PDF(BaseModel):\n    source: str | Path = Field(description=\"URL, file path, or base64 data of the PDF\")\n    media_type: str = Field(\n        description=\"MIME type of the PDF\", default=\"application/pdf\"\n    )\n    data: str | None = Field(None, description=\"Base64 encoded PDF data\", repr=False)\n\n    @classmethod\n    def autodetect(cls, source: str | Path) -> PDF:\n        \"\"\"Attempt to autodetect a PDF from a source string or Path.\n        Args:\n            source (Union[str,path]): The source string or path.\n        Returns:\n            A PDF if the source is detected to be a valid PDF.\n        Raises:\n            ValueError: If the source is not detected to be a valid PDF.\n        \"\"\"\n        if isinstance(source, str):\n            if cls.is_base64(source):\n                return cls.from_base64(source)\n            elif source.startswith((\"http://\", \"https://\")):\n                return cls.from_url(source)\n            elif source.startswith(\"gs://\"):\n                return cls.from_gs_url(source)\n\n            try:\n                if Path(source).is_file():\n                    return cls.from_path(source)\n            except FileNotFoundError as err:\n                raise MultimodalError(\n                    \"PDF file not found\",\n                    content_type=\"pdf\",\n                    file_path=str(source),\n                ) from err\n            except OSError as e:\n                if e.errno == 63:  # File name too long\n                    raise MultimodalError(\n                        \"PDF file name too long\",\n                        content_type=\"pdf\",\n                        file_path=str(source),\n                    ) from e\n                raise MultimodalError(\n                    \"Unable to read PDF file\",\n                    content_type=\"pdf\",\n                    file_path=str(source),\n                ) from e\n\n            return cls.from_raw_base64(source)\n        elif isinstance(source, Path):\n            return cls.from_path(source)\n\n    @classmethod\n    def autodetect_safely(cls, source: Union[str, Path]) -> Union[PDF, str]:  # noqa: UP007\n        \"\"\"Safely attempt to autodetect a PDF from a source string or path.\n\n        Args:\n            source (Union[str,path]): The source string or path.\n        Returns:\n            A PDF if the source is detected to be a valid PDF, otherwise\n            the source itself as a string.\n        \"\"\"\n        try:\n            return cls.autodetect(source)\n        except ValueError:\n            return str(source)\n\n    @classmethod\n    def is_base64(cls, s: str) -> bool:\n        return bool(re.match(r\"^data:application/pdf;base64,\", s))\n\n    @classmethod\n    def from_base64(cls, data_uri: str) -> PDF:\n        header, encoded = data_uri.split(\",\", 1)\n        media_type = header.split(\":\")[1].split(\";\")[0]\n        if media_type not in VALID_PDF_MIME_TYPES:\n            raise ValueError(f\"Unsupported PDF format: {media_type}\")\n        return cls(\n            source=data_uri,\n            media_type=media_type,\n            data=encoded,\n        )\n\n    @classmethod\n    @lru_cache\n    def from_path(cls, path: str | Path) -> PDF:\n        path = Path(path)\n        if not path.is_file():\n            raise FileNotFoundError(f\"PDF file not found: {path}\")\n\n        if path.stat().st_size == 0:\n            raise ValueError(\"PDF file is empty\")\n\n        media_type, _ = mimetypes.guess_type(str(path))\n        if media_type not in VALID_PDF_MIME_TYPES:\n            raise ValueError(f\"Unsupported PDF format: {media_type}\")\n\n        data = base64.b64encode(path.read_bytes()).decode(\"utf-8\")\n        return cls(source=path, media_type=media_type, data=data)\n\n    @classmethod\n    def from_raw_base64(cls, data: str) -> PDF:\n        try:\n            decoded = base64.b64decode(data)\n            # Check if it's a valid PDF by looking for the PDF header\n            if decoded.startswith(b\"%PDF-\"):\n                return cls(\n                    source=data,\n                    media_type=\"application/pdf\",\n                    data=data,\n                )\n            raise ValueError(\"Invalid PDF format\")\n        except Exception as e:\n            raise ValueError(\"Invalid or unsupported base64 PDF data\") from e\n\n    @classmethod\n    def from_gs_url(cls, data_uri: str, timeout: int = 30) -> PDF:\n        \"\"\"\n        Create a PDF instance from a Google Cloud Storage URL.\n\n        Args:\n            data_uri: GCS URL starting with gs://\n            timeout: Request timeout in seconds (default: 30)\n        \"\"\"\n        if not data_uri.startswith(\"gs://\"):\n            raise ValueError(\"URL must start with gs://\")\n\n        public_url = f\"https://storage.googleapis.com/{data_uri[5:]}\"\n\n        try:\n            response = requests.get(public_url, timeout=timeout)\n            response.raise_for_status()\n            media_type = response.headers.get(\"Content-Type\", \"application/pdf\")\n            if media_type not in VALID_PDF_MIME_TYPES:\n                raise ValueError(f\"Unsupported PDF format: {media_type}\")\n\n            data = base64.b64encode(response.content).decode(\"utf-8\")\n\n            return cls(source=data_uri, media_type=media_type, data=data)\n        except requests.RequestException as e:\n            raise ValueError(\n                \"Failed to access GCS PDF (must be publicly readable)\"\n            ) from e\n\n    @classmethod\n    @lru_cache\n    def from_url(cls, url: str) -> PDF:\n        if url.startswith(\"gs://\"):\n            return cls.from_gs_url(url)\n        parsed_url = urlparse(url)\n        media_type, _ = mimetypes.guess_type(parsed_url.path)\n\n        if not media_type:\n            try:\n                response = requests.head(url, allow_redirects=True)\n                media_type = response.headers.get(\"Content-Type\")\n            except requests.RequestException as e:\n                raise ValueError(\"Failed to fetch PDF from URL\") from e\n\n        if media_type not in VALID_PDF_MIME_TYPES:\n            raise ValueError(f\"Unsupported PDF format: {media_type}\")\n        return cls(source=url, media_type=media_type, data=None)\n\n    def to_mistral(self) -> dict[str, Any]:\n        if (\n            isinstance(self.source, str)\n            and self.source.startswith((\"http://\", \"https://\"))\n            and not self.data\n        ):\n            return {\n                \"type\": \"document_url\",\n                \"document_url\": self.source,\n            }\n        raise ValueError(\"Mistral only supports document URLs for now\")\n\n    def to_openai(self, mode: Mode) -> dict[str, Any]:\n        \"\"\"Convert to OpenAI's document format.\"\"\"\n        input_file_type = (\n            \"input_file\"\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}\n            else \"file\"\n        )\n\n        if (\n            isinstance(self.source, str)\n            and self.source.startswith((\"http://\", \"https://\"))\n            and not self.data\n        ):\n            # Fetch the file from URL and convert to base64\n            data = requests.get(self.source)\n            data = base64.b64encode(data.content).decode(\"utf-8\")\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}:\n                return {\n                    \"type\": input_file_type,\n                    \"filename\": self.source,\n                    \"file_data\": f\"data:{self.media_type};base64,{data}\",\n                }\n            else:\n                return {\n                    \"type\": input_file_type,\n                    \"file\": {\n                        \"filename\": self.source,\n                        \"file_data\": f\"data:{self.media_type};base64,{data}\",\n                    },\n                }\n        elif self.data or self.is_base64(str(self.source)):\n            data = self.data or str(self.source).split(\",\", 1)[1]\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}:\n                return {\n                    \"type\": input_file_type,\n                    \"filename\": (\n                        self.source\n                        if isinstance(self.source, str)\n                        else str(self.source)\n                    ),\n                    \"file_data\": f\"data:{self.media_type};base64,{data}\",\n                }\n            else:\n                return {\n                    \"type\": input_file_type,\n                    \"file\": {\n                        \"filename\": (\n                            self.source\n                            if isinstance(self.source, str)\n                            else str(self.source)\n                        ),\n                        \"file_data\": f\"data:{self.media_type};base64,{data}\",\n                    },\n                }\n        else:\n            raise ValueError(\"PDF data is missing for base64 encoding.\")\n\n    def to_anthropic(self) -> dict[str, Any]:\n        \"\"\"Convert to Anthropic's document format.\"\"\"\n        if (\n            isinstance(self.source, str)\n            and self.source.startswith((\"http://\", \"https://\"))\n            and not self.data\n        ):\n            return {\n                \"type\": \"document\",\n                \"source\": {\n                    \"type\": \"url\",\n                    \"url\": self.source,\n                },\n            }\n        else:\n            if not self.data:\n                self.data = requests.get(str(self.source)).content  # type: ignore\n                self.data = base64.b64encode(self.data).decode(\"utf-8\")  # type: ignore\n\n            return {\n                \"type\": \"document\",\n                \"source\": {\n                    \"type\": \"base64\",\n                    \"media_type\": self.media_type,\n                    \"data\": self.data,\n                },\n            }\n\n    def to_genai(self):\n        try:\n            from google.genai import types\n        except ImportError as err:\n            raise ImportError(\n                \"google-genai package is required for GenAI integration. Install with: pip install google-genai\"\n            ) from err\n\n        if (\n            isinstance(self.source, str)\n            and self.source.startswith((\"http://\", \"https://\"))\n            and not self.data\n        ):\n            # Fetch the file from URL and convert to base64\n            data = requests.get(self.source).content\n            data = base64.b64encode(data).decode(\"utf-8\")\n            return types.Part.from_bytes(\n                data=base64.b64decode(data),\n                mime_type=self.media_type,\n            )\n\n        if self.data:\n            return types.Part.from_bytes(\n                data=base64.b64decode(self.data),\n                mime_type=self.media_type,\n            )\n\n        raise ValueError(\"Unsupported PDF format\")\n\n    def to_bedrock(self, name: str | None = None) -> dict[str, Any]:\n        \"\"\"Convert to Bedrock's document format.\"\"\"\n        # Determine the document name\n        if name is None:\n            if isinstance(self.source, Path):\n                name = self.source.name\n            elif isinstance(self.source, str):\n                # Try to extract filename from path or URL\n                if self.source.startswith((\"http://\", \"https://\", \"gs://\")):\n                    name = Path(urlparse(self.source).path).name or \"document\"\n                else:\n                    name = (\n                        Path(self.source).name\n                        if Path(self.source).exists()\n                        else \"document\"\n                    )\n            else:\n                name = \"document\"\n\n        # Sanitize name according to Bedrock requirements\n        # Only allow alphanumeric, whitespace (max one in row), hyphens, parentheses, square brackets\n        name = re.sub(r\"[^\\w\\s\\-\\(\\)\\[\\]]\", \"\", name)\n        name = re.sub(r\"\\s+\", \" \", name)  # Consolidate whitespace\n        name = name.strip()\n\n        # Handle S3 URIs\n        if isinstance(self.source, str) and self.source.startswith(\"s3://\"):\n            # Parse S3 URI: s3://bucket/key\n            s3_match = re.match(r\"s3://([^/]+)/(.*)\", self.source)\n            if not s3_match:\n                raise ValueError(f\"Invalid S3 URI format: {self.source}\")\n\n            bucket = s3_match.group(1)\n            key = s3_match.group(2)\n\n            # Note: bucketOwner is optional but recommended for cross-account access\n            return {\n                \"document\": {\n                    \"format\": \"pdf\",\n                    \"name\": name,\n                    \"source\": {\n                        \"s3Location\": {\n                            \"uri\": self.source\n                            # \"bucketOwner\": \"account-id\"  # Optional, can be added by user\n                        }\n                    },\n                }\n            }\n\n        # Handle bytes-based sources (URLs, paths, base64)\n        if not self.data:\n            # Need to fetch/load the data\n            if isinstance(self.source, str) and self.source.startswith(\n                (\"http://\", \"https://\")\n            ):\n                response = requests.get(self.source)\n                response.raise_for_status()\n                pdf_bytes = response.content\n            elif isinstance(self.source, Path) or (\n                isinstance(self.source, str) and Path(self.source).exists()\n            ):\n                pdf_bytes = Path(self.source).read_bytes()\n            else:\n                raise ValueError(\"PDF data is missing and source cannot be loaded\")\n        else:\n            # Decode base64 data to bytes\n            pdf_bytes = base64.b64decode(self.data)\n\n        return {\n            \"document\": {\"format\": \"pdf\", \"name\": name, \"source\": {\"bytes\": pdf_bytes}}\n        }\n\n\nclass PDFWithCacheControl(PDF):\n    \"\"\"PDF with Anthropic prompt caching support.\"\"\"\n\n    def to_anthropic(self) -> dict[str, Any]:\n        \"\"\"Override Anthropic return with cache_control.\"\"\"\n        result = super().to_anthropic()\n        result[\"cache_control\"] = {\"type\": \"ephemeral\"}\n        return result\n\n\nclass PDFWithGenaiFile(PDF):\n    @classmethod\n    def from_new_genai_file(\n        cls, file_path: str, retry_delay: int = 10, max_retries: int = 20\n    ) -> PDFWithGenaiFile:\n        \"\"\"Create a new PDFWithGenaiFile from a file path.\"\"\"\n        from google.genai.types import FileState\n        import time\n        from google.genai import Client\n\n        client = Client()\n        file = client.files.upload(file=file_path)\n        while file.state != FileState.ACTIVE:\n            time.sleep(retry_delay)\n            file = client.files.get(name=file.name)  # type: ignore\n            if max_retries > 0:\n                max_retries -= 1\n            else:\n                raise Exception(\n                    \"Max retries reached. File upload has been started but is still pending\"\n                )\n\n        return cls(source=file.uri, media_type=file.mime_type, data=None)  # type: ignore\n\n    @classmethod\n    def from_existing_genai_file(cls, file_name: str) -> PDFWithGenaiFile:\n        \"\"\"Create a new PDFWithGenaiFile from a file URL.\"\"\"\n        from google.genai import types\n        from google.genai.types import FileState\n        from google.genai import Client\n\n        client = Client()\n        file = client.files.get(name=file_name)\n        if file.source == types.FileSource.UPLOADED and file.state == FileState.ACTIVE:\n            return cls(\n                source=file.uri,  # type: ignore\n                media_type=file.mime_type,  # type: ignore\n                data=None,\n            )\n        else:\n            raise ValueError(\"We only support uploaded PDFs for now\")\n\n    def to_genai(self):\n        try:\n            from google.genai import types\n        except ImportError as err:\n            raise ImportError(\n                \"google-genai package is required for GenAI integration. Install with: pip install google-genai\"\n            ) from err\n\n        if (\n            self.source\n            and isinstance(self.source, str)\n            and \"https://generativelanguage.googleapis.com/v1beta/files/\" in self.source\n        ):\n            return types.Part.from_uri(\n                file_uri=self.source,\n                mime_type=self.media_type,\n            )\n\n        return super().to_genai()\n\n\ndef convert_contents(\n    contents: Union[  # noqa: UP007\n        str,\n        dict[str, Any],\n        Image,\n        Audio,\n        list[Union[str, dict[str, Any], Image, Audio]],  # noqa: UP007\n    ],\n    mode: Mode,\n) -> Union[str, list[dict[str, Any]]]:  # noqa: UP007\n    \"\"\"Convert content items to the appropriate format based on the specified mode.\"\"\"\n    if isinstance(contents, str):\n        return contents\n    if isinstance(contents, (Image, Audio, PDF)) or isinstance(contents, dict):\n        contents = [contents]\n\n    converted_contents: list[dict[str, Union[str, Image]]] = []  # noqa: UP007\n    text_file_type = (\n        \"input_text\"\n        if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}\n        else \"text\"\n    )\n    for content in contents:\n        if isinstance(content, str):\n            converted_contents.append({\"type\": text_file_type, \"text\": content})\n        elif isinstance(content, dict):\n            converted_contents.append(content)\n        elif isinstance(content, (Image, Audio, PDF)):\n            if mode in {\n                Mode.ANTHROPIC_JSON,\n                Mode.ANTHROPIC_TOOLS,\n                Mode.ANTHROPIC_REASONING_TOOLS,\n            }:\n                converted_contents.append(content.to_anthropic())\n            elif mode in {Mode.GEMINI_JSON, Mode.GEMINI_TOOLS}:\n                raise NotImplementedError(\"Gemini is not supported yet\")\n            elif mode in {\n                Mode.MISTRAL_STRUCTURED_OUTPUTS,\n                Mode.MISTRAL_TOOLS,\n            } and isinstance(content, (PDF)):\n                converted_contents.append(content.to_mistral())  # type: ignore\n            else:\n                converted_contents.append(content.to_openai(mode))\n        else:\n            raise ValueError(f\"Unsupported content type: {type(content)}\")\n    return converted_contents\n\n\ndef autodetect_media(\n    source: str | Path | Image | Audio | PDF,\n) -> Image | Audio | PDF | str:\n    \"\"\"Autodetect images, audio, or PDFs from a given source.\n\n    Args:\n        source: URL, file path, Path, or data URI to inspect.\n\n    Returns:\n        The detected :class:`Image`, :class:`Audio`, or :class:`PDF` instance.\n        If detection fails, the original source is returned.\n    \"\"\"\n    if isinstance(source, (Image, Audio, PDF)):\n        return source\n\n    # Normalize once for cheap checks and mimetype guess\n    source = str(source)\n\n    if source.startswith(\"data:image/\"):\n        return Image.autodetect_safely(source)\n    if source.startswith(\"data:audio/\"):\n        return Audio.autodetect_safely(source)\n    if source.startswith(\"data:application/pdf\"):\n        return PDF.autodetect_safely(source)\n\n    media_type, _ = mimetypes.guess_type(source)\n    if media_type in VALID_MIME_TYPES:\n        return Image.autodetect_safely(source)\n    if media_type in VALID_AUDIO_MIME_TYPES:\n        return Audio.autodetect_safely(source)\n    if media_type in VALID_PDF_MIME_TYPES:\n        return PDF.autodetect_safely(source)\n\n    for cls in (Image, Audio, PDF):\n        item = cls.autodetect_safely(source)  # type: ignore[arg-type]\n        if not isinstance(item, str):\n            return item\n    return source\n\n\ndef convert_messages(\n    messages: list[\n        dict[\n            str,\n            Union[  # noqa: UP007\n                str,\n                dict[str, Any],\n                Image,\n                Audio,\n                PDF,\n                list[Union[str, dict[str, Any], Image, Audio, PDF]],  # noqa: UP007\n            ],\n        ]\n    ],\n    mode: Mode,\n    autodetect_images: bool = False,\n) -> list[dict[str, Any]]:\n    \"\"\"Convert messages to the appropriate format based on the specified mode.\"\"\"\n    converted_messages = []\n\n    def is_image_params(x: Any) -> bool:\n        return isinstance(x, dict) and x.get(\"type\") == \"image\" and \"source\" in x  # type: ignore\n\n    for message in messages:\n        if \"type\" in message:\n            if message[\"type\"] in {\"audio\", \"image\"}:\n                converted_messages.append(message)  # type: ignore\n            else:\n                raise ValueError(f\"Unsupported message type: {message['type']}\")\n        role = message[\"role\"]\n        content = message[\"content\"] or []\n        other_kwargs = {\n            k: v for k, v in message.items() if k not in [\"role\", \"content\", \"type\"]\n        }\n        if autodetect_images:\n            if isinstance(content, list):\n                new_content: list[str | dict[str, Any] | Image | Audio | PDF] = []  # noqa: UP007\n                for item in content:\n                    if isinstance(item, str):\n                        new_content.append(autodetect_media(item))\n                    elif is_image_params(item):\n                        new_content.append(\n                            ImageWithCacheControl.from_image_params(\n                                cast(ImageParams, item)\n                            )\n                        )\n                    else:\n                        new_content.append(item)\n                content = new_content\n            elif isinstance(content, str):\n                content = autodetect_media(content)\n            elif is_image_params(content):\n                content = ImageWithCacheControl.from_image_params(\n                    cast(ImageParams, content)\n                )\n        if isinstance(content, str):\n            converted_messages.append(  # type: ignore\n                {\"role\": role, \"content\": content, **other_kwargs}\n            )\n        else:\n            # At this point content is narrowed to non-str types accepted by convert_contents\n            converted_content = convert_contents(content, mode)  # type: ignore\n            converted_messages.append(  # type: ignore\n                {\"role\": role, \"content\": converted_content, **other_kwargs}\n            )\n    return converted_messages  # type: ignore\n\n\ndef extract_genai_multimodal_content(\n    contents: list[Any],\n    autodetect_images: bool = True,\n):\n    \"\"\"\n    Convert Typed Contents to the appropriate format for Google GenAI.\n    \"\"\"\n    from google.genai import types\n\n    result: list[Union[types.Content, types.File]] = []  # noqa: UP007\n    for content in contents:\n        # Check for Files\n        if isinstance(content, types.File):\n            result.append(content)\n            continue\n\n        # We only want to do the conversion for the Image type\n        if not isinstance(content, types.Content):\n            raise ValueError(\n                f\"Unsupported content type: {type(content)}. This should only be used for the Google types\"\n            )\n        # Cast to list of Parts\n        content = cast(types.Content, content)\n        converted_contents: list[types.Part] = []\n\n        if not content.parts:\n            raise ValueError(\"Content parts are empty\")\n\n        # Now we need to support a few cases\n        for content_part in content.parts:\n            if content_part.text and autodetect_images:\n                converted_item = autodetect_media(content_part.text)\n\n                if isinstance(converted_item, (Image, Audio, PDF)):\n                    converted_contents.append(converted_item.to_genai())\n                    continue\n\n                converted_contents.append(content_part)\n            else:\n                converted_contents.append(content_part)\n\n        result.append(types.Content(parts=converted_contents, role=content.role))\n\n    return result\n"
  },
  {
    "path": "instructor/processing/response.py",
    "content": "\"\"\"\nThis module serves as the central dispatcher for processing responses from various LLM providers\n(OpenAI, Anthropic, Google, Cohere, etc.) and transforming them into structured Pydantic models.\nIt handles different response formats, streaming responses, validation, and error recovery.\n\nThe module supports 40+ different modes across providers, each with specific handling logic\nfor request formatting and response parsing. It also provides retry mechanisms (reask) for\nhandling validation errors gracefully.\n\nKey Components:\n    - Response processing functions for sync/async operations\n    - Mode-based response model handlers for different providers\n    - Error recovery and retry logic for validation failures\n    - Support for streaming, partial, parallel, and iterable response models\n\nExample:\n    ```python\n    from instructor.process_response import process_response\n    from ..mode import Mode\n    from pydantic import BaseModel\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    # Process an OpenAI response\n    processed = process_response(\n        response=openai_response,\n        response_model=User,\n        mode=Mode.TOOLS,\n        stream=False\n    )\n    ```\n\"\"\"\n\nfrom __future__ import annotations\n\nimport inspect\nimport logging\nfrom typing import Any, TypeVar, TYPE_CHECKING, cast\nfrom collections.abc import AsyncGenerator\n\nfrom openai.types.chat import ChatCompletion\nfrom pydantic import BaseModel\nfrom typing_extensions import ParamSpec\n\nfrom instructor.core.exceptions import InstructorError, ConfigurationError\n\nfrom ..dsl.iterable import IterableBase\nfrom ..dsl.parallel import ParallelBase\nfrom ..dsl.partial import PartialBase\nfrom ..dsl.response_list import ListResponse\nfrom ..dsl.simple_type import AdapterBase\n\nif TYPE_CHECKING:\n    from .function_calls import OpenAISchema\nfrom ..mode import Mode\nfrom .multimodal import convert_messages\nfrom ..utils.core import prepare_response_model\n\n# Anthropic utils\nfrom ..providers.anthropic.utils import (\n    handle_anthropic_json,\n    handle_anthropic_parallel_tools,\n    handle_anthropic_reasoning_tools,\n    handle_anthropic_tools,\n    reask_anthropic_json,\n    reask_anthropic_tools,\n)\n\n# Bedrock utils\nfrom ..providers.bedrock.utils import (\n    handle_bedrock_json,\n    handle_bedrock_tools,\n    reask_bedrock_json,\n    reask_bedrock_tools,\n)\n\n# Cerebras utils\nfrom ..providers.cerebras.utils import (\n    handle_cerebras_json,\n    handle_cerebras_tools,\n    reask_cerebras_tools,\n)\n\n# Cohere utils\nfrom ..providers.cohere.utils import (\n    handle_cohere_json_schema,\n    handle_cohere_tools,\n    reask_cohere_tools,\n)\n\n# Fireworks utils\nfrom ..providers.fireworks.utils import (\n    handle_fireworks_json,\n    handle_fireworks_tools,\n    reask_fireworks_json,\n    reask_fireworks_tools,\n)\n\n# Google/Gemini/VertexAI utils\nfrom ..providers.gemini.utils import (\n    handle_gemini_json,\n    handle_gemini_tools,\n    handle_genai_structured_outputs,\n    handle_genai_tools,\n    handle_vertexai_json,\n    handle_vertexai_parallel_tools,\n    handle_vertexai_tools,\n    reask_gemini_json,\n    reask_gemini_tools,\n    reask_genai_structured_outputs,\n    reask_genai_tools,\n    reask_vertexai_json,\n    reask_vertexai_tools,\n)\n\n# Mistral utils\nfrom ..providers.mistral.utils import (\n    handle_mistral_structured_outputs,\n    handle_mistral_tools,\n    reask_mistral_structured_outputs,\n    reask_mistral_tools,\n)\n\n# OpenAI utils\nfrom ..providers.openai.utils import (\n    handle_functions,\n    handle_json_modes,\n    handle_json_o1,\n    handle_openrouter_structured_outputs,\n    handle_parallel_tools,\n    handle_responses_tools,\n    handle_responses_tools_with_inbuilt_tools,\n    handle_tools,\n    handle_tools_strict,\n    reask_default,\n    reask_md_json,\n    reask_responses_tools,\n    reask_tools,\n)\n\n# Perplexity utils\nfrom ..providers.perplexity.utils import (\n    handle_perplexity_json,\n    reask_perplexity_json,\n)\n\n# Writer utils\nfrom ..providers.writer.utils import (\n    handle_writer_json,\n    handle_writer_tools,\n    reask_writer_json,\n    reask_writer_tools,\n)\n\n# XAI utils\nfrom ..providers.xai.utils import (\n    handle_xai_json,\n    handle_xai_tools,\n    reask_xai_json,\n    reask_xai_tools,\n)\n\nlogger = logging.getLogger(\"instructor\")\n\nT_Model = TypeVar(\"T_Model\", bound=BaseModel)\nT_Retval = TypeVar(\"T_Retval\")\nT_ParamSpec = ParamSpec(\"T_ParamSpec\")\nT = TypeVar(\"T\")\n\n\nasync def process_response_async(\n    response: ChatCompletion,\n    *,\n    response_model: type[T_Model | OpenAISchema | BaseModel] | None,\n    stream: bool = False,\n    validation_context: dict[str, Any] | None = None,\n    strict: bool | None = None,\n    mode: Mode = Mode.TOOLS,\n) -> Any:\n    \"\"\"Asynchronously process and transform LLM responses into structured models.\n\n    This function is the async entry point for converting raw LLM responses into validated\n    Pydantic models. It handles various response formats from different providers and\n    supports special response types like streaming, partial objects, and parallel tool calls.\n\n    Args:\n        response (ChatCompletion or Similar API Response): The raw response from the LLM API. Despite the type hint,\n            this can be responses from any supported provider (OpenAI, Anthropic, Google, etc.)\n        response_model (type[T_Model | BaseModel] | None): The target Pydantic\n            model to parse the response into. If None, returns the raw response unchanged.\n            Can also be special DSL types like ParallelBase for parallel tool calls, or IterableBase and PartialBase for streaming.\n        stream (bool): Whether this is a streaming response. Required for proper handling\n            of IterableBase and PartialBase models. Defaults to False.\n        validation_context (dict[str, Any] | None): Additional context passed to Pydantic\n            validators during model validation. Useful for dynamic validation logic. The context\n            is also used to format templated responses. Defaults to None.\n        strict (bool | None): Whether to enforce strict JSON parsing. When True, the response\n            must exactly match the model schema. When False, allows minor deviations.\n        mode (Mode): The provider/format mode that determines how to parse the response.\n            Examples: Mode.TOOLS (OpenAI), Mode.ANTHROPIC_JSON, Mode.GEMINI_TOOLS.\n            Defaults to Mode.TOOLS.\n\n    Returns:\n        T_Model | ChatCompletion: The processed response. Return type depends on inputs:\n            - If response_model is None: returns raw response unchanged\n            - If response_model is IterableBase with stream=True: returns list of models\n            - If response_model is AdapterBase: returns the adapted content\n            - Otherwise: returns instance of response_model with _raw_response attached\n\n    Raises:\n        ValidationError: If the response doesn't match the expected model schema\n        IncompleteOutputException: If the response was truncated due to token limits\n        ValueError: If an invalid mode is specified\n\n    Note:\n        The function automatically detects special response model types (Iterable, Partial,\n        Parallel, Adapter) and applies appropriate processing logic for each.\n    \"\"\"\n\n    logger.debug(\n        f\"Instructor Raw Response: {response}\",\n    )\n    if response_model is None:\n        return response\n\n    if (\n        inspect.isclass(response_model)\n        and issubclass(response_model, IterableBase)\n        and stream\n    ):\n        # Preserve streaming behavior for `create_iterable()` (async for).\n        return response_model.from_streaming_response_async(  # type: ignore[return-value,arg-type]\n            cast(AsyncGenerator[Any, None], response),\n            mode=mode,\n        )\n\n    if (\n        inspect.isclass(response_model)\n        and issubclass(response_model, PartialBase)\n        and stream\n    ):\n        # Return the AsyncGenerator directly for streaming Partial responses.\n        return response_model.from_streaming_response_async(  # type: ignore[return-value,arg-type]\n            cast(AsyncGenerator[Any, None], response),\n            mode=mode,\n        )\n\n    model = response_model.from_response(  # type: ignore\n        response,\n        validation_context=validation_context,\n        strict=strict,\n        mode=mode,\n    )\n\n    # ? This really hints at the fact that we need a better way of\n    # ? attaching usage data and the raw response to the model we return.\n    if isinstance(model, IterableBase):\n        logger.debug(f\"Returning takes from IterableBase\")\n        return ListResponse.from_list(  # type: ignore[return-value]\n            [task for task in model.tasks],\n            raw_response=response,\n        )\n\n    if isinstance(response_model, ParallelBase):\n        logger.debug(f\"Returning model from ParallelBase\")\n        model._raw_response = response\n        return model\n\n    if isinstance(model, AdapterBase):\n        logger.debug(f\"Returning model from AdapterBase\")\n        return model.content\n\n    model._raw_response = response\n    return model\n\n\ndef process_response(\n    response: T_Model,\n    *,\n    response_model: type[OpenAISchema | BaseModel] | None = None,\n    stream: bool,\n    validation_context: dict[str, Any] | None = None,\n    strict=None,\n    mode: Mode = Mode.TOOLS,\n) -> Any:\n    \"\"\"Process and transform LLM responses into structured models (synchronous).\n\n    This is the main entry point for converting raw LLM responses into validated Pydantic\n    models. It acts as a dispatcher that handles various response formats from 40+ different\n    provider modes and transforms them according to the specified response model type.\n\n    Args:\n        response (T_Model): The raw response from the LLM API. The actual type varies by\n            provider (ChatCompletion for OpenAI, Message for Anthropic, etc.)\n        response_model (type[OpenAISchema | BaseModel] | None): The target Pydantic model\n            class to parse the response into. Special DSL types supported:\n            - IterableBase: For streaming multiple objects from a single response\n            - PartialBase: For incomplete/streaming partial objects\n            - ParallelBase: For parallel tool/function calls\n            - AdapterBase: For simple type adaptations (e.g., str, int)\n            If None, returns the raw response unchanged.\n        stream (bool): Whether this is a streaming response. Required to be True for\n            proper handling of IterableBase and PartialBase models.\n        validation_context (dict[str, Any] | None): Additional context passed to Pydantic\n            validators. Useful for runtime validation logic based on external state.\n        strict (bool | None): Controls JSON parsing strictness:\n            - True: Enforce exact schema matching (no extra fields)\n            - False/None: Allow minor deviations and extra fields\n        mode (Mode): The provider/format mode that determines parsing strategy.\n            Each mode corresponds to a specific provider and format combination:\n            - Tool modes: TOOLS, ANTHROPIC_TOOLS, GEMINI_TOOLS, etc.\n            - JSON modes: JSON, ANTHROPIC_JSON, VERTEXAI_JSON, etc.\n            - Special modes: PARALLEL_TOOLS, MD_JSON, JSON_SCHEMA, etc.\n\n    Returns:\n        T_Model | list[T_Model] | None: The processed response:\n            - If response_model is None: Original response unchanged\n            - If IterableBase: List of extracted model instances\n            - If ParallelBase: Special parallel response object\n            - If AdapterBase: The adapted simple type (str, int, etc.)\n            - Otherwise: Single instance of response_model with _raw_response attached\n\n    Raises:\n        ValidationError: Response doesn't match the expected model schema\n        IncompleteOutputException: Response truncated due to token limits\n        ValueError: Invalid mode specified or mode not supported\n        JSONDecodeError: Malformed JSON in response (for JSON modes)\n\n    Note:\n        The function preserves the raw response by attaching it to the parsed model\n        as `_raw_response`. This allows access to metadata like token usage, model\n        info, and other provider-specific fields after parsing.\n    \"\"\"\n    logger.debug(\n        f\"Instructor Raw Response: {response}\",\n    )\n\n    if response_model is None:\n        logger.debug(\"No response model, returning response as is\")\n        return response\n\n    if (\n        inspect.isclass(response_model)\n        and issubclass(response_model, IterableBase)\n        and stream\n    ):\n        # Preserve streaming behavior for `create_iterable()` (for/async for).\n        return response_model.from_streaming_response(  # type: ignore[return-value]\n            response,\n            mode=mode,\n        )\n\n    if (\n        inspect.isclass(response_model)\n        and issubclass(response_model, PartialBase)\n        and stream\n    ):\n        # Collect partial stream to surface validation errors inside retry logic.\n        return list(\n            response_model.from_streaming_response(  # type: ignore\n                response,\n                mode=mode,\n            )\n        )\n\n    model = response_model.from_response(  # type: ignore\n        response,\n        validation_context=validation_context,\n        strict=strict,\n        mode=mode,\n    )\n\n    # ? This really hints at the fact that we need a better way of\n    # ? attaching usage data and the raw response to the model we return.\n    if isinstance(model, IterableBase):\n        logger.debug(f\"Returning takes from IterableBase\")\n        return ListResponse.from_list(  # type: ignore[return-value]\n            [task for task in model.tasks],\n            raw_response=response,\n        )\n\n    if isinstance(response_model, ParallelBase):\n        logger.debug(f\"Returning model from ParallelBase\")\n        model._raw_response = response\n        return model\n\n    if isinstance(model, AdapterBase):\n        logger.debug(f\"Returning model from AdapterBase\")\n        return model.content\n\n    model._raw_response = response\n    return model\n\n\ndef is_typed_dict(cls) -> bool:\n    return (\n        isinstance(cls, type)\n        and issubclass(cls, dict)\n        and hasattr(cls, \"__annotations__\")\n    )\n\n\ndef handle_response_model(\n    response_model: type[T] | None, mode: Mode = Mode.TOOLS, **kwargs: Any\n) -> tuple[type[T] | None, dict[str, Any]]:\n    \"\"\"\n    Handles the response model based on the specified mode and prepares the kwargs for the API call.\n    This really should be named 'prepare_create_kwargs' as its job is to map the openai create kwargs\n    to the correct format for the API call based on the mode.\n\n    Args:\n        response_model (type[T] | None): The response model to be used for parsing the API response.\n        mode (Mode): The mode to use for handling the response model. Defaults to Mode.TOOLS.\n        **kwargs: Additional keyword arguments to be passed to the API call.\n\n    Returns:\n        tuple[type[T] | None, dict[str, Any]]: A tuple containing the processed response model and the updated kwargs.\n\n    This function prepares the response model and modifies the kwargs based on the specified mode.\n    It handles various modes like TOOLS, JSON, FUNCTIONS, etc., and applies the appropriate\n    transformations to the response model and kwargs.\n    \"\"\"\n\n    new_kwargs = kwargs.copy()\n    # Extract autodetect_images for message conversion\n    autodetect_images = new_kwargs.pop(\"autodetect_images\", False)\n\n    PARALLEL_MODES = {\n        Mode.PARALLEL_TOOLS: handle_parallel_tools,\n        Mode.VERTEXAI_PARALLEL_TOOLS: handle_vertexai_parallel_tools,\n        Mode.ANTHROPIC_PARALLEL_TOOLS: handle_anthropic_parallel_tools,\n    }\n\n    if mode in PARALLEL_MODES:\n        response_model, new_kwargs = PARALLEL_MODES[mode](response_model, new_kwargs)  # type: ignore\n        logger.debug(\n            f\"Instructor Request: {mode.value=}, {response_model=}, {new_kwargs=}\",\n            extra={\n                \"mode\": mode.value,\n                \"response_model\": (\n                    response_model.__name__\n                    if response_model is not None\n                    and hasattr(response_model, \"__name__\")\n                    else str(response_model)\n                ),\n                \"new_kwargs\": new_kwargs,\n            },\n        )\n        return response_model, new_kwargs\n\n    # Only prepare response_model if it's not None\n    if response_model is not None:\n        response_model = prepare_response_model(response_model)\n\n    mode_handlers = {  # type: ignore\n        Mode.FUNCTIONS: handle_functions,\n        Mode.TOOLS_STRICT: handle_tools_strict,\n        Mode.TOOLS: handle_tools,\n        Mode.MISTRAL_TOOLS: handle_mistral_tools,\n        Mode.MISTRAL_STRUCTURED_OUTPUTS: handle_mistral_structured_outputs,\n        Mode.JSON_O1: handle_json_o1,\n        Mode.JSON: lambda rm, nk: handle_json_modes(rm, nk, Mode.JSON),  # type: ignore\n        Mode.MD_JSON: lambda rm, nk: handle_json_modes(rm, nk, Mode.MD_JSON),  # type: ignore\n        Mode.JSON_SCHEMA: lambda rm, nk: handle_json_modes(rm, nk, Mode.JSON_SCHEMA),  # type: ignore\n        Mode.ANTHROPIC_TOOLS: handle_anthropic_tools,\n        Mode.ANTHROPIC_REASONING_TOOLS: handle_anthropic_reasoning_tools,\n        Mode.ANTHROPIC_JSON: handle_anthropic_json,\n        Mode.COHERE_JSON_SCHEMA: handle_cohere_json_schema,\n        Mode.COHERE_TOOLS: handle_cohere_tools,\n        Mode.GEMINI_JSON: handle_gemini_json,\n        Mode.GEMINI_TOOLS: handle_gemini_tools,\n        Mode.GENAI_TOOLS: lambda rm, nk: handle_genai_tools(rm, nk, autodetect_images),\n        Mode.GENAI_STRUCTURED_OUTPUTS: lambda rm, nk: handle_genai_structured_outputs(\n            rm, nk, autodetect_images\n        ),\n        Mode.VERTEXAI_TOOLS: handle_vertexai_tools,\n        Mode.VERTEXAI_JSON: handle_vertexai_json,\n        Mode.CEREBRAS_JSON: handle_cerebras_json,\n        Mode.CEREBRAS_TOOLS: handle_cerebras_tools,\n        Mode.FIREWORKS_JSON: handle_fireworks_json,\n        Mode.FIREWORKS_TOOLS: handle_fireworks_tools,\n        Mode.WRITER_TOOLS: handle_writer_tools,\n        Mode.WRITER_JSON: handle_writer_json,\n        Mode.BEDROCK_JSON: handle_bedrock_json,\n        Mode.BEDROCK_TOOLS: handle_bedrock_tools,\n        Mode.PERPLEXITY_JSON: handle_perplexity_json,\n        Mode.OPENROUTER_STRUCTURED_OUTPUTS: handle_openrouter_structured_outputs,\n        Mode.RESPONSES_TOOLS: handle_responses_tools,\n        Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS: handle_responses_tools_with_inbuilt_tools,\n        Mode.XAI_JSON: handle_xai_json,\n        Mode.XAI_TOOLS: handle_xai_tools,\n    }\n\n    if mode in mode_handlers:\n        response_model, new_kwargs = mode_handlers[mode](response_model, new_kwargs)  # type: ignore\n    else:\n        raise ConfigurationError(\n            f\"Invalid or unsupported mode: {mode}. \"\n            f\"This mode may not be implemented. \"\n            f\"Available modes: {', '.join(str(m) for m in mode_handlers.keys())}\"\n        )\n\n    # Handle message conversion for modes that don't already handle it\n    if \"messages\" in new_kwargs:\n        new_kwargs[\"messages\"] = convert_messages(\n            new_kwargs[\"messages\"],\n            mode,\n            autodetect_images=autodetect_images,\n        )\n\n    logger.debug(\n        f\"Instructor Request: {mode.value=}, {response_model=}, {new_kwargs=}\",\n        extra={\n            \"mode\": mode.value,\n            \"response_model\": (\n                response_model.__name__\n                if response_model is not None and hasattr(response_model, \"__name__\")\n                else str(response_model)\n            ),\n            \"new_kwargs\": new_kwargs,\n        },\n    )\n    return response_model, new_kwargs\n\n\ndef handle_reask_kwargs(\n    kwargs: dict[str, Any],\n    mode: Mode,\n    response: Any,\n    exception: Exception,\n    failed_attempts: list[Any] | None = None,\n) -> dict[str, Any]:\n    \"\"\"Handle validation errors by reformatting the request for retry (reask).\n\n    This function serves as the central dispatcher for handling validation failures\n    across all supported LLM providers. When a response fails validation, it prepares\n    a new request that includes detailed error information and retry context, allowing\n    the LLM to understand what went wrong and generate a corrected response.\n\n    The reask process involves:\n    1. Analyzing the validation error and failed response\n    2. Selecting the appropriate provider-specific reask handler\n    3. Enriching the exception with retry history (failed_attempts)\n    4. Formatting error feedback in the provider's expected message format\n    5. Preserving original request parameters while adding retry context\n\n    Args:\n        kwargs (dict[str, Any]): The original request parameters that resulted in\n            a validation error. Contains all parameters passed to the LLM API:\n            - messages: conversation history\n            - tools/functions: available function definitions\n            - temperature, max_tokens: generation parameters\n            - model, provider-specific settings\n        mode (Mode): The provider/format mode that determines which reask handler\n            to use. Each mode implements a specific strategy for formatting error\n            feedback and retry messages. Examples:\n            - Mode.TOOLS: OpenAI function calling\n            - Mode.ANTHROPIC_TOOLS: Anthropic tool use\n            - Mode.JSON: JSON-only responses\n        response (Any): The raw response from the LLM that failed validation.\n            Type and structure varies by provider:\n            - OpenAI: ChatCompletion with tool_calls or content\n            - Anthropic: Message with tool_use blocks or text content\n            - Google: GenerateContentResponse with function calls\n            - Cohere: NonStreamedChatResponse with tool calls\n        exception (Exception): The validation error that occurred, typically:\n            - Pydantic ValidationError: field validation failures\n            - JSONDecodeError: malformed JSON responses\n            - Custom validation errors from response processors\n            The exception will be enriched with failed_attempts data.\n        failed_attempts (list[FailedAttempt] | None): Historical record of previous\n            retry attempts for this request. Each FailedAttempt contains:\n            - attempt_number: sequential attempt counter\n            - exception: the validation error for that attempt\n            - completion: the raw LLM response that failed\n            Used to provide retry context and prevent repeated mistakes.\n\n    Returns:\n        dict[str, Any]: Modified kwargs for the retry request with:\n            - Updated messages including error feedback\n            - Original tool/function definitions preserved\n            - Generation parameters maintained (temperature, etc.)\n            - Provider-specific error formatting applied\n            - Retry context embedded in appropriate message format\n\n    Provider-Specific Reask Strategies:\n        **OpenAI Modes:**\n        - TOOLS/FUNCTIONS: Adds tool response messages with validation errors\n        - JSON modes: Appends user message with correction instructions\n        - Preserves function schemas and conversation context\n\n        **Anthropic Modes:**\n        - TOOLS: Creates tool_result blocks with error details\n        - JSON: Adds user message with structured error feedback\n        - Maintains conversation flow with proper message roles\n\n        **Google/Gemini Modes:**\n        - TOOLS: Formats as function response with error content\n        - JSON: Appends user message with validation feedback\n\n        **Other Providers (Cohere, Mistral, etc.):**\n        - Provider-specific message formatting\n        - Consistent error reporting patterns\n        - Maintained conversation context\n\n    Error Enrichment:\n        The exception parameter is enriched with retry metadata:\n        - exception.failed_attempts: list of previous failures\n        - exception.retry_attempt_number: current attempt number\n        This allows downstream handlers to access full retry context.\n\n    Example:\n        ```python\n        # After a ValidationError occurs during retry attempt #2\n        new_kwargs = handle_reask_kwargs(\n            kwargs=original_request,\n            mode=Mode.TOOLS,\n            response=failed_completion,\n            exception=validation_error,  # Will be enriched with failed_attempts\n            failed_attempts=[attempt1, attempt2]  # Previous failures\n        )\n        # new_kwargs now contains retry messages with error context\n        ```\n\n    Note:\n        This function is called internally by retry_sync() and retry_async()\n        when max_retries > 1. It ensures each retry includes progressively\n        more context about previous failures, helping the LLM learn from\n        mistakes and avoid repeating the same errors.\n    \"\"\"\n    # Create a shallow copy of kwargs to avoid modifying the original\n    kwargs_copy = kwargs.copy()\n\n    exception = InstructorError.from_exception(\n        exception, failed_attempts=failed_attempts\n    )\n\n    # Organized by provider (matching process_response.py structure)\n    REASK_HANDLERS = {\n        # OpenAI modes\n        Mode.FUNCTIONS: reask_default,\n        Mode.TOOLS_STRICT: reask_tools,\n        Mode.TOOLS: reask_tools,\n        Mode.JSON_O1: reask_default,\n        Mode.JSON: reask_md_json,\n        Mode.MD_JSON: reask_md_json,\n        Mode.JSON_SCHEMA: reask_md_json,\n        Mode.PARALLEL_TOOLS: reask_tools,\n        Mode.RESPONSES_TOOLS: reask_responses_tools,\n        Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS: reask_responses_tools,\n        # Mistral modes\n        Mode.MISTRAL_TOOLS: reask_mistral_tools,\n        Mode.MISTRAL_STRUCTURED_OUTPUTS: reask_mistral_structured_outputs,\n        # Anthropic modes\n        Mode.ANTHROPIC_TOOLS: reask_anthropic_tools,\n        Mode.ANTHROPIC_REASONING_TOOLS: reask_anthropic_tools,\n        Mode.ANTHROPIC_JSON: reask_anthropic_json,\n        Mode.ANTHROPIC_PARALLEL_TOOLS: reask_anthropic_tools,\n        # Cohere modes\n        Mode.COHERE_TOOLS: reask_cohere_tools,\n        Mode.COHERE_JSON_SCHEMA: reask_cohere_tools,\n        # Gemini/Google modes\n        Mode.GEMINI_TOOLS: reask_gemini_tools,\n        Mode.GEMINI_JSON: reask_gemini_json,\n        Mode.GENAI_TOOLS: reask_genai_tools,\n        Mode.GENAI_STRUCTURED_OUTPUTS: reask_genai_structured_outputs,\n        # VertexAI modes\n        Mode.VERTEXAI_TOOLS: reask_vertexai_tools,\n        Mode.VERTEXAI_JSON: reask_vertexai_json,\n        Mode.VERTEXAI_PARALLEL_TOOLS: reask_vertexai_tools,\n        # Cerebras modes\n        Mode.CEREBRAS_TOOLS: reask_cerebras_tools,\n        Mode.CEREBRAS_JSON: reask_default,\n        # Fireworks modes\n        Mode.FIREWORKS_TOOLS: reask_fireworks_tools,\n        Mode.FIREWORKS_JSON: reask_fireworks_json,\n        # Writer modes\n        Mode.WRITER_TOOLS: reask_writer_tools,\n        Mode.WRITER_JSON: reask_writer_json,\n        # Bedrock modes\n        Mode.BEDROCK_TOOLS: reask_bedrock_tools,\n        Mode.BEDROCK_JSON: reask_bedrock_json,\n        # Perplexity modes\n        Mode.PERPLEXITY_JSON: reask_perplexity_json,\n        # OpenRouter modes\n        Mode.OPENROUTER_STRUCTURED_OUTPUTS: reask_default,\n        # XAI modes\n        Mode.XAI_JSON: reask_xai_json,\n        Mode.XAI_TOOLS: reask_xai_tools,\n    }\n\n    if mode in REASK_HANDLERS:\n        return REASK_HANDLERS[mode](kwargs_copy, response, exception)\n    else:\n        return reask_default(kwargs_copy, response, exception)\n"
  },
  {
    "path": "instructor/processing/schema.py",
    "content": "\"\"\"\nStandalone schema generation utilities for different LLM providers.\n\nThis module provides provider-agnostic functions to generate schemas from Pydantic models\nwithout requiring inheritance from OpenAISchema or use of decorators.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport functools\nimport warnings\nfrom typing import Any, cast\n\nfrom docstring_parser import parse\nfrom pydantic import BaseModel\n\nfrom ..providers.gemini.utils import map_to_gemini_function_schema\n\n__all__ = [\n    \"generate_openai_schema\",\n    \"generate_anthropic_schema\",\n    \"generate_gemini_schema\",\n]\n\n\n@functools.lru_cache(maxsize=256)\ndef generate_openai_schema(model: type[BaseModel]) -> dict[str, Any]:\n    \"\"\"\n    Generate OpenAI function schema from a Pydantic model.\n\n    Args:\n        model: A Pydantic BaseModel subclass\n\n    Returns:\n        A dictionary in the format of OpenAI's function schema\n\n    Note:\n        The model's docstring will be used for the function description.\n        Parameter descriptions from the docstring will enrich field descriptions.\n    \"\"\"\n    schema = model.model_json_schema()\n    docstring = parse(model.__doc__ or \"\")\n    parameters = {k: v for k, v in schema.items() if k not in (\"title\", \"description\")}\n\n    # Enrich parameter descriptions from docstring\n    for param in docstring.params:\n        if (name := param.arg_name) in parameters[\"properties\"] and (\n            description := param.description\n        ):\n            if \"description\" not in parameters[\"properties\"][name]:\n                parameters[\"properties\"][name][\"description\"] = description\n\n    parameters[\"required\"] = sorted(\n        k for k, v in parameters[\"properties\"].items() if \"default\" not in v\n    )\n\n    if \"description\" not in schema:\n        if docstring.short_description:\n            schema[\"description\"] = docstring.short_description\n        else:\n            schema[\"description\"] = (\n                f\"Correctly extracted `{model.__name__}` with all \"\n                f\"the required parameters with correct types\"\n            )\n\n    return {\n        \"name\": schema[\"title\"],\n        \"description\": schema[\"description\"],\n        \"parameters\": parameters,\n    }\n\n\n@functools.lru_cache(maxsize=256)\ndef generate_anthropic_schema(model: type[BaseModel]) -> dict[str, Any]:\n    \"\"\"\n    Generate Anthropic tool schema from a Pydantic model.\n\n    Args:\n        model: A Pydantic BaseModel subclass\n\n    Returns:\n        A dictionary in the format of Anthropic's tool schema\n    \"\"\"\n    # Generate the Anthropic schema based on the OpenAI schema to avoid redundant schema generation\n    openai_schema = generate_openai_schema(model)\n    return {\n        \"name\": openai_schema[\"name\"],\n        \"description\": openai_schema[\"description\"],\n        \"input_schema\": model.model_json_schema(),\n    }\n\n\n@functools.lru_cache(maxsize=256)\ndef generate_gemini_schema(model: type[BaseModel]) -> Any:\n    \"\"\"\n    Generate Gemini function schema from a Pydantic model.\n\n    Args:\n        model: A Pydantic BaseModel subclass\n\n    Returns:\n        A Gemini FunctionDeclaration object\n\n    Note:\n        This function is deprecated. The google-generativeai library is being replaced by google-genai.\n    \"\"\"\n    # This is kept for backward compatibility but deprecated\n    warnings.warn(\n        \"generate_gemini_schema is deprecated. The google-generativeai library is being replaced by google-genai.\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    try:\n        import importlib\n\n        genai_types = cast(Any, importlib.import_module(\"google.generativeai.types\"))\n\n        # Use OpenAI schema\n        openai_schema = generate_openai_schema(model)\n\n        # Transform to Gemini format\n        function = genai_types.FunctionDeclaration(\n            name=openai_schema[\"name\"],\n            description=openai_schema[\"description\"],\n            parameters=map_to_gemini_function_schema(openai_schema[\"parameters\"]),\n        )\n\n        return function\n    except ImportError as e:\n        raise ImportError(\n            \"google-generativeai is deprecated. Please install google-genai instead: pip install google-genai\"\n        ) from e\n"
  },
  {
    "path": "instructor/processing/validators.py",
    "content": "\"\"\"Validators that extend OpenAISchema for structured outputs.\"\"\"\n\nfrom typing import Optional\n\nfrom pydantic import Field\n\nfrom .function_calls import OpenAISchema\n\n\nclass Validator(OpenAISchema):\n    \"\"\"\n    Validate if an attribute is correct and if not,\n    return a new value with an error message\n    \"\"\"\n\n    is_valid: bool = Field(\n        default=True,\n        description=\"Whether the attribute is valid based on the requirements\",\n    )\n    reason: Optional[str] = Field(\n        default=None,\n        description=\"The error message if the attribute is not valid, otherwise None\",\n    )\n    fixed_value: Optional[str] = Field(\n        default=None,\n        description=\"If the attribute is not valid, suggest a new value for the attribute\",\n    )\n"
  },
  {
    "path": "instructor/providers/README.md",
    "content": "# Providers Directory Structure\n\nThis directory contains implementations for all supported LLM providers in the instructor library.\n\n## Provider Organization\n\nEach provider is organized in its own subdirectory with the following structure:\n\n```\nproviders/\n├── provider_name/\n│   ├── __init__.py\n│   ├── client.py      # Provider-specific client factory (optional)\n│   └── utils.py       # Provider-specific utilities (optional)\n```\n\n## File Structure Patterns\n\n### Providers with both `client.py` and `utils.py`\n- **anthropic**, **bedrock**, **cerebras**, **cohere**, **fireworks**, **gemini**, **mistral**, **perplexity**, **writer**, **xai**\n- These providers require custom response handling logic and utility functions\n- `client.py`: Contains the `from_<provider>()` factory function\n- `utils.py`: Contains provider-specific response handlers, reask functions, and message formatting\n\n### Providers with only `client.py`\n- **genai**, **groq**, **vertexai**\n- These are simpler providers that use standard response handling from the core\n- They don't require custom utility functions\n\n### Special Case: OpenAI (only `utils.py`)\n- OpenAI doesn't have a `client.py` because `from_openai()` is defined in `core/client.py`\n- This is because OpenAI is the reference implementation that other providers are based on\n- OpenAI utilities are still needed by the core processing logic for standard handling\n\n## Adding a New Provider\n\nWhen adding a new provider:\n\n1. Create a new subdirectory under `providers/`\n2. Add an `__init__.py` file (can be minimal)\n3. Create `client.py` with a `from_<provider>()` function if needed\n4. Create `utils.py` only if you need custom:\n   - Response handlers (e.g., `handle_<provider>_json()`)\n   - Reask functions (e.g., `reask_<provider>_tools()`)\n   - Message formatting (e.g., `convert_to_<provider>_messages()`)\n5. Update `providers/__init__.py` to conditionally import your provider\n6. Update the main `instructor/__init__.py` to export the factory function\n\n## Import Structure\n\n- Provider modules use relative imports with `...` to access parent modules\n- Example: `from ...core.exceptions import ProviderError`\n- This maintains clean separation between provider implementations and core functionality"
  },
  {
    "path": "instructor/providers/__init__.py",
    "content": "\"\"\"Provider implementations for instructor.\"\"\"\n\nimport importlib.util\n\n__all__ = []\n\n# Conditional imports based on installed packages\nif importlib.util.find_spec(\"anthropic\") is not None:\n    from .anthropic.client import from_anthropic  # noqa: F401\n\n    __all__.append(\"from_anthropic\")\n\nif importlib.util.find_spec(\"boto3\") is not None:\n    from .bedrock.client import from_bedrock  # noqa: F401\n\n    __all__.append(\"from_bedrock\")\n\nif importlib.util.find_spec(\"cerebras\") is not None:\n    from .cerebras.client import from_cerebras  # noqa: F401\n\n    __all__.append(\"from_cerebras\")\n\nif importlib.util.find_spec(\"cohere\") is not None:\n    from .cohere.client import from_cohere  # noqa: F401\n\n    __all__.append(\"from_cohere\")\n\nif importlib.util.find_spec(\"fireworks\") is not None:\n    from .fireworks.client import from_fireworks  # noqa: F401\n\n    __all__.append(\"from_fireworks\")\n\nif (\n    importlib.util.find_spec(\"google\")\n    and importlib.util.find_spec(\"google.generativeai\") is not None\n):\n    from .gemini.client import from_gemini  # noqa: F401\n\n    __all__.append(\"from_gemini\")\n\nif (\n    importlib.util.find_spec(\"google\")\n    and importlib.util.find_spec(\"google.genai\") is not None\n):\n    from .genai.client import from_genai  # noqa: F401\n\n    __all__.append(\"from_genai\")\n\nif importlib.util.find_spec(\"groq\") is not None:\n    from .groq.client import from_groq  # noqa: F401\n\n    __all__.append(\"from_groq\")\n\nif importlib.util.find_spec(\"mistralai\") is not None:\n    from .mistral.client import from_mistral  # noqa: F401\n\n    __all__.append(\"from_mistral\")\n\nif importlib.util.find_spec(\"openai\") is not None:\n    from .perplexity.client import from_perplexity  # noqa: F401\n\n    __all__.append(\"from_perplexity\")\n\nif all(importlib.util.find_spec(pkg) for pkg in (\"vertexai\", \"jsonref\")):\n    try:\n        from .vertexai.client import from_vertexai  # noqa: F401\n    except Exception:\n        # Optional dependency may be present but broken/misconfigured at import time.\n        # Avoid failing `import instructor` in that case.\n        pass\n    else:\n        __all__.append(\"from_vertexai\")\n\nif importlib.util.find_spec(\"writerai\") is not None:\n    from .writer.client import from_writer  # noqa: F401\n\n    __all__.append(\"from_writer\")\n\nif importlib.util.find_spec(\"xai_sdk\") is not None:\n    from .xai.client import from_xai  # noqa: F401\n\n    __all__.append(\"from_xai\")\n"
  },
  {
    "path": "instructor/providers/anthropic/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/anthropic/client.py",
    "content": "from __future__ import annotations\n\nimport anthropic\nimport instructor\n\nfrom typing import overload, Any\n\n\n@overload\ndef from_anthropic(\n    client: (\n        anthropic.Anthropic | anthropic.AnthropicBedrock | anthropic.AnthropicVertex\n    ),\n    mode: instructor.Mode = instructor.Mode.ANTHROPIC_TOOLS,\n    beta: bool = False,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_anthropic(\n    client: (\n        anthropic.AsyncAnthropic\n        | anthropic.AsyncAnthropicBedrock\n        | anthropic.AsyncAnthropicVertex\n    ),\n    mode: instructor.Mode = instructor.Mode.ANTHROPIC_TOOLS,\n    beta: bool = False,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\ndef from_anthropic(\n    client: (\n        anthropic.Anthropic\n        | anthropic.AsyncAnthropic\n        | anthropic.AnthropicBedrock\n        | anthropic.AsyncAnthropicBedrock\n        | anthropic.AsyncAnthropicVertex\n        | anthropic.AnthropicVertex\n    ),\n    mode: instructor.Mode = instructor.Mode.ANTHROPIC_TOOLS,\n    beta: bool = False,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    \"\"\"Create an Instructor instance from an Anthropic client.\n\n    Args:\n        client: An instance of Anthropic client (sync or async)\n        mode: The mode to use for the client (ANTHROPIC_JSON or ANTHROPIC_TOOLS)\n        beta: Whether to use beta API features (uses client.beta.messages.create)\n        **kwargs: Additional keyword arguments to pass to the Instructor constructor\n\n    Returns:\n        An Instructor instance (sync or async depending on the client type)\n\n    Raises:\n        ModeError: If mode is not one of the valid Anthropic modes\n        ClientError: If client is not a valid Anthropic client instance\n    \"\"\"\n    valid_modes = {\n        instructor.Mode.ANTHROPIC_JSON,\n        instructor.Mode.ANTHROPIC_TOOLS,\n        instructor.Mode.ANTHROPIC_REASONING_TOOLS,\n        instructor.Mode.ANTHROPIC_PARALLEL_TOOLS,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Anthropic\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    valid_client_types = (\n        anthropic.Anthropic,\n        anthropic.AsyncAnthropic,\n        anthropic.AnthropicBedrock,\n        anthropic.AnthropicVertex,\n        anthropic.AsyncAnthropicBedrock,\n        anthropic.AsyncAnthropicVertex,\n    )\n\n    if not isinstance(client, valid_client_types):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of one of: {', '.join(t.__name__ for t in valid_client_types)}. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if beta:\n        create = client.beta.messages.create\n    else:\n        create = client.messages.create\n\n    if isinstance(\n        client,\n        (anthropic.Anthropic, anthropic.AnthropicBedrock, anthropic.AnthropicVertex),\n    ):\n        return instructor.Instructor(\n            client=client,\n            create=instructor.patch(create=create, mode=mode),\n            provider=instructor.Provider.ANTHROPIC,\n            mode=mode,\n            **kwargs,\n        )\n\n    else:\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=create, mode=mode),\n            provider=instructor.Provider.ANTHROPIC,\n            mode=mode,\n            **kwargs,\n        )\n"
  },
  {
    "path": "instructor/providers/anthropic/utils.py",
    "content": "\"\"\"Anthropic-specific utilities.\n\nThis module contains utilities specific to the Anthropic provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom textwrap import dedent\nfrom typing import Any, TypedDict, Union\n\n\nfrom ...mode import Mode\nfrom ...processing.schema import generate_anthropic_schema\n\n\nclass SystemMessage(TypedDict, total=False):\n    type: str\n    text: str\n    cache_control: dict[str, str]\n\n\ndef combine_system_messages(\n    existing_system: Union[str, list[SystemMessage], None],  # noqa: UP007\n    new_system: Union[str, list[SystemMessage]],  # noqa: UP007\n) -> Union[str, list[SystemMessage]]:  # noqa: UP007\n    \"\"\"\n    Combine existing and new system messages.\n\n    This optimized version uses a more direct approach with fewer branches.\n\n    Args:\n        existing_system: Existing system message(s) or None\n        new_system: New system message(s) to add\n\n    Returns:\n        Combined system message(s)\n    \"\"\"\n    # Fast path for None existing_system (avoid unnecessary operations)\n    if existing_system is None:\n        return new_system\n\n    # Validate input types\n    if not isinstance(existing_system, (str, list)) or not isinstance(\n        new_system, (str, list)\n    ):\n        raise ValueError(\n            f\"System messages must be strings or lists, got {type(existing_system)} and {type(new_system)}\"\n        )\n\n    # Use direct type comparison instead of isinstance for better performance\n    if isinstance(existing_system, str) and isinstance(new_system, str):\n        # Both are strings, join with newlines\n        # Avoid creating intermediate strings by joining only once\n        return f\"{existing_system}\\n\\n{new_system}\"\n    elif isinstance(existing_system, list) and isinstance(new_system, list):\n        # Both are lists, use list extension in place to avoid creating intermediate lists\n        # First create a new list to avoid modifying the original\n        result = list(existing_system)\n        result.extend(new_system)\n        return result\n    elif isinstance(existing_system, str) and isinstance(new_system, list):\n        # existing is string, new is list\n        # Create a pre-sized list to avoid resizing\n        result = [SystemMessage(type=\"text\", text=existing_system)]\n        result.extend(new_system)\n        return result\n    elif isinstance(existing_system, list) and isinstance(new_system, str):\n        # existing is list, new is string\n        # Create message once and add to existing\n        new_message = SystemMessage(type=\"text\", text=new_system)\n        result = list(existing_system)\n        result.append(new_message)\n        return result\n\n    # This should never happen due to validation above\n    return existing_system\n\n\ndef extract_system_messages(messages: list[dict[str, Any]]) -> list[SystemMessage]:\n    \"\"\"\n    Extract system messages from a list of messages.\n\n    This optimized version pre-allocates the result list and\n    reduces function call overhead.\n\n    Args:\n        messages: List of messages to extract system messages from\n\n    Returns:\n        List of system messages\n    \"\"\"\n    # Fast path for empty messages\n    if not messages:\n        return []\n\n    # First count system messages to pre-allocate result list\n    system_count = sum(1 for m in messages if m.get(\"role\") == \"system\")\n\n    # If no system messages, return empty list\n    if system_count == 0:\n        return []\n\n    # Helper function to convert a message content to SystemMessage\n    def convert_message(content: Any) -> SystemMessage:\n        if isinstance(content, str):\n            return SystemMessage(type=\"text\", text=content)\n        elif isinstance(content, dict):\n            return SystemMessage(**content)\n        else:\n            raise ValueError(f\"Unsupported content type: {type(content)}\")\n\n    # Process system messages\n    result: list[SystemMessage] = []\n\n    for message in messages:\n        if message.get(\"role\") == \"system\":\n            content = message.get(\"content\")\n\n            # Skip empty content\n            if not content:\n                continue\n\n            # Handle list or single content\n            if isinstance(content, list):\n                # Process each item in the list\n                for item in content:\n                    if item:  # Skip empty items\n                        result.append(convert_message(item))\n            else:\n                # Process single content\n                result.append(convert_message(content))\n\n    return result\n\n\ndef reask_anthropic_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Anthropic tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (tool result messages indicating validation errors)\n    \"\"\"\n    kwargs = kwargs.copy()\n    from anthropic.types import Message\n\n    # Handle Stream objects which are not Message instances\n    # This happens when streaming mode is used with retries\n    if not isinstance(response, Message):\n        kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\n\"\n                    \"Recall the function correctly, fix the errors\"\n                ),\n            }\n        )\n        return kwargs\n\n    assistant_content = []\n    tool_use_id = None\n    for content in response.content:\n        assistant_content.append(content.model_dump())  # type: ignore\n        if content.type == \"tool_use\":\n            tool_use_id = content.id\n\n    reask_msgs = [{\"role\": \"assistant\", \"content\": assistant_content}]  # type: ignore\n    if tool_use_id is not None:\n        reask_msgs.append(  # type: ignore\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"type\": \"tool_result\",\n                        \"tool_use_id\": tool_use_id,\n                        \"content\": f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\",\n                        \"is_error\": True,\n                    }\n                ],\n            }\n        )\n    else:\n        reask_msgs.append(  # type: ignore\n            {\n                \"role\": \"user\",\n                \"content\": f\"Validation Error due to no tool invocation:\\n{exception}\\nRecall the function correctly, fix the errors\",\n            }\n        )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_anthropic_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Anthropic JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n    from anthropic.types import Message\n\n    # Handle Stream objects which are not Message instances\n    # This happens when streaming mode is used with retries\n    if not isinstance(response, Message):\n        kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Validation Errors found:\\n{exception}\\n\"\n                    \"Recall the function correctly, fix the errors\"\n                ),\n            }\n        )\n        return kwargs\n\n    # Filter for text blocks to handle ThinkingBlock and other non-text content\n    text_blocks = [c for c in response.content if c.type == \"text\"]\n    if not text_blocks:\n        # Fallback if no text blocks found\n        text_content = \"No text content found in response\"\n    else:\n        # Use the last text block, similar to function_calls.py:396-397\n        text_content = text_blocks[-1].text\n\n    reask_msg = {\n        \"role\": \"user\",\n        \"content\": f\"\"\"Validation Errors found:\\n{exception}\\nRecall the function correctly, fix the errors found in the following attempt:\\n{text_content}\"\"\",\n    }\n    kwargs[\"messages\"].append(reask_msg)\n    return kwargs\n\n\ndef handle_anthropic_message_conversion(new_kwargs: dict[str, Any]) -> dict[str, Any]:\n    \"\"\"\n    Handle message conversion for Anthropic modes when response_model is None.\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (removes system messages)\n    - Adds/Modifies: \"system\" (if system messages found in messages)\n    \"\"\"\n    messages = new_kwargs.get(\"messages\", [])\n\n    # Handle Anthropic style messages\n    new_kwargs[\"messages\"] = [m for m in messages if m[\"role\"] != \"system\"]\n\n    if \"system\" not in new_kwargs:\n        system_messages = extract_system_messages(messages)\n        if system_messages:\n            new_kwargs[\"system\"] = system_messages\n\n    return new_kwargs\n\n\ndef handle_anthropic_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Anthropic tools mode.\n\n    When response_model is None:\n        - Extracts system messages from the messages list and moves them to the 'system' parameter\n        - Filters out system messages from the messages list\n        - No tools are configured\n        - Allows for unstructured responses from Claude\n\n    When response_model is provided:\n        - Generates Anthropic tool schema from the response model\n        - Sets up forced tool use with the specific tool name\n        - Extracts and combines system messages\n        - Filters system messages from the messages list\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (removes system messages)\n    - Adds/Modifies: \"system\" (combines existing with extracted system messages)\n    - Adds: \"tools\" (list with tool schema) - only when response_model provided\n    - Adds: \"tool_choice\" (forced tool use) - only when response_model provided\n    \"\"\"\n    if response_model is None:\n        # Just handle message conversion\n        new_kwargs = handle_anthropic_message_conversion(new_kwargs)\n        return None, new_kwargs\n\n    tool_descriptions = generate_anthropic_schema(response_model)\n    new_kwargs[\"tools\"] = [tool_descriptions]\n    new_kwargs[\"tool_choice\"] = {\n        \"type\": \"tool\",\n        \"name\": response_model.__name__,\n    }\n\n    system_messages = extract_system_messages(new_kwargs.get(\"messages\", []))\n\n    if system_messages:\n        new_kwargs[\"system\"] = combine_system_messages(\n            new_kwargs.get(\"system\"), system_messages\n        )\n\n    new_kwargs[\"messages\"] = [\n        m for m in new_kwargs.get(\"messages\", []) if m[\"role\"] != \"system\"\n    ]\n\n    return response_model, new_kwargs\n\n\ndef handle_anthropic_reasoning_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Anthropic reasoning tools mode.\n\n    This mode is similar to regular tools mode but with reasoning enabled:\n    - Uses \"auto\" tool choice instead of forced tool use\n    - Adds a system message encouraging tool use only when relevant\n    - Allows Claude to reason about whether to use tools\n\n    When response_model is None:\n        - Performs the same message conversion as handle_anthropic_tools\n        - No tools are configured\n\n    When response_model is provided:\n        - Sets up tools as in regular tools mode\n        - Changes tool_choice to \"auto\" to allow reasoning\n        - Adds system message to guide tool usage\n\n    Kwargs modifications:\n    - All modifications from handle_anthropic_tools, plus:\n    - Modifies: \"tool_choice\" (changes to {\"type\": \"auto\"}) - only when response_model provided\n    - Modifies: \"system\" (adds implicit forced tool message)\n    \"\"\"\n    # https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview#forcing-tool-use\n\n    response_model, new_kwargs = handle_anthropic_tools(response_model, new_kwargs)\n\n    if response_model is None:\n        # Just handle message conversion - already done by handle_anthropic_tools\n        return None, new_kwargs\n\n    # https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview#forcing-tool-use\n    # Reasoning does not allow forced tool use\n    new_kwargs[\"tool_choice\"] = {\"type\": \"auto\"}\n\n    # But add a message recommending only to use the tools if they are relevant\n    implict_forced_tool_message = dedent(\n        f\"\"\"\n        Return only the tool call and no additional text.\n        \"\"\"\n    )\n    new_kwargs[\"system\"] = combine_system_messages(\n        new_kwargs.get(\"system\"),\n        [{\"type\": \"text\", \"text\": implict_forced_tool_message}],\n    )\n    return response_model, new_kwargs\n\n\ndef handle_anthropic_json(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Anthropic JSON mode.\n\n    This mode instructs Claude to return JSON responses:\n    - System messages are extracted and combined\n    - A JSON schema message is added to guide the response format\n\n    When response_model is None:\n        - Extracts and moves system messages to the 'system' parameter\n        - Filters system messages from the messages list\n        - No JSON schema is added\n\n    When response_model is provided:\n        - Performs system message handling as above\n        - Adds a system message with the JSON schema\n        - Instructs Claude to return an instance matching the schema\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (removes system messages)\n    - Adds/Modifies: \"system\" (combines existing with extracted system messages)\n    - Modifies: \"system\" (adds JSON schema message) - only when response_model provided\n    \"\"\"\n    import json\n\n    system_messages = extract_system_messages(new_kwargs.get(\"messages\", []))\n\n    if system_messages:\n        new_kwargs[\"system\"] = combine_system_messages(\n            new_kwargs.get(\"system\"), system_messages\n        )\n\n    new_kwargs[\"messages\"] = [\n        m for m in new_kwargs.get(\"messages\", []) if m[\"role\"] != \"system\"\n    ]\n\n    if response_model is None:\n        # Just handle message conversion - already done above\n        return None, new_kwargs\n\n    json_schema_message = dedent(\n        f\"\"\"\n        As a genius expert, your task is to understand the content and provide\n        the parsed objects in json that match the following json_schema:\\n\n\n        {json.dumps(response_model.model_json_schema(), indent=2, ensure_ascii=False)}\n\n        Make sure to return an instance of the JSON, not the schema itself\n        \"\"\"\n    )\n\n    new_kwargs[\"system\"] = combine_system_messages(\n        new_kwargs.get(\"system\"),\n        [{\"type\": \"text\", \"text\": json_schema_message}],\n    )\n\n    return response_model, new_kwargs\n\n\ndef handle_anthropic_parallel_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[Any, dict[str, Any]]:\n    \"\"\"\n    Handle Anthropic parallel tools mode.\n\n    Kwargs modifications:\n    - Adds: \"tools\" (multiple function schemas from parallel model)\n    - Adds: \"tool_choice\" (\"auto\" to allow model to choose which tools to call)\n    - Modifies: \"system\" (moves system messages into system parameter)\n    - Removes: \"system\" messages from \"messages\" list\n    - Validates: stream=False\n    \"\"\"\n    from ...dsl.parallel import (\n        AnthropicParallelModel,\n        handle_anthropic_parallel_model,\n    )\n    from ...core.exceptions import ConfigurationError\n\n    if new_kwargs.get(\"stream\", False):\n        raise ConfigurationError(\n            \"stream=True is not supported when using ANTHROPIC_PARALLEL_TOOLS mode\"\n        )\n\n    new_kwargs[\"tools\"] = handle_anthropic_parallel_model(response_model)\n    new_kwargs[\"tool_choice\"] = {\"type\": \"auto\"}\n\n    system_messages = extract_system_messages(new_kwargs.get(\"messages\", []))\n\n    if system_messages:\n        new_kwargs[\"system\"] = combine_system_messages(\n            new_kwargs.get(\"system\"), system_messages\n        )\n\n    new_kwargs[\"messages\"] = [\n        m for m in new_kwargs.get(\"messages\", []) if m[\"role\"] != \"system\"\n    ]\n\n    return AnthropicParallelModel(typehint=response_model), new_kwargs\n\n\n# Handler registry for Anthropic\nANTHROPIC_HANDLERS = {\n    Mode.ANTHROPIC_TOOLS: {\n        \"reask\": reask_anthropic_tools,\n        \"response\": handle_anthropic_tools,\n    },\n    Mode.ANTHROPIC_JSON: {\n        \"reask\": reask_anthropic_json,\n        \"response\": handle_anthropic_json,\n    },\n    Mode.ANTHROPIC_REASONING_TOOLS: {\n        \"reask\": reask_anthropic_tools,\n        \"response\": handle_anthropic_reasoning_tools,\n    },\n    Mode.ANTHROPIC_PARALLEL_TOOLS: {\n        \"reask\": reask_anthropic_tools,\n        \"response\": handle_anthropic_parallel_tools,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/bedrock/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/bedrock/client.py",
    "content": "from __future__ import annotations  # type: ignore\n\nfrom typing import Any, Literal, overload\nimport warnings\n\nfrom botocore.client import BaseClient\n\nimport instructor\nfrom ...core.client import AsyncInstructor, Instructor\n\n\n@overload  # type: ignore\ndef from_bedrock(\n    client: BaseClient,\n    mode: instructor.Mode = instructor.Mode.BEDROCK_TOOLS,\n    async_client: Literal[False] = False,\n    **kwargs: Any,\n) -> Instructor: ...\n\n\n@overload  # type: ignore\ndef from_bedrock(\n    client: BaseClient,\n    mode: instructor.Mode = instructor.Mode.BEDROCK_TOOLS,\n    async_client: Literal[True] = True,\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\ndef handle_bedrock_json(\n    response_model: Any,\n    new_kwargs: Any,\n) -> tuple[Any, Any]:\n    \"\"\"\n    This function is deprecated and no longer used.\n    Bedrock JSON handling is now done in process_response.py via handle_bedrock_json().\n    \"\"\"\n    return response_model, new_kwargs\n\n\ndef from_bedrock(\n    client: BaseClient,\n    mode: instructor.Mode = instructor.Mode.BEDROCK_JSON,\n    async_client: bool = False,\n    _async: bool | None = None,  # Deprecated, use async_client\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    \"\"\"\n    Accepts both 'async_client' (preferred) and '_async' (deprecated) for async mode.\n    \"\"\"\n    valid_modes = {\n        instructor.Mode.BEDROCK_TOOLS,\n        instructor.Mode.BEDROCK_JSON,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Bedrock\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, BaseClient):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of boto3.client (BaseClient). \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    # Deprecation warning for _async usage\n    if _async is not None and not async_client:\n        warnings.warn(\n            \"The '_async' argument to from_bedrock is deprecated. Use 'async_client' instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n\n    # Prefer async_client, fallback to _async for backward compatibility\n    use_async = async_client or (_async is not None and _async is True)\n\n    async def async_wrapper(**kwargs: Any):\n        return client.converse(**kwargs)\n\n    create = client.converse\n\n    if use_async:\n        return AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.BEDROCK,\n            mode=mode,\n            **kwargs,\n        )\n    else:\n        return Instructor(\n            client=client,\n            create=instructor.patch(create=create, mode=mode),\n            provider=instructor.Provider.BEDROCK,\n            mode=mode,\n            **kwargs,\n        )\n"
  },
  {
    "path": "instructor/providers/bedrock/utils.py",
    "content": "\"\"\"AWS Bedrock-specific utilities.\n\nThis module contains utilities specific to the AWS Bedrock provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport base64\nimport json\nimport mimetypes\nimport requests\nfrom textwrap import dedent\nfrom typing import Any\n\nfrom ...mode import Mode\n\n\ndef generate_bedrock_schema(response_model: type[Any]) -> dict[str, Any]:\n    \"\"\"\n    Generate Bedrock tool schema from a Pydantic model.\n\n    Bedrock Converse API expects tools in this format:\n    {\n        \"toolSpec\": {\n            \"name\": \"tool_name\",\n            \"description\": \"tool description\",\n            \"inputSchema\": {\n                \"json\": { JSON Schema }\n            }\n        }\n    }\n    \"\"\"\n    schema = response_model.model_json_schema()\n\n    return {\n        \"toolSpec\": {\n            \"name\": response_model.__name__,\n            \"description\": response_model.__doc__\n            or f\"Correctly extracted `{response_model.__name__}` with all the required parameters with correct types\",\n            \"inputSchema\": {\"json\": schema},\n        }\n    }\n\n\ndef reask_bedrock_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Bedrock JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [response[\"output\"][\"message\"]]\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\n                    \"text\": f\"Correct your JSON ONLY RESPONSE, based on the following errors:\\n{exception}\"\n                },\n            ],\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_bedrock_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Bedrock tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (assistant message with tool use, then user message with tool result error)\n    \"\"\"\n    kwargs = kwargs.copy()\n\n    # Add the assistant's response message\n    assistant_message = response[\"output\"][\"message\"]\n    reask_msgs = [assistant_message]\n\n    # Find the tool use ID from the assistant's response to reference in the error\n    tool_use_id = None\n    if \"content\" in assistant_message:\n        for content_block in assistant_message[\"content\"]:\n            if \"toolUse\" in content_block:\n                tool_use_id = content_block[\"toolUse\"][\"toolUseId\"]\n                break\n\n    # Add a user message with tool result indicating validation error\n    if tool_use_id:\n        reask_msgs.append(\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"toolResult\": {\n                            \"toolUseId\": tool_use_id,\n                            \"content\": [\n                                {\n                                    \"text\": f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\"\n                                }\n                            ],\n                            \"status\": \"error\",\n                        }\n                    }\n                ],\n            }\n        )\n    else:\n        # Fallback if no tool use ID found\n        reask_msgs.append(\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\n                        \"text\": f\"Validation Error due to no tool invocation:\\n{exception}\\nRecall the function correctly, fix the errors\"\n                    }\n                ],\n            }\n        )\n\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef _normalize_bedrock_image_format(mime_or_ext: str) -> str:\n    \"\"\"\n    Map common/variant image types to Bedrock's required image.format enum:\n    one of {'gif','jpeg','png','webp'}.\n    \"\"\"\n    if not mime_or_ext:\n        return \"jpeg\"\n    val = mime_or_ext.strip().lower()\n    if \"/\" in val:\n        val = val.split(\"/\", 1)[1]  # take subtype, e.g., 'image/jpeg' -> 'jpeg'\n    if val in (\"jpg\", \"pjpeg\", \"x-jpeg\", \"x-jpg\"):\n        return \"jpeg\"\n    if val in (\"png\", \"x-png\"):\n        return \"png\"\n    if val in (\"gif\", \"x-gif\"):\n        return \"gif\"\n    if val in (\"webp\", \"image/webp\"):\n        return \"webp\"\n    return \"jpeg\"\n\n\ndef _openai_image_part_to_bedrock(part: dict[str, Any]) -> dict[str, Any]:\n    \"\"\"\n    Convert OpenAI-style image part:\n      {\"type\":\"image_url\",\"image_url\":{\"url\": \"<data:... or http(s):...>\"}}\n    into Bedrock Converse image content:\n      {\"image\":{\"format\": \"<fmt>\",\"source\":{\"bytes\": <raw-bytes>}}}\n    \"\"\"\n    image_url = (part.get(\"image_url\") or {}).get(\"url\")\n    if not image_url:\n        raise ValueError(\"image_url.url is required for OpenAI-style image parts\")\n\n    guessed_mime = mimetypes.guess_type(image_url)[0] or \"image/jpeg\"\n    fmt = _normalize_bedrock_image_format(guessed_mime)\n\n    # data URL to bytes\n    if image_url.startswith(\"data:\"):\n        try:\n            header, b64 = image_url.split(\",\", 1)\n        except ValueError as e:\n            raise ValueError(\"Invalid data URL in image_url.url\") from e\n        if \";base64\" not in header:\n            raise ValueError(\"Only base64 data URLs are supported for Bedrock\")\n        return {\"image\": {\"format\": fmt, \"source\": {\"bytes\": base64.b64decode(b64)}}}\n\n    # http(s) URL to bytes\n    elif image_url.startswith((\"http://\", \"https://\")):\n        try:\n            resp = requests.get(image_url, timeout=15)\n            resp.raise_for_status()\n            ctype = resp.headers.get(\"Content-Type\")\n            if ctype and \"/\" in ctype:\n                fmt = _normalize_bedrock_image_format(ctype)\n            return {\"image\": {\"format\": fmt, \"source\": {\"bytes\": resp.content}}}\n        except requests.exceptions.Timeout as e:  # type: ignore[attr-defined]\n            raise ValueError(f\"Timed out while fetching image from {image_url}\") from e\n        except requests.exceptions.ConnectionError as e:  # type: ignore[attr-defined]\n            raise ValueError(\n                f\"Connection error while fetching image from {image_url}: {e}\"\n            ) from e\n        except requests.exceptions.HTTPError as e:  # type: ignore[attr-defined]\n            raise ValueError(\n                f\"HTTP error while fetching image from {image_url}: {e}\"\n            ) from e\n        except requests.exceptions.RequestException as e:  # type: ignore[attr-defined]\n            raise ValueError(\n                f\"Request error while fetching image from {image_url}: {e}\"\n            ) from e\n        except Exception as e:\n            raise ValueError(\n                f\"Unexpected error while fetching image from {image_url}: {e}\"\n            ) from e\n    else:\n        raise ValueError(\n            \"Unsupported image_url scheme. Use http(s) or data:image/...;base64,...\"\n        )\n\n\ndef _to_bedrock_content_items(content: Any) -> list[dict[str, Any]]:\n    \"\"\"\n    Normalize content into Bedrock Converse content list.\n\n    Allowed inputs:\n      - string -> [{\"text\": \"...\"}]\n      - list of parts:\n          OpenAI-style:\n            {\"type\":\"text\",\"text\":\"...\"}\n            {\"type\":\"input_text\",\"text\":\"...\"}\n            {\"type\":\"image_url\",\"image_url\":{\"url\":\"<data:... or https:...>\"}}\n          Bedrock-native (passed through as-is):\n            {\"text\":\"...\"}\n            {\"image\":{\"format\":\"jpeg|png|gif|webp\",\"source\":{\"bytes\": <raw bytes>}}}\n            {\"document\":{\"format\":\"pdf|csv|doc|docx|xls|xlsx|html|txt|md\",\"name\":\"...\",\"source\":{\"bytes\": <raw bytes>}}}\n\n    Note:\n      - We do not validate or normalize Bedrock-native image/document blocks here.\n        Caller is responsible for providing valid 'format' and raw 'bytes'.\n    \"\"\"\n    # Plain string\n    if isinstance(content, str):\n        return [{\"text\": content}]\n\n    # List of parts\n    if isinstance(content, list):\n        items: list[dict[str, Any]] = []\n        for p in content:\n            # OpenAI-style parts (have \"type\")\n            if isinstance(p, dict) and \"type\" in p:\n                t = p.get(\"type\")\n                if t in (\"text\", \"input_text\"):\n                    txt = p.get(\"text\") or p.get(\"input_text\") or \"\"\n                    items.append({\"text\": txt})\n                    continue\n                if t == \"image_url\":\n                    items.append(_openai_image_part_to_bedrock(p))\n                    continue\n                raise ValueError(f\"Unsupported OpenAI-style part type for Bedrock: {t}\")\n\n            # Bedrock-native pass-throughs (no \"type\")\n            if isinstance(p, dict):\n                # Pass-through pure text\n                if (\n                    \"text\" in p\n                    and isinstance(p[\"text\"], str)\n                    and set(p.keys()) == {\"text\"}\n                ):\n                    items.append(p)\n                    continue\n                # Pass-through Bedrock-native image as-is (assumes correct format and raw bytes)\n                if \"image\" in p and isinstance(p[\"image\"], dict):\n                    items.append(p)\n                    continue\n                # Pass-through Bedrock-native document as-is (assumes correct format and raw bytes)\n                if \"document\" in p and isinstance(p[\"document\"], dict):\n                    items.append(p)\n                    continue\n\n                raise ValueError(f\"Unsupported dict content for Bedrock: {p}\")\n\n            # Plain string elements inside list\n            if isinstance(p, str):\n                items.append({\"text\": p})\n                continue\n\n            raise ValueError(f\"Unsupported content part for Bedrock: {type(p)}\")\n        return items\n\n    raise ValueError(f\"Unsupported message content type for Bedrock: {type(content)}\")\n\n\ndef _prepare_bedrock_converse_kwargs_internal(\n    call_kwargs: dict[str, Any],\n) -> dict[str, Any]:\n    \"\"\"\n    Prepare kwargs for the Bedrock Converse API.\n\n    Kwargs modifications:\n    - Moves: system list to messages as a system role\n    - Renames: \"model\" -> \"modelId\"\n    - Collects: temperature, max_tokens, top_p, stop into inferenceConfig\n    - Converts: messages content to Bedrock format\n    \"\"\"\n    # Handle Bedrock-native system parameter format: system=[{'text': '...'}]\n    # Convert to OpenAI format by adding to messages as system role\n    if \"system\" in call_kwargs and isinstance(call_kwargs[\"system\"], list):\n        system_content = call_kwargs.pop(\"system\")\n        if (\n            system_content\n            and isinstance(system_content[0], dict)\n            and \"text\" in system_content[0]\n        ):\n            # Convert system=[{'text': '...'}] to OpenAI format\n            system_text = system_content[0][\"text\"]\n            if \"messages\" not in call_kwargs:\n                call_kwargs[\"messages\"] = []\n            # Insert system message at beginning\n            call_kwargs[\"messages\"].insert(\n                0, {\"role\": \"system\", \"content\": system_text}\n            )\n\n    # Bedrock expects 'modelId' over 'model'\n    if \"model\" in call_kwargs and \"modelId\" not in call_kwargs:\n        call_kwargs[\"modelId\"] = call_kwargs.pop(\"model\")\n\n    # Prepare inferenceConfig for parameters like temperature, maxTokens, etc.\n    inference_config_params = {}\n\n    # Temperature\n    if \"temperature\" in call_kwargs:\n        inference_config_params[\"temperature\"] = call_kwargs.pop(\"temperature\")\n\n    # Max Tokens (OpenAI uses max_tokens)\n    if \"max_tokens\" in call_kwargs:\n        inference_config_params[\"maxTokens\"] = call_kwargs.pop(\"max_tokens\")\n    elif \"maxTokens\" in call_kwargs:  # If Bedrock-style maxTokens is already top-level\n        inference_config_params[\"maxTokens\"] = call_kwargs.pop(\"maxTokens\")\n\n    # Top P (OpenAI uses top_p)\n    if \"top_p\" in call_kwargs:\n        inference_config_params[\"topP\"] = call_kwargs.pop(\"top_p\")\n    elif \"topP\" in call_kwargs:  # If Bedrock-style topP is already top-level\n        inference_config_params[\"topP\"] = call_kwargs.pop(\"topP\")\n\n    # Stop Sequences (OpenAI uses 'stop')\n    # Bedrock 'Converse' API expects 'stopSequences'\n    if \"stop\" in call_kwargs:\n        stop_val = call_kwargs.pop(\"stop\")\n        if isinstance(stop_val, str):\n            inference_config_params[\"stopSequences\"] = [stop_val]\n        elif isinstance(stop_val, list):\n            inference_config_params[\"stopSequences\"] = stop_val\n    elif \"stop_sequences\" in call_kwargs:\n        inference_config_params[\"stopSequences\"] = call_kwargs.pop(\"stop_sequences\")\n    elif (\n        \"stopSequences\" in call_kwargs\n    ):  # If Bedrock-style stopSequences is already top-level\n        inference_config_params[\"stopSequences\"] = call_kwargs.pop(\"stopSequences\")\n\n    # If any inference parameters were collected, add them to inferenceConfig\n    # Merge with existing inferenceConfig if user provided one.\n    # User-provided inferenceConfig keys take precedence over top-level params if conflicts.\n    if inference_config_params:\n        if \"inferenceConfig\" in call_kwargs:\n            # Merge, giving precedence to what's already in call_kwargs[\"inferenceConfig\"]\n            # This could be more sophisticated, but for now, if inferenceConfig is set, assume it's intentional.\n            existing_inference_config = call_kwargs[\"inferenceConfig\"]\n            for key, value in inference_config_params.items():\n                if key not in existing_inference_config:\n                    existing_inference_config[key] = value\n        else:\n            call_kwargs[\"inferenceConfig\"] = inference_config_params\n\n    # Process messages for Bedrock: separate system prompts and format text content.\n    if \"messages\" in call_kwargs and isinstance(call_kwargs[\"messages\"], list):\n        original_input_messages = call_kwargs.pop(\"messages\")\n\n        bedrock_system_list: list[dict[str, Any]] = []\n        bedrock_user_assistant_messages_list: list[dict[str, Any]] = []\n\n        for msg_dict in original_input_messages:\n            if not isinstance(msg_dict, dict):\n                # If an item in the messages list is not a dictionary,\n                # pass it through to the user/assistant messages list as is.\n                # This allows non-standard message items to be handled by subsequent Boto3 validation\n                # or if they represent something other than standard role/content messages.\n                bedrock_user_assistant_messages_list.append(msg_dict)\n                continue\n\n            # Make a copy to avoid modifying the original dict if it's part of a larger structure\n            # or if the original list/dicts are expected to remain unchanged by the caller.\n            current_message_for_api = msg_dict.copy()\n            role = current_message_for_api.get(\"role\")\n            content = current_message_for_api.get(\n                \"content\"\n            )  # content can be None or other types\n\n            if role == \"system\":\n                if isinstance(content, str):\n                    bedrock_system_list.append({\"text\": content})\n                else:  # System message content is not a string (could be None, list, int, etc.)\n                    raise ValueError(\n                        \"System message content must be a string for Bedrock processing by this handler. \"\n                        f\"Found type: {type(content)}.\"\n                    )\n            else:  # For user, assistant, or other roles that go into Bedrock's 'messages' list\n                if \"content\" in current_message_for_api:\n                    # Sort out the content from the messages\n                    current_message_for_api[\"content\"] = _to_bedrock_content_items(\n                        content\n                    )\n                bedrock_user_assistant_messages_list.append(current_message_for_api)\n\n        if bedrock_system_list:\n            call_kwargs[\"system\"] = bedrock_system_list\n\n        # Always re-assign the 'messages' key with the processed list.\n        # If original_input_messages was empty or only contained system messages that were extracted,\n        # bedrock_user_assistant_messages_list will be empty, correctly resulting in `messages: []`.\n        call_kwargs[\"messages\"] = bedrock_user_assistant_messages_list\n    return call_kwargs\n\n\ndef handle_bedrock_json(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Bedrock JSON mode.\n\n    Kwargs modifications:\n    - Adds: \"response_format\" with json_schema\n    - Adds/Modifies: \"system\" (prepends JSON instructions)\n    - Applies: _prepare_bedrock_converse_kwargs_internal transformations\n    \"\"\"\n    new_kwargs = _prepare_bedrock_converse_kwargs_internal(new_kwargs)\n    json_message = dedent(\n        f\"\"\"\n        As a genius expert, your task is to understand the content and provide\n        the parsed objects in json that match the following json_schema:\\n\n\n        {json.dumps(response_model.model_json_schema(), indent=2, ensure_ascii=False)}\n\n        Make sure to return an instance of the JSON, not the schema itself\n        and don't include any other text in the response apart from the json\n        \"\"\"\n    )\n    system_message = new_kwargs.pop(\"system\", None)\n    if not system_message:\n        new_kwargs[\"system\"] = [{\"text\": json_message}]\n    else:\n        if not isinstance(system_message, list):\n            raise ValueError(\n                \"\"\"system must be a list of SystemMessage, refer to:\n                https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html\n                \"\"\"\n            )\n        system_message.append({\"text\": json_message})\n        new_kwargs[\"system\"] = system_message\n\n    return response_model, new_kwargs\n\n\ndef handle_bedrock_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Bedrock tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: Only applies _prepare_bedrock_converse_kwargs_internal transformations\n    - When response_model is provided:\n      - Adds: \"toolConfig\" with tools list and toolChoice configuration\n      - Applies: _prepare_bedrock_converse_kwargs_internal transformations\n    \"\"\"\n    new_kwargs = _prepare_bedrock_converse_kwargs_internal(new_kwargs)\n\n    if response_model is None:\n        return None, new_kwargs\n\n    # Generate Bedrock tool schema\n    tool_schema = generate_bedrock_schema(response_model)\n\n    # Set up tools configuration for Bedrock Converse API\n    new_kwargs[\"toolConfig\"] = {\n        \"tools\": [tool_schema],\n        \"toolChoice\": {\"tool\": {\"name\": response_model.__name__}},\n    }\n\n    return response_model, new_kwargs\n\n\n# Handler registry for Bedrock\nBEDROCK_HANDLERS = {\n    Mode.BEDROCK_JSON: {\n        \"reask\": reask_bedrock_json,\n        \"response\": handle_bedrock_json,\n    },\n    Mode.BEDROCK_TOOLS: {\n        \"reask\": reask_bedrock_tools,\n        \"response\": handle_bedrock_tools,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/cerebras/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/cerebras/client.py",
    "content": "from __future__ import annotations  # type: ignore\n\nfrom typing import Any, overload\n\nimport instructor\nfrom ...core.client import AsyncInstructor, Instructor\n\n\nfrom cerebras.cloud.sdk import Cerebras, AsyncCerebras\n\n\n@overload\ndef from_cerebras(\n    client: Cerebras,\n    mode: instructor.Mode = instructor.Mode.CEREBRAS_TOOLS,\n    **kwargs: Any,\n) -> Instructor: ...\n\n\n@overload\ndef from_cerebras(\n    client: AsyncCerebras,\n    mode: instructor.Mode = instructor.Mode.CEREBRAS_TOOLS,\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\ndef from_cerebras(\n    client: Cerebras | AsyncCerebras,\n    mode: instructor.Mode = instructor.Mode.CEREBRAS_TOOLS,\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    valid_modes = {\n        instructor.Mode.CEREBRAS_TOOLS,\n        instructor.Mode.CEREBRAS_JSON,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Cerebras\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, (Cerebras, AsyncCerebras)):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of Cerebras or AsyncCerebras. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if isinstance(client, AsyncCerebras):\n        create = client.chat.completions.create\n        return AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=create, mode=mode),\n            provider=instructor.Provider.CEREBRAS,\n            mode=mode,\n            **kwargs,\n        )\n\n    create = client.chat.completions.create\n    return Instructor(\n        client=client,\n        create=instructor.patch(create=create, mode=mode),\n        provider=instructor.Provider.CEREBRAS,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/cerebras/utils.py",
    "content": "\"\"\"Cerebras-specific utilities.\n\nThis module contains utilities specific to the Cerebras provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom ...mode import Mode\nfrom ...utils.core import dump_message\nfrom ...processing.schema import generate_openai_schema\n\n\ndef reask_cerebras_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Cerebras tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (tool response messages indicating validation errors)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    for tool_call in response.choices[0].message.tool_calls:\n        reask_msgs.append(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\nRecall the function correctly, \"\n                    f\"fix the errors and call the tool {tool_call.function.name} again, \"\n                    f\"taking into account the problems with {tool_call.function.arguments} that was previously generated.\"\n                ),\n            }\n        )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef handle_cerebras_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Cerebras tools mode.\n\n    Kwargs modifications:\n    - Adds: \"tools\" (list with function schema)\n    - Adds: \"tool_choice\" (forced function call)\n    - Validates: stream=False\n    \"\"\"\n    if new_kwargs.get(\"stream\", False):\n        raise ValueError(\"Stream is not supported for Cerebras Tool Calling\")\n    new_kwargs[\"tools\"] = [\n        {\n            \"type\": \"function\",\n            \"function\": generate_openai_schema(response_model),\n        }\n    ]\n    new_kwargs[\"tool_choice\"] = {\n        \"type\": \"function\",\n        \"function\": {\"name\": generate_openai_schema(response_model)[\"name\"]},\n    }\n    return response_model, new_kwargs\n\n\ndef handle_cerebras_json(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Cerebras JSON mode.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (system instruction with JSON schema)\n    \"\"\"\n    instruction = f\"\"\"\nYou are a helpful assistant that excels at following instructions.Your task is to understand the content and provide the parsed objects in json that match the following json_schema:\\n\n\nHere is the relevant JSON schema to adhere to\n\n<schema>\n{response_model.model_json_schema()}\n</schema>\n\nYour response should consist only of a valid JSON object that `{response_model.__name__}.model_validate_json()` can successfully parse.\n\"\"\"\n\n    new_kwargs[\"messages\"] = [{\"role\": \"system\", \"content\": instruction}] + new_kwargs[\n        \"messages\"\n    ]\n    return response_model, new_kwargs\n\n\n# Handler registry for Cerebras\nCEREBRAS_HANDLERS = {\n    Mode.CEREBRAS_TOOLS: {\n        \"reask\": reask_cerebras_tools,\n        \"response\": handle_cerebras_tools,\n    },\n    Mode.CEREBRAS_JSON: {\n        \"reask\": reask_cerebras_tools,  # Uses same reask as tools\n        \"response\": handle_cerebras_json,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/cohere/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/cohere/client.py",
    "content": "from __future__ import annotations\n\nimport inspect\nfrom collections.abc import Awaitable\nfrom typing import Any, TypeVar, cast, overload\n\nimport cohere\nimport instructor\nfrom pydantic import BaseModel\nfrom typing_extensions import ParamSpec\n\n\nT_Model = TypeVar(\"T_Model\", bound=BaseModel)\nT_ParamSpec = ParamSpec(\"T_ParamSpec\")\n\n\n@overload\ndef from_cohere(\n    client: cohere.Client,\n    mode: instructor.Mode = instructor.Mode.COHERE_TOOLS,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_cohere(\n    client: cohere.ClientV2,\n    mode: instructor.Mode = instructor.Mode.COHERE_TOOLS,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_cohere(\n    client: cohere.AsyncClient,\n    mode: instructor.Mode = instructor.Mode.COHERE_JSON_SCHEMA,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\n@overload\ndef from_cohere(\n    client: cohere.AsyncClientV2,\n    mode: instructor.Mode = instructor.Mode.COHERE_JSON_SCHEMA,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\ndef from_cohere(\n    client: cohere.Client | cohere.AsyncClient | cohere.ClientV2 | cohere.AsyncClientV2,\n    mode: instructor.Mode = instructor.Mode.COHERE_TOOLS,\n    **kwargs: Any,\n):\n    valid_modes = {\n        instructor.Mode.COHERE_TOOLS,\n        instructor.Mode.COHERE_JSON_SCHEMA,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode), provider=\"Cohere\", valid_modes=[str(m) for m in valid_modes]\n        )\n\n    # Determine if we're dealing with an async client\n    is_async = isinstance(client, (cohere.AsyncClient, cohere.AsyncClientV2))\n\n    if isinstance(client, (cohere.ClientV2, cohere.AsyncClientV2)):\n        client_version = \"v2\"\n    elif isinstance(client, (cohere.Client, cohere.AsyncClient)):\n        client_version = \"v1\"\n    else:\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of cohere.Client or cohere.AsyncClient or cohere.ClientV2 or cohere.AsyncClientV2. \"\n            f\"Got: {type(client).__name__}\"\n        )\n    kwargs[\"_cohere_client_version\"] = client_version\n\n    if is_async:\n\n        async def async_wrapper(*args: Any, **call_kwargs: Any):\n            if call_kwargs.pop(\"stream\", False):\n                return client.chat_stream(*args, **call_kwargs)\n            result = client.chat(*args, **call_kwargs)\n            if inspect.isawaitable(result):\n                return await cast(Awaitable[Any], result)\n            return result\n\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.COHERE,\n            mode=mode,\n            **kwargs,\n        )\n    else:\n\n        def sync_wrapper(*args: Any, **call_kwargs: Any):\n            if call_kwargs.pop(\"stream\", False):\n                return client.chat_stream(*args, **call_kwargs)\n            return client.chat(*args, **call_kwargs)\n\n        return instructor.Instructor(\n            client=client,\n            create=instructor.patch(create=sync_wrapper, mode=mode),\n            provider=instructor.Provider.COHERE,\n            mode=mode,\n            **kwargs,\n        )\n"
  },
  {
    "path": "instructor/providers/cohere/utils.py",
    "content": "\"\"\"Cohere-specific utilities.\n\nThis module contains utilities specific to the Cohere provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom ...mode import Mode\n\n\ndef reask_cohere_tools(\n    kwargs: dict[str, Any],\n    response: Any,  # Replace with actual response type for Cohere\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Cohere tools and JSON schema modes.\n    Supports both V1 and V2 formats.\n\n    V1 kwargs modifications:\n    - Adds/Modifies: \"chat_history\" (appends prior message)\n    - Modifies: \"message\" (user prompt describing validation errors)\n\n    V2 kwargs modifications:\n    - Modifies: \"messages\" (appends error correction message)\n    \"\"\"\n    # Default to marker stored on kwargs (set during client initialization)\n    client_version = kwargs.get(\"_cohere_client_version\")\n\n    # Detect V1 vs V2 response structure and extract text\n    if hasattr(response, \"text\"):\n        client_version = \"v1\"\n        response_text = response.text\n    elif hasattr(response, \"message\") and hasattr(response.message, \"content\"):\n        client_version = \"v2\"\n        content_items = response.message.content\n        response_text = \"\"\n        if content_items:\n            # Find the text content item (skip thinking/other types)\n            for item in content_items:\n                if (\n                    hasattr(item, \"type\")\n                    and item.type == \"text\"\n                    and hasattr(item, \"text\")\n                ):\n                    response_text = item.text\n                    break\n        if not response_text:\n            response_text = str(response)\n    else:\n        # Fallback to string representation\n        response_text = str(response)\n        if client_version is None:\n            if \"messages\" in kwargs:\n                client_version = \"v2\"\n            elif \"chat_history\" in kwargs or \"message\" in kwargs:\n                client_version = \"v1\"\n\n    # Create the correction message\n    correction_msg = (\n        \"Correct the following JSON response, based on the errors given below:\\n\\n\"\n        f\"JSON:\\n{response_text}\\n\\nExceptions:\\n{exception}\"\n    )\n\n    if client_version == \"v2\":\n        # V2 format: append to messages list\n        kwargs[\"messages\"].append({\"role\": \"user\", \"content\": correction_msg})\n    elif client_version == \"v1\":\n        # V1 format: use chat_history and message\n        message = kwargs.get(\"message\", \"\")\n\n        # Fetch or initialize chat_history in one operation\n        if \"chat_history\" in kwargs:\n            kwargs[\"chat_history\"].append({\"role\": \"user\", \"message\": message})\n        else:\n            kwargs[\"chat_history\"] = [{\"role\": \"user\", \"message\": message}]\n\n        kwargs[\"message\"] = correction_msg\n    else:\n        # Unknown version - raise error for future compatibility\n        raise ValueError(\n            f\"Unsupported Cohere client version: {client_version}. \"\n            f\"Expected 'v1' or 'v2'.\"\n        )\n\n    return kwargs\n\n\ndef handle_cohere_modes(new_kwargs: dict[str, Any]) -> tuple[None, dict[str, Any]]:\n    \"\"\"\n    Convert OpenAI-style messages to Cohere format.\n    Handles both V1 and V2 client formats.\n\n    V1 format:\n    - Removes: \"messages\"\n    - Adds: \"message\" (last user message)\n    - Adds: \"chat_history\" (prior messages)\n\n    V2 format:\n    - Keeps: \"messages\" (compatible with OpenAI format)\n\n    Both versions:\n    - Renames: \"model_name\" -> \"model\"\n    - Removes: \"strict\"\n    - Removes: \"_cohere_client_version\" (internal marker)\n    \"\"\"\n    new_kwargs = new_kwargs.copy()\n    client_version = new_kwargs.pop(\"_cohere_client_version\")\n\n    if client_version == \"v2\":\n        # V2 uses OpenAI-style messages directly - no conversion needed\n        # Just clean up incompatible fields\n        if \"model_name\" in new_kwargs and \"model\" not in new_kwargs:\n            new_kwargs[\"model\"] = new_kwargs.pop(\"model_name\")\n        new_kwargs.pop(\"strict\", None)\n    elif client_version == \"v1\":\n        # V1 needs conversion from OpenAI format to Cohere V1 format\n        messages = new_kwargs.pop(\"messages\", [])\n        chat_history = []\n        for message in messages[:-1]:\n            chat_history.append(  # type: ignore[arg-type]\n                {\n                    \"role\": message[\"role\"],\n                    \"message\": message[\"content\"],\n                }\n            )\n        new_kwargs[\"message\"] = messages[-1][\"content\"]\n        new_kwargs[\"chat_history\"] = chat_history\n        if \"model_name\" in new_kwargs and \"model\" not in new_kwargs:\n            new_kwargs[\"model\"] = new_kwargs.pop(\"model_name\")\n        new_kwargs.pop(\"strict\", None)\n    else:\n        # Unknown version - raise error for future compatibility\n        raise ValueError(\n            f\"Unsupported Cohere client version: {client_version}. \"\n            f\"Expected 'v1' or 'v2'.\"\n        )\n\n    return None, new_kwargs\n\n\ndef handle_cohere_json_schema(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Cohere JSON schema mode.\n\n    When response_model is None:\n        - Converts messages from OpenAI format to Cohere format (message + chat_history)\n        - No schema is added to the request\n\n    When response_model is provided:\n        - Converts messages from OpenAI format to Cohere format\n        - Adds the model's JSON schema to response_format\n\n    Kwargs modifications:\n    - Removes: \"messages\" (converted to message + chat_history)\n    - Adds: \"message\" (last message content)\n    - Adds: \"chat_history\" (all messages except last)\n    - Modifies: \"model\" (if \"model_name\" exists, renames to \"model\")\n    - Removes: \"strict\"\n    - Adds: \"response_format\" (with JSON schema) - only when response_model provided\n    \"\"\"\n    if response_model is None:\n        # Just handle message conversion\n        return handle_cohere_modes(new_kwargs)\n\n    new_kwargs[\"response_format\"] = {\n        \"type\": \"json_object\",\n        \"schema\": response_model.model_json_schema(),\n    }\n    _, new_kwargs = handle_cohere_modes(new_kwargs)\n\n    return response_model, new_kwargs\n\n\ndef handle_cohere_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Cohere tools mode.\n\n    When response_model is None:\n        - Converts messages from OpenAI format to Cohere format (message + chat_history for V1, messages for V2)\n        - No tools or schema instructions are added\n        - Allows for unstructured responses from Cohere\n\n    When response_model is provided:\n        - Converts messages from OpenAI format to Cohere format\n        - Prepends extraction instructions to the chat history (V1) or messages (V2)\n        - Includes the model's JSON schema in the instructions\n        - The model is instructed to extract a valid object matching the schema\n\n    Kwargs modifications:\n    - All modifications from handle_cohere_modes (message format conversion)\n    - Modifies: \"chat_history\" (V1) or \"messages\" (V2) to prepend extraction instruction - only when response_model provided\n    \"\"\"\n    if response_model is None:\n        # Just handle message conversion\n        return handle_cohere_modes(new_kwargs)\n\n    _, new_kwargs = handle_cohere_modes(new_kwargs)\n\n    instruction = f\"\"\"\\\nExtract a valid {response_model.__name__} object based on the chat history and the json schema below.\n{response_model.model_json_schema()}\nThe JSON schema was obtained by running:\n```python\nschema = {response_model.__name__}.model_json_schema()\n```\n\nThe output must be a valid JSON object that `{response_model.__name__}.model_validate_json()` can successfully parse.\nRespond with JSON only. Do not include code fences, markdown, or extra text.\n\"\"\"\n    # Check client version explicitly (marker already removed by handle_cohere_modes)\n    # Use presence of messages vs chat_history as indicator since marker is already consumed\n    if \"messages\" in new_kwargs:\n        # V2 format: prepend to messages\n        new_kwargs[\"messages\"].insert(0, {\"role\": \"user\", \"content\": instruction})\n    else:\n        # V1 format: prepend to chat_history\n        new_kwargs[\"chat_history\"] = [\n            {\"role\": \"user\", \"message\": instruction}\n        ] + new_kwargs[\"chat_history\"]\n\n    return response_model, new_kwargs\n\n\n# Handler registry for Cohere\nCOHERE_HANDLERS = {\n    Mode.COHERE_TOOLS: {\n        \"reask\": reask_cohere_tools,\n        \"response\": handle_cohere_tools,\n    },\n    Mode.COHERE_JSON_SCHEMA: {\n        \"reask\": reask_cohere_tools,\n        \"response\": handle_cohere_json_schema,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/fireworks/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/fireworks/client.py",
    "content": "from __future__ import annotations\n\nfrom typing import TYPE_CHECKING, Any, overload\n\nimport instructor\nfrom ...core.client import AsyncInstructor, Instructor\n\nif TYPE_CHECKING:\n    from fireworks.client import AsyncFireworks, Fireworks\nelse:\n    try:\n        from fireworks.client import AsyncFireworks, Fireworks\n    except ImportError:\n        AsyncFireworks = None  # type:ignore\n        Fireworks = None  # type:ignore\n\n\n@overload\ndef from_fireworks(\n    client: Fireworks,\n    mode: instructor.Mode = instructor.Mode.FIREWORKS_JSON,\n    **kwargs: Any,\n) -> Instructor: ...\n\n\n@overload\ndef from_fireworks(\n    client: AsyncFireworks,\n    mode: instructor.Mode = instructor.Mode.FIREWORKS_JSON,\n    **kwargs: Any,\n) -> AsyncInstructor: ...\n\n\ndef from_fireworks(\n    client: Fireworks | AsyncFireworks,  # type: ignore\n    mode: instructor.Mode = instructor.Mode.FIREWORKS_JSON,\n    **kwargs: Any,\n) -> Instructor | AsyncInstructor:\n    valid_modes = {\n        instructor.Mode.FIREWORKS_TOOLS,\n        instructor.Mode.FIREWORKS_JSON,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Fireworks\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, (AsyncFireworks, Fireworks)):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of Fireworks or AsyncFireworks. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if isinstance(client, AsyncFireworks):\n\n        async def async_wrapper(*args: Any, **kwargs: Any):  # type:ignore\n            if \"stream\" in kwargs and kwargs[\"stream\"] is True:\n                return client.chat.completions.acreate(*args, **kwargs)  # type:ignore\n            return await client.chat.completions.acreate(*args, **kwargs)  # type:ignore\n\n        return AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.FIREWORKS,\n            mode=mode,\n            **kwargs,\n        )\n\n    if isinstance(client, Fireworks):\n        return Instructor(\n            client=client,\n            create=instructor.patch(create=client.chat.completions.create, mode=mode),  # type: ignore\n            provider=instructor.Provider.FIREWORKS,\n            mode=mode,\n            **kwargs,\n        )\n\n    # Should never reach here due to earlier validation, but needed for type checker\n    raise AssertionError(\"Client must be AsyncFireworks or Fireworks\")\n"
  },
  {
    "path": "instructor/providers/fireworks/utils.py",
    "content": "\"\"\"Fireworks-specific utilities.\n\nThis module contains utilities specific to the Fireworks provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom ...mode import Mode\nfrom ...processing.schema import generate_openai_schema\nfrom ...utils.core import dump_message\n\n\ndef reask_fireworks_tools(kwargs: dict[str, Any], response: Any, exception: Exception):\n    \"\"\"\n    Handle reask for Fireworks tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (tool response messages indicating validation errors)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    for tool_call in response.choices[0].message.tool_calls:\n        reask_msgs.append(\n            {\n                \"role\": \"tool\",  # type: ignore\n                \"tool_call_id\": tool_call.id,\n                \"name\": tool_call.function.name,\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\"\n                ),\n            }\n        )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_fireworks_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Fireworks JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": f\"Correct your JSON ONLY RESPONSE, based on the following errors:\\n{exception}\",\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef handle_fireworks_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Fireworks tools mode.\n\n    Kwargs modifications:\n    - Adds: \"tools\" (list with function schema)\n    - Adds: \"tool_choice\" (forced function call)\n    - Sets default: stream=False\n    \"\"\"\n    if \"stream\" not in new_kwargs:\n        new_kwargs[\"stream\"] = False\n    new_kwargs[\"tools\"] = [\n        {\n            \"type\": \"function\",\n            \"function\": generate_openai_schema(response_model),\n        }\n    ]\n    new_kwargs[\"tool_choice\"] = {\n        \"type\": \"function\",\n        \"function\": {\"name\": generate_openai_schema(response_model)[\"name\"]},\n    }\n    return response_model, new_kwargs\n\n\ndef handle_fireworks_json(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Fireworks JSON mode.\n\n    Kwargs modifications:\n    - Adds: \"response_format\" with json_schema\n    - Sets default: stream=False\n    \"\"\"\n    if \"stream\" not in new_kwargs:\n        new_kwargs[\"stream\"] = False\n\n    new_kwargs[\"response_format\"] = {\n        \"type\": \"json_object\",\n        \"schema\": response_model.model_json_schema(),\n    }\n    return response_model, new_kwargs\n\n\n# Handler registry for Fireworks\nFIREWORKS_HANDLERS = {\n    Mode.FIREWORKS_TOOLS: {\n        \"reask\": reask_fireworks_tools,\n        \"response\": handle_fireworks_tools,\n    },\n    Mode.FIREWORKS_JSON: {\n        \"reask\": reask_fireworks_json,\n        \"response\": handle_fireworks_json,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/gemini/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/gemini/client.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any, Literal, overload\n\nimport google.generativeai as genai  # type: ignore[import-not-found]\n\nimport instructor\n\n\n@overload\ndef from_gemini(\n    client: genai.GenerativeModel,\n    mode: instructor.Mode = instructor.Mode.GEMINI_JSON,\n    use_async: Literal[True] = True,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\n@overload\ndef from_gemini(\n    client: genai.GenerativeModel,\n    mode: instructor.Mode = instructor.Mode.GEMINI_JSON,\n    use_async: Literal[False] = False,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\ndef from_gemini(\n    client: genai.GenerativeModel,\n    mode: instructor.Mode = instructor.Mode.GEMINI_JSON,\n    use_async: bool = False,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    import warnings\n\n    warnings.warn(\n        \"from_gemini is deprecated and will be removed in a future version. \"\n        \"Please use from_genai or from_provider instead. \"\n        \"Install google-genai with: pip install google-genai\\n\"\n        \"Example migration:\\n\"\n        \"  # Old way\\n\"\n        \"  from instructor import from_gemini\\n\"\n        \"  import google.generativeai as genai\\n\"\n        \"  client = from_gemini(genai.GenerativeModel('gemini-3-flash'))\\n\\n\"\n        \"  # New way\\n\"\n        \"  from instructor import from_genai\\n\"\n        \"  from google import genai\\n\"\n        \"  client = from_genai(genai.Client())\\n\"\n        \"  # OR use from_provider\\n\"\n        \"  client = instructor.from_provider('google/gemini-3-flash')\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    valid_modes = {\n        instructor.Mode.GEMINI_JSON,\n        instructor.Mode.GEMINI_TOOLS,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode), provider=\"Gemini\", valid_modes=[str(m) for m in valid_modes]\n        )\n\n    if not isinstance(client, genai.GenerativeModel):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of genai.GenerativeModel. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if use_async:\n        create = client.generate_content_async\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=create, mode=mode),\n            provider=instructor.Provider.GEMINI,\n            mode=mode,\n            **kwargs,\n        )\n\n    create = client.generate_content\n    return instructor.Instructor(\n        client=client,\n        create=instructor.patch(create=create, mode=mode),\n        provider=instructor.Provider.GEMINI,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/gemini/utils.py",
    "content": "\"\"\"Google-specific utilities (Gemini, GenAI, VertexAI).\n\nThis module contains utilities specific to Google providers,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport json\nimport re\nfrom textwrap import dedent\nfrom typing import TYPE_CHECKING, Any, Union\n\nfrom openai.types.chat import ChatCompletionMessageParam\nfrom pydantic import BaseModel\n\nfrom ...dsl.partial import Partial, PartialBase\nfrom ...core.exceptions import ConfigurationError\nfrom ...mode import Mode\nfrom ...processing.multimodal import Audio, Image, PDF\nfrom ...utils.core import get_message_content\n\nif TYPE_CHECKING:\n    from google.genai import types\n\n\ndef _get_model_schema(response_model: Any) -> dict[str, Any]:\n    \"\"\"\n    Safely get the JSON schema from a response model.\n\n    Handles both regular models and Partial-wrapped models by using hasattr\n    to check for the model_json_schema method.\n\n    Args:\n        response_model: The response model (may be regular or Partial-wrapped)\n\n    Returns:\n        The JSON schema dictionary\n    \"\"\"\n    if hasattr(response_model, \"model_json_schema\") and callable(\n        response_model.model_json_schema\n    ):\n        return response_model.model_json_schema()\n    # Fallback for wrapped types\n    return getattr(response_model, \"model_json_schema\", {})  # type: ignore[return-value]\n\n\ndef _get_model_name(response_model: Any) -> str:\n    \"\"\"\n    Safely get the name of a response model.\n\n    Handles both regular models and Partial-wrapped models by using getattr\n    with a fallback to 'Model'.\n\n    Args:\n        response_model: The response model (may be regular or Partial-wrapped)\n\n    Returns:\n        The model name\n    \"\"\"\n    return getattr(response_model, \"__name__\", \"Model\")\n\n\ndef transform_to_gemini_prompt(\n    messages_chatgpt: list[ChatCompletionMessageParam],\n) -> list[dict[str, Any]]:\n    \"\"\"\n    Transform messages from OpenAI format to Gemini format.\n\n    This optimized version reduces redundant processing and improves\n    handling of system messages.\n\n    Args:\n        messages_chatgpt: Messages in OpenAI format\n\n    Returns:\n        Messages in Gemini format\n    \"\"\"\n    # Fast path for empty messages\n    if not messages_chatgpt:\n        return []\n\n    # Process system messages first (collect all system messages)\n    system_prompts = []\n    for message in messages_chatgpt:\n        if message.get(\"role\") == \"system\":\n            content = message.get(\"content\", \"\")\n            if content:  # Only add non-empty system prompts\n                system_prompts.append(content)\n\n    # Format system prompt if we have any\n    system_prompt = \"\"\n    if system_prompts:\n        # Handle multiple system prompts by joining them\n        system_prompt = \"\\n\\n\".join(filter(None, system_prompts))\n\n    # Count non-system messages to pre-allocate result list\n    message_count = sum(1 for m in messages_chatgpt if m.get(\"role\") != \"system\")\n    messages_gemini = []\n\n    # Role mapping for faster lookups\n    role_map = {\n        \"user\": \"user\",\n        \"assistant\": \"model\",\n    }\n\n    # Process non-system messages in one pass\n    for message in messages_chatgpt:\n        role = message.get(\"role\", \"\")\n        if role in role_map:\n            gemini_role = role_map[role]\n            messages_gemini.append(\n                {\"role\": gemini_role, \"parts\": get_message_content(message)}\n            )\n\n    # Add system prompt if we have one\n    if system_prompt:\n        if messages_gemini:\n            # Add to the first message (most likely user message)\n            first_message = messages_gemini[0]\n            # Only insert if parts is a list\n            if isinstance(first_message.get(\"parts\"), list):\n                first_message[\"parts\"].insert(0, f\"*{system_prompt}*\")\n        else:\n            # Create a new user message just for the system prompt\n            messages_gemini.append({\"role\": \"user\", \"parts\": [f\"*{system_prompt}*\"]})\n\n    return messages_gemini\n\n\ndef verify_no_unions(obj: dict[str, Any]) -> bool:  # noqa: ARG001\n    \"\"\"\n    Verify that the object does not contain any Union types (except Optional and Decimal).\n    Optional[T] is allowed as it becomes Union[T, None].\n    Decimal types are allowed as Union[str, float] or Union[float, str].\n\n    Note: As of December 2024, Google GenAI now supports Union types\n    (see https://github.com/googleapis/python-genai/issues/447).\n    This function is kept for backward compatibility but now returns True\n    for all schemas. The validation is no longer necessary.\n\n    Args:\n        obj: The schema object to verify (kept for backward compatibility).\n\n    Returns:\n        Always returns True since Union types are now supported.\n    \"\"\"\n    # Google GenAI now supports Union types, so we no longer need to validate.\n    # See: https://github.com/instructor-ai/instructor/issues/1964\n    return True\n\n\ndef map_to_gemini_function_schema(obj: dict[str, Any]) -> dict[str, Any]:\n    \"\"\"\n    Map OpenAPI schema to Gemini function call schema.\n\n    Transforms a standard JSON schema to Gemini's expected format:\n    - Adds 'format': 'enum' for enum fields\n    - Converts Optional[T] (anyOf with null) to nullable fields\n    - Preserves Union types (anyOf) as they are now supported by GenAI SDK\n\n    Ref: https://ai.google.dev/api/python/google/generativeai/protos/Schema\n    \"\"\"\n    import jsonref\n\n    class FunctionSchema(BaseModel):\n        description: str | None = None\n        enum: list[str] | None = None\n        example: Any | None = None\n        format: str | None = None\n        nullable: bool | None = None\n        items: FunctionSchema | None = None\n        required: list[str] | None = None\n        type: str | None = None\n        anyOf: list[dict[str, Any]] | None = None\n        properties: dict[str, FunctionSchema] | None = None\n\n    # Resolve any $ref references in the schema\n    schema: dict[str, Any] = jsonref.replace_refs(obj, lazy_load=False)  # type: ignore\n    schema.pop(\"$defs\", None)\n\n    def transform_schema_node(node: Any) -> Any:\n        \"\"\"Transform a single schema node recursively.\"\"\"\n        if isinstance(node, list):\n            return [transform_schema_node(item) for item in node]\n\n        if not isinstance(node, dict):\n            return node\n\n        transformed = {}\n\n        for key, value in node.items():\n            if key == \"enum\":\n                # Gemini requires 'format': 'enum' for enum fields\n                transformed[key] = value\n                transformed[\"format\"] = \"enum\"\n            elif key == \"anyOf\" and isinstance(value, list) and len(value) == 2:\n                # Handle Optional[T] which becomes Union[T, None] in JSON schema\n                non_null_items = [\n                    item\n                    for item in value\n                    if not (isinstance(item, dict) and item.get(\"type\") == \"null\")\n                ]\n\n                if len(non_null_items) == 1:\n                    # This is Optional[T] - merge the actual type and mark as nullable\n                    actual_type = transform_schema_node(non_null_items[0])\n                    transformed.update(actual_type)\n                    transformed[\"nullable\"] = True\n                else:\n                    # Check if this is a Decimal type (string | number)\n                    types_in_union = []\n                    for item in value:\n                        if isinstance(item, dict) and \"type\" in item:\n                            types_in_union.append(item[\"type\"])\n\n                    if set(types_in_union) == {\"string\", \"number\"}:\n                        # This is a Decimal type - keep the anyOf structure\n                        transformed[key] = transform_schema_node(value)\n                    else:\n                        # This is a true Union type - keep as is and let validation catch it\n                        transformed[key] = transform_schema_node(value)\n            else:\n                transformed[key] = transform_schema_node(value)\n\n        return transformed\n\n    schema = transform_schema_node(schema)\n\n    # Validate that no unsupported Union types remain\n    if not verify_no_unions(schema):\n        raise ValueError(\n            \"Gemini does not support Union types (except Optional). Please change your function schema\"\n        )\n\n    return FunctionSchema(**schema).model_dump(exclude_none=True, exclude_unset=True)\n\n\nif TYPE_CHECKING:\n    from google.genai import types as genai_types\n\n\ndef map_to_genai_schema(obj: dict[str, Any]) -> genai_types.Schema:\n    from google.genai import types\n\n    schema = map_to_gemini_function_schema(obj)\n\n    def normalize(node: Any) -> Any:\n        if isinstance(node, list):\n            return [normalize(item) for item in node]\n\n        if not isinstance(node, dict):\n            return node\n\n        key_map = {\n            \"anyOf\": \"any_of\",\n            \"$ref\": \"ref\",\n            \"$defs\": \"defs\",\n            \"maxItems\": \"max_items\",\n            \"minItems\": \"min_items\",\n            \"maxLength\": \"max_length\",\n            \"minLength\": \"min_length\",\n            \"maxProperties\": \"max_properties\",\n            \"minProperties\": \"min_properties\",\n        }\n\n        normalized: dict[str, Any] = {}\n        for key, value in node.items():\n            normalized[key_map.get(key, key)] = normalize(value)\n        return normalized\n\n    return types.Schema.model_validate(normalize(schema))\n\n\ndef update_genai_kwargs(\n    kwargs: dict[str, Any], base_config: dict[str, Any]\n) -> dict[str, Any]:\n    \"\"\"\n    Update keyword arguments for google.genai package from OpenAI format.\n\n    Handles merging of user-provided config with instructor's base config,\n    including special handling for thinking_config and other config fields.\n    \"\"\"\n    from google.genai.types import HarmBlockThreshold, HarmCategory\n\n    new_kwargs = kwargs.copy()\n\n    OPENAI_TO_GEMINI_MAP = {\n        \"max_tokens\": \"max_output_tokens\",\n        \"temperature\": \"temperature\",\n        \"n\": \"candidate_count\",\n        \"top_p\": \"top_p\",\n        \"stop\": \"stop_sequences\",\n        \"seed\": \"seed\",\n        \"presence_penalty\": \"presence_penalty\",\n        \"frequency_penalty\": \"frequency_penalty\",\n    }\n\n    generation_config = new_kwargs.pop(\"generation_config\", {})\n\n    for openai_key, gemini_key in OPENAI_TO_GEMINI_MAP.items():\n        if openai_key in generation_config:\n            val = generation_config.pop(openai_key)\n            if val is not None:  # Only set if value is not None\n                base_config[gemini_key] = val\n\n    def _genai_kwargs_has_image_content(genai_kwargs: dict[str, Any]) -> bool:\n        \"\"\"\n        Best-effort check for image content in a GenAI request.\n\n        We use this to decide whether to send text vs image harm categories in\n        `safety_settings`. The google-genai SDK has separate image categories\n        (e.g., `HARM_CATEGORY_IMAGE_HATE`) which are required for image content.\n        \"\"\"\n        # Prefer typed GenAI contents if present (works with autodetect_images)\n        contents = genai_kwargs.get(\"contents\")\n        if isinstance(contents, list):\n            for content in contents:\n                parts = getattr(content, \"parts\", None)\n                if not parts:\n                    continue\n                for part in parts:\n                    inline_data = getattr(part, \"inline_data\", None)\n                    if inline_data is not None:\n                        mime_type = getattr(inline_data, \"mime_type\", None)\n                        if isinstance(mime_type, str) and mime_type.startswith(\n                            \"image/\"\n                        ):\n                            return True\n\n                    file_data = getattr(part, \"file_data\", None)\n                    if file_data is not None:\n                        mime_type = getattr(file_data, \"mime_type\", None)\n                        if isinstance(mime_type, str) and mime_type.startswith(\n                            \"image/\"\n                        ):\n                            return True\n\n        # Fall back to OpenAI-style messages if present\n        messages = genai_kwargs.get(\"messages\")\n        if isinstance(messages, list):\n            for message in messages:\n                if not isinstance(message, dict):\n                    continue\n                content = message.get(\"content\")\n                if isinstance(content, Image):\n                    return True\n                if isinstance(content, list):\n                    for item in content:\n                        if isinstance(item, Image):\n                            return True\n                        if isinstance(item, dict) and item.get(\"type\") in {\n                            \"image\",\n                            \"image_url\",\n                            \"input_image\",\n                        }:\n                            return True\n                if isinstance(content, dict) and content.get(\"type\") in {\n                    \"image\",\n                    \"image_url\",\n                    \"input_image\",\n                }:\n                    return True\n\n        return False\n\n    safety_settings = new_kwargs.pop(\"safety_settings\", {})\n    base_config[\"safety_settings\"] = []\n\n    # If users pass a list of settings, assume it's already in SDK format.\n    # This preserves compatibility with advanced usage.\n    if isinstance(safety_settings, list):\n        base_config[\"safety_settings\"] = safety_settings\n        safety_settings = None\n\n    # Filter out image related harm categories which are not\n    # supported for text based models\n    # Exclude JAILBREAK category as it's only for Vertex AI, not google.genai\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    if safety_settings is not None:\n        # google-genai has separate categories for image content.\n        has_image = _genai_kwargs_has_image_content(new_kwargs)\n        image_categories = [\n            c\n            for c in HarmCategory\n            if c not in excluded_categories\n            and c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n        ]\n        text_categories = [\n            c\n            for c in HarmCategory\n            if c not in excluded_categories\n            and not c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n        ]\n\n        supported_categories = (\n            image_categories if (has_image and image_categories) else text_categories\n        )\n\n        def _map_text_to_image_category_name(image_category_name: str) -> str | None:\n            suffix = image_category_name.removeprefix(\"HARM_CATEGORY_IMAGE_\")\n            # google-genai uses IMAGE_HATE while text uses HATE_SPEECH\n            if suffix == \"HATE\":\n                return \"HARM_CATEGORY_HATE_SPEECH\"\n            return f\"HARM_CATEGORY_{suffix}\"\n\n        for category in supported_categories:\n            threshold = HarmBlockThreshold.OFF\n            if isinstance(safety_settings, dict):\n                if category in safety_settings:\n                    threshold = safety_settings[category]\n                # If we are using image categories, try to honor thresholds passed via text categories.\n                elif has_image and category.name.startswith(\"HARM_CATEGORY_IMAGE_\"):\n                    mapped_name = _map_text_to_image_category_name(category.name)\n                    if mapped_name is not None and hasattr(HarmCategory, mapped_name):\n                        mapped_category = getattr(HarmCategory, mapped_name)\n                        if mapped_category in safety_settings:\n                            threshold = safety_settings[mapped_category]\n\n            base_config[\"safety_settings\"].append(\n                {\n                    \"category\": category,\n                    \"threshold\": threshold,\n                }\n            )\n\n    # Extract thinking_config from user's config if provided (dict or object)\n    # This ensures thinking_config inside config parameter is not ignored.\n    user_config = new_kwargs.get(\"config\")\n    user_thinking_config = None\n    if isinstance(user_config, dict):\n        user_thinking_config = user_config.get(\"thinking_config\")\n    elif user_config is not None and hasattr(user_config, \"thinking_config\"):\n        user_thinking_config = user_config.thinking_config\n\n    # Handle thinking_config parameter - prioritize kwarg over config.thinking_config\n    thinking_config = new_kwargs.pop(\"thinking_config\", None)\n    if thinking_config is None:\n        thinking_config = user_thinking_config\n\n    if thinking_config is not None:\n        base_config[\"thinking_config\"] = thinking_config\n\n    # Extract other relevant fields from user's config (dict or object).\n    # This ensures fields like automatic_function_calling / labels / cached_content\n    # are not ignored when config is passed as a dict.\n    if user_config is not None:\n        config_fields_to_merge = [\n            \"automatic_function_calling\",\n            \"labels\",\n            \"cached_content\",\n        ]\n        for field in config_fields_to_merge:\n            if isinstance(user_config, dict):\n                field_value = user_config.get(field)\n            elif hasattr(user_config, field):\n                field_value = getattr(user_config, field)\n            else:\n                field_value = None\n\n            if field_value is not None and field not in base_config:\n                base_config[field] = field_value\n\n    return base_config\n\n\ndef update_gemini_kwargs(kwargs: dict[str, Any]) -> dict[str, Any]:\n    \"\"\"\n    Update keyword arguments for Gemini API from OpenAI format.\n\n    This optimized version reduces redundant operations and uses\n    efficient data transformations.\n\n    Args:\n        kwargs: Dictionary of keyword arguments to update\n\n    Returns:\n        Updated dictionary of keyword arguments\n    \"\"\"\n    # Make a copy of kwargs to avoid modifying the original\n    result = kwargs.copy()\n\n    # Mapping of OpenAI args to Gemini args - defined as constant\n    # for quicker lookup without recreating the dictionary on each call\n    OPENAI_TO_GEMINI_MAP = {\n        \"max_tokens\": \"max_output_tokens\",\n        \"temperature\": \"temperature\",\n        \"n\": \"candidate_count\",\n        \"top_p\": \"top_p\",\n        \"stop\": \"stop_sequences\",\n    }\n\n    # Update generation_config if present\n    if \"generation_config\" in result:\n        gen_config = result[\"generation_config\"]\n\n        # Bulk process the mapping with fewer conditionals\n        for openai_key, gemini_key in OPENAI_TO_GEMINI_MAP.items():\n            if openai_key in gen_config:\n                val = gen_config.pop(openai_key)\n                if val is not None:  # Only set if value is not None\n                    gen_config[gemini_key] = val\n\n    # Transform messages format if messages key exists\n    if \"messages\" in result:\n        # Transform messages and store them under \"contents\" key\n        result[\"contents\"] = transform_to_gemini_prompt(result.pop(\"messages\"))\n\n    # Handle safety settings - import here to avoid circular imports\n    try:\n        from google.genai.types import HarmBlockThreshold, HarmCategory  # type: ignore\n    except ImportError:\n        # Fallback for backward compatibility\n        from google.generativeai.types import (  # type: ignore\n            HarmBlockThreshold,\n            HarmCategory,\n        )\n\n    # Create or get existing safety settings\n    safety_settings = result.get(\"safety_settings\", {})\n    result[\"safety_settings\"] = safety_settings\n\n    # Define default safety thresholds - these are static and can be\n    # defined once rather than recreating the dict on each call\n    DEFAULT_SAFETY_THRESHOLDS = {\n        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_ONLY_HIGH,\n        HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_ONLY_HIGH,\n        HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_ONLY_HIGH,\n    }\n\n    # Update safety settings with defaults if needed (more efficient loop)\n    for category, threshold in DEFAULT_SAFETY_THRESHOLDS.items():\n        current = safety_settings.get(category)\n        # Only update if not set or less restrictive than default\n        # Note: Lower values are more restrictive in HarmBlockThreshold\n        # BLOCK_NONE = 0, BLOCK_LOW_AND_ABOVE = 1, BLOCK_MEDIUM_AND_ABOVE = 2, BLOCK_ONLY_HIGH = 3\n        if current is None or current > threshold:\n            safety_settings[category] = threshold\n\n    return result\n\n\ndef extract_genai_system_message(\n    messages: list[dict[str, Any]],\n) -> str:\n    \"\"\"\n    Extract system messages from a list of messages.\n\n    We expect an explicit system messsage for this provider.\n    \"\"\"\n    system_messages = \"\"\n\n    for message in messages:\n        if isinstance(message, str):\n            continue\n        elif isinstance(message, dict):\n            if message.get(\"role\") == \"system\":\n                if isinstance(message.get(\"content\"), str):\n                    system_messages += message.get(\"content\", \"\") + \"\\n\\n\"\n                elif isinstance(message.get(\"content\"), list):\n                    for item in message.get(\"content\", []):\n                        if isinstance(item, str):\n                            system_messages += item + \"\\n\\n\"\n\n    if system_messages and len(messages) == 1:\n        raise ValueError(\n            \"At least one user message must be included. A system message alone is not sufficient.\"\n        )\n\n    if re.search(r\"{{.*?}}|{%.*?%}\", system_messages):\n        raise ValueError(\n            \"Jinja templating is not supported in system messages with Google GenAI, only user messages.\"\n        )\n\n    return system_messages\n\n\ndef convert_to_genai_messages(\n    messages: list[Union[str, dict[str, Any], list[dict[str, Any]]]],  # noqa: UP007\n) -> list[Any]:\n    \"\"\"\n    Convert a list of messages to a list of dictionaries in the format expected by the Gemini API.\n\n    This optimized version pre-allocates the result list and\n    reduces function call overhead.\n    \"\"\"\n    from google.genai import types\n\n    result: list[Union[types.Content, types.File]] = []  # noqa: UP007\n\n    for message in messages:\n        # We assume this is the user's message and we don't need to convert it\n        if isinstance(message, str):\n            result.append(\n                types.Content(\n                    role=\"user\",\n                    parts=[types.Part.from_text(text=message)],\n                )\n            )\n        elif isinstance(message, types.Content):\n            result.append(message)\n        elif isinstance(message, types.File):\n            result.append(message)\n        elif isinstance(message, dict):\n            assert \"role\" in message\n            assert \"content\" in message\n\n            if message[\"role\"] == \"system\":\n                continue\n\n            if message[\"role\"] not in {\"user\", \"model\"}:\n                raise ValueError(f\"Unsupported role: {message['role']}\")\n\n            if isinstance(message[\"content\"], str):\n                result.append(\n                    types.Content(\n                        role=message[\"role\"],\n                        parts=[types.Part.from_text(text=message[\"content\"])],\n                    )\n                )\n\n            elif isinstance(message[\"content\"], list):\n                content_parts = []\n\n                for content_item in message[\"content\"]:\n                    if isinstance(content_item, str):\n                        content_parts.append(types.Part.from_text(text=content_item))\n                    elif isinstance(content_item, (Image, Audio, PDF)):\n                        content_parts.append(content_item.to_genai())\n                    else:\n                        raise ValueError(\n                            f\"Unsupported content item type: {type(content_item)}\"\n                        )\n\n                result.append(\n                    types.Content(\n                        role=message[\"role\"],\n                        parts=content_parts,\n                    )\n                )\n        else:\n            raise ValueError(f\"Unsupported message type: {type(message)}\")\n\n    return result\n\n\n# Reask functions\ndef reask_gemini_tools(\n    kwargs: dict[str, Any],\n    response: Any,  # Replace with actual response type for Gemini\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Gemini tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"contents\" (tool response messages indicating validation errors)\n    \"\"\"\n    from google.ai import generativelanguage as glm  # type: ignore\n\n    reask_msgs = [\n        {\n            \"role\": \"model\",\n            \"parts\": [\n                glm.FunctionCall(\n                    name=response.parts[0].function_call.name,\n                    args=response.parts[0].function_call.args,\n                )\n            ],\n        },\n        {\n            \"role\": \"function\",\n            \"parts\": [\n                glm.Part(\n                    function_response=glm.FunctionResponse(\n                        name=response.parts[0].function_call.name,\n                        response={\"error\": f\"Validation Error(s) found:\\n{exception}\"},\n                    )\n                ),\n            ],\n        },\n        {\n            \"role\": \"user\",\n            \"parts\": [\"Recall the function arguments correctly and fix the errors\"],\n        },\n    ]\n    kwargs[\"contents\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_gemini_json(\n    kwargs: dict[str, Any],\n    response: Any,  # Replace with actual response type for Gemini\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Gemini JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"contents\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs[\"contents\"].append(\n        {\n            \"role\": \"user\",\n            \"parts\": [\n                f\"Correct the following JSON response, based on the errors given below:\\n\\n\"\n                f\"JSON:\\n{response.text}\\n\\nExceptions:\\n{exception}\"\n            ],\n        }\n    )\n    return kwargs\n\n\ndef reask_vertexai_tools(\n    kwargs: dict[str, Any],\n    response: Any,  # Replace with actual response type for Vertex AI\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Vertex AI tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"contents\" (tool response messages indicating validation errors)\n    \"\"\"\n    from ..vertexai.client import vertexai_function_response_parser\n\n    kwargs = kwargs.copy()\n    reask_msgs = [\n        response.candidates[0].content,\n        vertexai_function_response_parser(response, exception),\n    ]\n    kwargs[\"contents\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_vertexai_json(\n    kwargs: dict[str, Any],\n    response: Any,  # Replace with actual response type for Vertex AI\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Vertex AI JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"contents\" (user message requesting JSON correction)\n    \"\"\"\n    from ..vertexai.client import vertexai_message_parser\n\n    kwargs = kwargs.copy()\n\n    reask_msgs = [\n        response.candidates[0].content,\n        vertexai_message_parser(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Validation Errors found:\\n{exception}\\nRecall the function correctly, \"\n                    f\"fix the errors found in the following attempt:\\n{response.text}\"\n                ),\n            }\n        ),\n    ]\n    kwargs[\"contents\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_genai_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Google GenAI tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"contents\" (model response preserved for thought_signature,\n                        tool response with validation errors)\n    \"\"\"\n    from google.genai import types\n\n    kwargs = kwargs.copy()\n\n    existing_contents = kwargs.get(\"contents\")\n    if isinstance(existing_contents, list):\n        kwargs[\"contents\"] = existing_contents.copy()\n    elif existing_contents is None:\n        kwargs[\"contents\"] = []\n    else:\n        kwargs[\"contents\"] = list(existing_contents)\n\n    model_content = None\n    function_call_content = None\n    function_call = None\n\n    candidates = getattr(response, \"candidates\", None) if response is not None else None\n    if isinstance(candidates, list):\n        for candidate in candidates:\n            content = getattr(candidate, \"content\", None)\n            if content is None:\n                continue\n\n            if model_content is None:\n                model_content = content\n\n            parts = getattr(content, \"parts\", None) or []\n            for part in parts:\n                function_call = getattr(part, \"function_call\", None)\n                if function_call is not None:\n                    function_call_content = content\n                    break\n\n            if function_call is not None:\n                break\n\n    error_msg = (\n        f\"Validation Error found:\\n{exception}\\n\"\n        \"Recall the function correctly, fix the errors\"\n    )\n\n    if function_call is None:\n        if model_content is not None:\n            kwargs[\"contents\"].append(model_content)\n\n        kwargs[\"contents\"].append(\n            types.Content(\n                role=\"user\",\n                parts=[types.Part.from_text(text=error_msg)],\n            )\n        )\n        return kwargs\n\n    function_response_part = types.Part.from_function_response(\n        name=function_call.name,\n        response={\"error\": error_msg},\n    )\n\n    kwargs[\"contents\"].append(function_call_content)\n    kwargs[\"contents\"].append(\n        types.Content(role=\"tool\", parts=[function_response_part])\n    )\n    return kwargs\n\n\ndef reask_genai_structured_outputs(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Google GenAI structured outputs mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"contents\" (user message describing validation errors)\n    \"\"\"\n    from google.genai import types\n\n    kwargs = kwargs.copy()\n\n    genai_response = (\n        response.text\n        if response and hasattr(response, \"text\")\n        else \"You must generate a response to the user's request that is consistent with the response model\"\n    )\n\n    kwargs[\"contents\"].append(\n        types.ModelContent(\n            parts=[\n                types.Part.from_text(\n                    text=f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors in the following attempt:\\n{genai_response}\"\n                ),\n            ]\n        ),\n    )\n    return kwargs\n\n\n# Response handlers\ndef handle_genai_message_conversion(\n    new_kwargs: dict[str, Any], autodetect_images: bool = False\n) -> dict[str, Any]:\n    \"\"\"\n    Convert OpenAI-style messages to GenAI contents.\n\n    Kwargs modifications:\n    - Removes: \"messages\"\n    - Adds: \"contents\" (GenAI-style messages)\n    - Adds: \"config\" (system instruction) when system not provided\n    \"\"\"\n    from google.genai import types\n\n    messages = new_kwargs.get(\"messages\", [])\n\n    # Convert OpenAI-style messages to GenAI-style contents\n    new_kwargs[\"contents\"] = convert_to_genai_messages(messages)\n\n    # Extract multimodal content for GenAI\n    from ...processing.multimodal import extract_genai_multimodal_content\n\n    new_kwargs[\"contents\"] = extract_genai_multimodal_content(\n        new_kwargs[\"contents\"], autodetect_images\n    )\n\n    # Handle system message for GenAI\n    if \"system\" not in new_kwargs:\n        system_message = extract_genai_system_message(messages)\n        if system_message:\n            new_kwargs[\"config\"] = types.GenerateContentConfig(\n                system_instruction=system_message\n            )\n\n    # Remove messages since we converted to contents\n    new_kwargs.pop(\"messages\", None)\n\n    return new_kwargs\n\n\ndef handle_gemini_json(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Gemini JSON mode.\n\n    When response_model is None:\n        - Updates kwargs for Gemini compatibility (converts messages format)\n        - No JSON schema or response format is configured\n\n    When response_model is provided:\n        - Adds/modifies system message with JSON schema instructions\n        - Sets response_mime_type to \"application/json\"\n        - Updates kwargs for Gemini compatibility\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (adds/modifies system message with JSON schema) - only when response_model provided\n    - Adds/Modifies: \"generation_config\" (sets response_mime_type to \"application/json\") - only when response_model provided\n    - All modifications from update_gemini_kwargs (converts messages to Gemini format)\n    \"\"\"\n    if \"model\" in new_kwargs:\n        raise ConfigurationError(\n            \"Gemini `model` must be set while patching the client, not passed as a parameter to the create method\"\n        )\n\n    if response_model is None:\n        # Just handle message conversion\n        new_kwargs = update_gemini_kwargs(new_kwargs)\n        return None, new_kwargs\n\n    message = dedent(\n        f\"\"\"\n        As a genius expert, your task is to understand the content and provide\n        the parsed objects in json that match the following json_schema:\\n\n\n        {json.dumps(_get_model_schema(response_model), indent=2, ensure_ascii=False)}\n\n        Make sure to return an instance of the JSON, not the schema itself\n        \"\"\"\n    )\n\n    if new_kwargs[\"messages\"][0][\"role\"] != \"system\":\n        new_kwargs[\"messages\"].insert(0, {\"role\": \"system\", \"content\": message})\n    else:\n        new_kwargs[\"messages\"][0][\"content\"] += f\"\\n\\n{message}\"\n\n    new_kwargs[\"generation_config\"] = new_kwargs.get(\"generation_config\", {}) | {\n        \"response_mime_type\": \"application/json\"\n    }\n\n    new_kwargs = update_gemini_kwargs(new_kwargs)\n    return response_model, new_kwargs\n\n\ndef handle_gemini_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Gemini tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: Only applies update_gemini_kwargs transformations\n    - When response_model is provided:\n      - Adds: \"tools\" (list with gemini schema)\n      - Adds: \"tool_config\" (function calling config with mode and allowed functions)\n      - All modifications from update_gemini_kwargs\n    \"\"\"\n    if \"model\" in new_kwargs:\n        raise ConfigurationError(\n            \"Gemini `model` must be set while patching the client, not passed as a parameter to the create method\"\n        )\n\n    if response_model is None:\n        # Just handle message conversion\n        new_kwargs = update_gemini_kwargs(new_kwargs)\n        return None, new_kwargs\n\n    new_kwargs[\"tools\"] = [response_model.gemini_schema]\n    new_kwargs[\"tool_config\"] = {\n        \"function_calling_config\": {\n            \"mode\": \"ANY\",\n            \"allowed_function_names\": [_get_model_name(response_model)],\n        },\n    }\n\n    new_kwargs = update_gemini_kwargs(new_kwargs)\n    return response_model, new_kwargs\n\n\ndef handle_genai_structured_outputs(\n    response_model: type[Any] | None,\n    new_kwargs: dict[str, Any],\n    autodetect_images: bool = False,\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Google GenAI structured outputs mode.\n\n    Kwargs modifications:\n    - When response_model is None: Applies handle_genai_message_conversion\n    - When response_model is provided:\n      - Removes: \"messages\", \"response_model\", \"generation_config\", \"safety_settings\"\n      - Adds: \"contents\" (GenAI-style messages)\n      - Adds: \"config\" (GenerateContentConfig with system_instruction, response_mime_type, response_schema)\n      - Handles multimodal content extraction\n    \"\"\"\n    from google.genai import types\n\n    if response_model is None:\n        # Just handle message conversion\n        new_kwargs = handle_genai_message_conversion(new_kwargs, autodetect_images)\n        return None, new_kwargs\n\n    # Automatically wrap regular models with Partial when streaming is enabled\n    if new_kwargs.get(\"stream\", False) and not issubclass(response_model, PartialBase):\n        response_model = Partial[response_model]\n\n    # Extract thinking_config and cached_content from user-provided config (dict or object).\n    # This fixes issue #1966 (thinking_config ignored) and ensures cached_content\n    # is detected even when config is provided as a dict.\n    user_config = new_kwargs.get(\"config\")\n    user_thinking_config = None\n    user_cached_content = None\n    if isinstance(user_config, dict):\n        user_thinking_config = user_config.get(\"thinking_config\")\n        user_cached_content = user_config.get(\"cached_content\")\n    elif user_config is not None:\n        if hasattr(user_config, \"thinking_config\"):\n            user_thinking_config = user_config.thinking_config\n        if hasattr(user_config, \"cached_content\"):\n            user_cached_content = user_config.cached_content\n\n    # Prioritize kwarg thinking_config over config.thinking_config\n    if \"thinking_config\" not in new_kwargs and user_thinking_config is not None:\n        new_kwargs[\"thinking_config\"] = user_thinking_config\n\n    if new_kwargs.get(\"system\"):\n        system_message = new_kwargs.pop(\"system\")\n    elif new_kwargs.get(\"messages\"):\n        system_message = extract_genai_system_message(new_kwargs[\"messages\"])\n    else:\n        system_message = None\n\n    new_kwargs[\"contents\"] = convert_to_genai_messages(new_kwargs[\"messages\"])\n\n    # Extract multimodal content for GenAI\n    from ...processing.multimodal import extract_genai_multimodal_content\n\n    new_kwargs[\"contents\"] = extract_genai_multimodal_content(\n        new_kwargs[\"contents\"], autodetect_images\n    )\n\n    # We validate that the schema doesn't contain any Union fields\n    map_to_gemini_function_schema(_get_model_schema(response_model))\n\n    base_config = {\n        \"response_mime_type\": \"application/json\",\n        \"response_schema\": response_model,\n    }\n\n    # Only set system_instruction if NOT using cached_content\n    # When cached_content is used, the system instruction is already part of the cache\n    if user_cached_content is None:\n        base_config[\"system_instruction\"] = system_message\n\n    generation_config = update_genai_kwargs(new_kwargs, base_config)\n\n    new_kwargs[\"config\"] = types.GenerateContentConfig(**generation_config)\n    new_kwargs.pop(\"response_model\", None)\n    new_kwargs.pop(\"messages\", None)\n    new_kwargs.pop(\"generation_config\", None)\n    new_kwargs.pop(\"safety_settings\", None)\n    new_kwargs.pop(\"thinking_config\", None)\n\n    return response_model, new_kwargs\n\n\ndef handle_genai_tools(\n    response_model: type[Any] | None,\n    new_kwargs: dict[str, Any],\n    autodetect_images: bool = False,\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle Google GenAI tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: Applies handle_genai_message_conversion\n    - When response_model is provided:\n      - Removes: \"messages\", \"response_model\", \"generation_config\", \"safety_settings\"\n      - Adds: \"contents\" (GenAI-style messages)\n      - Adds: \"config\" (GenerateContentConfig with tools and tool_config)\n      - Handles multimodal content extraction\n    \"\"\"\n    from google.genai import types\n\n    if response_model is None:\n        # Just handle message conversion\n        new_kwargs = handle_genai_message_conversion(new_kwargs, autodetect_images)\n        return None, new_kwargs\n\n    # Automatically wrap regular models with Partial when streaming is enabled\n    if new_kwargs.get(\"stream\", False) and not issubclass(response_model, PartialBase):\n        response_model = Partial[response_model]\n\n    # Extract thinking_config and cached_content from user-provided config (dict or object).\n    # This fixes issue #1966 (thinking_config ignored) and ensures cached_content\n    # is detected even when config is provided as a dict.\n    user_config = new_kwargs.get(\"config\")\n    user_thinking_config = None\n    user_cached_content = None\n    if isinstance(user_config, dict):\n        user_thinking_config = user_config.get(\"thinking_config\")\n        user_cached_content = user_config.get(\"cached_content\")\n    elif user_config is not None:\n        if hasattr(user_config, \"thinking_config\"):\n            user_thinking_config = user_config.thinking_config\n        if hasattr(user_config, \"cached_content\"):\n            user_cached_content = user_config.cached_content\n\n    # Prioritize kwarg thinking_config over config.thinking_config\n    if \"thinking_config\" not in new_kwargs and user_thinking_config is not None:\n        new_kwargs[\"thinking_config\"] = user_thinking_config\n\n    schema = map_to_genai_schema(_get_model_schema(response_model))\n    function_definition = types.FunctionDeclaration(\n        name=_get_model_name(response_model),\n        description=getattr(response_model, \"__doc__\", None),\n        parameters=schema,\n    )\n\n    # We support the system message if you declare a system kwarg or if you pass a system message in the messages\n    if new_kwargs.get(\"system\"):\n        system_message = new_kwargs.pop(\"system\")\n    elif new_kwargs.get(\"messages\"):\n        system_message = extract_genai_system_message(new_kwargs[\"messages\"])\n    else:\n        system_message = None\n\n    base_config: dict[str, Any] = {}\n\n    # When cached_content is used, do NOT add tools, tool_config, or system_instruction\n    # These should already be part of the cache. Adding them causes 400 INVALID_ARGUMENT.\n    # See: https://ai.google.dev/gemini-api/docs/caching\n    if user_cached_content is None:\n        base_config[\"system_instruction\"] = system_message\n        base_config[\"tools\"] = [types.Tool(function_declarations=[function_definition])]\n        base_config[\"tool_config\"] = types.ToolConfig(\n            function_calling_config=types.FunctionCallingConfig(\n                mode=types.FunctionCallingConfigMode.ANY,\n                allowed_function_names=[_get_model_name(response_model)],\n            ),\n        )\n\n    # Convert messages before building config so we can correctly infer whether\n    # this request includes image content (which affects safety_settings).\n    new_kwargs[\"contents\"] = convert_to_genai_messages(new_kwargs[\"messages\"])\n\n    # Extract multimodal content for GenAI (autodetect_images may turn URLs into images)\n    from ...processing.multimodal import extract_genai_multimodal_content\n\n    new_kwargs[\"contents\"] = extract_genai_multimodal_content(\n        new_kwargs[\"contents\"], autodetect_images\n    )\n\n    generation_config = update_genai_kwargs(new_kwargs, base_config)\n\n    new_kwargs[\"config\"] = types.GenerateContentConfig(**generation_config)\n\n    new_kwargs.pop(\"response_model\", None)\n    new_kwargs.pop(\"messages\", None)\n    new_kwargs.pop(\"generation_config\", None)\n    new_kwargs.pop(\"safety_settings\", None)\n    new_kwargs.pop(\"thinking_config\", None)\n\n    return response_model, new_kwargs\n\n\ndef handle_vertexai_parallel_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[Any, dict[str, Any]]:\n    \"\"\"\n    Handle Vertex AI parallel tools mode.\n\n    Kwargs modifications:\n    - Adds: \"contents\", \"tools\", \"tool_config\" via vertexai_process_response\n    - Validates: stream=False\n    \"\"\"\n    from typing import get_args\n\n    from ..vertexai.client import vertexai_process_response\n    from instructor.dsl.parallel import VertexAIParallelModel\n\n    if new_kwargs.get(\"stream\", False):\n        raise ConfigurationError(\n            \"stream=True is not supported when using VERTEXAI_PARALLEL_TOOLS mode\"\n        )\n\n    # Extract concrete types before passing to vertexai_process_response\n    model_types = list(get_args(response_model))\n    contents, tools, tool_config = vertexai_process_response(new_kwargs, model_types)\n    new_kwargs[\"contents\"] = contents\n    new_kwargs[\"tools\"] = tools\n    new_kwargs[\"tool_config\"] = tool_config\n\n    return VertexAIParallelModel(typehint=response_model), new_kwargs\n\n\ndef handle_vertexai_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    from ..vertexai.client import vertexai_process_response\n\n    \"\"\"\n    Handle Vertex AI tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"contents\", \"tools\", \"tool_config\" via vertexai_process_response\n    \"\"\"\n\n    if response_model is None:\n        # Just handle message conversion - keep the messages as they are\n        return None, new_kwargs\n\n    contents, tools, tool_config = vertexai_process_response(new_kwargs, response_model)\n\n    new_kwargs[\"contents\"] = contents\n    new_kwargs[\"tools\"] = tools\n    new_kwargs[\"tool_config\"] = tool_config\n    return response_model, new_kwargs\n\n\ndef handle_vertexai_json(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    from instructor.providers.vertexai.client import vertexai_process_json_response\n\n    \"\"\"\n    Handle Vertex AI JSON mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"contents\" and \"generation_config\" via vertexai_process_json_response\n    \"\"\"\n\n    if response_model is None:\n        # Just handle message conversion - keep the messages as they are\n        return None, new_kwargs\n\n    contents, generation_config = vertexai_process_json_response(\n        new_kwargs, response_model\n    )\n\n    new_kwargs[\"contents\"] = contents\n    new_kwargs[\"generation_config\"] = generation_config\n    return response_model, new_kwargs\n\n\n# Handler registry for Google providers\nGOOGLE_HANDLERS = {\n    Mode.GEMINI_TOOLS: {\n        \"reask\": reask_gemini_tools,\n        \"response\": handle_gemini_tools,\n    },\n    Mode.GEMINI_JSON: {\n        \"reask\": reask_gemini_json,\n        \"response\": handle_gemini_json,\n    },\n    Mode.GENAI_TOOLS: {\n        \"reask\": reask_genai_tools,\n        \"response\": handle_genai_tools,\n    },\n    Mode.GENAI_STRUCTURED_OUTPUTS: {\n        \"reask\": reask_genai_structured_outputs,\n        \"response\": handle_genai_structured_outputs,\n    },\n    Mode.VERTEXAI_TOOLS: {\n        \"reask\": reask_vertexai_tools,\n        \"response\": handle_vertexai_tools,\n    },\n    Mode.VERTEXAI_JSON: {\n        \"reask\": reask_vertexai_json,\n        \"response\": handle_vertexai_json,\n    },\n    Mode.VERTEXAI_PARALLEL_TOOLS: {\n        \"reask\": reask_vertexai_tools,\n        \"response\": handle_vertexai_parallel_tools,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/genai/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/genai/client.py",
    "content": "# type: ignore\nfrom __future__ import annotations\n\nfrom typing import Any, Literal, overload\n\nfrom google.genai import Client\n\nimport instructor\n\n\n@overload\ndef from_genai(\n    client: Client,\n    mode: instructor.Mode = instructor.Mode.GENAI_TOOLS,\n    use_async: Literal[True] = True,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\n@overload\ndef from_genai(\n    client: Client,\n    mode: instructor.Mode = instructor.Mode.GENAI_TOOLS,\n    use_async: Literal[False] = False,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\ndef from_genai(\n    client: Client,\n    mode: instructor.Mode = instructor.Mode.GENAI_TOOLS,\n    use_async: bool = False,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    valid_modes = {\n        instructor.Mode.GENAI_TOOLS,\n        instructor.Mode.GENAI_STRUCTURED_OUTPUTS,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode), provider=\"GenAI\", valid_modes=[str(m) for m in valid_modes]\n        )\n\n    if not isinstance(client, Client):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of google.genai.Client. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if use_async:\n\n        async def async_wrapper(*args: Any, **kwargs: Any):  # type:ignore\n            if kwargs.pop(\"stream\", False):\n                return await client.aio.models.generate_content_stream(*args, **kwargs)  # type:ignore\n            return await client.aio.models.generate_content(*args, **kwargs)  # type:ignore\n\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.GENAI,\n            mode=mode,\n            **kwargs,\n        )\n\n    def sync_wrapper(*args: Any, **kwargs: Any):  # type:ignore\n        if kwargs.pop(\"stream\", False):\n            return client.models.generate_content_stream(*args, **kwargs)  # type:ignore\n\n        return client.models.generate_content(*args, **kwargs)  # type:ignore\n\n    return instructor.Instructor(\n        client=client,\n        create=instructor.patch(create=sync_wrapper, mode=mode),\n        provider=instructor.Provider.GENAI,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/groq/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/groq/client.py",
    "content": "from __future__ import annotations\n\nfrom typing import overload, Any\n\nimport groq\nimport instructor\n\n\n@overload\ndef from_groq(\n    client: groq.Groq,\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_groq(\n    client: groq.AsyncGroq,\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\ndef from_groq(\n    client: groq.Groq | groq.AsyncGroq,\n    mode: instructor.Mode = instructor.Mode.TOOLS,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    valid_modes = {\n        instructor.Mode.JSON,\n        instructor.Mode.TOOLS,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode), provider=\"Groq\", valid_modes=[str(m) for m in valid_modes]\n        )\n\n    if not isinstance(client, (groq.Groq, groq.AsyncGroq)):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of groq.Groq or groq.AsyncGroq. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if isinstance(client, groq.Groq):\n        return instructor.Instructor(\n            client=client,\n            create=instructor.patch(create=client.chat.completions.create, mode=mode),\n            provider=instructor.Provider.GROQ,\n            mode=mode,\n            **kwargs,\n        )\n\n    else:\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=client.chat.completions.create, mode=mode),\n            provider=instructor.Provider.GROQ,\n            mode=mode,\n            **kwargs,\n        )\n"
  },
  {
    "path": "instructor/providers/mistral/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/mistral/client.py",
    "content": "# Future imports to ensure compatibility with Python 3.9\nfrom __future__ import annotations\n\n\nfrom mistralai import Mistral\nimport instructor\nfrom typing import overload, Any, Literal\n\n\n@overload\ndef from_mistral(\n    client: Mistral,\n    mode: instructor.Mode = instructor.Mode.MISTRAL_TOOLS,\n    use_async: Literal[True] = True,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\n@overload\ndef from_mistral(\n    client: Mistral,\n    mode: instructor.Mode = instructor.Mode.MISTRAL_TOOLS,\n    use_async: Literal[False] = False,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\ndef from_mistral(\n    client: Mistral,\n    mode: instructor.Mode = instructor.Mode.MISTRAL_TOOLS,\n    use_async: bool = False,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    valid_modes = {\n        instructor.Mode.MISTRAL_TOOLS,\n        instructor.Mode.MISTRAL_STRUCTURED_OUTPUTS,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Mistral\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, Mistral):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of mistralai.Mistral. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if use_async:\n\n        async def async_wrapper(\n            *args: Any, **kwargs: Any\n        ):  # Handler for async streaming\n            if kwargs.pop(\"stream\", False):\n                return await client.chat.stream_async(*args, **kwargs)\n            return await client.chat.complete_async(*args, **kwargs)\n\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=async_wrapper, mode=mode),\n            provider=instructor.Provider.MISTRAL,\n            mode=mode,\n            **kwargs,\n        )\n\n    def sync_wrapper(*args: Any, **kwargs: Any):  # Handler for sync streaming\n        if kwargs.pop(\"stream\", False):\n            return client.chat.stream(*args, **kwargs)\n        return client.chat.complete(*args, **kwargs)\n\n    return instructor.Instructor(\n        client=client,\n        create=instructor.patch(create=sync_wrapper, mode=mode),\n        provider=instructor.Provider.MISTRAL,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/mistral/utils.py",
    "content": "\"\"\"Mistral-specific utilities.\n\nThis module contains utilities specific to the Mistral provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom ...mode import Mode\nfrom ...processing.schema import generate_openai_schema\nfrom ...utils.core import dump_message\n\n\ndef reask_mistral_structured_outputs(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Mistral structured outputs mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (assistant content and user correction request)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [\n        {\n            \"role\": \"assistant\",\n            \"content\": response.choices[0].message.content,\n        }\n    ]\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": (\n                f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\"\n            ),\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_mistral_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Mistral tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (tool response messages indicating validation errors)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    for tool_call in response.choices[0].message.tool_calls:\n        reask_msgs.append(\n            {\n                \"role\": \"tool\",  # type: ignore\n                \"tool_call_id\": tool_call.id,\n                \"name\": tool_call.function.name,\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\"\n                ),\n            }\n        )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef handle_mistral_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Mistral tools mode.\n\n    Kwargs modifications:\n    - Adds: \"tools\" (list with function schema)\n    - Adds: \"tool_choice\" set to \"any\"\n    \"\"\"\n    new_kwargs[\"tools\"] = [\n        {\n            \"type\": \"function\",\n            \"function\": generate_openai_schema(response_model),\n        }\n    ]\n    new_kwargs[\"tool_choice\"] = \"any\"\n    return response_model, new_kwargs\n\n\ndef handle_mistral_structured_outputs(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Mistral structured outputs mode.\n\n    Kwargs modifications:\n    - Adds: \"response_format\" derived from the response model\n    - Removes: \"tools\" and \"response_model\" from kwargs\n    \"\"\"\n    from mistralai.extra import response_format_from_pydantic_model\n\n    new_kwargs[\"response_format\"] = response_format_from_pydantic_model(response_model)\n    new_kwargs.pop(\"tools\", None)\n    new_kwargs.pop(\"response_model\", None)\n    return response_model, new_kwargs\n\n\n# Handler registry for Mistral\nMISTRAL_HANDLERS = {\n    Mode.MISTRAL_TOOLS: {\n        \"reask\": reask_mistral_tools,\n        \"response\": handle_mistral_tools,\n    },\n    Mode.MISTRAL_STRUCTURED_OUTPUTS: {\n        \"reask\": reask_mistral_structured_outputs,\n        \"response\": handle_mistral_structured_outputs,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/openai/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/openai/utils.py",
    "content": "\"\"\"OpenAI-specific utilities.\n\nThis module contains utilities specific to the OpenAI provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport json\nfrom textwrap import dedent\nfrom typing import Any, cast\n\nfrom openai import pydantic_function_tool\n\nfrom ...dsl.parallel import ParallelModel, handle_parallel_model\nfrom ...core.exceptions import ConfigurationError\nfrom ...mode import Mode\nfrom ...utils.core import dump_message, merge_consecutive_messages\nfrom ...processing.schema import generate_openai_schema\n\n\ndef _is_stream_response(response: Any) -> bool:\n    \"\"\"Check if response is a Stream object rather than a ChatCompletion.\n\n    Stream objects don't have 'choices' attribute and can't be used\n    for detailed reask messages that reference the response content.\n    \"\"\"\n    return response is None or not hasattr(response, \"choices\")\n\n\ndef _filter_responses_tool_calls(output_items: list[Any]) -> list[Any]:\n    \"\"\"Return response output items that represent tool calls.\"\"\"\n    tool_calls: list[Any] = []\n    for item in output_items:\n        item_type = getattr(item, \"type\", None)\n        if item_type in {\"function_call\", \"tool_call\"}:\n            tool_calls.append(item)\n            continue\n        if item_type is None and hasattr(item, \"arguments\"):\n            tool_calls.append(item)\n    return tool_calls\n\n\ndef _format_responses_tool_call_details(tool_call: Any) -> str:\n    \"\"\"Format tool call name/id details for reask messages.\"\"\"\n    tool_name = getattr(tool_call, \"name\", None)\n    tool_id = (\n        getattr(tool_call, \"id\", None)\n        or getattr(tool_call, \"call_id\", None)\n        or getattr(tool_call, \"tool_call_id\", None)\n    )\n    details: list[str] = []\n    if tool_name:\n        details.append(f\"name={tool_name}\")\n    if tool_id:\n        details.append(f\"id={tool_id}\")\n    if not details:\n        return \"\"\n    return f\" (tool call {', '.join(details)})\"\n\n\ndef reask_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n    failed_attempts: list[Any] | None = None,  # noqa: ARG001\n):\n    \"\"\"\n    Handle reask for OpenAI tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (tool response messages indicating validation errors)\n    \"\"\"\n    kwargs = kwargs.copy()\n\n    # Handle Stream objects which don't have choices attribute\n    # This happens when streaming mode is used with retries\n    if _is_stream_response(response):\n        kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\n\"\n                    \"Recall the function correctly, fix the errors\"\n                ),\n            }\n        )\n        return kwargs\n\n    reask_msgs = [dump_message(response.choices[0].message)]\n    for tool_call in response.choices[0].message.tool_calls:\n        reask_msgs.append(\n            {\n                \"role\": \"tool\",  # type: ignore\n                \"tool_call_id\": tool_call.id,\n                \"name\": tool_call.function.name,\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\"\n                ),\n            }\n        )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_responses_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n    failed_attempts: list[Any] | None = None,  # noqa: ARG001\n):\n    \"\"\"\n    Handle reask for OpenAI responses tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user messages with validation errors)\n    \"\"\"\n    kwargs = kwargs.copy()\n\n    # Handle Stream objects which don't have output attribute\n    if response is None or not hasattr(response, \"output\"):\n        kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\n\"\n                    \"Recall the function correctly, fix the errors\"\n                ),\n            }\n        )\n        return kwargs\n\n    reask_messages = []\n    for tool_call in _filter_responses_tool_calls(response.output):\n        details = _format_responses_tool_call_details(tool_call)\n        reask_messages.append(\n            {\n                \"role\": \"user\",  # type: ignore\n                \"content\": (\n                    f\"Validation Error found:\\n{exception}\\n\"\n                    \"Recall the function correctly, fix the errors with \"\n                    f\"{tool_call.arguments}{details}\"\n                ),\n            }\n        )\n\n    kwargs[\"messages\"].extend(reask_messages)\n    return kwargs\n\n\ndef reask_md_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n    failed_attempts: list[Any] | None = None,  # noqa: ARG001\n):\n    \"\"\"\n    Handle reask for OpenAI JSON modes when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n\n    # Handle Stream objects which don't have choices attribute\n    if _is_stream_response(response):\n        kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": f\"Correct your JSON ONLY RESPONSE, based on the following errors:\\n{exception}\",\n            }\n        )\n        return kwargs\n\n    reask_msgs = [dump_message(response.choices[0].message)]\n\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": f\"Correct your JSON ONLY RESPONSE, based on the following errors:\\n{exception}\",\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_default(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n    failed_attempts: list[Any] | None = None,  # noqa: ARG001\n):\n    \"\"\"\n    Handle reask for OpenAI default mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting function correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n\n    # Handle Stream objects which don't have choices attribute\n    if _is_stream_response(response):\n        kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": (\n                    f\"Recall the function correctly, fix the errors, exceptions found\\n{exception}\"\n                ),\n            }\n        )\n        return kwargs\n\n    reask_msgs = [dump_message(response.choices[0].message)]\n\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": (\n                f\"Recall the function correctly, fix the errors, exceptions found\\n{exception}\"\n            ),\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\n# Response handlers\ndef handle_parallel_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI parallel tools mode for concurrent function calls.\n\n    This mode enables making multiple independent function calls in a single request,\n    useful for batch processing or when you need to extract multiple structured outputs\n    simultaneously. The response_model should be a list/iterable type or use the\n    ParallelModel wrapper.\n\n    Example usage:\n        # Define models for parallel extraction\n        class PersonInfo(BaseModel):\n            name: str\n            age: int\n\n        class EventInfo(BaseModel):\n            date: str\n            location: str\n\n        # Use with PARALLEL_TOOLS mode\n        result = client.chat.completions.create(\n            model=\"gpt-4\",\n            response_model=[PersonInfo, EventInfo],\n            mode=instructor.Mode.PARALLEL_TOOLS,\n            messages=[{\"role\": \"user\", \"content\": \"Extract person and event info...\"}]\n        )\n\n    Kwargs modifications:\n    - Adds: \"tools\" (multiple function schemas from parallel model)\n    - Adds: \"tool_choice\" (\"auto\" to allow model to choose which tools to call)\n    - Validates: stream=False (streaming not supported in parallel mode)\n    \"\"\"\n    if new_kwargs.get(\"stream\", False):\n        raise ConfigurationError(\n            \"stream=True is not supported when using PARALLEL_TOOLS mode\"\n        )\n    new_kwargs[\"tools\"] = handle_parallel_model(response_model)\n    new_kwargs[\"tool_choice\"] = \"auto\"\n    return cast(type[Any], ParallelModel(typehint=response_model)), new_kwargs\n\n\ndef handle_functions(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI functions mode (deprecated).\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"functions\" (list with function schema)\n      - Adds: \"function_call\" (forced function call)\n    \"\"\"\n    Mode.warn_mode_functions_deprecation()\n\n    if response_model is None:\n        return None, new_kwargs\n\n    new_kwargs[\"functions\"] = [generate_openai_schema(response_model)]\n    new_kwargs[\"function_call\"] = {\n        \"name\": generate_openai_schema(response_model)[\"name\"]\n    }\n    return response_model, new_kwargs\n\n\ndef handle_tools_strict(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI strict tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"tools\" (list with strict function schema)\n      - Adds: \"tool_choice\" (forced function call)\n    \"\"\"\n    if response_model is None:\n        return None, new_kwargs\n\n    response_model_schema = pydantic_function_tool(response_model)\n    response_model_schema[\"function\"][\"strict\"] = True\n    new_kwargs[\"tools\"] = [response_model_schema]\n    new_kwargs[\"tool_choice\"] = {\n        \"type\": \"function\",\n        \"function\": {\"name\": response_model_schema[\"function\"][\"name\"]},\n    }\n    return response_model, new_kwargs\n\n\ndef handle_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"tools\" (list with function schema)\n      - Adds: \"tool_choice\" (forced function call)\n    \"\"\"\n    if response_model is None:\n        return None, new_kwargs\n\n    new_kwargs[\"tools\"] = [\n        {\n            \"type\": \"function\",\n            \"function\": generate_openai_schema(response_model),\n        }\n    ]\n    new_kwargs[\"tool_choice\"] = {\n        \"type\": \"function\",\n        \"function\": {\"name\": generate_openai_schema(response_model)[\"name\"]},\n    }\n    return response_model, new_kwargs\n\n\ndef handle_responses_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI responses tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"tools\" (list with function schema)\n      - Adds: \"tool_choice\" (forced function call)\n      - Adds: \"max_output_tokens\" (converted from max_tokens)\n    \"\"\"\n    # Handle max_tokens to max_output_tokens conversion for RESPONSES_TOOLS modes\n    if new_kwargs.get(\"max_tokens\") is not None:\n        new_kwargs[\"max_output_tokens\"] = new_kwargs.pop(\"max_tokens\")\n\n    # If response_model is None, just return without setting up tools\n    if response_model is None:\n        return None, new_kwargs\n\n    schema = pydantic_function_tool(response_model)\n    del schema[\"function\"][\"strict\"]\n\n    tool_definition = {\n        \"type\": \"function\",\n        \"name\": schema[\"function\"][\"name\"],\n        \"parameters\": schema[\"function\"][\"parameters\"],\n    }\n\n    if \"description\" in schema[\"function\"]:\n        tool_definition[\"description\"] = schema[\"function\"][\"description\"]\n    else:\n        tool_definition[\"description\"] = (\n            f\"Correctly extracted `{response_model.__name__}` with all \"\n            f\"the required parameters with correct types\"\n        )\n\n    new_kwargs[\"tools\"] = [\n        {\n            \"type\": \"function\",\n            \"name\": schema[\"function\"][\"name\"],\n            \"parameters\": schema[\"function\"][\"parameters\"],\n        }\n    ]\n\n    new_kwargs[\"tool_choice\"] = {\n        \"type\": \"function\",\n        \"name\": generate_openai_schema(response_model)[\"name\"],\n    }\n\n    return response_model, new_kwargs\n\n\ndef handle_responses_tools_with_inbuilt_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI responses tools with inbuilt tools mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Adds: \"tools\" (list with function schema)\n      - Adds: \"tool_choice\" (forced function call)\n      - Adds: \"max_output_tokens\" (converted from max_tokens)\n    \"\"\"\n    # Handle max_tokens to max_output_tokens conversion for RESPONSES_TOOLS modes\n    if new_kwargs.get(\"max_tokens\") is not None:\n        new_kwargs[\"max_output_tokens\"] = new_kwargs.pop(\"max_tokens\")\n\n    # If response_model is None, just return without setting up tools\n    if response_model is None:\n        return None, new_kwargs\n\n    schema = pydantic_function_tool(response_model)\n    del schema[\"function\"][\"strict\"]\n\n    tool_definition = {\n        \"type\": \"function\",\n        \"name\": schema[\"function\"][\"name\"],\n        \"parameters\": schema[\"function\"][\"parameters\"],\n    }\n\n    if \"description\" in schema[\"function\"]:\n        tool_definition[\"description\"] = schema[\"function\"][\"description\"]\n    else:\n        tool_definition[\"description\"] = (\n            f\"Correctly extracted `{response_model.__name__}` with all \"\n            f\"the required parameters with correct types\"\n        )\n\n    if not new_kwargs.get(\"tools\"):\n        new_kwargs[\"tools\"] = [tool_definition]\n        new_kwargs[\"tool_choice\"] = {\n            \"type\": \"function\",\n            \"name\": generate_openai_schema(response_model)[\"name\"],\n        }\n    else:\n        new_kwargs[\"tools\"].append(tool_definition)\n\n    return response_model, new_kwargs\n\n\ndef handle_json_o1(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI o1 JSON mode.\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Modifies: \"messages\" (appends user message with JSON schema)\n      - Validates: No system messages allowed for O1 models\n    \"\"\"\n    roles = [message[\"role\"] for message in new_kwargs.get(\"messages\", [])]\n    if \"system\" in roles:\n        raise ValueError(\"System messages are not supported For the O1 models\")\n\n    if response_model is None:\n        return None, new_kwargs\n\n    message = dedent(\n        f\"\"\"\n        Understand the content and provide\n        the parsed objects in json that match the following json_schema:\\n\n\n        {json.dumps(response_model.model_json_schema(), indent=2, ensure_ascii=False)}\n\n        Make sure to return an instance of the JSON, not the schema itself\n        \"\"\"\n    )\n\n    new_kwargs[\"messages\"].append(\n        {\n            \"role\": \"user\",\n            \"content\": message,\n        },\n    )\n    return response_model, new_kwargs\n\n\ndef handle_json_modes(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any], mode: Mode\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle OpenAI JSON modes (JSON, MD_JSON, JSON_SCHEMA).\n\n    Kwargs modifications:\n    - When response_model is None: No modifications\n    - When response_model is provided:\n      - Mode.JSON_SCHEMA: Adds \"response_format\" with json_schema\n      - Mode.JSON: Adds \"response_format\" with type=\"json_object\", modifies system message\n      - Mode.MD_JSON: Appends user message for markdown JSON response\n    \"\"\"\n    if response_model is None:\n        return None, new_kwargs\n\n    # Use a neutral prompt that doesn't impose a persona\n    # This allows the JSON mode to work with character-based applications\n    # See: https://github.com/instructor-ai/instructor/issues/1514\n    message = dedent(\n        f\"\"\"\n        Parse the content and return a JSON object matching this schema:\n\n        {json.dumps(response_model.model_json_schema(), indent=2, ensure_ascii=False)}\n\n        Return a valid JSON instance, not the schema definition.\"\"\"\n    )\n\n    if mode == Mode.JSON:\n        new_kwargs[\"response_format\"] = {\"type\": \"json_object\"}\n    elif mode == Mode.JSON_SCHEMA:\n        new_kwargs[\"response_format\"] = {\n            \"type\": \"json_schema\",\n            \"json_schema\": {\n                \"name\": response_model.__name__,\n                \"schema\": response_model.model_json_schema(),\n            },\n        }\n    elif mode == Mode.MD_JSON:\n        new_kwargs[\"messages\"].append(\n            {\n                \"role\": \"user\",\n                \"content\": \"Return the correct JSON response within a ```json codeblock. not the JSON_SCHEMA\",\n            },\n        )\n        new_kwargs[\"messages\"] = merge_consecutive_messages(new_kwargs[\"messages\"])\n\n    if mode != Mode.JSON_SCHEMA:\n        if new_kwargs[\"messages\"][0][\"role\"] != \"system\":\n            new_kwargs[\"messages\"].insert(\n                0,\n                {\n                    \"role\": \"system\",\n                    \"content\": message,\n                },\n            )\n        elif isinstance(new_kwargs[\"messages\"][0][\"content\"], str):\n            new_kwargs[\"messages\"][0][\"content\"] += f\"\\n\\n{message}\"\n        elif isinstance(new_kwargs[\"messages\"][0][\"content\"], list):\n            new_kwargs[\"messages\"][0][\"content\"][0][\"text\"] += f\"\\n\\n{message}\"\n        else:\n            raise ValueError(\n                \"Invalid message format, must be a string or a list of messages\"\n            )\n\n    return response_model, new_kwargs\n\n\ndef handle_openrouter_structured_outputs(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle OpenRouter structured outputs mode.\n\n    Kwargs modifications:\n    - Adds: \"response_format\" (json_schema with strict mode enabled)\n    \"\"\"\n    schema = response_model.model_json_schema()\n    schema[\"additionalProperties\"] = False\n    new_kwargs[\"response_format\"] = {\n        \"type\": \"json_schema\",\n        \"json_schema\": {\n            \"name\": response_model.__name__,\n            \"schema\": schema,\n            \"strict\": True,\n        },\n    }\n    return response_model, new_kwargs\n\n\n# Handler registry for OpenAI\nOPENAI_HANDLERS = {\n    Mode.TOOLS: {\n        \"reask\": reask_tools,\n        \"response\": handle_tools,\n    },\n    Mode.TOOLS_STRICT: {\n        \"reask\": reask_tools,\n        \"response\": handle_tools_strict,\n    },\n    Mode.FUNCTIONS: {\n        \"reask\": reask_default,\n        \"response\": handle_functions,\n    },\n    Mode.JSON: {\n        \"reask\": reask_md_json,\n        \"response\": lambda rm, nk: handle_json_modes(rm, nk, Mode.JSON),\n    },\n    Mode.MD_JSON: {\n        \"reask\": reask_md_json,\n        \"response\": lambda rm, nk: handle_json_modes(rm, nk, Mode.MD_JSON),\n    },\n    Mode.JSON_SCHEMA: {\n        \"reask\": reask_md_json,\n        \"response\": lambda rm, nk: handle_json_modes(rm, nk, Mode.JSON_SCHEMA),\n    },\n    Mode.JSON_O1: {\n        \"reask\": reask_md_json,\n        \"response\": handle_json_o1,\n    },\n    Mode.PARALLEL_TOOLS: {\n        \"reask\": reask_tools,\n        \"response\": handle_parallel_tools,\n    },\n    Mode.RESPONSES_TOOLS: {\n        \"reask\": reask_responses_tools,\n        \"response\": handle_responses_tools,\n    },\n    Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS: {\n        \"reask\": reask_responses_tools,\n        \"response\": handle_responses_tools_with_inbuilt_tools,\n    },\n    Mode.OPENROUTER_STRUCTURED_OUTPUTS: {\n        \"reask\": reask_md_json,\n        \"response\": handle_openrouter_structured_outputs,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/perplexity/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/perplexity/client.py",
    "content": "from __future__ import annotations\n\nimport openai\nimport instructor\nfrom typing import overload, Any\n\n\n@overload\ndef from_perplexity(\n    client: openai.OpenAI,\n    mode: instructor.Mode = instructor.Mode.PERPLEXITY_JSON,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_perplexity(\n    client: openai.AsyncOpenAI,\n    mode: instructor.Mode = instructor.Mode.PERPLEXITY_JSON,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\ndef from_perplexity(\n    client: openai.OpenAI | openai.AsyncOpenAI,\n    mode: instructor.Mode = instructor.Mode.PERPLEXITY_JSON,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    \"\"\"Create an Instructor client from a Perplexity client.\n\n    Args:\n        client: A Perplexity client (sync or async)\n        mode: The mode to use for the client (must be PERPLEXITY_JSON)\n        **kwargs: Additional arguments to pass to the client\n\n    Returns:\n        An Instructor client\n    \"\"\"\n    valid_modes = {instructor.Mode.PERPLEXITY_JSON}\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"Perplexity\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, (openai.OpenAI, openai.AsyncOpenAI)):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of openai.OpenAI or openai.AsyncOpenAI. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if isinstance(client, openai.AsyncOpenAI):\n        create = client.chat.completions.create\n        return instructor.AsyncInstructor(\n            client=client,\n            create=instructor.patch(create=create, mode=mode),\n            provider=instructor.Provider.PERPLEXITY,\n            mode=mode,\n            **kwargs,\n        )\n\n    create = client.chat.completions.create\n    return instructor.Instructor(\n        client=client,\n        create=instructor.patch(create=create, mode=mode),\n        provider=instructor.Provider.PERPLEXITY,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/perplexity/utils.py",
    "content": "\"\"\"Perplexity-specific utilities.\n\nThis module contains utilities specific to the Perplexity provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom ...mode import Mode\nfrom ...utils.core import dump_message\n\n\ndef reask_perplexity_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Perplexity JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": f\"Correct your JSON ONLY RESPONSE, based on the following errors:\\n{exception}\",\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef handle_perplexity_json(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Perplexity JSON mode.\n\n    Kwargs modifications:\n    - Adds: \"response_format\" with json_schema\n    \"\"\"\n    new_kwargs[\"response_format\"] = {\n        \"type\": \"json_schema\",\n        \"json_schema\": {\"schema\": response_model.model_json_schema()},\n    }\n\n    return response_model, new_kwargs\n\n\n# Handler registry for Perplexity\nPERPLEXITY_HANDLERS = {\n    Mode.PERPLEXITY_JSON: {\n        \"reask\": reask_perplexity_json,\n        \"response\": handle_perplexity_json,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/vertexai/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/vertexai/client.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any, Union, get_origin\n\nfrom vertexai.preview.generative_models import ToolConfig  # type: ignore[import-not-found]\nimport vertexai.generative_models as gm  # type: ignore[import-not-found]\nfrom pydantic import BaseModel\nimport instructor\nfrom ...dsl.parallel import get_types_array\nimport jsonref\n\n\ndef _create_gemini_json_schema(model: type[BaseModel]) -> dict[str, Any]:\n    # Add type check to ensure we have a concrete model class\n    if get_origin(model) is not None:\n        raise TypeError(f\"Expected concrete model class, got type hint {model}\")\n\n    schema = model.model_json_schema()\n    schema_without_refs: dict[str, Any] = jsonref.replace_refs(schema)  # type: ignore[assignment]\n    gemini_schema: dict[Any, Any] = {\n        \"type\": schema_without_refs[\"type\"],\n        \"properties\": schema_without_refs[\"properties\"],\n        \"required\": (\n            schema_without_refs[\"required\"] if \"required\" in schema_without_refs else []\n        ),  # TODO: Temporary Fix for Iterables which throw an error when their tasks field is specified in the required field\n    }\n    return gemini_schema\n\n\ndef _create_vertexai_tool(\n    models: type[BaseModel] | list[type[BaseModel]] | Any,\n) -> gm.Tool:  # noqa: UP007\n    \"\"\"Creates a tool with function declarations for single model or list of models\"\"\"\n    # Handle Iterable case first\n    if get_origin(models) is not None:\n        model_list = list(get_types_array(models))\n    else:\n        # Handle both single model and list of models\n        model_list = models if isinstance(models, list) else [models]\n\n    declarations = []\n    for model in model_list:\n        parameters = _create_gemini_json_schema(model)\n        declaration = gm.FunctionDeclaration(\n            name=model.__name__,\n            description=model.__doc__,\n            parameters=parameters,\n        )\n        declarations.append(declaration)\n\n    return gm.Tool(function_declarations=declarations)\n\n\ndef vertexai_message_parser(\n    message: dict[str, str | gm.Part | list[str | gm.Part]],\n) -> gm.Content:\n    if isinstance(message[\"content\"], str):\n        return gm.Content(\n            role=message[\"role\"],  # type:ignore\n            parts=[gm.Part.from_text(message[\"content\"])],\n        )\n    elif isinstance(message[\"content\"], list):\n        parts: list[gm.Part] = []\n        for item in message[\"content\"]:\n            if isinstance(item, str):\n                parts.append(gm.Part.from_text(item))\n            elif isinstance(item, gm.Part):\n                parts.append(item)\n            else:\n                raise ValueError(f\"Unsupported content type in list: {type(item)}\")\n        return gm.Content(\n            role=message[\"role\"],  # type:ignore\n            parts=parts,\n        )\n    else:\n        raise ValueError(\"Unsupported message content type\")\n\n\ndef _vertexai_message_list_parser(\n    messages: list[dict[str, str | gm.Part | list[str | gm.Part]]],\n) -> list[gm.Content]:\n    contents = [\n        vertexai_message_parser(message) if isinstance(message, dict) else message\n        for message in messages\n    ]\n    return contents\n\n\ndef vertexai_function_response_parser(\n    response: gm.GenerationResponse, exception: Exception\n) -> gm.Content:\n    return gm.Content(\n        parts=[\n            gm.Part.from_function_response(\n                name=response.candidates[0].content.parts[0].function_call.name,\n                response={\n                    \"content\": f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\"\n                },\n            )\n        ]\n    )\n\n\ndef vertexai_process_response(\n    _kwargs: dict[str, Any],\n    model: Union[type[BaseModel], list[type[BaseModel]], Any],  # noqa: UP007\n):\n    messages: list[dict[str, str]] = _kwargs.pop(\"messages\")\n    contents = _vertexai_message_list_parser(messages)  # type: ignore[arg-type]\n\n    tool = _create_vertexai_tool(models=model)\n\n    tool_config = ToolConfig(\n        function_calling_config=ToolConfig.FunctionCallingConfig(\n            mode=ToolConfig.FunctionCallingConfig.Mode.ANY,\n        )\n    )\n    return contents, [tool], tool_config\n\n\ndef vertexai_process_json_response(_kwargs: dict[str, Any], model: type[BaseModel]):\n    messages: list[dict[str, str]] = _kwargs.pop(\"messages\")\n    contents = _vertexai_message_list_parser(messages)  # type: ignore[arg-type]\n\n    config: dict[str, Any] | None = _kwargs.pop(\"generation_config\", None)\n\n    response_schema = _create_gemini_json_schema(model)\n\n    generation_config = gm.GenerationConfig(\n        response_mime_type=\"application/json\",\n        response_schema=response_schema,\n        **(config if config else {}),\n    )\n\n    return contents, generation_config\n\n\ndef from_vertexai(\n    client: gm.GenerativeModel,\n    mode: instructor.Mode = instructor.Mode.VERTEXAI_TOOLS,\n    _async: bool = False,\n    use_async: bool | None = None,\n    **kwargs: Any,\n) -> instructor.Instructor:\n    import warnings\n\n    warnings.warn(\n        \"from_vertexai is deprecated and will be removed in a future version. \"\n        \"Please use from_genai with vertexai=True or from_provider instead. \"\n        \"Install google-genai with: pip install google-genai\\n\"\n        \"Example migration:\\n\"\n        \"  # Old way\\n\"\n        \"  from instructor import from_vertexai\\n\"\n        \"  import vertexai.generative_models as gm\\n\"\n        \"  client = from_vertexai(gm.GenerativeModel('gemini-3-flash'))\\n\\n\"\n        \"  # New way\\n\"\n        \"  from instructor import from_genai\\n\"\n        \"  from google import genai\\n\"\n        \"  client = from_genai(genai.Client(vertexai=True, project='your-project', location='us-central1'))\\n\"\n        \"  # OR use from_provider\\n\"\n        \"  client = instructor.from_provider('vertexai/gemini-3-flash')\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    valid_modes = {\n        instructor.Mode.VERTEXAI_PARALLEL_TOOLS,\n        instructor.Mode.VERTEXAI_TOOLS,\n        instructor.Mode.VERTEXAI_JSON,\n    }\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode),\n            provider=\"VertexAI\",\n            valid_modes=[str(m) for m in valid_modes],\n        )\n\n    if not isinstance(client, gm.GenerativeModel):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of vertexai.generative_models.GenerativeModel. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if use_async is not None and _async != False:\n        from ...core.exceptions import ConfigurationError\n\n        raise ConfigurationError(\n            \"Cannot provide both '_async' and 'use_async'. Use 'use_async' instead.\"\n        )\n\n    if _async and use_async is None:\n        import warnings\n\n        warnings.warn(\n            \"'_async' is deprecated. Use 'use_async' instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        use_async = _async\n\n    is_async = use_async if use_async is not None else _async\n\n    create = client.generate_content_async if is_async else client.generate_content\n\n    return instructor.Instructor(\n        client=client,\n        create=instructor.patch(create=create, mode=mode),\n        provider=instructor.Provider.VERTEXAI,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/writer/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/writer/client.py",
    "content": "# Future imports to ensure compatibility with Python 3.9\nfrom __future__ import annotations\n\n\nimport instructor\nfrom writerai import AsyncWriter, Writer\nfrom typing import overload, Any\n\n\n@overload\ndef from_writer(\n    client: Writer,\n    mode: instructor.Mode = instructor.Mode.WRITER_TOOLS,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_writer(\n    client: AsyncWriter,\n    mode: instructor.Mode = instructor.Mode.WRITER_TOOLS,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\ndef from_writer(\n    client: Writer | AsyncWriter,\n    mode: instructor.Mode = instructor.Mode.WRITER_TOOLS,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    valid_modes = {instructor.Mode.WRITER_TOOLS, instructor.Mode.WRITER_JSON}\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode), provider=\"Writer\", valid_modes=[str(m) for m in valid_modes]\n        )\n\n    if not isinstance(client, (Writer, AsyncWriter)):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            f\"Client must be an instance of Writer or AsyncWriter. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    if isinstance(client, Writer):\n        return instructor.Instructor(\n            client=client,\n            create=instructor.patch(create=client.chat.chat, mode=mode),\n            provider=instructor.Provider.WRITER,\n            mode=mode,\n            **kwargs,\n        )\n\n    return instructor.AsyncInstructor(\n        client=client,\n        create=instructor.patch(create=client.chat.chat, mode=mode),\n        provider=instructor.Provider.WRITER,\n        mode=mode,\n        **kwargs,\n    )\n"
  },
  {
    "path": "instructor/providers/writer/utils.py",
    "content": "\"\"\"Writer-specific utilities.\n\nThis module contains utilities specific to the Writer provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any\n\nfrom ...mode import Mode\nfrom ...processing.schema import generate_openai_schema\nfrom ...utils.core import dump_message\n\n\ndef reask_writer_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Writer tools mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user instructions to correct tool call)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": (\n                f\"Validation Error found:\\n{exception}\\n Fix errors and fill tool call arguments/name \"\n                f\"correctly. Just update arguments dict values or update name. Don't change the structure \"\n                f\"of them. You have to call function by passing desired \"\n                f\"functions name/args as part of special attribute with name tools_calls, \"\n                f\"not as text in attribute with name content. IT'S IMPORTANT!\"\n            ),\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef reask_writer_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for Writer JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Adds: \"messages\" (user message requesting JSON correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msgs = [dump_message(response.choices[0].message)]\n    reask_msgs.append(\n        {\n            \"role\": \"user\",\n            \"content\": f\"Correct your JSON response: {response.choices[0].message.content}, \"\n            f\"based on the following errors:\\n{exception}\",\n        }\n    )\n    kwargs[\"messages\"].extend(reask_msgs)\n    return kwargs\n\n\ndef handle_writer_tools(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Writer tools mode.\n\n    Kwargs modifications:\n    - Adds: \"tools\" (list with function schema)\n    - Sets: \"tool_choice\" to \"auto\"\n    \"\"\"\n    new_kwargs[\"tools\"] = [\n        {\n            \"type\": \"function\",\n            \"function\": generate_openai_schema(response_model),\n        }\n    ]\n    new_kwargs[\"tool_choice\"] = \"auto\"\n    return response_model, new_kwargs\n\n\ndef handle_writer_json(\n    response_model: type[Any], new_kwargs: dict[str, Any]\n) -> tuple[type[Any], dict[str, Any]]:\n    \"\"\"\n    Handle Writer JSON mode.\n\n    Kwargs modifications:\n    - Adds: \"response_format\" with json_schema\n    \"\"\"\n    new_kwargs[\"response_format\"] = {\n        \"type\": \"json_schema\",\n        \"json_schema\": {\"schema\": response_model.model_json_schema()},\n    }\n\n    return response_model, new_kwargs\n\n\n# Handler registry for Writer\nWRITER_HANDLERS = {\n    Mode.WRITER_TOOLS: {\n        \"reask\": reask_writer_tools,\n        \"response\": handle_writer_tools,\n    },\n    Mode.WRITER_JSON: {\n        \"reask\": reask_writer_json,\n        \"response\": handle_writer_json,\n    },\n}\n"
  },
  {
    "path": "instructor/providers/xai/__init__.py",
    "content": "\"\"\"Provider implementation.\"\"\"\n"
  },
  {
    "path": "instructor/providers/xai/client.py",
    "content": "from __future__ import annotations\n\nfrom typing import Any, TYPE_CHECKING, cast, overload\nimport json\n\nfrom instructor.dsl.iterable import IterableBase\nfrom instructor.dsl.partial import PartialBase\nfrom instructor.dsl.simple_type import AdapterBase\n\nfrom instructor.utils.core import prepare_response_model\nfrom pydantic import BaseModel\n\nimport instructor\nfrom .utils import _convert_messages\n\n\ndef _raise_xai_sdk_missing() -> None:\n    from ...core.exceptions import ConfigurationError\n\n    raise ConfigurationError(\n        \"The xAI provider needs the optional dependency `xai-sdk`. \"\n        'Install it with `uv pip install \"instructor[xai]\"` (or `pip install \"instructor[xai]\"`). '\n        \"Note: xai-sdk requires Python 3.10+.\"\n    ) from None\n\n\ndef _get_model_schema(response_model: Any) -> dict[str, Any]:\n    \"\"\"\n    Safely get JSON schema from a response model.\n\n    Handles both regular models and wrapped types by checking for the\n    model_json_schema method with hasattr.\n\n    Args:\n        response_model: The response model (may be regular or wrapped)\n\n    Returns:\n        The JSON schema dictionary\n    \"\"\"\n    if hasattr(response_model, \"model_json_schema\") and callable(\n        response_model.model_json_schema\n    ):\n        schema_method = response_model.model_json_schema\n        return schema_method()\n    return {}\n\n\ndef _get_model_name(response_model: Any) -> str:\n    \"\"\"\n    Safely get the name of a response model.\n\n    Args:\n        response_model: The response model\n\n    Returns:\n        The model name or 'Model' as fallback\n    \"\"\"\n    return getattr(response_model, \"__name__\", \"Model\")\n\n\ndef _finalize_parsed_response(parsed: Any, raw_response: Any) -> Any:\n    if isinstance(parsed, BaseModel):\n        parsed._raw_response = raw_response\n    if isinstance(parsed, IterableBase):\n        return [task for task in parsed.tasks]\n    if isinstance(parsed, AdapterBase):\n        return parsed.content\n    return parsed\n\n\nif TYPE_CHECKING:\n    from xai_sdk.sync.client import Client as SyncClient\n    from xai_sdk.aio.client import Client as AsyncClient\n    from xai_sdk import chat as xchat\nelse:\n    try:\n        from xai_sdk.sync.client import Client as SyncClient\n        from xai_sdk.aio.client import Client as AsyncClient\n        from xai_sdk import chat as xchat\n    except ImportError:\n        SyncClient = None\n        AsyncClient = None\n        xchat = None\n\n\n@overload\ndef from_xai(\n    client: SyncClient,\n    mode: instructor.Mode = instructor.Mode.XAI_JSON,\n    **kwargs: Any,\n) -> instructor.Instructor: ...\n\n\n@overload\ndef from_xai(\n    client: AsyncClient,\n    mode: instructor.Mode = instructor.Mode.XAI_JSON,\n    **kwargs: Any,\n) -> instructor.AsyncInstructor: ...\n\n\ndef from_xai(\n    client: SyncClient | AsyncClient,\n    mode: instructor.Mode = instructor.Mode.XAI_JSON,\n    **kwargs: Any,\n) -> instructor.Instructor | instructor.AsyncInstructor:\n    if SyncClient is None or AsyncClient is None or xchat is None:\n        _raise_xai_sdk_missing()\n\n    valid_modes = {instructor.Mode.XAI_JSON, instructor.Mode.XAI_TOOLS}\n\n    if mode not in valid_modes:\n        from ...core.exceptions import ModeError\n\n        raise ModeError(\n            mode=str(mode), provider=\"xAI\", valid_modes=[str(m) for m in valid_modes]\n        )\n\n    if not isinstance(client, (SyncClient, AsyncClient)):\n        from ...core.exceptions import ClientError\n\n        raise ClientError(\n            \"Client must be an instance of xai_sdk.sync.client.Client or xai_sdk.aio.client.Client. \"\n            f\"Got: {type(client).__name__}\"\n        )\n\n    async def acreate(\n        response_model: type[BaseModel] | None,\n        messages: list[dict[str, Any]],\n        strict: bool = True,\n        **call_kwargs: Any,\n    ):\n        x_messages = _convert_messages(messages)\n        model = call_kwargs.pop(\"model\")\n        # Remove instructor-specific kwargs that xAI doesn't support\n        call_kwargs.pop(\"max_retries\", None)\n        call_kwargs.pop(\"validation_context\", None)\n        call_kwargs.pop(\"context\", None)\n        call_kwargs.pop(\"hooks\", None)\n        is_stream = call_kwargs.pop(\"stream\", False)\n\n        chat = client.chat.create(model=model, messages=x_messages, **call_kwargs)\n\n        if response_model is None:\n            resp = await chat.sample()  # type: ignore[misc]\n            return resp\n\n        assert response_model is not None\n\n        prepared_model = response_model\n        if mode == instructor.Mode.XAI_TOOLS or is_stream:\n            prepared_model = prepare_response_model(response_model)\n        assert prepared_model is not None\n\n        if mode == instructor.Mode.XAI_JSON:\n            if is_stream:\n                # code from xai_sdk.chat.parse\n                chat.proto.response_format.CopyFrom(\n                    xchat.chat_pb2.ResponseFormat(\n                        format_type=xchat.chat_pb2.FormatType.FORMAT_TYPE_JSON_SCHEMA,\n                        schema=json.dumps(_get_model_schema(prepared_model)),\n                    )\n                )\n                json_chunks = (chunk.content async for _, chunk in chat.stream())  # type: ignore[misc]\n                # response_model is guaranteed to be a type[BaseModel] at this point due to earlier assertion\n                rm = cast(type[BaseModel], prepared_model)\n                if issubclass(rm, IterableBase):\n                    return rm.tasks_from_chunks_async(json_chunks)  # type: ignore\n                elif issubclass(rm, PartialBase):\n                    return rm.model_from_chunks_async(json_chunks)  # type: ignore\n                else:\n                    raise ValueError(\n                        f\"Unsupported response model type for streaming: {_get_model_name(response_model)}\"\n                    )\n            else:\n                raw, parsed = await chat.parse(response_model)  # type: ignore[misc]\n                parsed._raw_response = raw\n                return parsed\n        else:\n            tool_obj = xchat.tool(\n                name=_get_model_name(prepared_model),\n                description=prepared_model.__doc__ or \"\",\n                parameters=_get_model_schema(prepared_model),\n            )\n            chat.proto.tools.append(tool_obj)  # type: ignore[arg-type]\n            tool_name = tool_obj.function.name  # type: ignore[attr-defined]\n            chat.proto.tool_choice.CopyFrom(xchat.required_tool(tool_name))\n            if is_stream:\n                stream_iter = chat.stream()  # type: ignore[misc]\n                args = (\n                    resp.tool_calls[0].function.arguments  # type: ignore[index,attr-defined]\n                    async for resp, _ in stream_iter  # type: ignore[assignment]\n                    if resp.tool_calls and resp.finish_reason == \"REASON_INVALID\"  # type: ignore[attr-defined]\n                )\n                rm = cast(type[BaseModel], prepared_model)\n                if issubclass(rm, IterableBase):\n                    return rm.tasks_from_chunks_async(args)  # type: ignore\n                elif issubclass(rm, PartialBase):\n                    return rm.model_from_chunks_async(args)  # type: ignore\n                else:\n                    raise ValueError(\n                        f\"Unsupported response model type for streaming: {_get_model_name(response_model)}\"\n                    )\n            else:\n                resp = await chat.sample()  # type: ignore[misc]\n                if not resp.tool_calls:  # type: ignore[attr-defined]\n                    # If no tool calls, try to extract from text content\n                    from ...processing.function_calls import _validate_model_from_json\n                    from ...utils import extract_json_from_codeblock\n\n                    # Try to extract JSON from text content\n                    text_content: str = \"\"\n                    if hasattr(resp, \"text\") and resp.text:  # type: ignore[attr-defined]\n                        text_content = str(resp.text)  # type: ignore[attr-defined]\n                    elif hasattr(resp, \"content\") and resp.content:  # type: ignore[attr-defined]\n                        content = resp.content  # type: ignore[attr-defined]\n                        if isinstance(content, str):\n                            text_content = content\n                        elif isinstance(content, list) and content:\n                            text_content = str(content[0])\n\n                    if text_content:\n                        json_str = extract_json_from_codeblock(text_content)\n                        model_for_validation = cast(type[Any], prepared_model)\n                        parsed = _validate_model_from_json(\n                            model_for_validation, json_str, None, strict\n                        )\n                        return _finalize_parsed_response(parsed, resp)\n\n                    raise ValueError(\n                        f\"No tool calls returned from xAI and no text content available. \"\n                        f\"Response: {resp}\"\n                    )\n\n                args = resp.tool_calls[0].function.arguments  # type: ignore[index,attr-defined]\n                from ...processing.function_calls import _validate_model_from_json\n\n                model_for_validation = cast(type[Any], prepared_model)\n                parsed = _validate_model_from_json(\n                    model_for_validation, args, None, strict\n                )\n                return _finalize_parsed_response(parsed, resp)\n\n    def create(\n        response_model: type[BaseModel] | None,\n        messages: list[dict[str, Any]],\n        strict: bool = True,\n        **call_kwargs: Any,\n    ):\n        x_messages = _convert_messages(messages)\n        model = call_kwargs.pop(\"model\")\n        # Remove instructor-specific kwargs that xAI doesn't support\n        call_kwargs.pop(\"max_retries\", None)\n        call_kwargs.pop(\"validation_context\", None)\n        call_kwargs.pop(\"context\", None)\n        call_kwargs.pop(\"hooks\", None)\n        # Check if streaming is requested\n        is_stream = call_kwargs.pop(\"stream\", False)\n\n        chat = client.chat.create(model=model, messages=x_messages, **call_kwargs)\n\n        if response_model is None:\n            resp = chat.sample()  # type: ignore[misc]\n            return resp\n\n        assert response_model is not None\n\n        prepared_model = response_model\n        if mode == instructor.Mode.XAI_TOOLS or is_stream:\n            prepared_model = prepare_response_model(response_model)\n        assert prepared_model is not None\n\n        if mode == instructor.Mode.XAI_JSON:\n            if is_stream:\n                # code from xai_sdk.chat.parse\n                chat.proto.response_format.CopyFrom(\n                    xchat.chat_pb2.ResponseFormat(\n                        format_type=xchat.chat_pb2.FormatType.FORMAT_TYPE_JSON_SCHEMA,\n                        schema=json.dumps(_get_model_schema(prepared_model)),\n                    )\n                )\n                json_chunks = (chunk.content for _, chunk in chat.stream())  # type: ignore[misc]\n                rm = cast(type[BaseModel], prepared_model)\n                if issubclass(rm, IterableBase):\n                    return rm.tasks_from_chunks(json_chunks)\n                elif issubclass(rm, PartialBase):\n                    return rm.model_from_chunks(json_chunks)\n                else:\n                    raise ValueError(\n                        f\"Unsupported response model type for streaming: {_get_model_name(response_model)}\"\n                    )\n            else:\n                raw, parsed = chat.parse(response_model)  # type: ignore[misc]\n                parsed._raw_response = raw\n                return parsed\n        else:\n            tool_obj = xchat.tool(\n                name=_get_model_name(prepared_model),\n                description=prepared_model.__doc__ or \"\",\n                parameters=_get_model_schema(prepared_model),\n            )\n            chat.proto.tools.append(tool_obj)  # type: ignore[arg-type]\n            tool_name = tool_obj.function.name  # type: ignore[attr-defined]\n            chat.proto.tool_choice.CopyFrom(xchat.required_tool(tool_name))\n            if is_stream:\n                stream_iter = chat.stream()  # type: ignore[misc]\n                for resp, _ in stream_iter:  # type: ignore[assignment]\n                    # For xAI, tool_calls are returned at the end of the response.\n                    # Effectively, it is not a streaming response.\n                    # See: https://docs.x.ai/docs/guides/function-calling\n                    if resp.tool_calls:  # type: ignore[attr-defined]\n                        args = resp.tool_calls[0].function.arguments  # type: ignore[index,attr-defined]\n                        rm = cast(type[BaseModel], prepared_model)\n                        if issubclass(rm, IterableBase):\n                            return rm.tasks_from_chunks(args)\n                        elif issubclass(rm, PartialBase):\n                            return rm.model_from_chunks(args)\n                        else:\n                            raise ValueError(\n                                f\"Unsupported response model type for streaming: {_get_model_name(response_model)}\"\n                            )\n            else:\n                resp = chat.sample()  # type: ignore[misc]\n                if not resp.tool_calls:  # type: ignore[attr-defined]\n                    # If no tool calls, try to extract from text content\n                    from ...processing.function_calls import _validate_model_from_json\n                    from ...utils import extract_json_from_codeblock\n\n                    # Try to extract JSON from text content\n                    text_content: str = \"\"\n                    if hasattr(resp, \"text\") and resp.text:  # type: ignore[attr-defined]\n                        text_content = str(resp.text)  # type: ignore[attr-defined]\n                    elif hasattr(resp, \"content\") and resp.content:  # type: ignore[attr-defined]\n                        content = resp.content  # type: ignore[attr-defined]\n                        if isinstance(content, str):\n                            text_content = content\n                        elif isinstance(content, list) and content:\n                            text_content = str(content[0])\n\n                    if text_content:\n                        json_str = extract_json_from_codeblock(text_content)\n                        model_for_validation = cast(type[Any], prepared_model)\n                        parsed = _validate_model_from_json(\n                            model_for_validation, json_str, None, strict\n                        )\n                        return _finalize_parsed_response(parsed, resp)\n\n                    raise ValueError(\n                        f\"No tool calls returned from xAI and no text content available. \"\n                        f\"Response: {resp}\"\n                    )\n\n                args = resp.tool_calls[0].function.arguments  # type: ignore[index,attr-defined]\n                from ...processing.function_calls import _validate_model_from_json\n\n                model_for_validation = cast(type[Any], prepared_model)\n                parsed = _validate_model_from_json(\n                    model_for_validation, args, None, strict\n                )\n                return _finalize_parsed_response(parsed, resp)\n\n    if isinstance(client, AsyncClient):\n        return instructor.AsyncInstructor(\n            client=client,\n            create=acreate,\n            provider=instructor.Provider.XAI,\n            mode=mode,\n            **kwargs,\n        )\n    else:\n        return instructor.Instructor(\n            client=client,\n            create=create,\n            provider=instructor.Provider.XAI,\n            mode=mode,\n            **kwargs,\n        )\n"
  },
  {
    "path": "instructor/providers/xai/utils.py",
    "content": "\"\"\"xAI-specific utilities.\n\nThis module contains utilities specific to the xAI provider,\nincluding reask functions, response handlers, and message formatting.\n\"\"\"\n\nfrom __future__ import annotations\n\nfrom typing import Any, TYPE_CHECKING\n\nfrom ...mode import Mode\n\nif TYPE_CHECKING:\n    from xai_sdk import chat as xchat\nelse:\n    try:\n        from xai_sdk import chat as xchat\n    except ImportError:\n        xchat = None\n\n\ndef _convert_messages(messages: list[dict[str, Any]]):\n    \"\"\"Convert OpenAI-style messages to xAI format.\"\"\"\n    if xchat is None:\n        from ...core.exceptions import ConfigurationError\n\n        raise ConfigurationError(\n            \"The xAI provider needs the optional dependency `xai-sdk`. \"\n            'Install it with `uv pip install \"instructor[xai]\"` (or `pip install \"instructor[xai]\"`). '\n            \"Note: xai-sdk requires Python 3.10+.\"\n        ) from None\n\n    converted = []\n    for m in messages:\n        role = m[\"role\"]\n        content = m.get(\"content\", \"\")\n        if isinstance(content, str):\n            c = xchat.text(content)\n        else:\n            raise ValueError(\"Only string content supported for xAI provider\")\n        if role == \"user\":\n            converted.append(xchat.user(c))\n        elif role == \"assistant\":\n            converted.append(xchat.assistant(c))\n        elif role == \"system\":\n            converted.append(xchat.system(c))\n        elif role == \"tool\":\n            converted.append(xchat.tool_result(content))\n        else:\n            raise ValueError(f\"Unsupported role: {role}\")\n    return converted\n\n\ndef reask_xai_json(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for xAI JSON mode when validation fails.\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (appends user message requesting correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n    reask_msg = {\n        \"role\": \"user\",\n        \"content\": f\"Validation Errors found:\\n{exception}\\nRecall the function correctly, fix the errors found in the following attempt:\\n{response}\",\n    }\n    kwargs[\"messages\"].append(reask_msg)\n    return kwargs\n\n\ndef reask_xai_tools(\n    kwargs: dict[str, Any],\n    response: Any,\n    exception: Exception,\n):\n    \"\"\"\n    Handle reask for xAI tools mode when validation fails.\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (appends assistant and user messages for tool correction)\n    \"\"\"\n    kwargs = kwargs.copy()\n\n    # Add assistant response to conversation history\n    assistant_msg = {\n        \"role\": \"assistant\",\n        \"content\": str(response),\n    }\n    kwargs[\"messages\"].append(assistant_msg)\n\n    # Add user correction request\n    reask_msg = {\n        \"role\": \"user\",\n        \"content\": f\"Validation Error found:\\n{exception}\\nRecall the function correctly, fix the errors\",\n    }\n    kwargs[\"messages\"].append(reask_msg)\n    return kwargs\n\n\ndef handle_xai_json(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle xAI JSON mode.\n\n    When response_model is None:\n        - Converts messages from OpenAI format to xAI format\n        - No schema is added to the request\n\n    When response_model is provided:\n        - Converts messages from OpenAI format to xAI format\n        - Sets up the model for JSON parsing mode\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (converts from OpenAI to xAI format)\n    - Removes: instructor-specific kwargs (max_retries, validation_context, context, hooks)\n    \"\"\"\n    # Convert messages to xAI format\n    messages = new_kwargs.get(\"messages\", [])\n    new_kwargs[\"x_messages\"] = _convert_messages(messages)\n\n    # Remove instructor-specific kwargs that xAI doesn't support\n    new_kwargs.pop(\"max_retries\", None)\n    new_kwargs.pop(\"validation_context\", None)\n    new_kwargs.pop(\"context\", None)\n    new_kwargs.pop(\"hooks\", None)\n\n    return response_model, new_kwargs\n\n\ndef handle_xai_tools(\n    response_model: type[Any] | None, new_kwargs: dict[str, Any]\n) -> tuple[type[Any] | None, dict[str, Any]]:\n    \"\"\"\n    Handle xAI tools mode.\n\n    When response_model is None:\n        - Converts messages from OpenAI format to xAI format\n        - No tools are configured\n\n    When response_model is provided:\n        - Converts messages from OpenAI format to xAI format\n        - Sets up tool schema from the response model\n        - Configures tool choice for automatic tool selection\n\n    Kwargs modifications:\n    - Modifies: \"messages\" (converts from OpenAI to xAI format)\n    - Adds: \"tool\" (xAI tool schema) - only when response_model provided\n    - Removes: instructor-specific kwargs (max_retries, validation_context, context, hooks)\n    \"\"\"\n    # Convert messages to xAI format\n    messages = new_kwargs.get(\"messages\", [])\n    new_kwargs[\"x_messages\"] = _convert_messages(messages)\n\n    # Remove instructor-specific kwargs that xAI doesn't support\n    new_kwargs.pop(\"max_retries\", None)\n    new_kwargs.pop(\"validation_context\", None)\n    new_kwargs.pop(\"context\", None)\n    new_kwargs.pop(\"hooks\", None)\n\n    if response_model is not None and xchat is not None:\n        # Set up tool schema for structured output\n        new_kwargs[\"tool\"] = xchat.tool(\n            name=response_model.__name__,\n            description=response_model.__doc__ or \"\",\n            parameters=response_model.model_json_schema(),\n        )\n\n    return response_model, new_kwargs\n\n\n# Handler registry for xAI\nXAI_HANDLERS = {\n    Mode.XAI_JSON: {\n        \"reask\": reask_xai_json,\n        \"response\": handle_xai_json,\n    },\n    Mode.XAI_TOOLS: {\n        \"reask\": reask_xai_tools,\n        \"response\": handle_xai_tools,\n    },\n}\n"
  },
  {
    "path": "instructor/py.typed",
    "content": ""
  },
  {
    "path": "instructor/templating.py",
    "content": "# type: ignore[all]\nfrom __future__ import annotations\nfrom typing import Any\nfrom textwrap import dedent\nfrom instructor.mode import Mode\nfrom jinja2.sandbox import SandboxedEnvironment\n\n\ndef apply_template(text: str, context: dict[str, Any]) -> str:\n    \"\"\"Apply Jinja2 template to the given text.\"\"\"\n    return dedent(SandboxedEnvironment().from_string(text).render(**context))\n\n\ndef process_message(\n    message: dict[str, Any], context: dict[str, Any], mode: Mode\n) -> dict[str, Any]:\n    \"\"\"Process a single message, applying templates to its content.\"\"\"\n    if mode in {Mode.GENAI_TOOLS, Mode.GENAI_STRUCTURED_OUTPUTS}:\n        from google.genai import types\n\n        return types.Content(\n            role=message.role,\n            parts=[\n                (\n                    types.Part.from_text(text=apply_template(part.text, context))\n                    if hasattr(part, \"text\")\n                    else part\n                )\n                for part in message.parts\n            ],\n        )\n\n    # VertexAI Support\n    if (\n        hasattr(message, \"parts\")\n        and isinstance(message.parts, list)\n        and len(message.parts) > 0\n        and not isinstance(message.parts[0], str)\n    ):\n        import vertexai.generative_models as gm\n\n        return gm.Content(\n            role=message.role,\n            parts=[\n                (\n                    gm.Part.from_text(apply_template(part.text, context))\n                    if hasattr(part, \"text\")\n                    else part\n                )\n                for part in message.parts\n            ],\n        )\n\n    # OpenAI format\n    if isinstance(message.get(\"content\"), str):\n        message[\"content\"] = apply_template(message[\"content\"], context)\n        return message\n\n    # Anthropic format\n    if isinstance(message.get(\"content\"), list):\n        for part in message[\"content\"]:\n            if (\n                isinstance(part, dict)\n                and part.get(\"type\") == \"text\"\n                and isinstance(part.get(\"text\"), str)\n            ):\n                part[\"text\"] = apply_template(part[\"text\"], context)\n        return message\n\n    # Gemini Support\n    if isinstance(message.get(\"parts\"), list):\n        message[\"parts\"] = [\n            apply_template(part, context) if isinstance(part, str) else part\n            for part in message[\"parts\"]\n        ]\n        return message\n\n    # Cohere format\n    if isinstance(message.get(\"message\"), str):\n        message[\"message\"] = apply_template(message[\"message\"], context)\n        return message\n\n\ndef handle_templating(\n    kwargs: dict[str, Any], mode: Mode, context: dict[str, Any] | None = None\n) -> dict[str, Any]:\n    \"\"\"\n    Handle templating for messages using the provided context.\n\n    This function processes messages, applying Jinja2 templating to their content\n    using the provided context. It supports various message formats including\n    OpenAI, Anthropic, Cohere, VertexAI, and Gemini.\n\n    Args:\n        kwargs (Dict[str, Any]): Keyword arguments being passed to the create method.\n        context (Dict[str, Any] | None, optional): Variables to use in templating. Defaults to None.\n\n    Returns:\n        Dict[str, Any]: The processed kwargs with templated content.\n\n    Raises:\n        ValueError: If no recognized message format is found in kwargs.\n    \"\"\"\n    if not context:\n        return kwargs\n\n    new_kwargs = kwargs.copy()\n\n    # Handle Cohere's message field\n    if \"message\" in new_kwargs:\n        new_kwargs[\"message\"] = apply_template(new_kwargs[\"message\"], context)\n        new_kwargs[\"chat_history\"] = [\n            process_message(message, context, mode)\n            for message in new_kwargs[\"chat_history\"]\n        ]\n\n        return new_kwargs\n\n    if isinstance(new_kwargs, list):\n        messages = new_kwargs\n        if not messages:\n            return\n    elif isinstance(new_kwargs, dict):\n        messages = new_kwargs.get(\"messages\") or new_kwargs.get(\"contents\")\n\n    if not messages:\n        return\n\n    if \"messages\" in new_kwargs:\n        new_kwargs[\"messages\"] = [\n            process_message(message, context, mode) for message in messages\n        ]\n\n    elif \"contents\" in new_kwargs:\n        new_kwargs[\"contents\"] = [\n            process_message(content, context, mode)\n            for content in new_kwargs[\"contents\"]\n        ]\n\n    return new_kwargs\n"
  },
  {
    "path": "instructor/utils/__init__.py",
    "content": "\"\"\"Utility modules for instructor library.\n\nThis package contains utility functions organized by provider and functionality.\n\"\"\"\n\n# Re-export everything from core\nfrom .core import (\n    extract_json_from_codeblock,\n    extract_json_from_stream,\n    extract_json_from_stream_async,\n    update_total_usage,\n    dump_message,\n    is_async,\n    merge_consecutive_messages,\n    classproperty,\n    get_message_content,\n    disable_pydantic_error_url,\n    is_typed_dict,\n    is_simple_type,\n    prepare_response_model,\n)\n\n# Re-export from providers\nfrom .providers import Provider, get_provider\n\n__all__ = [\n    # Core functions\n    \"extract_json_from_codeblock\",\n    \"extract_json_from_stream\",\n    \"extract_json_from_stream_async\",\n    \"update_total_usage\",\n    \"dump_message\",\n    \"is_async\",\n    \"merge_consecutive_messages\",\n    \"classproperty\",\n    \"get_message_content\",\n    \"disable_pydantic_error_url\",\n    \"is_typed_dict\",\n    \"is_simple_type\",\n    \"prepare_response_model\",\n    # Provider functions\n    \"Provider\",\n    \"get_provider\",\n    # Gemini utils\n    \"transform_to_gemini_prompt\",\n    \"verify_no_unions\",\n    \"map_to_gemini_function_schema\",\n    \"update_genai_kwargs\",\n    \"update_gemini_kwargs\",\n    \"extract_genai_system_message\",\n    \"convert_to_genai_messages\",\n    # Anthropic utils\n    \"SystemMessage\",\n    \"combine_system_messages\",\n    \"extract_system_messages\",\n]\n\n\n# Lazy imports for backward compatibility to avoid circular imports\ndef __getattr__(name):\n    # Gemini utils\n    if name in [\n        \"transform_to_gemini_prompt\",\n        \"verify_no_unions\",\n        \"map_to_gemini_function_schema\",\n        \"update_genai_kwargs\",\n        \"update_gemini_kwargs\",\n        \"extract_genai_system_message\",\n        \"convert_to_genai_messages\",\n    ]:\n        from ..providers.gemini import utils as gemini_utils\n\n        return getattr(gemini_utils, name)\n\n    # Anthropic utils\n    if name in [\n        \"SystemMessage\",\n        \"combine_system_messages\",\n        \"extract_system_messages\",\n    ]:\n        from ..providers.anthropic import utils as anthropic_utils\n\n        return getattr(anthropic_utils, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "instructor/utils/core.py",
    "content": "\"\"\"Core utilities for instructor library.\n\nThis module contains generic utility functions that are not provider-specific.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport inspect\nimport json\nimport logging\nfrom collections.abc import AsyncGenerator, Generator, Iterable\nfrom typing import (\n    TYPE_CHECKING,\n    Any,\n    Callable,\n    Generic,\n    Union,\n    TypeVar,\n    cast,\n    get_args,\n    get_origin,\n)\n\nfrom openai.types import CompletionUsage as OpenAIUsage\nfrom openai.types.chat import (\n    ChatCompletion,\n    ChatCompletionMessage,\n    ChatCompletionMessageParam,\n)\nfrom pydantic import BaseModel, ValidationError, create_model\n\n# Avoid circular import - these will be imported where needed\n\nif TYPE_CHECKING:\n    from anthropic.types import Usage as AnthropicUsage\n\nlogger = logging.getLogger(\"instructor\")\nR_co = TypeVar(\"R_co\", covariant=True)\nT_Model = TypeVar(\"T_Model\", bound=BaseModel)\nT = TypeVar(\"T\")\n\n\ndef extract_json_from_codeblock(content: str) -> str:\n    \"\"\"\n    Extract JSON from a string that may contain extra text.\n\n    The function looks for the first '{' and the last '}' in the string and\n    returns the content between them, inclusive. If no braces are found,\n    the original string is returned.\n\n    Args:\n        content: The string that may contain JSON\n\n    Returns:\n        The extracted JSON string\n    \"\"\"\n\n    first_brace = content.find(\"{\")\n    last_brace = content.rfind(\"}\")\n    if first_brace != -1 and last_brace != -1:\n        json_content = content[first_brace : last_brace + 1]\n    else:\n        json_content = content  # Return as is if no JSON-like content found\n\n    return json_content\n\n\ndef extract_json_from_stream(\n    chunks: Iterable[str],\n) -> Generator[str, None, None]:\n    \"\"\"\n    Extract JSON from a stream of chunks, handling JSON in code blocks.\n\n    This optimized version extracts JSON from markdown code blocks or plain JSON\n    by implementing a state machine approach.\n\n    The state machine tracks several states:\n    - Whether we're inside a code block (```json ... ```)\n    - Whether we've started tracking a JSON object\n    - Whether we're inside a string literal\n    - The stack of open braces to properly identify the JSON structure\n\n    Args:\n        chunks: An iterable of string chunks\n\n    Yields:\n        Characters within the JSON object\n    \"\"\"\n    # State flags\n    in_codeblock = False\n    codeblock_delimiter_count = 0\n    json_started = False\n    in_string = False\n    escape_next = False\n    brace_stack = []\n    buffer = []\n\n    # Track potential codeblock start/end\n    codeblock_buffer = []\n\n    for chunk in chunks:\n        for char in chunk:\n            # Track codeblock delimiters (```)\n            if not in_codeblock and char == \"`\":\n                codeblock_buffer.append(char)\n                if len(codeblock_buffer) == 3:\n                    in_codeblock = True\n                    codeblock_delimiter_count = 0\n                    codeblock_buffer = []\n                continue\n            elif len(codeblock_buffer) > 0 and char != \"`\":\n                # Reset if we see something other than backticks\n                codeblock_buffer = []\n\n            # If we're in a codeblock but haven't started JSON yet\n            if in_codeblock and not json_started:\n                # Track end of codeblock\n                if char == \"`\":\n                    codeblock_delimiter_count += 1\n                    if codeblock_delimiter_count == 3:\n                        in_codeblock = False\n                        codeblock_delimiter_count = 0\n                    continue\n                elif codeblock_delimiter_count > 0:\n                    codeblock_delimiter_count = (\n                        0  # Reset if we see something other than backticks\n                    )\n\n                # Look for the start of JSON\n                if char == \"{\":\n                    json_started = True\n                    brace_stack.append(\"{\")\n                    buffer.append(char)\n                # Skip other characters until we find the start of JSON\n                continue\n\n            # If we've started tracking JSON\n            if json_started:\n                # Handle string literals and escaped characters\n                if char == '\"' and not escape_next:\n                    in_string = not in_string\n                elif char == \"\\\\\" and in_string:\n                    escape_next = True\n                    buffer.append(char)\n                    continue\n                else:\n                    escape_next = False\n\n                # Track end of codeblock if we're in one\n                if in_codeblock and not in_string:\n                    if char == \"`\":\n                        codeblock_delimiter_count += 1\n                        if codeblock_delimiter_count == 3:\n                            # End of codeblock means end of JSON\n                            in_codeblock = False\n                            # Yield the buffer without the closing backticks\n                            for c in buffer:\n                                yield c\n                            buffer = []\n                            json_started = False\n                            break\n                        continue\n                    elif codeblock_delimiter_count > 0:\n                        codeblock_delimiter_count = 0\n\n                # Track braces when not in a string\n                if not in_string:\n                    if char == \"{\":\n                        brace_stack.append(\"{\")\n                    elif char == \"}\" and brace_stack:\n                        brace_stack.pop()\n                        # If we've completed a JSON object, yield its characters\n                        if not brace_stack:\n                            buffer.append(char)\n                            for c in buffer:\n                                yield c\n                            buffer = []\n                            json_started = False\n                            break\n\n                # Add character to buffer\n                buffer.append(char)\n                continue\n\n            # If we're not in a codeblock and haven't started JSON, look for standalone JSON\n            if not in_codeblock and not json_started and char == \"{\":\n                json_started = True\n                brace_stack.append(\"{\")\n                buffer.append(char)\n\n    # Yield any remaining buffer content if we have valid JSON\n    if json_started and buffer:\n        for c in buffer:\n            yield c\n\n\nasync def extract_json_from_stream_async(\n    chunks: AsyncGenerator[str, None],\n) -> AsyncGenerator[str, None]:\n    \"\"\"\n    Extract JSON from an async stream of chunks, handling JSON in code blocks.\n\n    This optimized version extracts JSON from markdown code blocks or plain JSON\n    by implementing a state machine approach.\n\n    The state machine tracks several states:\n    - Whether we're inside a code block (```json ... ```)\n    - Whether we've started tracking a JSON object\n    - Whether we're inside a string literal\n    - The stack of open braces to properly identify the JSON structure\n\n    Args:\n        chunks: An async generator yielding string chunks\n\n    Yields:\n        Characters within the JSON object\n    \"\"\"\n    # State flags\n    in_codeblock = False\n    codeblock_delimiter_count = 0\n    json_started = False\n    in_string = False\n    escape_next = False\n    brace_stack = []\n    buffer = []\n\n    # Track potential codeblock start/end\n    codeblock_buffer = []\n\n    async for chunk in chunks:\n        for char in chunk:\n            # Track codeblock delimiters (```)\n            if not in_codeblock and char == \"`\":\n                codeblock_buffer.append(char)\n                if len(codeblock_buffer) == 3:\n                    in_codeblock = True\n                    codeblock_delimiter_count = 0\n                    codeblock_buffer = []\n                continue\n            elif len(codeblock_buffer) > 0 and char != \"`\":\n                # Reset if we see something other than backticks\n                codeblock_buffer = []\n\n            # If we're in a codeblock but haven't started JSON yet\n            if in_codeblock and not json_started:\n                # Track end of codeblock\n                if char == \"`\":\n                    codeblock_delimiter_count += 1\n                    if codeblock_delimiter_count == 3:\n                        in_codeblock = False\n                        codeblock_delimiter_count = 0\n                    continue\n                elif codeblock_delimiter_count > 0:\n                    codeblock_delimiter_count = (\n                        0  # Reset if we see something other than backticks\n                    )\n\n                # Look for the start of JSON\n                if char == \"{\":\n                    json_started = True\n                    brace_stack.append(\"{\")\n                    buffer.append(char)\n                # Skip other characters until we find the start of JSON\n                continue\n\n            # If we've started tracking JSON\n            if json_started:\n                # Handle string literals and escaped characters\n                if char == '\"' and not escape_next:\n                    in_string = not in_string\n                elif char == \"\\\\\" and in_string:\n                    escape_next = True\n                    buffer.append(char)\n                    continue\n                else:\n                    escape_next = False\n\n                # Track end of codeblock if we're in one\n                if in_codeblock and not in_string:\n                    if char == \"`\":\n                        codeblock_delimiter_count += 1\n                        if codeblock_delimiter_count == 3:\n                            # End of codeblock means end of JSON\n                            in_codeblock = False\n                            # Yield the buffer without the closing backticks\n                            for c in buffer:\n                                yield c\n                            buffer = []\n                            json_started = False\n                            break\n                        continue\n                    elif codeblock_delimiter_count > 0:\n                        codeblock_delimiter_count = 0\n\n                # Track braces when not in a string\n                if not in_string:\n                    if char == \"{\":\n                        brace_stack.append(\"{\")\n                    elif char == \"}\" and brace_stack:\n                        brace_stack.pop()\n                        # If we've completed a JSON object, yield its characters\n                        if not brace_stack:\n                            buffer.append(char)\n                            for c in buffer:\n                                yield c\n                            buffer = []\n                            json_started = False\n                            break\n\n                # Add character to buffer\n                buffer.append(char)\n                continue\n\n            # If we're not in a codeblock and haven't started JSON, look for standalone JSON\n            if not in_codeblock and not json_started and char == \"{\":\n                json_started = True\n                brace_stack.append(\"{\")\n                buffer.append(char)\n\n    # Yield any remaining buffer content if we have valid JSON\n    if json_started and buffer:\n        for c in buffer:\n            yield c\n\n\ndef update_total_usage(\n    response: T_Model | None,\n    total_usage: OpenAIUsage | AnthropicUsage,\n) -> T_Model | ChatCompletion | None:\n    if response is None:\n        return None\n\n    response_usage = getattr(response, \"usage\", None)\n    if isinstance(response_usage, OpenAIUsage) and isinstance(total_usage, OpenAIUsage):\n        total_usage.completion_tokens += response_usage.completion_tokens or 0\n        total_usage.prompt_tokens += response_usage.prompt_tokens or 0\n        total_usage.total_tokens += response_usage.total_tokens or 0\n        if (rtd := response_usage.completion_tokens_details) and (\n            ttd := total_usage.completion_tokens_details\n        ):\n            ttd.audio_tokens = (ttd.audio_tokens or 0) + (rtd.audio_tokens or 0)\n            ttd.reasoning_tokens = (ttd.reasoning_tokens or 0) + (\n                rtd.reasoning_tokens or 0\n            )\n        if (rpd := response_usage.prompt_tokens_details) and (\n            tpd := total_usage.prompt_tokens_details\n        ):\n            tpd.audio_tokens = (tpd.audio_tokens or 0) + (rpd.audio_tokens or 0)\n            tpd.cached_tokens = (tpd.cached_tokens or 0) + (rpd.cached_tokens or 0)\n        response.usage = total_usage  # type: ignore  # Replace each response usage with the total usage\n        return response\n\n    # Anthropic usage.\n    try:\n        from anthropic.types import Usage as AnthropicUsage\n\n        if isinstance(response_usage, AnthropicUsage) and isinstance(\n            total_usage, AnthropicUsage\n        ):\n            if not total_usage.cache_creation_input_tokens:\n                total_usage.cache_creation_input_tokens = 0\n\n            if not total_usage.cache_read_input_tokens:\n                total_usage.cache_read_input_tokens = 0\n\n            total_usage.input_tokens += response_usage.input_tokens or 0\n            total_usage.output_tokens += response_usage.output_tokens or 0\n            total_usage.cache_creation_input_tokens += (\n                response_usage.cache_creation_input_tokens or 0\n            )\n            total_usage.cache_read_input_tokens += (\n                response_usage.cache_read_input_tokens or 0\n            )\n            response.usage = total_usage  # type: ignore\n            return response\n    except ImportError:\n        pass\n\n    logger.debug(\"No compatible response.usage found, token usage not updated.\")\n    return response\n\n\ndef dump_message(message: ChatCompletionMessage) -> ChatCompletionMessageParam:\n    \"\"\"Dumps a message to a dict, to be returned to the OpenAI API.\n    Workaround for an issue with the OpenAI API, where the `tool_calls` field isn't allowed to be present in requests\n    if it isn't used.\n    \"\"\"\n    ret: ChatCompletionMessageParam = {\n        \"role\": message.role,\n        \"content\": message.content or \"\",\n    }\n    if hasattr(message, \"tool_calls\") and message.tool_calls is not None:\n        ret[\"tool_calls\"] = message.model_dump()[\"tool_calls\"]\n    if (\n        hasattr(message, \"function_call\")\n        and message.function_call is not None\n        and ret[\"content\"]\n    ):\n        if not isinstance(ret[\"content\"], str):\n            response_message: str = \"\"\n            for content_message in ret[\"content\"]:\n                if isinstance(content_message, dict):\n                    # Use get() to safely access values\n                    message_type = content_message.get(\"type\")\n                    if message_type == \"text\":\n                        text_content = content_message.get(\"text\", \"\")\n                        response_message += text_content\n                    elif message_type == \"refusal\":\n                        refusal_content = content_message.get(\"refusal\", \"\")\n                        response_message += refusal_content\n            ret[\"content\"] = response_message\n        ret[\"content\"] += json.dumps(message.model_dump()[\"function_call\"])\n    return ret\n\n\ndef is_async(func: Callable[..., Any]) -> bool:\n    \"\"\"Returns true if the callable is async, accounting for wrapped callables\"\"\"\n    is_coroutine = inspect.iscoroutinefunction(func)\n    while hasattr(func, \"__wrapped__\"):\n        func = func.__wrapped__  # type: ignore - dynamic\n        is_coroutine = is_coroutine or inspect.iscoroutinefunction(func)\n    return is_coroutine\n\n\ndef merge_consecutive_messages(messages: list[dict[str, Any]]) -> list[dict[str, Any]]:\n    \"\"\"\n    Merge consecutive messages from the same role into a single message.\n\n    This optimized version pre-allocates the result list and minimizes operations.\n\n    Args:\n        messages: List of message dictionaries to merge\n\n    Returns:\n        List of merged message dictionaries\n    \"\"\"\n    if not messages:\n        return []\n\n    # Pre-allocate result list with estimated size (worst case: no merges happen)\n    message_count = len(messages)\n    new_messages = []\n\n    # Detect whether all messages have a flat content (i.e. all string)\n    # Some providers require content to be a string, so we need to check that and behave accordingly\n    # Fast path: avoid checking all messages if the first few have mixed content types\n    flat_string = True\n    for _i, m in enumerate(messages[: min(10, message_count)]):\n        if not isinstance(m.get(\"content\", \"\"), str):\n            flat_string = False\n            break\n\n    # Only check all messages if we haven't determined it's not flat_string\n    if flat_string and message_count > 10:\n        flat_string = all(isinstance(m.get(\"content\", \"\"), str) for m in messages[10:])\n\n    # Process messages with a single loop\n    for message in messages:\n        role = message.get(\"role\", \"user\")\n        new_content = message.get(\"content\", \"\")\n\n        # Transform string content to list if needed\n        if not flat_string and isinstance(new_content, str):\n            new_content = [{\"type\": \"text\", \"text\": new_content}]\n\n        # Check if we can merge with previous message\n        if new_messages and role == new_messages[-1][\"role\"]:\n            if flat_string:\n                # Fast path for string content\n                new_messages[-1][\"content\"] += f\"\\n\\n{new_content}\"\n            else:\n                # Fast path for list content\n                if isinstance(new_content, list):\n                    new_messages[-1][\"content\"].extend(new_content)\n                else:\n                    # Fallback for unexpected content type\n                    new_messages[-1][\"content\"].append(new_content)\n        else:\n            # Add new message\n            new_messages.append({\"role\": role, \"content\": new_content})\n\n    return new_messages\n\n\nclass classproperty(Generic[R_co]):\n    \"\"\"Descriptor for class-level properties.\n\n    Examples:\n        >>> from instructor.utils import classproperty\n\n        >>> class MyClass:\n        ...     @classproperty\n        ...     def my_property(cls):\n        ...         return cls\n\n        >>> assert MyClass.my_property\n    \"\"\"\n\n    def __init__(self, method: Callable[[Any], R_co]) -> None:\n        self.cproperty = method\n\n    def __get__(self, instance: object, cls: type[Any]) -> R_co:\n        return self.cproperty(cls)\n\n\ndef get_message_content(message: ChatCompletionMessageParam) -> list[Any]:\n    \"\"\"\n    Extract content from a message and ensure it's returned as a list.\n\n    This optimized version handles different message formats more efficiently.\n\n    Args:\n        message: A message in ChatCompletionMessageParam format\n\n    Returns:\n        The message content as a list\n    \"\"\"\n    # Fast path for empty message\n    if not message:\n        return [\"\"]\n\n    # Get content with default empty string\n    content = message.get(\"content\", \"\")\n\n    # Fast path for common content types\n    if isinstance(content, list):\n        return content if content else [\"\"]\n\n    # Return single item list with content (could be string, None, or other)\n    return [content if content is not None else \"\"]\n\n\ndef disable_pydantic_error_url():\n    \"\"\"Disable URLs in Pydantic ValidationError messages.\n\n    This function monkey-patches Pydantic's ValidationError.__str__ method\n    to prevent URLs from being included in error messages. This is necessary\n    because Pydantic reads the PYDANTIC_ERRORS_INCLUDE_URL environment variable\n    at import time, not at validation time, so setting it later has no effect.\n\n    The function works by storing the original __str__ method and replacing it\n    with a version that filters out URLs from the error message.\n    \"\"\"\n    # Store the original __str__ method if not already stored\n    if not hasattr(ValidationError, \"_original_str\"):\n        ValidationError._original_str = ValidationError.__str__  # type: ignore\n\n    # Create a new __str__ method that excludes URLs\n    def __str__(self):  # type: ignore\n        output = ValidationError._original_str(self)  # type: ignore\n        # Remove error_url from the error details to prevent URL inclusion\n        # This removes the (error_code=..., input=..., ctx={...}) parts that include URLs\n        lines = []\n        for line in output.split(\"\\n\"):\n            # Skip lines that contain URLs or error documentation links\n            if \"https://errors.pydantic.dev\" not in line:\n                lines.append(line)\n        return \"\\n\".join(lines)\n\n    # Replace the __str__ method\n    ValidationError.__str__ = __str__  # type: ignore\n\n\ndef is_typed_dict(cls) -> bool:\n    return (\n        isinstance(cls, type)\n        and issubclass(cls, dict)\n        and hasattr(cls, \"__annotations__\")\n    )\n\n\ndef is_simple_type(typehint: type[T]) -> bool:\n    \"\"\"Check if a type is a simple type that can be adapted.\"\"\"\n    from instructor.dsl.simple_type import is_simple_type as _is_simple_type\n\n    return _is_simple_type(typehint)\n\n\ndef prepare_response_model(response_model: type[T] | None) -> type[T] | None:\n    \"\"\"\n    Prepares the response model for use in the API call.\n\n    This function performs several transformations on the input response_model:\n    1. If the response_model is None, it returns None.\n    2. If it's a simple type, it wraps it in a ModelAdapter.\n    3. If it's a TypedDict, it converts it to a Pydantic BaseModel.\n    4. If it's an Iterable, it wraps the element type in an IterableModel.\n    5. If it's not already a subclass of OpenAISchema, it applies the openai_schema decorator.\n\n    Args:\n        response_model (type[T] | None): The input response model to be prepared.\n\n    Returns:\n        type[T] | None: The prepared response model, or None if the input was None.\n    \"\"\"\n    if response_model is None:\n        return None\n\n    origin = get_origin(response_model)\n\n    # For `list[int | str]` and other scalar lists, keep the simple-type adapter path.\n    # However, for `list[User]` (or `list[Union[User, Other]]`) we want IterableModel.\n    if origin is list and is_simple_type(response_model):\n        args = get_args(response_model)\n        inner = args[0] if args else None\n\n        def _is_model_type(t: Any) -> bool:\n            if inspect.isclass(t) and issubclass(t, BaseModel):\n                return True\n            return get_origin(t) is Union and all(\n                inspect.isclass(m) and issubclass(m, BaseModel) for m in get_args(t)\n            )\n\n        if inner is not None and _is_model_type(inner):\n            # Treat as structured iterable extraction.\n            origin = list\n        else:\n            from instructor.dsl.simple_type import ModelAdapter\n\n            # Avoid `ModelAdapter[response_model]` so type checkers don't treat this\n            # as a type expression. This is a runtime wrapper.\n            response_model = ModelAdapter.__class_getitem__(response_model)  # type: ignore[arg-type]\n            origin = get_origin(response_model)\n\n    # Convert TypedDict -> BaseModel\n    if is_typed_dict(response_model):\n        model_name = getattr(response_model, \"__name__\", \"TypedDictModel\")\n        annotations = getattr(response_model, \"__annotations__\", {})\n        response_model = cast(\n            type[BaseModel],\n            create_model(\n                model_name,\n                **{k: (v, ...) for k, v in annotations.items()},\n            ),\n        )\n\n    # Convert Iterable[T] or list[T] (where T is a model) -> IterableModel(T)\n    origin = get_origin(response_model)\n    if origin in {Iterable, list}:\n        from instructor.dsl.iterable import IterableModel\n\n        args = get_args(response_model)\n        if not args or args[0] is None:\n            raise ValueError(\n                \"response_model must be parameterized, e.g. list[User] or Iterable[User]\"\n            )\n        iterable_element_class = args[0]\n        if is_typed_dict(iterable_element_class):\n            iterable_element_class = cast(\n                type[BaseModel],\n                create_model(\n                    getattr(iterable_element_class, \"__name__\", \"TypedDictModel\"),\n                    **{\n                        k: (v, ...)\n                        for k, v in getattr(\n                            iterable_element_class, \"__annotations__\", {}\n                        ).items()\n                    },\n                ),\n            )\n        response_model = IterableModel(cast(type[BaseModel], iterable_element_class))\n\n    if is_simple_type(response_model):\n        from instructor.dsl.simple_type import ModelAdapter\n\n        # Avoid `ModelAdapter[response_model]` so type checkers don't treat this as\n        # a type expression. This is a runtime wrapper.\n        response_model = ModelAdapter.__class_getitem__(response_model)  # type: ignore[arg-type]\n\n    # Import here to avoid circular dependency\n    from ..processing.function_calls import OpenAISchema, openai_schema\n\n    # response_model is guaranteed to be a type at this point due to earlier checks\n    if inspect.isclass(response_model) and not issubclass(response_model, OpenAISchema):\n        response_model = openai_schema(response_model)  # type: ignore\n    elif not inspect.isclass(response_model):\n        response_model = openai_schema(response_model)  # type: ignore\n\n    return response_model\n"
  },
  {
    "path": "instructor/utils/providers.py",
    "content": "\"\"\"Provider detection and registry utilities.\n\nThis module contains provider-related enums and detection logic.\n\"\"\"\n\nfrom enum import Enum\n\n\nclass Provider(Enum):\n    OPENAI = \"openai\"\n    VERTEXAI = \"vertexai\"\n    ANTHROPIC = \"anthropic\"\n    ANYSCALE = \"anyscale\"\n    TOGETHER = \"together\"\n    GROQ = \"groq\"\n    MISTRAL = \"mistral\"\n    COHERE = \"cohere\"\n    GEMINI = \"gemini\"\n    GENAI = \"genai\"\n    DATABRICKS = \"databricks\"\n    CEREBRAS = \"cerebras\"\n    DEEPSEEK = \"deepseek\"\n    FIREWORKS = \"fireworks\"\n    WRITER = \"writer\"\n    XAI = \"xai\"\n    UNKNOWN = \"unknown\"\n    BEDROCK = \"bedrock\"\n    PERPLEXITY = \"perplexity\"\n    OPENROUTER = \"openrouter\"\n\n\ndef get_provider(base_url: str) -> Provider:\n    \"\"\"\n    Detect the provider based on the base URL.\n\n    Args:\n        base_url: The base URL to analyze\n\n    Returns:\n        Provider: The detected provider enum value\n    \"\"\"\n    if \"anyscale\" in str(base_url):\n        return Provider.ANYSCALE\n    elif \"together\" in str(base_url):\n        return Provider.TOGETHER\n    elif \"anthropic\" in str(base_url):\n        return Provider.ANTHROPIC\n    elif \"cerebras\" in str(base_url):\n        return Provider.CEREBRAS\n    elif \"fireworks\" in str(base_url):\n        return Provider.FIREWORKS\n    elif \"groq\" in str(base_url):\n        return Provider.GROQ\n    elif \"openai\" in str(base_url):\n        return Provider.OPENAI\n    elif \"mistral\" in str(base_url):\n        return Provider.MISTRAL\n    elif \"cohere\" in str(base_url):\n        return Provider.COHERE\n    elif \"gemini\" in str(base_url):\n        return Provider.GEMINI\n    elif \"databricks\" in str(base_url):\n        return Provider.DATABRICKS\n    elif \"deepseek\" in str(base_url):\n        return Provider.DEEPSEEK\n    elif \"vertexai\" in str(base_url):\n        return Provider.VERTEXAI\n    elif \"writer\" in str(base_url):\n        return Provider.WRITER\n    elif \"perplexity\" in str(base_url):\n        return Provider.PERPLEXITY\n    elif \"x.ai\" in str(base_url) or \"xai\" in str(base_url):\n        return Provider.XAI\n    elif \"openrouter\" in str(base_url):\n        return Provider.OPENROUTER\n    return Provider.UNKNOWN\n"
  },
  {
    "path": "instructor/validation/__init__.py",
    "content": "\"\"\"Validation components for instructor.\"\"\"\n\nfrom .async_validators import (\n    AsyncValidationContext,\n    async_field_validator,\n    async_model_validator,\n    ASYNC_VALIDATOR_KEY,\n    ASYNC_MODEL_VALIDATOR_KEY,\n)\nfrom ..core.exceptions import AsyncValidationError\nfrom .llm_validators import Validator, llm_validator, openai_moderation\n\n__all__ = [\n    \"AsyncValidationContext\",\n    \"AsyncValidationError\",\n    \"async_field_validator\",\n    \"async_model_validator\",\n    \"ASYNC_VALIDATOR_KEY\",\n    \"ASYNC_MODEL_VALIDATOR_KEY\",\n    \"Validator\",\n    \"llm_validator\",\n    \"openai_moderation\",\n]\n"
  },
  {
    "path": "instructor/validation/async_validators.py",
    "content": "from typing import Callable, Any, TypeVar\nfrom inspect import signature\nfrom pydantic import ValidationInfo\n\n\nASYNC_VALIDATOR_KEY = \"__async_validator__\"\nASYNC_MODEL_VALIDATOR_KEY = \"__async_model_validator__\"\nT = TypeVar(\"T\", bound=Callable[..., Any])\n\n\nclass AsyncValidationContext:\n    context: dict[str, Any]\n\n    def __init__(self, context: dict[str, Any]):\n        self.context = context\n\n\ndef async_field_validator(field: str, *fields: str) -> Callable[[T], T]:\n    field_names = field, *fields\n\n    def decorator(func: T) -> T:\n        params = signature(func).parameters\n        requires_validation_context = False\n        if len(params) == 3:\n            if \"info\" not in params:\n                raise ValueError(\n                    \"Async validator can only have a value parameter and an optional info parameter\"\n                )\n            if params[\"info\"].annotation != ValidationInfo:\n                raise ValueError(\n                    \"Async validator info parameter must be of type ValidationInfo\"\n                )\n            requires_validation_context = True\n\n        setattr(\n            func, ASYNC_VALIDATOR_KEY, (field_names, func, requires_validation_context)\n        )\n        return func\n\n    return decorator\n\n\ndef async_model_validator() -> Callable[[T], T]:\n    def decorator(func: T) -> T:\n        params = signature(func).parameters\n        requires_validation_context = False\n        if len(params) > 2:\n            raise ValueError(\"Invalid Parameter Count!\")\n\n        if len(params) == 2:\n            if \"info\" not in params:\n                raise ValueError(\n                    \"Async validator can only have a value parameter and an optional info parameter\"\n                )\n            if params[\"info\"].annotation != ValidationInfo:\n                raise ValueError(\n                    \"Async validator info parameter must be of type ValidationInfo\"\n                )\n            requires_validation_context = True\n\n        setattr(\n            func,\n            ASYNC_MODEL_VALIDATOR_KEY,\n            (func, requires_validation_context),\n        )\n        return func\n\n    return decorator\n"
  },
  {
    "path": "instructor/validation/llm_validators.py",
    "content": "from typing import Callable\n\nfrom openai import OpenAI\n\nfrom ..processing.validators import Validator\nfrom ..core.client import Instructor\n\n\ndef llm_validator(\n    statement: str,\n    client: Instructor,\n    allow_override: bool = False,\n    model: str = \"gpt-3.5-turbo\",\n    temperature: float = 0,\n) -> Callable[[str], str]:\n    \"\"\"\n    Create a validator that uses the LLM to validate an attribute\n\n    ## Usage\n\n    ```python\n    from instructor import llm_validator\n    from pydantic import BaseModel, Field, field_validator\n\n    class User(BaseModel):\n        name: str = Annotated[str, llm_validator(\"The name must be a full name all lowercase\")\n        age: int = Field(description=\"The age of the person\")\n\n    try:\n        user = User(name=\"Jason Liu\", age=20)\n    except ValidationError as e:\n        print(e)\n    ```\n\n    ```\n    1 validation error for User\n    name\n        The name is valid but not all lowercase (type=value_error.llm_validator)\n    ```\n\n    Note that there, the error message is written by the LLM, and the error type is `value_error.llm_validator`.\n\n    Parameters:\n        statement (str): The statement to validate\n        model (str): The LLM to use for validation (default: \"gpt-4o-mini\")\n        temperature (float): The temperature to use for the LLM (default: 0)\n        client (OpenAI): The OpenAI client to use (default: None)\n    \"\"\"\n\n    def llm(v: str) -> str:\n        resp = client.chat.completions.create(\n            response_model=Validator,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a world class validation model. Capable to determine if the following value is valid for the statement, if it is not, explain why and suggest a new value.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": f\"Does `{v}` follow the rules: {statement}\",\n                },\n            ],\n            model=model,\n            temperature=temperature,\n        )\n\n        # If the response is  not valid, return the reason, this could be used in\n        # the future to generate a better response, via reasking mechanism.\n        assert resp.is_valid, resp.reason\n\n        if allow_override and not resp.is_valid and resp.fixed_value is not None:\n            # If the value is not valid, but we allow override, return the fixed value\n            return resp.fixed_value\n        return v\n\n    return llm\n\n\ndef openai_moderation(client: OpenAI) -> Callable[[str], str]:\n    \"\"\"\n    Validates a message using OpenAI moderation model.\n\n    Should only be used for monitoring inputs and outputs of OpenAI APIs\n    Other use cases are disallowed as per:\n    https://platform.openai.com/docs/guides/moderation/overview\n\n    Example:\n    ```python\n    from instructor import OpenAIModeration\n\n    class Response(BaseModel):\n        message: Annotated[str, AfterValidator(OpenAIModeration(openai_client=client))]\n\n    Response(message=\"I hate you\")\n    ```\n\n    ```\n     ValidationError: 1 validation error for Response\n     message\n    Value error, `I hate you.` was flagged for ['harassment'] [type=value_error, input_value='I hate you.', input_type=str]\n    ```\n\n    client (OpenAI): The OpenAI client to use, must be sync (default: None)\n    \"\"\"\n\n    def validate_message_with_openai_mod(v: str) -> str:\n        response = client.moderations.create(input=v)\n        out = response.results[0]\n        cats = out.categories.model_dump()\n        if out.flagged:\n            raise ValueError(\n                f\"`{v}` was flagged for {', '.join(cat for cat in cats if cats[cat])}\"\n            )\n\n        return v\n\n    return validate_message_with_openai_mod\n"
  },
  {
    "path": "instructor/validators.py",
    "content": "\"\"\"Backwards compatibility module for instructor.validators.\n\nThis module provides lazy imports to maintain backwards compatibility.\n\"\"\"\n\nimport warnings\n\n\ndef __getattr__(name: str):\n    \"\"\"Lazy import to provide backward compatibility for validators imports.\"\"\"\n    warnings.warn(\n        f\"Importing from 'instructor.validators' is deprecated and will be removed in v2.0.0. \"\n        f\"Please update your imports to use the new location:\\n\"\n        \"  from instructor.validation import llm_validator, openai_moderation\",\n        DeprecationWarning,\n        stacklevel=2,\n    )\n\n    from . import validation\n    from .processing import validators as processing_validators\n\n    # Try validation module first\n    if hasattr(validation, name):\n        return getattr(validation, name)\n\n    # Then try processing.validators\n    if hasattr(processing_validators, name):\n        return getattr(processing_validators, name)\n\n    raise AttributeError(f\"module '{__name__}' has no attribute '{name}'\")\n"
  },
  {
    "path": "mkdocs.yml",
    "content": "site_name: Instructor\nsite_author: Jason Liu\nsite_description: A lightweight library for structured outputs with LLMs.\nrepo_name: instructor\nrepo_url: https://github.com/jxnl/instructor/\nsite_url: https://python.useinstructor.com/\nedit_uri: edit/main/docs/\ncopyright: Copyright &copy; 2024 Jason Liu\ntheme:\n  name: material\n  icon:\n    repo: fontawesome/brands/github\n    edit: material/pencil\n    view: material/eye\n    theme:\n    admonition:\n      note: octicons/tag-16\n      abstract: octicons/checklist-16\n      info: octicons/info-16\n      tip: octicons/squirrel-16\n      success: octicons/check-16\n      question: octicons/question-16\n      warning: octicons/alert-16\n      failure: octicons/x-circle-16\n      danger: octicons/zap-16\n      bug: octicons/bug-16\n      example: octicons/beaker-16\n      quote: octicons/quote-16\n  features:\n    - announce.dismiss\n    - content.action.edit\n    - content.action.view\n    - content.code.annotate\n    - content.code.copy\n    - content.code.select\n    - content.tabs.link\n    - content.tooltips\n    - header.autohide\n    - navigation.expand\n    - navigation.footer\n    - navigation.indexes\n    - navigation.instant\n    - navigation.instant.prefetch\n    - navigation.instant.progress\n    - navigation.prune\n    - navigation.sections\n    - navigation.tabs\n    # - navigation.tabs.sticky\n    - navigation.top\n    - navigation.tracking\n    - search.highlight\n    - search.share\n    - search.suggest\n    - toc.follow\n    # - toc.integrate\n  palette:\n      - scheme: default\n        primary: black\n        accent: indigo\n        toggle:\n          icon: material/brightness-7\n          name: Switch to dark mode\n      - scheme: slate\n        primary: black\n        accent: indigo\n        toggle:\n          icon: material/brightness-4\n          name: Switch to light mode\n  font:\n    text: Roboto\n    code: Roboto Mono\n  custom_dir: docs/overrides\n# Extensions\nmarkdown_extensions:\n  - abbr\n  - admonition\n  - pymdownx.details\n  - attr_list\n  - def_list\n  - footnotes\n  - md_in_html\n  - toc:\n      permalink: true\n  - pymdownx.arithmatex:\n      generic: true\n  - pymdownx.betterem:\n      smart_enable: all\n  - pymdownx.caret\n  - pymdownx.details\n  - pymdownx.emoji:\n      emoji_generator: !!python/name:material.extensions.emoji.to_svg\n      emoji_index: !!python/name:material.extensions.emoji.twemoji\n  - pymdownx.highlight:\n      anchor_linenums: true\n      line_spans: __span\n      pygments_lang_class: true\n  - pymdownx.inlinehilite\n  - pymdownx.keys\n  - pymdownx.magiclink:\n      normalize_issue_symbols: true\n      repo_url_shorthand: true\n      user: jxnl\n      repo: instructor\n  - pymdownx.mark\n  - pymdownx.smartsymbols\n  - pymdownx.snippets:\n      auto_append:\n        - includes/mkdocs.md\n  - pymdownx.superfences:\n      custom_fences:\n        - name: mermaid\n          class: mermaid\n          format: !!python/name:pymdownx.superfences.fence_code_format\n  - pymdownx.tabbed:\n      alternate_style: true\n      combine_header_slug: true\n  - pymdownx.tasklist:\n      custom_checkbox: true\n  - pymdownx.arithmatex:\n      generic: true\n\nextra_javascript:\n  - javascripts/katex.js\n  - https://unpkg.com/katex@0/dist/katex.min.js\n  - https://unpkg.com/katex@0/dist/contrib/auto-render.min.js\n\nextra_css:\n  - https://unpkg.com/katex@0/dist/katex.min.css\nnav:\n  - Introduction:\n    - Structured Outputs for LLMs: 'index.md'\n    - Start Here (Beginners): 'start-here.md'\n    - Getting Started: 'getting-started.md'\n    - Installation: 'installation.md'\n    - Why use Instructor?: 'why.md'\n    - Architecture: 'architecture.md'\n    - Debugging: 'debugging.md'\n    - Repository Overview: 'repository-overview.md'\n    - Mode Comparison: 'modes-comparison.md'\n    - Philosophy: 'concepts/philosophy.md'\n    - API Reference: 'api.md'\n    - FAQ: 'faq.md'\n    - Help with Instructor: 'help.md'\n    - Contributing: 'contributing.md'\n    - Newsletter: 'newsletter.md'\n    - Tutorials: 'tutorials/index.md'\n  - Learning:\n    - Installation: 'learning/getting_started/installation.md'\n    - Overview: 'learning/index.md'\n    - Getting Started with Structured Outputs: 'learning/getting_started/structured_outputs.md'\n    - Your First Extraction: 'learning/getting_started/first_extraction.md'\n    - Understanding Response Models: 'learning/getting_started/response_models.md'\n    - Simple Object Extraction: 'learning/patterns/simple_object.md'\n    - List Extraction: 'learning/patterns/list_extraction.md'\n    - Simple Nested Structure: 'learning/patterns/nested_structure.md'\n    - Field Validation: 'learning/patterns/field_validation.md'\n    - Optional Fields: 'learning/patterns/optional_fields.md'\n    - Prompt Templates: 'learning/patterns/prompt_templates.md'\n    - Streaming Basics: 'learning/streaming/basics.md'\n    - Streaming Lists: 'learning/streaming/lists.md'\n    - Validation Basics: 'learning/validation/basics.md'\n    - Custom Validators: 'learning/validation/custom_validators.md'\n    - Retry Mechanisms: 'learning/validation/retry_mechanisms.md'\n    - Field-level Validation: 'learning/validation/field_level_validation.md'\n  - Integrations:\n    - Overview: 'integrations/index.md'\n    # Major cloud providers\n    - OpenAI: 'integrations/openai.md'\n    - OpenAI Responses: 'integrations/openai-responses.md'\n    - DeepSeek: 'integrations/deepseek.md'\n    - llama-cpp-python: 'integrations/llama-cpp-python.md'\n    - Gemini: 'integrations/google.md'\n    - Anthropic: 'integrations/anthropic.md'\n    - xAI: 'integrations/xai.md'\n    - Azure OpenAI: 'integrations/azure.md'\n    - Google GenAI: 'integrations/genai.md'\n    - AWS Bedrock: 'integrations/bedrock.md'\n    - Vertex AI: 'integrations/vertex.md'\n    \n    # Fast inference providers\n    - Groq: 'integrations/groq.md'\n    - Fireworks: 'integrations/fireworks.md'\n    - Together: 'integrations/together.md'\n    - Anyscale: 'integrations/anyscale.md'\n    \n    # Other commercial providers\n    - Cerebras: 'integrations/cerebras.md'\n    - Cohere: 'integrations/cohere.md'\n    - Databricks: 'integrations/databricks.md'\n    - Cortex: 'integrations/cortex.md'\n    - LiteLLM: 'integrations/litellm.md'\n    - Mistral: 'integrations/mistral.md'\n    - Ollama: 'integrations/ollama.md'\n    - Perplexity: 'integrations/perplexity.md'\n    - Writer: 'integrations/writer.md'\n    - OpenRouter: 'integrations/openrouter.md'\n    - SambaNova: 'integrations/sambanova.md'\n    - TrueFoundry: 'integrations/truefoundry.md'\n  - Cookbook:\n    - Overview: 'examples/index.md'\n    - \"Audio Information Extraction\": 'examples/audio_extraction.md'\n    - \"Recursive Schema Examples\": 'examples/recursive.md'\n    - \"Enhancing Text Classification\": 'examples/classification.md'\n    - \"Local Classification with Llama-cpp\": 'examples/local_classification.md'\n    - \"Structured Outputs with Ollama\": 'examples/ollama.md'\n    - \"Multi-Modal Data with Gemini\": 'examples/multi_modal_gemini.md'\n    - \"Exact Citations for RAG\": 'examples/exact_citations.md'\n    - \"Extracting Knowledge Graphs\": 'examples/knowledge_graph.md'\n    - \"Table Extraction with GPT-4 Vision\": 'examples/extracting_tables.md'\n    - \"User-Defined Bulk Classification\": 'examples/bulk_classification.md'\n    - \"AI Model Self-Correction\": 'examples/self_critique.md'\n    - \"Receipt Data Extraction with GPT-4\": 'examples/extracting_receipts.md'\n    - \"Slide Data Extraction with GPT-4\": 'examples/extract_slides.md'\n    - \"Content Moderation with OpenAI\": 'examples/moderation.md'\n    - \"Complex Entity Resolution\": 'examples/entity_resolution.md'\n    - \"Expanding RAG Search Queries\": 'examples/search.md'\n    - \"RAG Query Planning\": 'examples/planning-tasks.md'\n    - \"PII Data Sanitization\": 'examples/pii.md'\n    - \"Integrating Open Source Models\": 'examples/open_source.md'\n    - \"Image to Ad Copy Generation\": 'examples/image_to_ad_copy.md'\n    - \"SQLModel Integration\": 'examples/sqlmodel.md'\n    - \"Examples in Pydantic Models\": 'examples/examples.md'\n    - \"Intelligent Document Segmentation\": 'examples/document_segmentation.md'\n    - \"Structured Output with watsonx.ai\": 'examples/watsonx.md'\n    - \"Structured Outputs with Groq\": 'examples/groq.md'\n    - \"Structured Outputs with Mistral\": 'examples/mistral.md'\n    - \"Action Items Extraction\": 'examples/action_items.md'\n    - \"Contact Information Extraction\": 'examples/extract_contact_info.md'\n    - \"Knowledge Graph Building\": 'examples/building_knowledge_graphs.md'\n    - \"Tracing with Langfuse\": 'examples/tracing_with_langfuse.md'\n    - \"Multiple Classification Tasks\": 'examples/multiple_classification.md'\n    - \"Pandas DataFrame Integration\": 'examples/pandas_df.md'\n    - \"Partial Response Streaming\": 'examples/partial_streaming.md'\n    - \"Single Classification Tasks\": 'examples/single_classification.md'\n    - \"Table Extraction from Images\": 'examples/tables_from_vision.md'\n    - \"Using Decimals\": 'examples/using_decimals.md'\n    - \"YouTube Clip Analysis\": 'examples/youtube_clips.md'\n  - Concepts:\n    - Overview: 'concepts/index.md'\n    - Error Handling: 'concepts/error_handling.md'\n    - Retrying: 'concepts/retrying.md'\n    - Fields: 'concepts/fields.md'\n    - Models: 'concepts/models.md'\n    - Parallel Tools: 'concepts/parallel.md'\n    - Templating: 'concepts/templating.md'\n    - Lists and Arrays: 'concepts/lists.md'\n    - Prompting: 'concepts/prompting.md'\n    - Citations: 'concepts/citation.md'\n    - Multimodal : 'concepts/multimodal.md'\n    - Patching: 'concepts/patching.md'\n    - from_provider: 'concepts/from_provider.md'\n    - Migration Guide: 'concepts/migration.md'\n    - Mode Migration: 'concepts/mode-migration.md'\n    - Hooks: 'concepts/hooks.md'\n    - Types: 'concepts/types.md'\n    - TypedDicts: 'concepts/typeddicts.md'\n    - Validators: \"concepts/reask_validation.md\"\n    - Usage Tokens: 'concepts/usage.md'\n    - Missing: \"concepts/maybe.md\"\n    - Stream Iterable: \"concepts/iterable.md\"\n    - Stream Partial: \"concepts/partial.md\"\n    - Raw Response: 'concepts/raw_response.md'\n    - FastAPI: 'concepts/fastapi.md'\n    - Caching: 'concepts/caching.md'\n    - Prompt Caching: 'concepts/prompt_caching.md'\n    - Logging: 'concepts/logging.md'\n    - Distillation: \"concepts/distillation.md\"\n    - Dictionary Operations: 'concepts/dictionary_operations.md'\n    - Union: 'concepts/union.md'\n    - Unions: 'concepts/unions.md'\n    - Validation: 'concepts/validation.md'\n    - Semantic Validation: 'concepts/semantic_validation.md'\n    - Alias: 'concepts/alias.md'\n    - Enums: 'concepts/enums.md'\n    - Type Adapter: 'concepts/typeadapter.md'\n  \n  - Prompt Engineering:\n    - \"prompting/index.md\"\n    - Zero-Shot:\n      - Use Emotional Language: 'prompting/zero_shot/emotion_prompting.md'\n      - Assign a Role: 'prompting/zero_shot/role_prompting.md'\n      - Define A Style: 'prompting/zero_shot/style_prompting.md'\n      - Auto-Refine The Prompt: 'prompting/zero_shot/s2a.md'\n      - Simulate A Perspective: 'prompting/zero_shot/simtom.md'\n      - Clarify Ambiguous Information: 'prompting/zero_shot/rar.md'\n      - Ask Model To Repeat Query: 'prompting/zero_shot/re2.md'\n      - Generate Follow-Up Questions: 'prompting/zero_shot/self_ask.md'\n    - Few-Shot:\n      - Example Generation:\n        - Generate In-Context Examples: 'prompting/few_shot/example_generation/sg_icl.md'\n      - Example Ordering: 'prompting/few_shot/example_ordering.md'\n      - Exemplar Selection:\n        - Select Effective Examples: 'prompting/few_shot/exemplar_selection/knn.md'\n        - Vote-K: 'prompting/few_shot/exemplar_selection/vote_k.md'\n        - Consistent Based Examples: 'prompting/few_shot/cosp.md'\n    - Thought Generation:\n      - Chain-Of-Thought (Zero-Shot):\n        - Generate Examples First: 'prompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md'\n        - Consider Higher-Level Context: 'prompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md'\n        - Examine The Context: 'prompting/thought_generation/chain_of_thought_zero_shot/thread_of_thought.md'\n        - Structure The Reasoning: 'prompting/thought_generation/chain_of_thought_zero_shot/tab_cot.md'\n      - Chain-Of-Thought (Few-Shot):\n        - Prioritize Uncertain Examples: 'prompting/thought_generation/chain_of_thought_few_shot/active_prompt.md'\n        - Automate Example Selection: 'prompting/thought_generation/chain_of_thought_few_shot/auto_cot.md'\n        - Prioritize Complex Examples: 'prompting/thought_generation/chain_of_thought_few_shot/complexity_based.md'\n        - Include Incorrect Examples: 'prompting/thought_generation/chain_of_thought_few_shot/contrastive.md'\n        - Memory-of-Thought: 'prompting/thought_generation/chain_of_thought_few_shot/memory_of_thought.md'\n        - Use Majority Voting: 'prompting/thought_generation/chain_of_thought_few_shot/uncertainty_routed_cot.md'\n        - Generate Prompt Variations: 'prompting/thought_generation/chain_of_thought_few_shot/prompt_mining.md'\n    - Ensembling:\n      - Prioritize Consistent Examples: 'prompting/ensembling/cosp.md'\n      - Use Distinct Example Subsets: 'prompting/ensembling/dense.md'\n      - Verify Responses over Majority Voting : 'prompting/ensembling/diverse.md'\n      - Use Ensembles To Test Prompts: 'prompting/ensembling/max_mutual_information.md'\n      - Combine Multiple Reasoning Chains: 'prompting/ensembling/meta_cot.md'\n      - Combine Different Specialized LLMs: 'prompting/ensembling/more.md'\n      - Generate Multiple Candidate Responses: 'prompting/ensembling/self_consistency.md'\n      - Use LLMs to Combine Different Responses: 'prompting/ensembling/universal_self_consistency.md'\n      - Use Task Specific Evaluation Metrics: 'prompting/ensembling/usp.md'\n      - Use Translation for Paraphrasing: 'prompting/ensembling/prompt_paraphrasing.md'\n    - Self-Criticism:\n      - Independently Verify Responses: 'prompting/self_criticism/chain_of_verification.md'\n      - Determine Uncertainty of Reasoning Chain: 'prompting/self_criticism/self_calibration.md'\n      - Improve With Feedback: 'prompting/self_criticism/self_refine.md'\n      - Self-Verify Responses: 'prompting/self_criticism/self_verification.md'\n      - Reconstruct Prompt from Reasoning Steps : 'prompting/self_criticism/reversecot.md'\n      - Break Down Reasoning Into Multiple Steps: 'prompting/self_criticism/cumulative_reason.md'\n    - Decomposition:\n      - Break Down Complex Tasks: 'prompting/decomposition/decomp.md'\n      - Leverage Task Specific Systems: 'prompting/decomposition/faithful_cot.md'\n      - Solve simpler subproblems: 'prompting/decomposition/least_to_most.md'\n      - Ditch Vanilla Chain Of Thought: 'prompting/decomposition/plan_and_solve.md'\n      - Generate Python for Intermediate Steps: 'prompting/decomposition/program_of_thought.md'\n      - Recurs.-of-Thought: 'prompting/decomposition/recurs_of_thought.md'\n      - Generate in Parallel: 'prompting/decomposition/skeleton_of_thought.md'\n      - Tree-of-Thought: 'prompting/decomposition/tree-of-thought.md'\n  - CLI Reference:\n      - \"CLI Reference\": \"cli/index.md\"\n      - \"Finetuning GPT-3.5\": \"cli/finetune.md\"\n      - \"Usage Tracking\": \"cli/usage.md\"\n      - \"Batch Jobs\": \"cli/batch.md\"\n  - Find Jobs (External):\n      - Jobs: \"jobs.md\"\n  - Blog:\n      - \"blog/index.md\"\nplugins:\n  - llmstxt:\n      markdown_description: >\n        Instructor is a Python library that makes it easy to work with structured outputs \n        from large language models (LLMs). Built on top of Pydantic, it provides a simple, \n        type-safe way to extract structured data from LLM responses across multiple providers \n        including OpenAI, Anthropic, Google, and many others.\n      sections:\n        Getting Started:\n          - index.md: Introduction to structured outputs with LLMs\n          - getting-started.md: Quick start guide\n          - installation.md: Installation instructions\n        Core Concepts:\n          - concepts/*.md\n        Integrations:\n          - integrations/*.md\n  - redirects:\n      redirect_maps:\n         jobs.md: https://jobs.applied-llms.org/\n         # LLM client redirects\n         hub/ollama.md: integrations/ollama.md\n         hub/llama-cpp-python.md: integrations/llama-cpp-python.md\n         hub/anthropic.md: integrations/anthropic.md\n         hub/anyscale.md: integrations/anyscale.md\n         hub/azure.md: integrations/azure.md\n         hub/bedrock.md: integrations/bedrock.md\n         hub/cerebras.md: integrations/cerebras.md\n         hub/cohere.md: integrations/cohere.md\n         hub/databricks.md: integrations/databricks.md\n         hub/fireworks.md: integrations/fireworks.md\n         hub/google.md: integrations/google.md\n         hub/genai.md: integrations/genai.md\n         hub/groq.md: integrations/groq.md\n         hub/litellm.md: integrations/litellm.md\n         hub/mistral.md: integrations/mistral.md\n         hub/openai.md: integrations/openai.md\n         hub/perplexity.md: integrations/perplexity.md\n         hub/together.md: integrations/together.md\n         hub/vertex.md: integrations/vertex.md\n         hub/vertexai.md: integrations/vertex.md  # Handle old vertexai.md references\n         # Legacy hub/clients/ redirects\n         'hub/clients/google.md': 'integrations/google.md'\n         'hub/clients/litellm.md': 'integrations/litellm.md'\n         'hub/clients/ollama.md': 'integrations/ollama.md'\n         'hub/clients/llama-cpp-python.md': 'integrations/llama-cpp-python.md'\n         'hub/clients/anthropic.md': 'integrations/anthropic.md'\n         'hub/clients/anyscale.md': 'integrations/anyscale.md'\n         'hub/clients/azure.md': 'integrations/azure.md'\n         'hub/clients/bedrock.md': 'integrations/bedrock.md'\n         'hub/clients/cerebras.md': 'integrations/cerebras.md'\n         'hub/clients/cohere.md': 'integrations/cohere.md'\n         'hub/clients/databricks.md': 'integrations/databricks.md'\n         'hub/clients/fireworks.md': 'integrations/fireworks.md'\n         'hub/clients/groq.md': 'integrations/groq.md'\n         'hub/clients/mistral.md': 'integrations/mistral.md'\n         'hub/clients/openai.md': 'integrations/openai.md'\n         'hub/clients/perplexity.md': 'integrations/perplexity.md'\n         'hub/clients/together.md': 'integrations/together.md'\n         'hub/clients/vertex.md': 'integrations/vertex.md'\n         'hub/clients/vertexai.md': 'integrations/vertex.md'\n         # Example redirects\n         'hub/action_items.md': 'examples/action_items.md'\n         'hub/batch_classification_langsmith.md': 'examples/batch_classification_langsmith.md'\n         'hub/extract_contact_info.md': 'examples/extract_contact_info.md'\n         'hub/index.md': 'examples/index.md'\n         'hub/knowledge_graph.md': 'examples/building_knowledge_graphs.md'\n         'hub/multiple_classification.md': 'examples/multiple_classification.md'\n         'hub/pandas_df.md': 'examples/pandas_df.md'\n         'hub/partial_streaming.md': 'examples/partial_streaming.md'\n         'hub/single_classification.md': 'examples/single_classification.md'\n         'hub/tables_from_vision.md': 'examples/tables_from_vision.md'\n         'hub/youtube_clips.md': 'examples/youtube_clips.md'\n  - social\n  - search:\n      separator: '[\\s\\u200b\\-_,:!=\\[\\]()\"`/]+|\\.(?!\\b)(?=[A-Z][a-z])'\n  - minify:\n      minify_html: true\n  - mkdocstrings:\n      handlers:\n        python:\n          options:\n            members_order: alphabetical\n            allow_inspection: true\n            show_bases: true\n  - blog:\n      enabled: !ENV CI\n      blog_dir: \"blog\"\n      blog_toc: true\n      post_dir: blog/posts\n      post_date_format: yyyy/MM/dd\n      post_url_format: \"{date}/{slug}\"\n      authors_file: \"{blog}/.authors.yml\"\nhooks:\n  - docs/hooks/hide_lines.py\nextra:\n  analytics:\n    provider: google\n    property: G-5CR8QXF5CN\n    feedback:\n      title: Was this page helpful?\n      ratings:\n        - icon: material/emoticon-happy-outline\n          name: This page was helpful\n          data: 1\n          note: >-\n            Thanks for your feedback!\n        - icon: material/emoticon-sad-outline\n          name: This page could be improved\n          data: 0\n          note: >-\n            Thanks for your feedback! Help us improve this page by\n            using our <a href=\"https://forms.gle/ijr9Zrcg2QWgKoWs7\" target=\"_blank\" rel=\"noopener\">feedback form</a>.\n  social:\n    - icon: fontawesome/brands/twitter\n      link: https://twitter.com/jxnlco\n    - icon: fontawesome/brands/github\n      link: https://github.com/jxnl\n"
  },
  {
    "path": "pyproject.toml",
    "content": "[project]\nauthors = [\n    { name = \"Jason Liu\" },\n    { name = \"Ivan Leo\" },\n]\nmaintainers = [\n    { email = \"jason@jxnl.co\" },\n    { email = \"ivan@jxnl.co\" },\n]\nlicense = { text = \"MIT\" }\nrequires-python = \"<4.0,>=3.9\"\ndependencies = [\n    \"openai>=2.0.0,<3.0.0\",\n    \"pydantic<3.0.0,>=2.8.0\",\n    \"docstring-parser<1.0,>=0.16\",\n    \"typer<1.0.0,>=0.9.0\",\n    \"rich<15.0.0,>=13.7.0\",\n    \"aiohttp<4.0.0,>=3.9.1\",\n    \"tenacity<10.0.0,>=8.2.3\",\n    \"pydantic-core<3.0.0,>=2.18.0\",\n    \"jiter>=0.6.1,<0.13\",\n    \"jinja2<4.0.0,>=3.1.4\",\n    \"requests<3.0.0,>=2.32.3\",\n    \"diskcache>=5.6.3\",\n]\nname = \"instructor\"\nversion = \"1.14.5\"\ndescription = \"structured outputs for llm\"\nreadme = \"README.md\"\n\n[build-system]\nrequires = [\"hatchling\"]\nbuild-backend = \"hatchling.build\"\n\n[tool.uv]\npackage = true\n\n[project.urls]\nrepository = \"https://github.com/instructor-ai/instructor\"\n\n[tool.pytest.ini_options]\nmarkers = [\n    \"unit: marks tests as unit tests (fast, no external dependencies)\",\n    \"integration: marks tests as integration tests (may require API keys)\",\n    \"llm: marks tests that make LLM API calls\",\n]\n\n[project.optional-dependencies]\ndev = [\n    \"pytest<9.0.0,>=8.3.3\",\n    \"pytest-asyncio>=0.24.0,<2.0.0\",\n    \"coverage<8.0.0,>=7.3.2\",\n    \"jsonref<2.0.0,>=1.1.0\",\n    \"pytest-examples>=0.0.15\",\n    \"python-dotenv>=1.0.1\",\n    \"pytest-xdist>=3.8.0\",\n    \"pre-commit>=4.2.0\",\n    \"ty>=0.0.1a23\",\n    \"anthropic==0.76.0\",\n    \"xmltodict>=0.13,<1.1\",\n]\ndocs = [\n    \"mkdocs<2.0.0,>=1.6.1\",\n    \"mkdocs-material[imaging]<10.0.0,>=9.5.9\",\n    \"mkdocstrings>=0.27.1,<0.31.0\",\n    \"mkdocstrings-python<2.0.0,>=1.12.2\",\n    \"pytest-examples>=0.0.15\",\n    \"mkdocs-jupyter<0.26.0,>=0.24.6\",\n    \"mkdocs-rss-plugin<2.0.0,>=1.12.0\",\n    \"mkdocs-minify-plugin<1.0.0,>=0.8.0\",\n    \"mkdocs-redirects<2.0.0,>=1.2.1\",\n    \"mkdocs-material-extensions>=1.3.1\",\n    \"mkdocs-material>=9.6.14\",\n]\ntest-docs = [\n    \"fastapi>=0.109.2,<0.129.0\",\n    \"redis>=5.0.1,<8.0.0\",\n    \"diskcache<6.0.0,>=5.6.3\",\n    \"pandas<3.0.0,>=2.2.0\",\n    \"tabulate<1.0.0,>=0.9.0\",\n    \"pydantic-extra-types<3.0.0,>=2.6.0\",\n    \"litellm<2.0.0,>=1.35.31\",\n    \"mistralai<2.0.0,>=1.5.1\",\n]\nanthropic = [\"anthropic==0.76.0\", \"xmltodict>=0.13,<1.1\"]\ngroq = [\"groq>=0.4.2,<1.1.0\"]\ncohere = [\"cohere<6.0.0,>=5.1.8\"]\nvertexai = [\"google-cloud-aiplatform<2.0.0,>=1.53.0\", \"jsonref<2.0.0,>=1.1.0\"]\ncerebras_cloud_sdk = [\"cerebras-cloud-sdk<2.0.0,>=1.5.0\"]\nfireworks-ai = [\"fireworks-ai<1.0.0,>=0.15.4\"]\nwriter = [\"writer-sdk<3.0.0,>=2.2.0\"]\nbedrock = [\"boto3<2.0.0,>=1.34.0\"]\nmistral = [\"mistralai<2.0.0,>=1.5.1\"]\nperplexity = [\"openai>=2.0.0,<3.0.0\"]\ngoogle-genai = [\"google-genai>=1.5.0\",\"jsonref<2.0.0,>=1.1.0\"]\nlitellm = [\"litellm<2.0.0,>=1.35.31\"]\nxai = [\"xai-sdk>=0.2.0 ; python_version >= '3.10'\"]\nphonenumbers = [\"phonenumbers>=8.13.33,<10.0.0\"]\ngraphviz = [\"graphviz<1.0.0,>=0.20.3\"]\nsqlmodel = [\"sqlmodel<1.0.0,>=0.0.22\"]\ntrafilatura = [\"trafilatura<3.0.0,>=1.12.2\"]\npydub = [\"pydub<1.0.0,>=0.25.1\"]\ndatasets = [\"datasets>=3.0.1,<5.0.0\"]\n\n[project.scripts]\ninstructor = \"instructor.cli.cli:app\"\n"
  },
  {
    "path": "requirements-doc.txt",
    "content": "mkdocs\ncairosvg\npillow\nmkdocs-minify-plugin\nmkdocstrings \nmkdocstrings-python \nmkdocs-jupyter \nmkdocs-redirects\nmkdocs-llmstxt"
  },
  {
    "path": "requirements-examples.txt",
    "content": "openai>=1.1.0\npydantic\ndocstring-parser\nrich\naiohttp\nruff==0.14.14\npre-commit==4.3.0\ntyper\ncohere\ndatasets\ntrafilatura\n"
  },
  {
    "path": "requirements.txt",
    "content": "# This file was autogenerated by uv via the following command:\n#    uv pip compile pyproject.toml -o requirements.txt\naiohappyeyeballs==2.6.1\n    # via aiohttp\naiohttp==3.13.3\n    # via instructor (pyproject.toml)\naiosignal==1.4.0\n    # via aiohttp\nannotated-types==0.7.0\n    # via pydantic\nanyio==4.12.1\n    # via\n    #   httpx\n    #   openai\nattrs==25.4.0\n    # via aiohttp\ncertifi==2026.1.4\n    # via\n    #   httpcore\n    #   httpx\n    #   requests\ncharset-normalizer==3.4.4\n    # via requests\nclick==8.1.8\n    # via typer\ndiskcache==5.6.3\n    # via instructor (pyproject.toml)\ndistro==1.9.0\n    # via openai\ndocstring-parser==0.17.0\n    # via instructor (pyproject.toml)\nfrozenlist==1.8.0\n    # via\n    #   aiohttp\n    #   aiosignal\nh11==0.16.0\n    # via httpcore\nhttpcore==1.0.9\n    # via httpx\nhttpx==0.28.1\n    # via openai\nidna==3.11\n    # via\n    #   anyio\n    #   httpx\n    #   requests\n    #   yarl\njinja2==3.1.6\n    # via instructor (pyproject.toml)\njiter==0.12.0\n    # via\n    #   instructor (pyproject.toml)\n    #   openai\nmarkdown-it-py==3.0.0\n    # via rich\nmarkupsafe==3.0.3\n    # via jinja2\nmdurl==0.1.2\n    # via markdown-it-py\nmultidict==6.7.1\n    # via\n    #   aiohttp\n    #   yarl\nopenai==2.16.0\n    # via instructor (pyproject.toml)\npropcache==0.4.1\n    # via\n    #   aiohttp\n    #   yarl\npydantic==2.12.5\n    # via\n    #   instructor (pyproject.toml)\n    #   openai\npydantic-core==2.41.5\n    # via\n    #   instructor (pyproject.toml)\n    #   pydantic\npygments==2.19.2\n    # via rich\nrequests==2.32.5\n    # via instructor (pyproject.toml)\nrich==14.3.1\n    # via\n    #   instructor (pyproject.toml)\n    #   typer\nshellingham==1.5.4\n    # via typer\nsniffio==1.3.1\n    # via openai\ntenacity==9.1.2\n    # via instructor (pyproject.toml)\ntqdm==4.67.1\n    # via openai\ntyper==0.21.1\n    # via instructor (pyproject.toml)\ntyping-extensions==4.15.0\n    # via\n    #   aiosignal\n    #   anyio\n    #   openai\n    #   pydantic\n    #   pydantic-core\n    #   typer\n    #   typing-inspection\ntyping-inspection==0.4.2\n    # via pydantic\nurllib3==2.6.3\n    # via requests\nyarl==1.22.0\n    # via aiohttp\n"
  },
  {
    "path": "scripts/README.md",
    "content": "# Scripts Directory\n\nThis directory contains utility scripts for maintaining and improving the Instructor documentation and project structure.\n\n## Available Scripts\n\n### 1. `make_clean.py` - Markdown File Cleaner\n\n**Purpose**: Cleans markdown files by removing special whitespace characters and replacing em dashes with regular dashes.\n\n**What it does**:\n- Recursively finds all `.md` files in the `docs/` directory\n- Removes special Unicode whitespace characters (non-breaking spaces, zero-width spaces, etc.)\n- Replaces em dashes (`—`) and en dashes (`–`) with regular dashes (`-`)\n- Preserves intentional formatting while cleaning problematic characters\n\n**Usage**:\n```bash\n# Clean all markdown files in docs/\npython scripts/make_clean.py\n\n# Dry run to see what would be changed\npython scripts/make_clean.py --dry-run\n\n# Clean files in a different directory\npython scripts/make_clean.py --docs-dir path/to/docs\n```\n\n**Pre-commit Integration**: This script runs automatically on commits that include markdown files in the `docs/` directory.\n\n### 2. `check_blog_excerpts.py` - Blog Post Excerpt Validator\n\n**Purpose**: Ensures all blog posts contain the `<!-- more -->` tag for proper excerpt handling.\n\n**What it does**:\n- Scans all markdown files in `docs/blog/posts/`\n- Checks for the presence of `<!-- more -->` tags\n- Reports files missing the tag\n- Exits with error code 1 if any files are missing the tag\n\n**Usage**:\n```bash\n# Check all blog posts\npython scripts/check_blog_excerpts.py\n\n# Check posts in a different directory\npython scripts/check_blog_excerpts.py --blog-posts-dir path/to/posts\n```\n\n**Pre-commit Integration**: This script runs automatically on commits that include blog post files.\n\n### 3. `make_sitemap.py` - Enhanced Documentation Sitemap Generator\n\n**Purpose**: Generates an enhanced sitemap (`sitemap.yaml`) with AI-powered content analysis and cross-link suggestions.\n\n**What it does**:\n- Recursively traverses the `docs/` directory\n- Analyzes each markdown file using OpenAI's GPT-4o-mini\n- Extracts summaries, keywords, and topics for SEO\n- Identifies internal links and references\n- Generates cross-link suggestions based on content similarity\n- Creates a comprehensive `sitemap.yaml` file\n\n**Features**:\n- **Caching**: Reuses analysis for unchanged files (based on content hash)\n- **Concurrent Processing**: Processes multiple files simultaneously\n- **Cross-linking**: Suggests related documents based on content similarity\n- **Retry Logic**: Handles API failures with exponential backoff\n\n**Usage**:\n```bash\n# Generate sitemap with default settings\npython scripts/make_sitemap.py\n\n# Customize settings\npython scripts/make_sitemap.py \\\n  --root-dir docs \\\n  --output-file sitemap.yaml \\\n  --max-concurrency 10 \\\n  --min-similarity 0.4\n\n# Use custom API key\npython scripts/make_sitemap.py --api-key your-openai-key\n```\n\n**Output**: Creates `sitemap.yaml` with structure:\n```yaml\nfile.md:\n  summary: \"Brief description of the content\"\n  keywords: [\"keyword1\", \"keyword2\", \"keyword3\"]\n  topics: [\"topic1\", \"topic2\", \"topic3\"]\n  references: [\"other-file.md\", \"another-file.md\"]\n  ai_references: [\"ai-detected-reference.md\"]\n  cross_links: [\"suggested-related-file.md\"]\n  hash: \"content-hash-for-caching\"\n```\n\n**Requirements**: \n- OpenAI API key (set as `OPENAI_API_KEY` environment variable or passed via `--api-key`)\n- Dependencies: `openai`, `typer`, `rich`, `tenacity`, `pyyaml`\n\n## Pre-commit Integration\n\nThese scripts are integrated into the project's pre-commit hooks to ensure code quality:\n\n- **`make_clean.py`**: Runs on commits with markdown files in `docs/`\n- **`check_blog_excerpts.py`**: Runs on commits with blog post files\n\nThe hooks are configured in `.pre-commit-config.yaml` and run automatically during the commit process.\n\n## Running Scripts Manually\n\nYou can run any script manually for testing or one-time operations:\n\n```bash\n# Test markdown cleaning\npython scripts/make_clean.py --dry-run\n\n# Check blog excerpts\npython scripts/check_blog_excerpts.py\n\n# Generate fresh sitemap\npython scripts/make_sitemap.py\n```\n\n### 4. `fix_api_calls.py` - API Call Standardization\n\n**Purpose**: Replaces old API call patterns with simplified versions for consistency.\n\n**What it does**:\n- Finds and replaces `client.chat.completions.create` → `client.create`\n- Finds and replaces `client.chat.completions.create_partial` → `client.create_partial`\n- Finds and replaces `client.chat.completions.create_iterable` → `client.create_iterable`\n- Finds and replaces `client.chat.completions.create_with_completion` → `client.create_with_completion`\n- Processes all markdown and notebook files in the docs directory\n\n**Usage**:\n```bash\n# Dry run to see what would be changed\npython scripts/fix_api_calls.py --dry-run\n\n# Apply changes to all files\npython scripts/fix_api_calls.py\n\n# Process a single file\npython scripts/fix_api_calls.py --file docs/index.md\n\n# Custom docs directory\npython scripts/fix_api_calls.py --docs-dir path/to/docs\n```\n\n### 5. `fix_old_patterns.py` - Client Initialization Pattern Fixer\n\n**Purpose**: Replaces old client initialization patterns with the modern `from_provider` API.\n\n**What it does**:\n- Replaces `instructor.from_openai(OpenAI())` → `instructor.from_provider(\"openai/model-name\")`\n- Replaces `instructor.from_anthropic(Anthropic())` → `instructor.from_provider(\"anthropic/model-name\")`\n- Replaces `instructor.patch(OpenAI())` → `instructor.from_provider(\"openai/model-name\")`\n- Handles all supported providers (OpenAI, Anthropic, Google, Cohere, Mistral, Groq, etc.)\n- Attempts to extract model names from existing code\n\n**Usage**:\n```bash\n# Dry run to see what would be changed\npython scripts/fix_old_patterns.py --dry-run\n\n# Apply changes to all files\npython scripts/fix_old_patterns.py\n\n# Process a single file\npython scripts/fix_old_patterns.py --file docs/integrations/openai.md\n```\n\n**Note**: Model names are extracted from existing code when possible, but may need manual review for accuracy.\n\n### 6. `audit_patterns.py` - Pattern Auditor\n\n**Purpose**: Audits documentation files to find old patterns that need updating.\n\n**What it does**:\n- Finds old API call patterns (`client.chat.completions.*`)\n- Finds old initialization patterns (`instructor.from_*`, `instructor.patch`)\n- Identifies potentially unused imports\n- Reports line numbers for each issue\n- Provides summary statistics\n\n**Usage**:\n```bash\n# Detailed report with line numbers\npython scripts/audit_patterns.py\n\n# Summary statistics only\npython scripts/audit_patterns.py --summary\n\n# Audit a single file\npython scripts/audit_patterns.py --file docs/index.md\n\n# Custom docs directory\npython scripts/audit_patterns.py --docs-dir path/to/docs\n```\n\n**Output**: Reports issues by file with line numbers, or summary statistics showing total counts per pattern type.\n\n## Adding New Scripts\n\nWhen adding new scripts to this directory:\n\n1. **Documentation**: Add a section to this README explaining the script's purpose and usage\n2. **Pre-commit Integration**: If appropriate, add the script to `.pre-commit-config.yaml`\n3. **Error Handling**: Ensure scripts exit with appropriate error codes\n4. **Help Text**: Include `--help` functionality for command-line scripts\n5. **Testing**: Test scripts manually before committing\n\n## Dependencies\n\nMost scripts use only Python standard library modules. The sitemap generator requires additional dependencies:\n\n```bash\nuv add openai typer rich tenacity pyyaml\n```\n\n## Troubleshooting\n\n**Pre-commit hooks failing**:\n- Check that scripts are executable: `chmod +x scripts/*.py`\n- Verify script paths in `.pre-commit-config.yaml`\n- Run scripts manually to identify issues\n\n**Sitemap generation issues**:\n- Ensure OpenAI API key is set correctly\n- Check network connectivity for API calls\n- Review error messages for specific file issues\n\n**Markdown cleaning issues**:\n- Use `--dry-run` to preview changes\n- Check file permissions in the docs directory\n- Verify UTF-8 encoding of markdown files "
  },
  {
    "path": "scripts/audit_patterns.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nAudit documentation files for old patterns that need to be updated.\n\nReports:\n- Old API call patterns (client.chat.completions.*)\n- Old initialization patterns (instructor.from_*, instructor.patch)\n- Unused imports\n\"\"\"\n\nimport argparse\nimport re\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Dict, List\n\n\ndef find_markdown_files(docs_dir: Path) -> List[Path]:\n    \"\"\"Find all markdown files in the docs directory.\"\"\"\n    return list(docs_dir.rglob(\"*.md\")) + list(docs_dir.rglob(\"*.ipynb\"))\n\n\ndef audit_api_calls(content: str, file_path: Path) -> Dict[str, List[int]]:\n    \"\"\"Find old API call patterns.\"\"\"\n    issues = defaultdict(list)\n\n    patterns = {\n        \"client.chat.completions.create\": r\"client\\.chat\\.completions\\.create\\(\",\n        \"client.chat.completions.create_partial\": r\"client\\.chat\\.completions\\.create_partial\\(\",\n        \"client.chat.completions.create_iterable\": r\"client\\.chat\\.completions\\.create_iterable\\(\",\n        \"client.chat.completions.create_with_completion\": r\"client\\.chat\\.completions\\.create_with_completion\\(\",\n    }\n\n    for name, pattern in patterns.items():\n        for match in re.finditer(pattern, content):\n            line_num = content[: match.start()].count(\"\\n\") + 1\n            issues[name].append(line_num)\n\n    return issues\n\n\ndef audit_old_init_patterns(content: str, file_path: Path) -> Dict[str, List[int]]:\n    \"\"\"Find old initialization patterns.\"\"\"\n    issues = defaultdict(list)\n\n    # Find instructor.from_* patterns\n    from_pattern = r\"instructor\\.from_(\\w+)\\(\"\n    for match in re.finditer(from_pattern, content):\n        provider = match.group(1)\n        line_num = content[: match.start()].count(\"\\n\") + 1\n        issues[f\"instructor.from_{provider}\"].append(line_num)\n\n    # Find instructor.patch patterns\n    patch_pattern = r\"instructor\\.patch\\(\"\n    for match in re.finditer(patch_pattern, content):\n        line_num = content[: match.start()].count(\"\\n\") + 1\n        issues[\"instructor.patch\"].append(line_num)\n\n    return issues\n\n\ndef audit_unused_imports(content: str, file_path: Path) -> Dict[str, List[int]]:\n    \"\"\"Find potentially unused imports when from_provider is used.\"\"\"\n    issues = defaultdict(list)\n\n    # Check if from_provider is used\n    uses_from_provider = \"from_provider\" in content or \"from_provider\" in content\n\n    if not uses_from_provider:\n        return issues\n\n    # Find provider imports\n    import_patterns = {\n        \"import openai\": r\"^import\\s+openai\\b\",\n        \"from openai import\": r\"^from\\s+openai\\s+import\",\n        \"import anthropic\": r\"^import\\s+anthropic\\b\",\n        \"from anthropic import\": r\"^from\\s+anthropic\\s+import\",\n    }\n\n    lines = content.split(\"\\n\")\n    for line_num, line in enumerate(lines, 1):\n        for name, pattern in import_patterns.items():\n            if re.search(pattern, line):\n                # Check if the import is actually used\n                if name.startswith(\"import \"):\n                    module = name.split()[1]\n                    # Simple check - if module name appears elsewhere, might be used\n                    if content.count(module) <= 2:  # Just import and maybe one use\n                        issues[name].append(line_num)\n\n    return issues\n\n\ndef process_file(file_path: Path) -> Dict[str, Dict[str, List[int]]]:\n    \"\"\"Process a single file and return all issues.\"\"\"\n    try:\n        content = file_path.read_text(encoding=\"utf-8\")\n\n        return {\n            \"api_calls\": audit_api_calls(content, file_path),\n            \"old_init\": audit_old_init_patterns(content, file_path),\n            \"unused_imports\": audit_unused_imports(content, file_path),\n        }\n    except Exception as e:\n        print(f\"Error processing {file_path}: {e}\")\n        return {\"api_calls\": {}, \"old_init\": {}, \"unused_imports\": {}}\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Audit documentation files for old patterns\"\n    )\n    parser.add_argument(\n        \"--docs-dir\",\n        type=Path,\n        default=Path(\"docs\"),\n        help=\"Directory containing documentation files (default: docs)\",\n    )\n    parser.add_argument(\n        \"--file\",\n        type=Path,\n        help=\"Audit a single file instead of all files\",\n    )\n    parser.add_argument(\n        \"--summary\",\n        action=\"store_true\",\n        help=\"Show only summary statistics\",\n    )\n\n    args = parser.parse_args()\n\n    if args.file:\n        files = [args.file]\n    else:\n        files = find_markdown_files(args.docs_dir)\n\n    all_issues = {}\n    total_counts = defaultdict(int)\n\n    for file_path in files:\n        issues = process_file(file_path)\n        if any(issues.values()):\n            all_issues[str(file_path)] = issues\n\n            # Count totals\n            for issue_type, patterns in issues.items():\n                for pattern, line_nums in patterns.items():\n                    total_counts[f\"{issue_type}:{pattern}\"] += len(line_nums)\n\n    if args.summary:\n        print(\"Summary Statistics:\")\n        print(\"=\" * 60)\n        for key, count in sorted(total_counts.items()):\n            issue_type, pattern = key.split(\":\", 1)\n            print(f\"  {pattern}: {count} instances\")\n    else:\n        # Detailed report\n        for file_path, issues in sorted(all_issues.items()):\n            print(f\"\\n{file_path}:\")\n            print(\"-\" * 60)\n\n            for issue_type, patterns in issues.items():\n                if patterns:\n                    print(f\"  {issue_type.replace('_', ' ').title()}:\")\n                    for pattern, line_nums in sorted(patterns.items()):\n                        lines_str = \", \".join(map(str, line_nums[:10]))\n                        if len(line_nums) > 10:\n                            lines_str += f\", ... ({len(line_nums)} total)\"\n                        print(f\"    {pattern}: lines {lines_str}\")\n\n    print(f\"\\nTotal files with issues: {len(all_issues)}\")\n    print(f\"Total issues found: {sum(total_counts.values())}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/check_blog_excerpts.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nCheck if blog posts contain the <!-- more --> tag for excerpts.\n\nThis script:\n- Recursively finds all .md files in the docs/blog/posts directory\n- Checks if each file contains the <!-- more --> tag\n- Reports files that are missing the tag\n- Exits with error code 1 if any files are missing the tag\n\"\"\"\n\nimport sys\nfrom pathlib import Path\n\n\ndef check_blog_excerpts(blog_posts_dir: str = \"docs/blog/posts\") -> bool:\n    \"\"\"\n    Check if blog posts contain the <!-- more --> tag.\n\n    Args:\n        blog_posts_dir: Path to the blog posts directory (default: \"docs/blog/posts\")\n\n    Returns:\n        True if all files have the tag, False if any are missing it\n    \"\"\"\n    blog_path = Path(blog_posts_dir)\n\n    if not blog_path.exists():\n        print(f\"Error: Directory '{blog_posts_dir}' does not exist.\")\n        return False\n\n    if not blog_path.is_dir():\n        print(f\"Error: '{blog_posts_dir}' is not a directory.\")\n        return False\n\n    # Find all markdown files recursively\n    md_files = list(blog_path.rglob(\"*.md\"))\n\n    if not md_files:\n        print(f\"No markdown files found in '{blog_posts_dir}' directory.\")\n        return True\n\n    print(f\"Checking {len(md_files)} blog post files for <!-- more --> tag...\")\n\n    missing_tag_files = []\n\n    for md_file in md_files:\n        try:\n            # Read the file content\n            with open(md_file, encoding=\"utf-8\") as f:\n                content = f.read()\n\n            # Check if the file contains the <!-- more --> tag\n            if \"<!-- more -->\" not in content:\n                missing_tag_files.append(md_file)\n                print(f\"Missing <!-- more --> tag: {md_file}\")\n            else:\n                print(f\"✓ Has <!-- more --> tag: {md_file}\")\n\n        except Exception as e:\n            print(f\"Error reading {md_file}: {e}\")\n            missing_tag_files.append(md_file)\n\n    # Summary\n    if missing_tag_files:\n        print(f\"\\n❌ Found {len(missing_tag_files)} files missing <!-- more --> tag:\")\n        for file in missing_tag_files:\n            print(f\"  - {file}\")\n        print(\n            f\"\\nPlease add <!-- more --> tag to these files for proper excerpt handling.\"\n        )\n        return False\n    else:\n        print(f\"\\n✅ All {len(md_files)} blog post files have the <!-- more --> tag!\")\n        return True\n\n\ndef main():\n    \"\"\"Main function to handle command line arguments.\"\"\"\n    import argparse\n\n    parser = argparse.ArgumentParser(\n        description=\"Check if blog posts contain the <!-- more --> tag for excerpts\"\n    )\n    parser.add_argument(\n        \"--blog-posts-dir\",\n        default=\"docs/blog/posts\",\n        help=\"Path to blog posts directory (default: docs/blog/posts)\",\n    )\n\n    args = parser.parse_args()\n\n    success = check_blog_excerpts(blog_posts_dir=args.blog_posts_dir)\n\n    # Exit with appropriate code for pre-commit\n    sys.exit(0 if success else 1)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/check_links.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nCheck for broken internal links in documentation files.\n\nFinds:\n- Broken internal links (missing target files)\n- Broken anchor links\n- Orphaned pages (no incoming links)\n\"\"\"\n\nimport argparse\nimport re\nfrom pathlib import Path\n\n\ndef find_markdown_files(docs_dir: Path) -> list[Path]:\n    \"\"\"Find all markdown files in the docs directory.\"\"\"\n    return list(docs_dir.rglob(\"*.md\"))\n\n\ndef extract_links(content: str, file_path: Path) -> list[tuple[str, int]]:  # noqa: ARG001\n    \"\"\"\n    Extract internal markdown links from content.\n\n    Returns:\n        List of (link_target, line_number) tuples\n    \"\"\"\n    links = []\n\n    # Match markdown links: [text](url)\n    for match in re.finditer(r\"\\[([^\\]]+)\\]\\(([^)]+)\\)\", content):\n        link_text = match.group(1)\n        link_url = match.group(2)\n        line_num = content[: match.start()].count(\"\\n\") + 1\n\n        # Skip external links\n        if link_url.startswith((\"http://\", \"https://\", \"mailto:\", \"#\")):\n            continue\n\n        links.append((link_url, line_num))\n\n    return links\n\n\ndef resolve_link(link_url: str, source_file: Path, docs_dir: Path) -> tuple[bool, str]:  # noqa: ARG001\n    \"\"\"\n    Resolve a relative link and check if target exists.\n\n    Returns:\n        (exists, resolved_path)\n    \"\"\"\n    # Split anchor if present\n    if \"#\" in link_url:\n        link_path, anchor = link_url.split(\"#\", 1)\n    else:\n        link_path = link_url\n        anchor = None\n\n    # Resolve relative path\n    source_dir = source_file.parent\n    target_path = (source_dir / link_path).resolve()\n\n    # Check if file exists\n    exists = target_path.exists()\n\n    return exists, str(target_path)\n\n\ndef check_file(file_path: Path, docs_dir: Path) -> dict[str, list[tuple[str, int]]]:\n    \"\"\"Check all links in a file.\"\"\"\n    issues = {}\n\n    try:\n        content = file_path.read_text(encoding=\"utf-8\")\n        links = extract_links(content, file_path)\n\n        broken_links = []\n        for link_url, line_num in links:\n            exists, resolved_path = resolve_link(link_url, file_path, docs_dir)\n            if not exists:\n                broken_links.append((link_url, line_num))\n\n        if broken_links:\n            issues[\"broken_links\"] = broken_links\n\n        return issues\n    except Exception as e:\n        return {\"error\": [(str(e), 0)]}\n\n\ndef find_orphaned_pages(files: list[Path], docs_dir: Path) -> set[Path]:\n    \"\"\"Find pages with no incoming links.\"\"\"\n    all_files = set(files)\n    referenced_files = set()\n\n    for file_path in files:\n        try:\n            content = file_path.read_text(encoding=\"utf-8\")\n            links = extract_links(content, file_path)\n\n            for link_url, _ in links:\n                exists, resolved_path = resolve_link(link_url, file_path, docs_dir)\n                if exists:\n                    referenced_files.add(Path(resolved_path))\n        except Exception:\n            pass\n\n    # Files that are not referenced (orphaned)\n    orphaned = all_files - referenced_files\n\n    # Remove index pages and special files from orphaned list\n    orphaned = {\n        f\n        for f in orphaned\n        if not any(\n            part in str(f)\n            for part in [\"index.md\", \"AGENT.md\", \"repository-overview.md\"]\n        )\n    }\n\n    return orphaned\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Check for broken internal links in documentation\"\n    )\n    parser.add_argument(\n        \"--docs-dir\",\n        type=Path,\n        default=Path(\"docs\"),\n        help=\"Directory containing documentation files (default: docs)\",\n    )\n    parser.add_argument(\n        \"--summary\",\n        action=\"store_true\",\n        help=\"Show only summary statistics\",\n    )\n    parser.add_argument(\n        \"--file\",\n        type=Path,\n        help=\"Check a single file instead of all files\",\n    )\n    parser.add_argument(\n        \"--find-orphans\",\n        action=\"store_true\",\n        help=\"Find orphaned pages with no incoming links\",\n    )\n\n    args = parser.parse_args()\n\n    if args.file:\n        files = [args.file]\n    else:\n        files = find_markdown_files(args.docs_dir)\n\n    all_issues = {}\n    total_broken = 0\n\n    for file_path in files:\n        issues = check_file(file_path, args.docs_dir)\n        if issues:\n            all_issues[str(file_path)] = issues\n            if \"broken_links\" in issues:\n                total_broken += len(issues[\"broken_links\"])\n\n    if args.summary:\n        print(\"Summary Statistics:\")\n        print(\"=\" * 60)\n        print(f\"  Files with broken links: {len(all_issues)}\")\n        print(f\"  Total broken links: {total_broken}\")\n    else:\n        # Detailed report\n        for file_path, issues in sorted(all_issues.items()):\n            if \"broken_links\" in issues:\n                print(f\"\\n{file_path}:\")\n                for link_url, line_num in issues[\"broken_links\"]:\n                    print(f\"  Line {line_num}: {link_url}\")\n\n    if args.find_orphans:\n        orphaned = find_orphaned_pages(files, args.docs_dir)\n        if orphaned:\n            print(\"\\n\" + \"=\" * 60)\n            print(\"Orphaned Pages (no incoming links):\")\n            print(\"=\" * 60)\n            for file_path in sorted(orphaned):\n                print(f\"  {file_path}\")\n            print(f\"\\nTotal orphaned pages: {len(orphaned)}\")\n\n    print(f\"\\nTotal files checked: {len(files)}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/fix_api_calls.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nFix API calls in documentation files.\n\nReplaces old API patterns with simplified versions:\n- client.chat.completions.create → client.create\n- client.chat.completions.create_partial → client.create_partial\n- client.chat.completions.create_iterable → client.create_iterable\n- client.chat.completions.create_with_completion → client.create_with_completion\n\"\"\"\n\nimport argparse\nimport re\nfrom pathlib import Path\n\n\ndef find_markdown_files(docs_dir: Path) -> list[Path]:\n    \"\"\"Find all markdown files in the docs directory.\"\"\"\n    return list(docs_dir.rglob(\"*.md\")) + list(docs_dir.rglob(\"*.ipynb\"))\n\n\ndef replace_api_calls(content: str, dry_run: bool = False) -> tuple[str, int]:  # noqa: ARG001\n    \"\"\"\n    Replace old API call patterns with simplified versions.\n\n    Returns:\n        Tuple of (new_content, number_of_replacements)\n    \"\"\"\n    replacements = 0\n\n    # Pattern mappings: (old_pattern, new_pattern)\n    patterns = [\n        (\n            r\"client\\.chat\\.completions\\.create_with_completion\\(\",\n            \"client.create_with_completion(\",\n        ),\n        (r\"client\\.chat\\.completions\\.create_partial\\(\", \"client.create_partial(\"),\n        (r\"client\\.chat\\.completions\\.create_iterable\\(\", \"client.create_iterable(\"),\n        (r\"client\\.chat\\.completions\\.create\\(\", \"client.create(\"),\n    ]\n\n    new_content = content\n    for old_pattern, new_pattern in patterns:\n        matches = len(re.findall(old_pattern, new_content))\n        if matches > 0:\n            new_content = re.sub(old_pattern, new_pattern, new_content)\n            replacements += matches\n\n    return new_content, replacements\n\n\ndef process_file(file_path: Path, dry_run: bool = False) -> int:\n    \"\"\"Process a single file and return number of replacements.\"\"\"\n    try:\n        content = file_path.read_text(encoding=\"utf-8\")\n        new_content, replacements = replace_api_calls(content, dry_run)\n\n        if replacements > 0:\n            if dry_run:\n                print(f\"Would fix {replacements} instances in {file_path}\")\n            else:\n                file_path.write_text(new_content, encoding=\"utf-8\")\n                print(f\"Fixed {replacements} instances in {file_path}\")\n\n        return replacements\n    except Exception as e:\n        print(f\"Error processing {file_path}: {e}\")\n        return 0\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Replace old API call patterns with simplified versions\"\n    )\n    parser.add_argument(\n        \"--docs-dir\",\n        type=Path,\n        default=Path(\"docs\"),\n        help=\"Directory containing documentation files (default: docs)\",\n    )\n    parser.add_argument(\n        \"--dry-run\",\n        action=\"store_true\",\n        help=\"Show what would be changed without making changes\",\n    )\n    parser.add_argument(\n        \"--file\",\n        type=Path,\n        help=\"Process a single file instead of all files\",\n    )\n\n    args = parser.parse_args()\n\n    if args.file:\n        files = [args.file]\n    else:\n        files = find_markdown_files(args.docs_dir)\n\n    total_replacements = 0\n    files_modified = 0\n\n    for file_path in files:\n        replacements = process_file(file_path, args.dry_run)\n        if replacements > 0:\n            total_replacements += replacements\n            files_modified += 1\n\n    print(f\"\\nSummary:\")\n    print(f\"  Files processed: {len(files)}\")\n    print(f\"  Files modified: {files_modified}\")\n    print(f\"  Total replacements: {total_replacements}\")\n\n    if args.dry_run:\n        print(\"\\nRun without --dry-run to apply changes\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/fix_doc_tests.py",
    "content": "#!/usr/bin/env python3\n\"\"\"Fix doc test formatting issues using --update-examples for each test file.\"\"\"\n\nimport subprocess\nimport sys\nfrom pathlib import Path\n\ntest_files = [\n    \"tests/docs/test_concepts_operations.py\",\n    \"tests/docs/test_examples_batch.py\",\n    \"tests/docs/test_examples_integrations.py\",\n    \"tests/docs/test_examples_multimodal.py\",\n    \"tests/docs/test_posts.py\",\n]\n\n\ndef run_update(test_file: str) -> bool:\n    \"\"\"Run --update-examples on a test file.\"\"\"\n    print(f\"\\n{'=' * 60}\")\n    print(f\"Processing: {test_file}\")\n    print(f\"{'=' * 60}\")\n\n    cmd = [\"uv\", \"run\", \"pytest\", test_file, \"--update-examples\", \"-q\", \"--tb=no\"]\n\n    try:\n        result = subprocess.run(\n            cmd, capture_output=True, text=True, cwd=Path(__file__).parent.parent\n        )\n\n        if result.returncode == 0:\n            print(f\"✓ Successfully updated {test_file}\")\n            return True\n        else:\n            # Even with errors, some files might have been updated\n            print(f\"⚠ Completed {test_file} with exit code {result.returncode}\")\n            if result.stdout:\n                print(\"STDOUT:\", result.stdout[-500:])  # Last 500 chars\n            return False\n    except Exception as e:\n        print(f\"✗ Error processing {test_file}: {e}\")\n        return False\n\n\nif __name__ == \"__main__\":\n    success_count = 0\n    for test_file in test_files:\n        if run_update(test_file):\n            success_count += 1\n\n    print(f\"\\n{'=' * 60}\")\n    print(f\"Summary: {success_count}/{len(test_files)} files processed\")\n    print(f\"{'=' * 60}\")\n\n    sys.exit(0 if success_count == len(test_files) else 1)\n"
  },
  {
    "path": "scripts/fix_old_patterns.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nFix old client initialization patterns in documentation files.\n\nReplaces old initialization patterns with from_provider:\n- instructor.from_openai(OpenAI()) → instructor.from_provider(\"openai/model-name\")\n- instructor.from_anthropic(Anthropic()) → instructor.from_provider(\"anthropic/model-name\")\n- instructor.patch(OpenAI()) → instructor.from_provider(\"openai/model-name\")\n- Similar patterns for all other providers\n\"\"\"\n\nimport argparse\nimport re\nfrom pathlib import Path\nfrom typing import List, Tuple\n\n\n# Mapping of provider names to their from_provider identifiers\nPROVIDER_MAPPING = {\n    \"openai\": \"openai\",\n    \"anthropic\": \"anthropic\",\n    \"google\": \"google\",\n    \"cohere\": \"cohere\",\n    \"mistral\": \"mistral\",\n    \"groq\": \"groq\",\n    \"litellm\": \"litellm\",\n    \"ollama\": \"ollama\",\n    \"azure\": \"azure\",\n    \"bedrock\": \"bedrock\",\n    \"vertex\": \"vertex\",\n    \"genai\": \"google\",  # Google GenAI\n    \"deepseek\": \"deepseek\",\n    \"fireworks\": \"fireworks\",\n    \"cerebras\": \"cerebras\",\n    \"together\": \"together\",\n    \"anyscale\": \"anyscale\",\n    \"perplexity\": \"perplexity\",\n    \"writer\": \"writer\",\n    \"openrouter\": \"openrouter\",\n    \"sambanova\": \"sambanova\",\n    \"truefoundry\": \"truefoundry\",\n    \"cortex\": \"cortex\",\n    \"databricks\": \"databricks\",\n    \"xai\": \"xai\",\n}\n\n\ndef find_markdown_files(docs_dir: Path) -> List[Path]:\n    \"\"\"Find all markdown files in the docs directory.\"\"\"\n    return list(docs_dir.rglob(\"*.md\")) + list(docs_dir.rglob(\"*.ipynb\"))\n\n\ndef extract_model_name(content: str, match_start: int, match_end: int) -> str:\n    \"\"\"\n    Try to extract model name from context around the match.\n    Looks for common patterns like model=\"...\", model='...', or model_name=...\n    \"\"\"\n    # Look backwards and forwards for model parameter\n    context_start = max(0, match_start - 200)\n    context_end = min(len(content), match_end + 200)\n    context = content[context_start:context_end]\n\n    # Try to find model parameter\n    model_match = re.search(\n        r'model\\s*[=:]\\s*[\"\\']([^\"\\']+)[\"\\']', context, re.IGNORECASE\n    )\n    if model_match:\n        return model_match.group(1)\n\n    # Default model names by provider\n    return \"gpt-4o\"  # Will need manual review for accuracy\n\n\ndef replace_from_pattern(\n    content: str, provider: str, dry_run: bool = False\n) -> Tuple[str, int]:\n    \"\"\"\n    Replace instructor.from_PROVIDER(Provider()) patterns.\n\n    Pattern: instructor.from_openai(OpenAI(model=\"...\"))\n    → instructor.from_provider(\"openai/model-name\")\n    \"\"\"\n    replacements = 0\n\n    # Pattern: instructor.from_PROVIDER(ProviderClass(...))\n    pattern = rf\"instructor\\.from_{provider}\\((\\w+)(\\([^)]*\\))?\\)\"\n\n    def replacer(match):\n        nonlocal replacements\n        provider_class = match.group(1)\n        args = match.group(2) or \"\"\n\n        # Try to extract model name from args\n        model_match = re.search(r'model\\s*=\\s*[\"\\']([^\"\\']+)[\"\\']', args)\n        if model_match:\n            model_name = model_match.group(1)\n        else:\n            # Default model - may need manual review\n            model_name = (\n                \"gpt-4o\" if provider == \"openai\" else \"claude-3-5-sonnet-20241022\"\n            )\n\n        replacements += 1\n        return f'instructor.from_provider(\"{provider}/{model_name}\")'\n\n    new_content = re.sub(pattern, replacer, content, flags=re.IGNORECASE)\n    return new_content, replacements\n\n\ndef replace_patch_pattern(content: str, dry_run: bool = False) -> Tuple[str, int]:\n    \"\"\"\n    Replace instructor.patch(Provider()) patterns.\n\n    Pattern: instructor.patch(OpenAI(model=\"...\"))\n    → instructor.from_provider(\"openai/model-name\")\n    \"\"\"\n    replacements = 0\n\n    # Pattern: instructor.patch(ProviderClass(...))\n    # Match common provider classes\n    provider_classes = \"|\".join(\n        [\n            \"OpenAI\",\n            \"Anthropic\",\n            \"GoogleGenerativeAI\",\n            \"Cohere\",\n            \"Mistral\",\n            \"Groq\",\n            \"LiteLLM\",\n            \"Ollama\",\n            \"Bedrock\",\n            \"VertexAI\",\n        ]\n    )\n\n    pattern = rf\"instructor\\.patch\\(({provider_classes})(\\([^)]*\\))?\\)\"\n\n    def replacer(match):\n        nonlocal replacements\n        provider_class = match.group(1)\n        args = match.group(2) or \"\"\n\n        # Map class name to provider identifier\n        class_to_provider = {\n            \"OpenAI\": \"openai\",\n            \"Anthropic\": \"anthropic\",\n            \"GoogleGenerativeAI\": \"google\",\n            \"Cohere\": \"cohere\",\n            \"Mistral\": \"mistral\",\n            \"Groq\": \"groq\",\n            \"LiteLLM\": \"litellm\",\n            \"Ollama\": \"ollama\",\n            \"Bedrock\": \"bedrock\",\n            \"VertexAI\": \"vertex\",\n        }\n\n        provider = class_to_provider.get(provider_class, \"openai\")\n\n        # Try to extract model name from args\n        model_match = re.search(r'model\\s*=\\s*[\"\\']([^\"\\']+)[\"\\']', args)\n        if model_match:\n            model_name = model_match.group(1)\n        else:\n            # Default models\n            defaults = {\n                \"openai\": \"gpt-4o\",\n                \"anthropic\": \"claude-3-5-sonnet-20241022\",\n                \"google\": \"gemini-1.5-pro\",\n            }\n            model_name = defaults.get(provider, \"gpt-4o\")\n\n        replacements += 1\n        return f'instructor.from_provider(\"{provider}/{model_name}\")'\n\n    new_content = re.sub(pattern, replacer, content)\n    return new_content, replacements\n\n\ndef replace_old_patterns(content: str, dry_run: bool = False) -> Tuple[str, int]:\n    \"\"\"\n    Replace all old initialization patterns.\n\n    Returns:\n        Tuple of (new_content, total_replacements)\n    \"\"\"\n    total_replacements = 0\n    new_content = content\n\n    # Replace instructor.patch() patterns first\n    new_content, patch_replacements = replace_patch_pattern(new_content, dry_run)\n    total_replacements += patch_replacements\n\n    # Replace instructor.from_* patterns for each provider\n    for provider in PROVIDER_MAPPING.keys():\n        new_content, from_replacements = replace_from_pattern(\n            new_content, provider, dry_run\n        )\n        total_replacements += from_replacements\n\n    return new_content, total_replacements\n\n\ndef process_file(file_path: Path, dry_run: bool = False) -> int:\n    \"\"\"Process a single file and return number of replacements.\"\"\"\n    try:\n        content = file_path.read_text(encoding=\"utf-8\")\n        new_content, replacements = replace_old_patterns(content, dry_run)\n\n        if replacements > 0:\n            if dry_run:\n                print(f\"Would fix {replacements} instances in {file_path}\")\n            else:\n                file_path.write_text(new_content, encoding=\"utf-8\")\n                print(f\"Fixed {replacements} instances in {file_path}\")\n\n        return replacements\n    except Exception as e:\n        print(f\"Error processing {file_path}: {e}\")\n        return 0\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Replace old client initialization patterns with from_provider\"\n    )\n    parser.add_argument(\n        \"--docs-dir\",\n        type=Path,\n        default=Path(\"docs\"),\n        help=\"Directory containing documentation files (default: docs)\",\n    )\n    parser.add_argument(\n        \"--dry-run\",\n        action=\"store_true\",\n        help=\"Show what would be changed without making changes\",\n    )\n    parser.add_argument(\n        \"--file\",\n        type=Path,\n        help=\"Process a single file instead of all files\",\n    )\n\n    args = parser.parse_args()\n\n    if args.file:\n        files = [args.file]\n    else:\n        files = find_markdown_files(args.docs_dir)\n\n    total_replacements = 0\n    files_modified = 0\n\n    for file_path in files:\n        replacements = process_file(file_path, args.dry_run)\n        if replacements > 0:\n            total_replacements += replacements\n            files_modified += 1\n\n    print(f\"\\nSummary:\")\n    print(f\"  Files processed: {len(files)}\")\n    print(f\"  Files modified: {files_modified}\")\n    print(f\"  Total replacements: {total_replacements}\")\n\n    if args.dry_run:\n        print(\"\\nRun without --dry-run to apply changes\")\n    else:\n        print(\"\\n⚠️  Note: Please review model names - defaults may need adjustment\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/make_clean.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nClean markdown files in the docs directory.\n\nThis script:\n- Recursively finds all .md files in the docs directory\n- Strips special whitespace characters (non-breaking spaces, zero-width spaces, etc.)\n- Replaces em dashes (—) with regular dashes (-)\n- Preserves the original file structure\n\"\"\"\n\nimport re\nimport unicodedata\nfrom pathlib import Path\n\n\ndef clean_markdown_content(content: str) -> str:\n    \"\"\"\n    Clean markdown content by removing special whitespace and replacing em dashes.\n\n    Args:\n        content: The original markdown content\n\n    Returns:\n        The cleaned markdown content\n    \"\"\"\n    # Replace em dashes with regular dashes\n    content = content.replace(\"—\", \"-\")\n    content = content.replace(\"–\", \"-\")  # en dash as well\n\n    # Remove special whitespace characters\n    # This includes non-breaking spaces, zero-width spaces, and other Unicode whitespace\n    cleaned_lines = []\n    for line in content.split(\"\\n\"):\n        # Normalize Unicode characters and remove special whitespace\n        cleaned_line = unicodedata.normalize(\"NFKC\", line)\n        # Remove zero-width characters and other special whitespace\n        cleaned_line = re.sub(r\"[\\u200B\\u200C\\u200D\\uFEFF]\", \"\", cleaned_line)\n        # Replace non-breaking spaces with regular spaces\n        cleaned_line = cleaned_line.replace(\"\\u00a0\", \" \")\n        # Strip leading/trailing whitespace but preserve intentional indentation\n        cleaned_line = cleaned_line.rstrip()\n        cleaned_lines.append(cleaned_line)\n\n    return \"\\n\".join(cleaned_lines)\n\n\ndef process_markdown_files(docs_dir: str = \"docs\", dry_run: bool = False) -> None:\n    \"\"\"\n    Process all markdown files in the docs directory.\n\n    Args:\n        docs_dir: Path to the docs directory (default: \"docs\")\n        dry_run: If True, show what would be changed without modifying files\n    \"\"\"\n    docs_path = Path(docs_dir)\n\n    if not docs_path.exists():\n        print(f\"Error: Directory '{docs_dir}' does not exist.\")\n        return\n\n    if not docs_path.is_dir():\n        print(f\"Error: '{docs_dir}' is not a directory.\")\n        return\n\n    # Find all markdown files recursively\n    md_files = list(docs_path.rglob(\"*.md\"))\n\n    if not md_files:\n        print(f\"No markdown files found in '{docs_dir}' directory.\")\n        return\n\n    mode_text = \"DRY RUN - \" if dry_run else \"\"\n    print(f\"{mode_text}Found {len(md_files)} markdown files to process...\")\n\n    processed_count = 0\n    modified_count = 0\n\n    for md_file in md_files:\n        try:\n            # Read the original content\n            with open(md_file, encoding=\"utf-8\") as f:\n                original_content = f.read()\n\n            # Clean the content\n            cleaned_content = clean_markdown_content(original_content)\n\n            # Check if content was modified\n            if cleaned_content != original_content:\n                if dry_run:\n                    print(f\"Would modify: {md_file}\")\n                    # Show a sample of the changes\n                    original_lines = original_content.split(\"\\n\")\n                    cleaned_lines = cleaned_content.split(\"\\n\")\n                    for i, (orig, clean) in enumerate(\n                        zip(original_lines, cleaned_lines)\n                    ):\n                        if orig != clean:\n                            print(f\"  Line {i + 1}:\")\n                            print(f\"    Original: {repr(orig)}\")\n                            print(f\"    Cleaned:  {repr(clean)}\")\n                            # Only show first difference per file\n                            break\n                else:\n                    # Write the cleaned content back to the file\n                    with open(md_file, \"w\", encoding=\"utf-8\") as f:\n                        f.write(cleaned_content)\n                    print(f\"Modified: {md_file}\")\n                modified_count += 1\n            else:\n                if not dry_run:\n                    print(f\"No changes needed: {md_file}\")\n\n            processed_count += 1\n\n        except Exception as e:\n            print(f\"Error processing {md_file}: {e}\")\n\n    action_text = \"would be\" if dry_run else \"were\"\n    print(f\"\\nProcessing complete!\")\n    print(f\"Total files processed: {processed_count}\")\n    print(f\"Files {action_text} modified: {modified_count}\")\n\n\ndef main():\n    \"\"\"Main function to handle command line arguments.\"\"\"\n    import argparse\n\n    parser = argparse.ArgumentParser(\n        description=\"Clean markdown files by removing special whitespace and replacing em dashes\"\n    )\n    parser.add_argument(\n        \"--docs-dir\", default=\"docs\", help=\"Path to docs directory (default: docs)\"\n    )\n    parser.add_argument(\n        \"--dry-run\",\n        action=\"store_true\",\n        help=\"Show what would be changed without modifying files\",\n    )\n\n    args = parser.parse_args()\n\n    process_markdown_files(docs_dir=args.docs_dir, dry_run=args.dry_run)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/make_desc.py",
    "content": "import os\nfrom typing import Optional, Literal\nimport asyncio\nfrom openai import AsyncOpenAI\nimport typer\nfrom rich.console import Console\nfrom rich.progress import Progress\nfrom rich.table import Table\nfrom pydantic import BaseModel, Field\nimport instructor\nimport frontmatter\n\nconsole = Console()\nclient = instructor.from_openai(AsyncOpenAI())\n\n\nasync def generate_ai_frontmatter(\n    client: AsyncOpenAI, title: str, content: str, categories: list[str]\n):\n    \"\"\"\n    Generate a description and categories for the given content using AI.\n\n    Args:\n        client (AsyncOpenAI): The AsyncOpenAI client.\n        title (str): The title of the markdown file.\n        content (str): The content of the file.\n        categories (List[str]): List of all available categories.\n\n    Returns:\n        DescriptionAndCategories: The generated description, categories, tags, and reasoning.\n    \"\"\"\n\n    class DescriptionAndCategories(BaseModel):\n        description: str\n        reasoning: str = Field(\n            ..., description=\"The reasoning for the correct categories\"\n        )\n        tags: list[str]\n        categories: list[\n            Literal[\n                \"OpenAI\",\n                \"Anthropic\",\n                \"LLama\",\n                \"LLM Observability\",\n                \"Data Processing\",\n                \"Python\",\n                \"LLM Techniques\",\n                \"Pydantic\",\n                \"Performance Optimization\",\n                \"Data Validation\",\n                \"API Development\",\n                \"Retrieval Augmented Generation\",\n            ]\n        ]\n\n    response = await client.chat.completions.create(\n        model=\"gpt-4o-mini\",\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are an AI assistant that generates SEO-friendly descriptions for markdown files.\",\n            },\n            {\"role\": \"user\", \"content\": f\"Title: {title}\\n\\nContent: {content}\"},\n            {\n                \"role\": \"user\",\n                \"content\": f\"Based on the title and content, generate a brief description (max 160 characters) that would be suitable for SEO purposes. Also, select up to 3 relevant categories from the following list: {', '.join(categories)}. Return both the description and the selected categories. The categories should be pretty strict, so only choose one if you're really sure it's the best choice. Also, suggest up to 5 relevant tags.\",\n            },\n        ],\n        max_tokens=150,\n        response_model=DescriptionAndCategories,\n    )\n    return response\n\n\ndef get_all_categories(root_dir: str) -> set[str]:\n    \"\"\"\n    Read all markdown files and extract unique categories.\n\n    Args:\n        root_dir (str): The root directory to start processing from.\n\n    Returns:\n        Set[str]: A set of unique categories.\n    \"\"\"\n    categories = set()\n    for root, _, files in os.walk(root_dir):\n        for file in files:\n            if file.endswith(\".md\"):\n                file_path = os.path.join(root, file)\n                post = frontmatter.load(file_path)\n                if \"categories\" in post.metadata:\n                    categories.update(post.metadata[\"categories\"])\n    return categories\n\n\ndef preview_categories(root_dir: str) -> None:\n    \"\"\"\n    Preview all categories found in markdown files.\n\n    Args:\n        root_dir (str): The root directory to start processing from.\n    \"\"\"\n    categories = get_all_categories(root_dir)\n\n    table = Table(title=\"Categories Preview\")\n    table.add_column(\"Category\", style=\"cyan\")\n\n    for category in sorted(categories):\n        table.add_row(category)\n\n    console.print(table)\n    console.print(f\"\\nTotal categories found: {len(categories)}\")\n\n\nasync def process_file(\n    client: AsyncOpenAI, file_path: str, categories: list[str], enable_comments: bool\n) -> None:\n    \"\"\"\n    Process a single file, adding or updating the description and categories in the front matter.\n\n    Args:\n        client (AsyncOpenAI): The AsyncOpenAI client.\n        file_path (str): The path to the file to process.\n        categories (List[str]): List of all available categories.\n        enable_comments (bool): Whether to enable comments in the front matter.\n    \"\"\"\n    post = frontmatter.load(file_path)\n    title = post.metadata.get(\"title\", os.path.basename(file_path))\n\n    response = await generate_ai_frontmatter(client, title, post.content, categories)\n    post.metadata[\"description\"] = response.description\n    post.metadata[\"categories\"] = response.categories\n    post.metadata[\"tags\"] = response.tags\n\n    if enable_comments:\n        post.metadata[\"comments\"] = True\n\n    with open(file_path, \"w\", encoding=\"utf-8\") as file:\n        file.write(frontmatter.dumps(post))\n\n    console.print(f\"[green]Updated front matter in {file_path}[/green]\")\n\n\nasync def process_files(\n    root_dir: str,\n    api_key: Optional[str] = None,  # noqa: ARG001\n    use_categories: bool = False,\n    enable_comments: bool = False,\n) -> None:\n    \"\"\"\n    Process all markdown files in the given directory and its subdirectories.\n\n    Args:\n        root_dir (str): The root directory to start processing from.\n        api_key (Optional[str]): The OpenAI API key. If not provided, it will be read from the OPENAI_API_KEY environment variable.\n        use_categories (bool): Whether to first read all files and generate a list of categories.\n        enable_comments (bool): Whether to enable comments in the front matter.\n    \"\"\"\n    markdown_files = []\n    for root, _, files in os.walk(root_dir):\n        for file in files:\n            if file.endswith(\".md\"):\n                markdown_files.append(os.path.join(root, file))\n\n    categories = list(get_all_categories(root_dir)) if use_categories else []\n\n    with Progress() as progress:\n        task = progress.add_task(\n            \"[green]Processing files...\", total=len(markdown_files)\n        )\n\n        async def process_and_update(file_path: str) -> None:\n            await process_file(client, file_path, categories, enable_comments)\n            progress.update(task, advance=1)\n\n        tasks = [process_and_update(file_path) for file_path in markdown_files]\n        await asyncio.gather(*tasks)\n\n    console.print(\"[bold green]All files processed successfully![/bold green]\")\n\n\napp = typer.Typer()\n\n\n@app.command()\ndef main(\n    root_dir: str = typer.Option(\"docs\", help=\"Root directory to process\"),\n    api_key: Optional[str] = typer.Option(None, help=\"OpenAI API key\"),\n    use_categories: bool = typer.Option(False, help=\"Use categories from all files\"),\n    preview_only: bool = typer.Option(\n        False, help=\"Preview categories without processing files\"\n    ),\n    enable_comments: bool = typer.Option(\n        False, help=\"Enable comments in the front matter\"\n    ),\n):\n    \"\"\"\n    Add or update description in front matter of markdown files in the given directory and its subdirectories.\n    \"\"\"\n    if preview_only:\n        preview_categories(root_dir)\n    else:\n        asyncio.run(process_files(root_dir, api_key, use_categories, enable_comments))\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "scripts/make_sitemap.py",
    "content": "import os\nimport asyncio\nimport yaml\nfrom typing import Optional, Any\nfrom collections.abc import Generator\nfrom openai import AsyncOpenAI\nimport typer\nfrom rich.console import Console\nfrom rich.progress import Progress\nimport hashlib\nfrom asyncio import as_completed\nimport tenacity\nimport re\n\nconsole = Console()\n\n\ndef traverse_docs(\n    root_dir: str = \"docs\",\n) -> Generator[tuple[str, str, str], None, None]:\n    \"\"\"\n    Recursively traverse the docs folder and yield the path, content, and content hash of each file.\n\n    Args:\n        root_dir (str): The root directory to start traversing from. Defaults to 'docs'.\n\n    Yields:\n        Tuple[str, str, str]: A tuple containing the relative path from 'docs', the file content, and the content hash.\n    \"\"\"\n    for root, _, files in os.walk(root_dir):\n        for file in files:\n            if file.endswith(\".md\"):  # Assuming we're only interested in Markdown files\n                file_path = os.path.join(root, file)\n                relative_path = os.path.relpath(file_path, root_dir)\n\n                with open(file_path, encoding=\"utf-8\") as f:\n                    content = f.read()\n\n                content_hash = hashlib.md5(content.encode()).hexdigest()\n                yield relative_path, content, content_hash\n\n\ndef extract_markdown_links(content: str) -> list[str]:\n    \"\"\"\n    Extract all markdown links from the content.\n\n    Args:\n        content (str): The markdown content to analyze\n\n    Returns:\n        List[str]: List of extracted link paths\n    \"\"\"\n    # Match markdown links [text](path)\n    link_pattern = r\"\\[([^\\]]+)\\]\\(([^)]+)\\)\"\n    matches = re.findall(link_pattern, content)\n\n    links = []\n    for _, link_path in matches:\n        # Filter out external links and anchors\n        if not link_path.startswith((\"http://\", \"https://\", \"#\", \"mailto:\")):\n            # Clean up relative paths\n            link_path = link_path.strip(\"/\")\n            if link_path.endswith(\".md\"):\n                links.append(link_path)\n            elif \".\" not in link_path:\n                # Assume it's a directory reference, add index.md\n                links.append(f\"{link_path}/index.md\")\n\n    return links\n\n\ndef normalize_path(path: str, current_path: str) -> str:\n    \"\"\"\n    Normalize a relative path based on the current file's location.\n\n    Args:\n        path (str): The path to normalize\n        current_path (str): The current file's path\n\n    Returns:\n        str: The normalized path\n    \"\"\"\n    if path.startswith(\"/\"):\n        # Absolute path from docs root\n        return path.strip(\"/\")\n\n    # Relative path\n    current_dir = os.path.dirname(current_path)\n    if current_dir:\n        normalized = os.path.normpath(os.path.join(current_dir, path))\n        # Remove any leading '../' that go outside docs/\n        while normalized.startswith(\"../\"):\n            normalized = normalized[3:]\n        return normalized\n\n    return path\n\n\n@tenacity.retry(\n    stop=tenacity.stop_after_attempt(3),\n    wait=tenacity.wait_exponential(multiplier=1, min=4, max=10),\n    retry=tenacity.retry_if_exception_type(Exception),\n    before_sleep=lambda retry_state: console.print(\n        f\"[yellow]Retrying analysis... (Attempt {retry_state.attempt_number})[/yellow]\"\n    ),\n)\nasync def analyze_content(\n    client: AsyncOpenAI, path: str, content: str\n) -> dict[str, Any]:\n    \"\"\"\n    Analyze the content of a file to extract summary, keywords, topics, and references.\n\n    Args:\n        client (AsyncOpenAI): The AsyncOpenAI client.\n        path (str): The path of the file.\n        content (str): The content of the file.\n\n    Returns:\n        Dict[str, Any]: Analysis results including summary, keywords, topics, and references.\n\n    Raises:\n        Exception: If all retry attempts fail.\n    \"\"\"\n    try:\n        response = await client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"\"\"You are a documentation analyzer. Extract and return the following information in a structured format:\n1. A concise summary (2-3 sentences) for SEO\n2. A list of important keywords (5-10 words/phrases)\n3. Main topics/concepts covered (3-5 topics)\n4. Any references to other documentation pages mentioned in the text\n\nReturn the response in this exact format:\nSUMMARY: [Your summary here]\nKEYWORDS: [keyword1, keyword2, keyword3, ...]\nTOPICS: [topic1, topic2, topic3, ...]\nREFERENCES: [referenced_page1.md, referenced_page2.md, ...]\n\nIf no references are found, write: REFERENCES: none\"\"\",\n                },\n                {\"role\": \"user\", \"content\": content},\n            ],\n            max_tokens=4000,\n        )\n\n        result_text = response.choices[0].message.content\n\n        # Parse the structured response\n        summary = \"\"\n        keywords = []\n        topics = []\n        references = []\n\n        if result_text:\n            for line in result_text.split(\"\\n\"):\n                line = line.strip()\n                if line.startswith(\"SUMMARY:\"):\n                    summary = line[8:].strip()\n                elif line.startswith(\"KEYWORDS:\"):\n                    keywords_text = line[9:].strip()\n                    if keywords_text and keywords_text != \"none\":\n                        keywords = [k.strip() for k in keywords_text.split(\",\")]\n                elif line.startswith(\"TOPICS:\"):\n                    topics_text = line[7:].strip()\n                    if topics_text and topics_text != \"none\":\n                        topics = [t.strip() for t in topics_text.split(\",\")]\n                elif line.startswith(\"REFERENCES:\"):\n                    refs_text = line[11:].strip()\n                    if refs_text and refs_text != \"none\":\n                        references = [r.strip() for r in refs_text.split(\",\")]\n\n        return {\n            \"summary\": summary,\n            \"keywords\": keywords,\n            \"topics\": topics,\n            \"ai_references\": references,\n        }\n\n    except Exception as e:\n        console.print(f\"[bold red]Error analyzing {path}: {str(e)}[/bold red]\")\n        raise\n\n\nasync def generate_sitemap(\n    root_dir: str,\n    output_file: str,\n    api_key: Optional[str] = None,\n    max_concurrency: int = 5,\n) -> None:\n    \"\"\"\n    Generate a sitemap from the given root directory.\n\n    Args:\n        root_dir (str): The root directory to start traversing from.\n        output_file (str): The output file to save the sitemap.\n        api_key (Optional[str]): The OpenAI API key. If not provided, it will be read from the OPENAI_API_KEY environment variable.\n        max_concurrency (int): The maximum number of concurrent tasks. Defaults to 5.\n    \"\"\"\n    client = AsyncOpenAI(api_key=api_key)\n\n    # Load existing sitemap if it exists\n    existing_sitemap: dict[str, dict[str, Any]] = {}\n    if os.path.exists(output_file):\n        with open(output_file, encoding=\"utf-8\") as sitemap_file:\n            existing_sitemap = yaml.safe_load(sitemap_file) or {}\n\n    sitemap_data: dict[str, dict[str, Any]] = {}\n\n    async def process_file(\n        path: str, content: str, content_hash: str\n    ) -> tuple[str, dict[str, Any]]:\n        # Check if we can reuse existing data\n        if (\n            path in existing_sitemap\n            and existing_sitemap[path].get(\"hash\") == content_hash\n        ):\n            # Extract markdown links even for cached content\n            links = extract_markdown_links(content)\n            normalized_links = []\n            for link in links:\n                normalized = normalize_path(link, path)\n                if normalized:\n                    normalized_links.append(normalized)\n\n            existing_data = existing_sitemap[path].copy()\n            existing_data[\"references\"] = normalized_links\n            return path, existing_data\n\n        try:\n            # Extract markdown links\n            links = extract_markdown_links(content)\n            normalized_links = []\n            for link in links:\n                normalized = normalize_path(link, path)\n                if normalized:\n                    normalized_links.append(normalized)\n\n            # Get AI analysis\n            analysis = await analyze_content(client, path, content)\n\n            return path, {\n                \"summary\": analysis[\"summary\"],\n                \"keywords\": analysis[\"keywords\"],\n                \"topics\": analysis[\"topics\"],\n                \"references\": normalized_links,\n                \"ai_references\": analysis[\"ai_references\"],\n                \"hash\": content_hash,\n            }\n        except Exception as e:\n            console.print(\n                f\"[bold red]Failed to analyze {path} after multiple attempts: {str(e)}[/bold red]\"\n            )\n            return path, {\n                \"summary\": \"Failed to generate summary\",\n                \"keywords\": [],\n                \"topics\": [],\n                \"references\": normalized_links,\n                \"ai_references\": [],\n                \"hash\": content_hash,\n            }\n\n    files_to_process: list[tuple[str, str, str]] = list(traverse_docs(root_dir))\n    total_files = len(files_to_process)\n\n    with Progress() as progress:\n        task = progress.add_task(\"[green]Processing files...\", total=total_files)\n\n        semaphore = asyncio.Semaphore(max_concurrency)\n\n        async def bounded_process_file(*args):\n            async with semaphore:\n                return await process_file(*args)\n\n        tasks = [\n            bounded_process_file(path, content, content_hash)\n            for path, content, content_hash in files_to_process\n        ]\n\n        for completed_task in as_completed(tasks):\n            path, result = await completed_task\n            sitemap_data[path] = result\n            progress.update(task, advance=1)\n\n    # Save final results\n    with open(output_file, \"w\", encoding=\"utf-8\") as sitemap_file:\n        yaml.dump(sitemap_data, sitemap_file, default_flow_style=False, sort_keys=True)\n\n    console.print(\n        f\"[bold green]Sitemap has been generated and saved to {output_file}[/bold green]\"\n    )\n    console.print(f\"[green]Processed {total_files} files[/green]\")\n\n\napp = typer.Typer()\n\n\n@app.command()\ndef main(\n    root_dir: str = typer.Option(\"docs\", help=\"Root directory to traverse\"),\n    output_file: str = typer.Option(\"sitemap.yaml\", help=\"Output file for the sitemap\"),\n    api_key: Optional[str] = typer.Option(None, help=\"OpenAI API key\"),\n    max_concurrency: int = typer.Option(5, help=\"Maximum number of concurrent tasks\"),\n):\n    \"\"\"\n    Generate a sitemap with keywords, topics, and reference analysis.\n    \"\"\"\n    asyncio.run(generate_sitemap(root_dir, output_file, api_key, max_concurrency))\n\n\nif __name__ == \"__main__\":\n    app()\n"
  },
  {
    "path": "scripts/validate_headings.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nValidate heading structure in documentation files.\n\nChecks for:\n- Multiple H1 tags (should only have one)\n- Heading hierarchy violations (e.g., H1 → H3 skipping H2)\n- Missing H1 tags\n\"\"\"\n\nimport argparse\nimport re\nfrom collections import defaultdict\nfrom pathlib import Path\n\n\ndef find_markdown_files(docs_dir: Path) -> list[Path]:\n    \"\"\"Find all markdown files in the docs directory.\"\"\"\n    return list(docs_dir.rglob(\"*.md\"))\n\n\ndef extract_headings(content: str) -> list[tuple[int, str, int]]:\n    \"\"\"\n    Extract all headings from markdown content.\n\n    Returns:\n        List of (level, text, line_number) tuples\n    \"\"\"\n    headings = []\n    lines = content.split(\"\\n\")\n\n    for line_num, line in enumerate(lines, 1):\n        # Match markdown headings: # Title, ## Title, etc.\n        match = re.match(r\"^(#{1,6})\\s+(.+)$\", line)\n        if match:\n            level = len(match.group(1))\n            text = match.group(2).strip()\n            headings.append((level, text, line_num))\n\n    return headings\n\n\ndef validate_headings(headings: list[tuple[int, str, int]]) -> dict[str, list[str]]:\n    \"\"\"Validate heading structure.\"\"\"\n    issues = {}\n\n    if not headings:\n        issues[\"no_headings\"] = [\"No headings found in file\"]\n        return issues\n\n    # Check for H1\n    h1_headings = [h for h in headings if h[0] == 1]\n    if not h1_headings:\n        issues[\"missing_h1\"] = [\"No H1 heading found\"]\n    elif len(h1_headings) > 1:\n        issues[\"multiple_h1\"] = [\n            f\"Line {line}: {text}\" for level, text, line in h1_headings\n        ]\n\n    # Check heading hierarchy\n    prev_level = 0\n    hierarchy_violations = []\n    for level, text, line_num in headings:\n        if prev_level > 0 and level > prev_level + 1:\n            hierarchy_violations.append(\n                f\"Line {line_num}: Skipped from H{prev_level} to H{level}: {text[:50]}\"\n            )\n        prev_level = level\n\n    if hierarchy_violations:\n        issues[\"hierarchy_violations\"] = hierarchy_violations\n\n    return issues\n\n\ndef process_file(file_path: Path) -> dict[str, list[str]]:\n    \"\"\"Process a single file and return issues.\"\"\"\n    try:\n        content = file_path.read_text(encoding=\"utf-8\")\n        headings = extract_headings(content)\n        return validate_headings(headings)\n    except Exception as e:\n        return {\"error\": [str(e)]}\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Validate heading structure in documentation files\"\n    )\n    parser.add_argument(\n        \"--docs-dir\",\n        type=Path,\n        default=Path(\"docs\"),\n        help=\"Directory containing documentation files (default: docs)\",\n    )\n    parser.add_argument(\n        \"--summary\",\n        action=\"store_true\",\n        help=\"Show only summary statistics\",\n    )\n    parser.add_argument(\n        \"--file\",\n        type=Path,\n        help=\"Validate a single file instead of all files\",\n    )\n\n    args = parser.parse_args()\n\n    if args.file:\n        files = [args.file]\n    else:\n        files = find_markdown_files(args.docs_dir)\n\n    all_issues = {}\n    total_counts = defaultdict(int)\n\n    for file_path in files:\n        issues = process_file(file_path)\n        if issues:\n            all_issues[str(file_path)] = issues\n            for issue_type, messages in issues.items():\n                total_counts[issue_type] += len(messages)\n\n    if args.summary:\n        print(\"Summary Statistics:\")\n        print(\"=\" * 60)\n        for issue_type, count in sorted(total_counts.items()):\n            print(f\"  {issue_type.replace('_', ' ').title()}: {count}\")\n    else:\n        # Detailed report\n        for file_path, issues in sorted(all_issues.items()):\n            print(f\"\\n{file_path}:\")\n            for issue_type, messages in issues.items():\n                print(f\"  {issue_type.replace('_', ' ').title()}:\")\n                for message in messages:\n                    print(f\"    {message}\")\n\n    print(f\"\\nTotal files checked: {len(files)}\")\n    print(f\"Files with issues: {len(all_issues)}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/validate_meta_tags.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nValidate frontmatter meta tags in documentation files.\n\nChecks for:\n- Missing title/description\n- Title length (50-60 chars recommended)\n- Description length (150-160 chars recommended)\n- Duplicate titles/descriptions\n\"\"\"\n\nimport argparse\nimport re\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Dict, List\n\n\ndef find_markdown_files(docs_dir: Path) -> List[Path]:\n    \"\"\"Find all markdown files in the docs directory.\"\"\"\n    return list(docs_dir.rglob(\"*.md\"))\n\n\ndef extract_frontmatter(content: str) -> Dict[str, str]:\n    \"\"\"Extract frontmatter from markdown content.\"\"\"\n    frontmatter = {}\n\n    # Match YAML frontmatter between --- markers\n    match = re.match(r\"^---\\s*\\n(.*?)\\n---\\s*\\n\", content, re.DOTALL)\n    if not match:\n        return frontmatter\n\n    yaml_content = match.group(1)\n\n    # Extract title\n    title_match = re.search(r\"^title:\\s*(.+)$\", yaml_content, re.MULTILINE)\n    if title_match:\n        frontmatter[\"title\"] = title_match.group(1).strip(\" \\\"'\")\n\n    # Extract description\n    desc_match = re.search(r\"^description:\\s*(.+)$\", yaml_content, re.MULTILINE)\n    if desc_match:\n        frontmatter[\"description\"] = desc_match.group(1).strip(\" \\\"'\")\n\n    # Extract keywords\n    keywords_match = re.search(r\"^keywords:\\s*(.+)$\", yaml_content, re.MULTILINE)\n    if keywords_match:\n        frontmatter[\"keywords\"] = keywords_match.group(1).strip(\" \\\"'\")\n\n    return frontmatter\n\n\ndef validate_file(file_path: Path) -> Dict[str, List[str]]:\n    \"\"\"Validate a single file's frontmatter.\"\"\"\n    issues = {}\n\n    try:\n        content = file_path.read_text(encoding=\"utf-8\")\n        frontmatter = extract_frontmatter(content)\n\n        # Check for missing frontmatter\n        if not frontmatter:\n            issues[\"missing_frontmatter\"] = [\"No frontmatter found\"]\n            return issues\n\n        # Check title\n        if \"title\" not in frontmatter:\n            issues[\"missing_title\"] = [\"Title missing from frontmatter\"]\n        else:\n            title = frontmatter[\"title\"]\n            title_len = len(title)\n            if title_len < 50:\n                issues[\"title_too_short\"] = [\n                    f\"Title is {title_len} chars (recommend 50-60 for SEO)\"\n                ]\n            elif title_len > 60:\n                issues[\"title_too_long\"] = [\n                    f\"Title is {title_len} chars (recommend 50-60 for SEO)\"\n                ]\n\n        # Check description\n        if \"description\" not in frontmatter:\n            issues[\"missing_description\"] = [\"Description missing from frontmatter\"]\n        else:\n            desc = frontmatter[\"description\"]\n            desc_len = len(desc)\n            if desc_len < 150:\n                issues[\"description_too_short\"] = [\n                    f\"Description is {desc_len} chars (recommend 150-160 for SEO)\"\n                ]\n            elif desc_len > 160:\n                issues[\"description_too_long\"] = [\n                    f\"Description is {desc_len} chars (recommend 150-160 for SEO)\"\n                ]\n\n        return issues\n    except Exception as e:\n        return {\"error\": [str(e)]}\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Validate frontmatter meta tags in documentation files\"\n    )\n    parser.add_argument(\n        \"--docs-dir\",\n        type=Path,\n        default=Path(\"docs\"),\n        help=\"Directory containing documentation files (default: docs)\",\n    )\n    parser.add_argument(\n        \"--summary\",\n        action=\"store_true\",\n        help=\"Show only summary statistics\",\n    )\n    parser.add_argument(\n        \"--file\",\n        type=Path,\n        help=\"Validate a single file instead of all files\",\n    )\n    parser.add_argument(\n        \"--check-duplicates\",\n        action=\"store_true\",\n        help=\"Check for duplicate titles and descriptions\",\n    )\n\n    args = parser.parse_args()\n\n    if args.file:\n        files = [args.file]\n    else:\n        files = find_markdown_files(args.docs_dir)\n\n    all_issues = {}\n    total_counts = defaultdict(int)\n\n    # Track titles and descriptions for duplicate checking\n    titles = defaultdict(list)\n    descriptions = defaultdict(list)\n\n    for file_path in files:\n        issues = validate_file(file_path)\n        if issues:\n            all_issues[str(file_path)] = issues\n            for issue_type, messages in issues.items():\n                total_counts[issue_type] += len(messages)\n\n        # Collect titles and descriptions for duplicate checking\n        if args.check_duplicates:\n            content = file_path.read_text(encoding=\"utf-8\")\n            frontmatter = extract_frontmatter(content)\n            if \"title\" in frontmatter:\n                titles[frontmatter[\"title\"]].append(str(file_path))\n            if \"description\" in frontmatter:\n                descriptions[frontmatter[\"description\"]].append(str(file_path))\n\n    if args.summary:\n        print(\"Summary Statistics:\")\n        print(\"=\" * 60)\n        for issue_type, count in sorted(total_counts.items()):\n            print(f\"  {issue_type.replace('_', ' ').title()}: {count} files\")\n    else:\n        # Detailed report\n        for file_path, issues in sorted(all_issues.items()):\n            print(f\"\\n{file_path}:\")\n            for issue_type, messages in issues.items():\n                for message in messages:\n                    print(f\"  - {message}\")\n\n    # Check for duplicates\n    if args.check_duplicates:\n        print(\"\\n\" + \"=\" * 60)\n        print(\"Duplicate Titles:\")\n        print(\"=\" * 60)\n        for title, file_list in sorted(titles.items()):\n            if len(file_list) > 1:\n                print(f\"\\n{title}\")\n                for f in file_list:\n                    print(f\"  - {f}\")\n\n        print(\"\\n\" + \"=\" * 60)\n        print(\"Duplicate Descriptions:\")\n        print(\"=\" * 60)\n        for desc, file_list in sorted(descriptions.items()):\n            if len(file_list) > 1:\n                print(f\"\\n{desc}\")\n                for f in file_list:\n                    print(f\"  - {f}\")\n\n    print(f\"\\nTotal files checked: {len(files)}\")\n    print(f\"Files with issues: {len(all_issues)}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "sitemap.yaml",
    "content": "api.md:\n  cross_links: []\n  hash: 4512e518bca21bfdbbc97752e007d64f\n  references: []\n  summary: 'The API Reference Guide provides a thorough overview of various components\n    related to instructors, validation, iteration, and function calls within a programming\n    framework. Key topics include OpenAI instructors, DSL validators, iterable structures,\n    partial applications, parallel processing, and optional operations through the\n    ''maybe'' moniker. It also delves into function call mechanisms, offering developers\n    essential information for implementing efficient and robust APIs. This guide serves\n    as a vital resource for those seeking to enhance their understanding and application\n    of API-related functionalities. Keywords: API reference, instructors, validation,\n    iteration, function calls, OpenAI, DSL validators, parallel processing.'\narchitecture.md:\n  ai_references: []\n  cross_links: []\n  hash: 141a2c4c63d93091402d5bf4e39b04f8\n  keywords:\n  - Instructor\n  - LLM providers\n  - Pydantic Model\n  - Schema Converter\n  - API Request\n  - Response Parser\n  - Validator\n  - Retry Mechanism\n  references: []\n  summary: The Instructor Architecture document elucidates the internal workings of\n    the Instructor system and its integration with various Large Language Model (LLM)\n    providers. It details the core components that facilitate seamless interactions\n    and structured data handling in a consistent manner across different providers.\n  topics:\n  - Core Components\n  - Request Flow\n  - Data Validation\n  - LLM Integration\n  - Structured Output\nblog/index.md:\n  cross_links:\n  - blog/posts/aisummit-2023.md\n  - blog/posts/announcing-unified-provider-interface.md\n  - blog/posts/caching.md\n  - blog/posts/chain-of-density.md\n  - blog/posts/citations.md\n  - blog/posts/distilation-part1.md\n  - blog/posts/generator.md\n  - blog/posts/langsmith.md\n  - blog/posts/learn-async.md\n  - blog/posts/llms-txt-adoption.md\n  - blog/posts/logfire.md\n  - blog/posts/rag-and-beyond.md\n  - blog/posts/validation-part1.md\n  - concepts/partial.md\n  - examples/batch_job_oai.md\n  - examples/bulk_classification.md\n  - examples/image_to_ad_copy.md\n  - integrations/llama-cpp-python.md\n  - integrations/ollama.md\n  - integrations/together.md\n  - prompting/decomposition/least_to_most.md\n  - prompting/self_criticism/chain_of_verification.md\n  - prompting/self_criticism/cumulative_reason.md\n  - prompting/self_criticism/reversecot.md\n  hash: 04ec2689ed366f014bc3f15ce4fd0b42\n  references:\n  - blog/posts/announcing-unified-provider-interface.md\n  - blog/posts/llms-txt-adoption.md\n  - blog/posts/rag-and-beyond.md\n  - blog/posts/chain-of-density.md\n  - blog/posts/validation-part1.md\n  - blog/posts/citations.md\n  - blog/posts/distilation-part1.md\n  - blog/posts/langsmith.md\n  - blog/posts/logfire.md\n  - blog/posts/caching.md\n  - blog/posts/learn-async.md\n  - blog/posts/generator.md\n  - examples/batch_job_oai.md\n  - examples/bulk_classification.md\n  - examples/image_to_ad_copy.md\n  - prompting/decomposition/least_to_most.md\n  - prompting/self_criticism/chain_of_verification.md\n  - prompting/self_criticism/cumulative_reason.md\n  - prompting/self_criticism/reversecot.md\n  - integrations/ollama.md\n  - integrations/llama-cpp-python.md\n  - integrations/together.md\n  - concepts/partial.md\n  - blog/posts/aisummit-2023.md\n  summary: This document outlines various resources and updates available for users\n    interested in AI development, optimization, and language model techniques. It\n    encourages subscribing to a newsletter to receive updates on new features and\n    tips for using \"Instructor.\" The content includes topics on advanced AI techniques\n    like the Unified Provider Interface, llms.txt adoption, and GPT-4 level summaries\n    using GPT-3.5-turbo. It also covers AI model validation, function caching in Python,\n    batch processing, and integrations with tools like Logfire and Pandas. Additionally,\n    it introduces prompting techniques such as Least-to-Most prompting and the Reverse\n    Chain of Thought (RCoT) for enhancing language model performance. Key objectives\n    are to keep users informed with the latest advancements and provide practical\n    tips for AI model refinement and deployment. Keywords include AI development,\n    language models, optimization, Python, integrations, and prompting techniques.\nblog/posts/aisummit-2023.md:\n  ai_references:\n  - '[AI Engineer Summit](https://www.ai.engineer/summit)'\n  - '[Pydantic Documentation](https://docs.pydantic.dev/latest/)'\n  - '[full talk](https://www.youtube.com/watch?v=yj-wSRJwrrc)'\n  cross_links: []\n  hash: f0b52aac48499d18ab5101d10da676ed\n  keywords:\n  - Pydantic\n  - Prompt Engineering\n  - AI Summit\n  - Machine Learning\n  - Data Validation\n  references: []\n  summary: This document provides insights from a keynote at the AI Engineer Summit\n    on utilizing Pydantic for effective prompt engineering. The talk includes a deep\n    dive into the related documentation and aims to refine the art of prompt engineering\n    in AI applications.\n  topics:\n  - Pydantic usage\n  - Prompt engineering techniques\n  - AI in engineering\n  - Machine learning applications\nblog/posts/announcing-gemini-tool-calling-support.md:\n  cross_links: []\n  hash: 9918d92d63a5005bc11f4df8593d1411\n  references: []\n  summary: \"This article introduces the latest support for structured outputs via\\\n    \\ tool calling in the instructor library for both Gemini and VertexAI SDKs, enhancing\\\n    \\ AI model interactions. It highlights easy installation options for Gemini (`instructor[google-generativeai]`)\\\n    \\ and VertexAI (`instructor[vertexai]`), emphasizing Gemini\\u2019s advantages\\\n    \\ such as a higher free token quota and simpler setup with just a Google API key.\\\n    \\ The guide provides step-by-step examples of using instructor with Gemini and\\\n    \\ VertexAI models (`gemini-3-flash`, `gemini-1.5-pro-latest`) for chat\\\n    \\ completions and structured output extraction, focusing on AI SDKs, tool calling,\\\n    \\ structured outputs, and generative models for AI developers.\"\nblog/posts/announcing-instructor-responses-support.md:\n  cross_links:\n  - integrations/openai-responses.md\n  hash: 8ce4314b2dee3e0af9a37baeee08ed87\n  references:\n  - integrations/openai-responses.md\n  - integrations/openai-responses.md\n  summary: The announcement highlights Instructor's integration with OpenAI's new\n    Responses API, providing a streamlined, type-safe interface for structured outputs,\n    web search, and citation tools. Key features include easy client initialization,\n    full Pydantic validation, built-in tools for real-time information retrieval,\n    and async support. This integration enhances LLM applications by simplifying external\n    data referencing, maintaining compatibility with existing chat workflows, and\n    enabling powerful capabilities like file search and citations without additional\n    complexity. Core keywords include Instructor, Responses API, OpenAI, structured\n    outputs, type safety, web search, citations, Pydantic, async support, LLM development.\nblog/posts/announcing-unified-provider-interface.md:\n  ai_references:\n  - '[../../integrations/anthropic.md#caching'\n  - ../posts/anthropic-prompt-caching.md\n  - ../../concepts/prompt_caching.md\n  - ../../concepts/multimodal.md\n  - /concepts/patching\n  - /integrations/\n  - string-based-init\n  - best_framework\n  - introduction]\n  cross_links:\n  - blog/posts/anthropic-prompt-caching.md\n  - blog/posts/best_framework.md\n  - blog/posts/string-based-init.md\n  - concepts/multimodal.md\n  - concepts/prompt_caching.md\n  - integrations/anthropic.md\n  hash: c88097d85ac482f5383e301293764cea\n  keywords:\n  - from_provider\n  - LLM providers\n  - client initialization\n  - synchronous\n  - asynchronous\n  - model comparison\n  - structured outputs\n  - multi-provider strategies\n  - rapid prototyping\n  references:\n  - blog/posts/anthropic-prompt-caching.md\n  - concepts/prompt_caching.md\n  - concepts/multimodal.md\n  - blog/posts/concepts/patching/index.md\n  - blog/posts/integrations/index.md\n  - blog/posts/string-based-init/index.md\n  - blog/posts/best_framework/index.md\n  - blog/posts/introduction/index.md\n  summary: The `from_provider()` function in the Instructor library allows users to\n    easily switch between various LLM providers using a single string identifier,\n    simplifying client initialization and model experimentation. This enhancement\n    automates setup procedures and supports both synchronous and asynchronous operations,\n    improving efficiency for developers working with multiple language models.\n  topics:\n  - Functionality of from_provider\n  - Key benefits of using from_provider\n  - Internal workings of from_provider\n  - Example usage of from_provider\n  - Future improvements in LLM integration\nblog/posts/anthropic-prompt-caching.md:\n  ai_references:\n  - '[Caching Strategies](/concepts/caching)'\n  - '[Anthropic Integration](/integrations/anthropic)'\n  - '[Anthropic Structured Outputs](structured-output-anthropic)'\n  - '[Response Caching](caching)'\n  - '[Performance Monitoring](logfire)'\n  cross_links: []\n  hash: 54da38a45472225872357555af50eb10\n  keywords:\n  - prompt caching\n  - Anthropic\n  - API optimization\n  - cost reduction\n  - latency improvement\n  - caching limitations\n  - developer guide\n  references:\n  - blog/posts/concepts/caching/index.md\n  - blog/posts/integrations/anthropic/index.md\n  - blog/posts/structured-output-anthropic/index.md\n  - blog/posts/caching/index.md\n  - blog/posts/logfire/index.md\n  summary: This document explores the benefits of using prompt caching with Anthropic,\n    highlighting its ability to improve response times and reduce costs for applications\n    requiring large context management. It includes a quickstart guide, implementation\n    examples, and discusses key limitations and considerations for developers eager\n    to optimize API interactions.\n  topics:\n  - prompt caching implementation\n  - API usage optimization\n  - caching limitations\n  - character extraction example\n  - performance monitoring\nblog/posts/anthropic-web-search-structured.md:\n  cross_links: []\n  hash: 9a5a79e8e389eb7265944a8968db3fa9\n  references: []\n  summary: Learn how to leverage Anthropic's web search tool with Instructor to access\n    real-time, structured data from the web. This powerful combination enables AI\n    models like Claude to fetch the latest information, generate organized responses\n    using Pydantic models, and cite sources for verification. Key features include\n    enhanced accuracy, reduced hallucinations, and customizable search configurations\n    like domain restrictions and search limits. Ideal for building dynamic applications\n    that require up-to-date data on topics such as sports, news, or market trends.\nblog/posts/anthropic.md:\n  cross_links: []\n  hash: 44073f09c95cb56e33653923ef4e83c8\n  references: []\n  summary: This article discusses integrating Anthropic's powerful language models\n    with Instructor and Pydantic for structured output generation in Python. It provides\n    step-by-step guidance on installing the `instructor[anthropic]` package, configuring\n    the Anthropic client with enhanced capabilities, and creating custom data models\n    for precise JSON responses. Key topics include handling nested types, leveraging\n    the `anthropic` client, and supporting models like Claude-3 for AI-driven applications.\n    The content highlights ongoing feature development, including streaming support,\n    and encourages community feedback to improve compatibility and functionality in\n    API development and LLM techniques.\nblog/posts/bad-schemas-could-break-llms.md:\n  cross_links:\n  - blog/posts/matching-language.md\n  - blog/posts/timestamp.md\n  - examples/index.md\n  - index.md\n  hash: 8d3274500a88eb0bfe0171d9f00504f8\n  references:\n  - blog/posts/matching-language.md\n  - blog/posts/timestamp.md\n  - index.md\n  - examples/index.md\n  summary: This article emphasizes the critical impact of response models and schemas\n    on Large Language Model (LLM) performance, particularly with Claude and GPT-4o.\n    Key insights include how field naming, chain-of-thought reasoning, and response\n    mode choices (JSON vs. Tool Calling) significantly influence accuracy, with performance\n    gains of up to 60% through optimized schemas. The content highlights the importance\n    of designing well-structured response models, testing different permutations systematically,\n    and using tools like Instructor for prototyping. Core keywords include LLM response\n    models, structured outputs, JSON mode, tool calling, GPT-4o, Claude, reasoning\n    prompts, and model performance optimization.\nblog/posts/best_framework.md:\n  cross_links:\n  - blog/posts/introduction.md\n  - concepts/iterable.md\n  - concepts/parallel.md\n  - concepts/partial.md\n  - concepts/patching.md\n  - concepts/philosophy.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/types.md\n  - concepts/unions.md\n  - examples/index.md\n  - integrations/groq.md\n  - integrations/index.md\n  - integrations/llama-cpp-python.md\n  - integrations/ollama.md\n  - integrations/together.md\n  hash: 41b529a5e2d92400da24c6f6c1e8146f\n  references:\n  - concepts/retrying.md\n  - concepts/reask_validation.md\n  - concepts/parallel.md\n  - concepts/partial.md\n  - concepts/iterable.md\n  - concepts/types.md\n  - concepts/unions.md\n  - examples/index.md\n  - integrations/index.md\n  - integrations/together.md\n  - integrations/ollama.md\n  - integrations/groq.md\n  - integrations/llama-cpp-python.md\n  - concepts/philosophy.md\n  - concepts/patching.md\n  - concepts/retrying.md\n  - concepts/partial.md\n  - blog/posts/introduction.md\n  - integrations/index.md\n  - concepts/types.md\n  summary: Instructor is a lightweight Python library that enhances the OpenAI SDK\n    by enabling seamless mapping of LLM outputs to structured, type-safe data using\n    Pydantic models and Python type annotations. It simplifies extracting structured\n    data from GPTs and other compatible providers, supports features like retrying,\n    validation, streaming, and parallel tool calling, and allows direct access to\n    message parameters for advanced prompt engineering. Designed for easy integration\n    and incremental adoption, Instructor helps teams convert unstructured LLM text\n    into validated data, making it ideal for improving data consistency and reducing\n    \"string hell\" in AI applications. Key keywords include LLM outputs, structured\n    data, Python, Pydantic, OpenAI SDK, GPT, data mapping, response_model.\nblog/posts/caching.md:\n  cross_links:\n  - blog/posts/anthropic-prompt-caching.md\n  - blog/posts/learn-async.md\n  - concepts/caching.md\n  - concepts/parallel.md\n  - concepts/prompt_caching.md\n  - examples/batch_job_oai.md\n  hash: 11fdb88f500185d84f0a06cc2a4b4c41\n  references:\n  - concepts/caching.md\n  - concepts/prompt_caching.md\n  - concepts/parallel.md\n  - blog/posts/anthropic-prompt-caching.md\n  - blog/posts/learn-async.md\n  - examples/batch_job_oai.md\n  summary: This article explores advanced caching techniques in Python to optimize\n    performance when working with Pydantic models and language model APIs like OpenAI.\n    It covers in-memory caching with `functools.cache`, persistent caching with `diskcache`,\n    and distributed caching using `redis`. The content emphasizes creating custom\n    decorators to cache API responses effectively, with a focus on serialization,\n    cache invalidation considerations, and selecting appropriate caching strategies\n    for small and large-scale applications. Keywords include Python caching, Pydantic\n    models, performance optimization, in-memory caching, diskcache, Redis, API response\n    caching, and distributed systems.\nblog/posts/chain-of-density.md:\n  cross_links:\n  - blog/posts/validation-part1.md\n  - cli/finetune.md\n  hash: 1ff99278946f900cba0eb4b22d8c663a\n  references:\n  - blog/posts/validation-part1.md\n  - cli/finetune.md\n  summary: \"This article explores advanced AI summarization techniques, focusing on\\\n    \\ the Chain of Density method with GPT-3.5 and GPT-4. It details how to implement\\\n    \\ iterative, entity-dense summaries, fine-tune GPT-3.5 models for improved performance,\\\n    \\ and achieve significant efficiency gains\\u2014up to 20x faster and 50x cost\\\n    \\ savings. The guide covers data modeling, validation with Pydantic, and custom\\\n    \\ prompting for high-quality summaries. Keywords include GPT-3.5, GPT-4, Chain\\\n    \\ of Density, summarization, fine-tuning, LLM techniques, entity density, AI text\\\n    \\ summarization, Instructor library, model distillation, OpenAI, cost efficiency,\\\n    \\ latency reduction.\"\nblog/posts/chat-with-your-pdf-with-gemini.md:\n  ai_references:\n  - '[multimodal-gemini.md'\n  - generating-pdf-citations.md\n  - rag-and-beyond.md\n  - ../../concepts/retrying.md\n  - ../../index.md]\n  cross_links:\n  - blog/posts/generating-pdf-citations.md\n  - blog/posts/multimodal-gemini.md\n  - blog/posts/rag-and-beyond.md\n  - concepts/retrying.md\n  - index.md\n  hash: 902b85d5f28f8de856e9e59b6bb79faf\n  keywords:\n  - '[Google Gemini'\n  - Document Processing\n  - PDF Analysis\n  - Pydantic\n  - Python\n  - Multimodal Capabilities\n  - Structured Output]\n  references:\n  - concepts/retrying.md\n  - blog/posts/multimodal-gemini.md\n  - blog/posts/concepts/multimodal/index.md\n  - blog/posts/multimodal-gemini/index.md\n  - blog/posts/generating-pdf-citations/index.md\n  - blog/posts/rag-and-beyond/index.md\n  - index.md\n  summary: This documentation provides a comprehensive guide on using Google's Gemini\n    model with Instructor to efficiently process PDFs and extract structured information.\n    The integration simplifies typical document processing challenges, allowing users\n    to leverage multimodal capabilities to streamline data extraction into a structured\n    format easily.\n  topics:\n  - '[PDF Processing'\n  - Google Gemini Model\n  - Instructor Integration\n  - Multimodal Data Extraction\n  - Benefits of Structured Outputs]\nblog/posts/citations.md:\n  ai_references:\n  - '[Validation Guide](/concepts/validation)'\n  - '[RAG Techniques](rag-and-beyond)'\n  - '[PDF Citations](generating-pdf-citations)'\n  - '[Validation Basics](validation-part1)'\n  - '[finetuning a better summarizer](https://jxnl.github.io/instructor/blog/2023/11/05/chain-of-density/)'\n  cross_links: []\n  hash: bdc9538dce76ab09cb897edab533e546\n  keywords:\n  - Pydantic\n  - LLM\n  - Citation Verification\n  - Data Accuracy\n  - Python\n  - Validation\n  - Error Handling\n  - Context Validation\n  - Model Validation\n  references:\n  - blog/posts/concepts/validation/index.md\n  - blog/posts/rag-and-beyond/index.md\n  - blog/posts/generating-pdf-citations/index.md\n  - blog/posts/validation-part1/index.md\n  summary: This blog post explores how Pydantic can be utilized to enhance the verification\n    of citations in large language models (LLMs) to improve data accuracy and reliability.\n    It provides practical examples of using substring checks and LLMs for citation\n    validation, as well as techniques for aligning answers with their corresponding\n    citations.\n  topics:\n  - Citation Verification\n  - Data Accuracy\n  - Pydantic Validators\n  - LLM Integration\n  - Error Handling Techniques\nblog/posts/consistent-stories.md:\n  cross_links: []\n  hash: b11eb15649a2a818d4d6bfcf26507cdb\n  references: []\n  summary: 'This article discusses how to generate complex Directed Acyclic Graphs\n    (DAGs) using GPT-4o, focusing on creating consistent and coherent Choose Your\n    Own Adventure stories. The challenge of generating large graphs is addressed with\n    a two-phase approach: first generating a story outline, then expanding choices\n    in parallel to manage context limitations and allow deeper story branches. Key\n    benefits include path-specific context, parallel generation, controlled growth\n    via a max_depth parameter, and rate-limiting using semaphores. The article emphasizes\n    structured validation, using Pydantic models, and highlights the efficiency of\n    parallel processing for content generation in large-scale language models, applicable\n    through tools like instructor with OpenAI''s API. Keywords: DAGs, GPT-4o, Choose\n    Your Own Adventure, story generation, language models, parallel processing, Pydantic,\n    OpenAI.'\nblog/posts/course.md:\n  cross_links: []\n  hash: 8424fc0d6b49b24ad11707b30daaddde\n  references: []\n  summary: 'Discover a free, one-hour course on Weights and Biases, exploring essential\n    techniques for steering language models in machine learning. This comprehensive\n    course covers material from detailed tutorials and is accessible to everyone interested\n    in AI and machine learning. Perfect for both beginners and experienced practitioners,\n    it offers valuable insights and practical tools for leveraging language models\n    effectively. Access this open resource at [wandb.courses](https://www.wandb.courses/courses/steering-language-models).\n    Keywords: Weights and Biases, language models, machine learning, AI course, free\n    resources.'\nblog/posts/cursor-rules.md:\n  ai_references:\n  - '[version-control-for-the-vibe-coder-part-1.md'\n  - version-control-for-the-vibe-coder-part-2.md]\n  cross_links: []\n  hash: fccc7d93ee9d7b15bbfb41e09fd91660\n  keywords:\n  - '[Cursor rules'\n  - Git workflows\n  - AI-assisted coding\n  - small commits\n  - pull requests]\n  references: []\n  summary: This documentation discusses how Instructor's Cursor rules enhance Git\n    workflows for contributors by promoting AI-assisted coding practices. It emphasizes\n    the importance of small, frequent commits and provides guidance for managing pull\n    requests, making contributions to projects simpler and more organized.\n  topics:\n  - '[Git practices'\n  - AI coding\n  - contributor guidelines\n  - version control\n  - pull request management]\nblog/posts/distilation-part1.md:\n  cross_links: []\n  hash: 2b0cffc5cf2701d20f0f294b843aaf1e\n  references: []\n  summary: This guide explores using the `Instructor` library to enhance Python functions\n    through fine-tuning and distillation. The library streamlines the process of developing\n    task-specific language models by simplifying function calls and managing data\n    preparation. Key features include automatic dataset generation for fine-tuning,\n    efficient function integration, and backward compatibility. The guide covers logging\n    outputs, the importance of structured outputs, and future plans for function implementation.\n    Essential keywords include Instructor, fine-tuning, distillation, language models,\n    Python, and dataset generation.\nblog/posts/extract-model-looks.md:\n  cross_links: []\n  hash: 1a96f01876050a880e6d2f67bee23cb2\n  references: []\n  summary: \"This article presents a two-phase, parallel approach to generating complex,\\\n    \\ consistent Directed Acyclic Graphs (DAGs) and stories with GPT-4o, overcoming\\\n    \\ limitations of large graph sizes and context window constraints. By first creating\\\n    \\ a detailed story outline\\u2014including setting, plot, choices, and visual style\\u2014\\\n    and then expanding branches concurrently while maintaining path-specific context,\\\n    \\ the method ensures coherence and efficiency. Key concepts include state isolation,\\\n    \\ parallel processing, structured validation with Pydantic, and controllable story\\\n    \\ depth. Ideal for generating large, interconnected content at scale, this approach\\\n    \\ enhances story and graph generation speed, consistency, and complexity using\\\n    \\ AI models like OpenAI\\u2019s GPT-4o.\"\nblog/posts/extracting-model-metadata.md:\n  ai_references:\n  - '[../../concepts/multimodal.md]'\n  cross_links:\n  - concepts/multimodal.md\n  hash: caa1adf0f1bb9d67726b3f7cf6b332a4\n  keywords:\n  - '[metadata extraction'\n  - structured extraction\n  - gpt-4o\n  - multimodal\n  - taxonomy\n  - product recommendations\n  - e-commerce\n  - personalization\n  - instructor]\n  references:\n  - concepts/multimodal.md\n  summary: This documentation explains how to effectively extract structured metadata\n    from images using the Structured Extraction technique in conjunction with multimodal\n    language models like gpt-4o. It provides insights into creating a taxonomy for\n    e-commerce product categorization and demonstrates practical implementations using\n    Python, making it essential for enhancing personalized recommendations in online\n    retail settings.\n  topics:\n  - '[metadata extraction'\n  - product taxonomy\n  - multimodal language models\n  - Python implementation\n  - e-commerce personalization]\nblog/posts/fake-data.md:\n  ai_references: []\n  cross_links: []\n  hash: e94f325f97c0441ee1cdc670f4feb925\n  keywords:\n  - '[Synthetic Data'\n  - Pydantic\n  - OpenAI\n  - Data Generation\n  - Python\n  - data modeling\n  - JSON schema\n  - AI-generated data]\n  references: []\n  summary: This documentation provides a comprehensive guide on generating synthetic\n    data using Pydantic and OpenAI's models, featuring practical examples and configurations.\n    Users can learn to customize synthetic data generation through various methods\n    such as example setting, model adjustments, and descriptive influences on data\n    output.\n  topics:\n  - '[Data generation with Pydantic'\n  - Using OpenAI models\n  - Customizing synthetic data\n  - Practical examples in Python\n  - JSON schema configurations]\nblog/posts/full-fastapi-visibility.md:\n  cross_links:\n  - blog/posts/learn-async.md\n  hash: b86decf8772b03d62dd49c2700936cc3\n  references:\n  - blog/posts/learn-async.md\n  summary: This article demonstrates how Logfire enhances FastAPI applications with\n    comprehensive observability and OpenTelemetry integration. It highlights easy\n    setup and code integration for logging, profiling, and monitoring API endpoints,\n    including handling asynchronous operations with asyncio and streaming responses\n    using Instructor's Iterable support. Key topics include FastAPI, Logfire, OpenTelemetry,\n    Pydantic, AsyncIO, streaming responses, and performance tracking, providing practical\n    examples to improve application visibility, debugging, and error reproduction\n    in production environments.\nblog/posts/generating-pdf-citations.md:\n  cross_links:\n  - index.md\n  hash: d293a327202394d87adcd15ec894381e\n  references:\n  - index.md\n  summary: This article demonstrates how to leverage Google's Gemini model with Instructor\n    and Pydantic for accurate PDF data extraction and citation generation. It highlights\n    the importance of structured outputs to reduce hallucinations, ensure source-truthfulness,\n    and improve reliability in document processing. The process involves PDF parsing\n    with PyMuPDF, uploading files to Gemini, and creating citations for precise referencing,\n    making it ideal for legal, academic, and financial applications. Key topics include\n    PDF analysis, structured data validation, GPT integration, citation highlighting,\n    and reducing errors in AI-generated content, with keywords like Gemini, PDF processing,\n    citations, structured outputs, Pydantic, document verification, and AI accuracy.\nblog/posts/generator.md:\n  cross_links:\n  - concepts/fastapi.md\n  hash: b9ebcb6883c21f0ba7d87980c45817dd\n  references:\n  - concepts/fastapi.md\n  summary: 'This article explores the use of Python generators to enhance Large Language\n    Model (LLM) streaming, improving latency and user experience in applications like\n    eCommerce and chat interfaces. It explains how generators enable efficient, real-time\n    data processing and extraction, allowing for faster rendering and responsiveness.\n    The post demonstrates practical implementations using the Instructor library for\n    structured data extraction from streaming LLM responses, highlighting their benefits\n    over traditional approaches. Key concepts include Python generators, LLM streaming,\n    data pipeline optimization, and fast API integration, emphasizing how real-time\n    streaming can boost performance and customer engagement. Core keywords: Python\n    generators, LLM streaming, data processing, real-time API, latency reduction,\n    fastapi, instructor library, structured extraction, performance optimization.'\nblog/posts/google-openai-client.md:\n  cross_links:\n  - blog/posts/bad-schemas-could-break-llms.md\n  - blog/posts/multimodal-gemini.md\n  - concepts/retrying.md\n  hash: 26e8561156b73b2a9b6da501c1aa7c04\n  references:\n  - blog/posts/bad-schemas-could-break-llms.md\n  - blog/posts/multimodal-gemini.md\n  - concepts/retrying.md\n  summary: \"This article explains why Instructor remains essential despite Google's\\\n    \\ recent OpenAI compatibility for Gemini models. While the new integration simplifies\\\n    \\ interactions with Gemini via OpenAI's API, it has limitations such as limited\\\n    \\ schema support, lack of streaming, and no multimodal capabilities. Instructor\\\n    \\ offers a provider-agnostic API, advanced schema management, streaming, multimodal\\\n    \\ support, automatic validation, retries, and seamless provider switching\\u2014\\\n    features crucial for building reliable, production-grade LLM applications. Keywords\\\n    \\ include Gemini, OpenAI integration, Instructor, multimodal support, schema management,\\\n    \\ streaming, provider agnostic, robust AI applications.\"\nblog/posts/introducing-structured-outputs-with-cerebras-inference.md:\n  cross_links: []\n  hash: 9cae7568e3f7431ca1ee3b73b8a7a1b0\n  references: []\n  summary: Explore how to leverage Cerebras Inference for structured outputs and faster\n    model processing with seamless Pydantic integration. Cerebras offers up to 20x\n    faster inference compared to GPUs, making it an excellent choice for efficient\n    API development. The article guides you through setting up a Cerebras Inference\n    API key and using the Cerebras SDK with Pydantic models for validated responses.\n    Key functionality includes creating instructor clients, using models like \"llama3.1-70b\",\n    and supporting both synchronous and asynchronous operations. Enhance your API\n    integration with features such as streaming responses in `CEREBRAS_JSON` mode\n    for real-time data processing. Key topics include Cerebras Inference, Pydantic,\n    fast inference, structured outputs, and API integration.\nblog/posts/introducing-structured-outputs.md:\n  ai_references:\n  - '[../../concepts/reask_validation.md'\n  - ../../concepts/lists.md\n  - ../../concepts/partial.md]\n  cross_links:\n  - concepts/lists.md\n  - concepts/partial.md\n  - concepts/reask_validation.md\n  hash: 85ac9a93f1b6892914274bd21ebc8498\n  keywords:\n  - '[OpenAI'\n  - Structured Outputs\n  - instructor\n  - Pydantic\n  - Data Validation\n  - LLM Workflows\n  - API\n  - Vendor Lock-in]\n  references:\n  - concepts/reask_validation.md\n  - concepts/lists.md\n  - concepts/partial.md\n  summary: This article explores the challenges associated with OpenAI's Structured\n    Outputs and introduces 'instructor' as a solution to enhance LLM workflows. It\n    discusses issues such as validation limitations, streaming difficulties, and latency\n    problems while highlighting the advantages of using 'instructor' for automatic\n    retries and provider flexibility.\n  topics:\n  - '[OpenAI Structured Outputs'\n  - Validation Logic\n  - Streaming Challenges\n  - Latency Issues\n  - instructor Features]\nblog/posts/introduction.md:\n  cross_links:\n  - blog/posts/best_framework.md\n  - blog/posts/structured-output-anthropic.md\n  - concepts/models.md\n  - concepts/reask_validation.md\n  - index.md\n  - integrations/index.md\n  hash: 33cd1df34b63e686b253b5ebca7b433d\n  references:\n  - index.md\n  - integrations/index.md\n  - concepts/reask_validation.md\n  - concepts/models.md\n  - blog/posts/best_framework.md\n  - blog/posts/structured-output-anthropic.md\n  - examples/chain-of-thought.md\n  summary: This article explores how Pydantic simplifies working with Language Learning\n    Models (LLMs) in Python, particularly through structured JSON outputs. It highlights\n    the difficulties developers face with existing LLM frameworks and showcases how\n    the Pydantic-powered Instructor library streamlines interactions with language\n    models, focusing on ease of use, widespread adoption, and compatibility with tools\n    like OpenAI's Function Calling. By supporting modular schemas, easy validation,\n    and relationship definition, Pydantic offers a more organized code structure,\n    enhancing the developer experience. The piece also parallels LLM architecture\n    with FastAPI, offering simple, Pythonic approaches to utilizing LLMs effectively.\n    Key phrases include Pydantic, LLMs, structured JSON, OpenAI, Python, and language\n    model interaction.\nblog/posts/jinja-proposal.md:\n  ai_references: []\n  cross_links: []\n  hash: c49c3ea11717caead70f820614a48932\n  keywords:\n  - '[Jinja'\n  - Templating\n  - Pydantic\n  - API Development\n  - Data Validation\n  - Prompt Formatting\n  - Versioning\n  - Logging\n  - Security]\n  references: []\n  summary: This document outlines the integration of Jinja templating into the Instructor\n    platform to enhance prompt formatting, validation, versioning, and secure logging\n    capabilities. By leveraging Jinja's features, Instructor will provide improved\n    handling of complex prompts and better data management, ultimately boosting its\n    functionality for users.\n  topics:\n  - '[Integration of Jinja'\n  - Enhanced Formatting Capabilities\n  - Data Validation\n  - Version Control\n  - Secure Logging]\nblog/posts/langsmith.md:\n  cross_links:\n  - blog/posts/learn-async.md\n  - examples/bulk_classification.md\n  hash: 3f9c1608a2030bf77928eb024d6326e4\n  references:\n  - examples/bulk_classification.md\n  - blog/posts/learn-async.md\n  summary: \"This blog post explores how LangChain's LangSmith can be integrated with\\\n    \\ the OpenAI client to enhance functionality through seamless LLM observability.\\\n    \\ By wrapping the OpenAI client with LangSmith and using the `instructor` package,\\\n    \\ developers can improve their LLM applications by enabling features such as question\\\n    \\ classification and asynchronous processing with `asyncio`. The article provides\\\n    \\ a step-by-step guide on setting up LangSmith, installing necessary SDKs, and\\\n    \\ implementing multi-label classification of questions using Python. It highlights\\\n    \\ LangSmith\\u2019s capabilities as a DevOps platform for developing, collaborating,\\\n    \\ deploying, and monitoring language model applications. Key points include the\\\n    \\ use of `wrap_openai`, rate limiting via `asyncio.Semaphore`, and customizing\\\n    \\ the classification prompt to fit specific use cases.\"\nblog/posts/learn-async.md:\n  ai_references:\n  - '[../concepts/error_handling.md'\n  - ../concepts/retrying.md\n  - https://docs.python.org/3/library/asyncio.html\n  - https://realpython.com/async-io-python/\n  - https://python.useinstructor.com\n  - https://platform.openai.com/docs/guides/async]\n  cross_links:\n  - concepts/error_handling.md\n  - concepts/retrying.md\n  hash: 510b01ac35458a0b82a7f5055913fb4f\n  keywords:\n  - '[asyncio'\n  - asyncio.gather\n  - asyncio.as_completed\n  - Python\n  - LLM processing\n  - concurrent processing\n  - async programming\n  - rate limiting\n  - performance optimization]\n  references:\n  - blog/concepts/error_handling.md\n  - blog/concepts/retrying.md\n  summary: This documentation provides an in-depth guide on using Python's asyncio.gather\n    and asyncio.as_completed for efficient concurrent processing of Large Language\n    Models (LLMs). It covers various async programming patterns, rate limiting techniques,\n    and performance optimization strategies vital for AI applications.\n  topics:\n  - '[asyncio methods'\n  - concurrent execution\n  - performance comparison\n  - rate-limited processing\n  - error handling]\nblog/posts/llm-as-reranker.md:\n  ai_references:\n  - '[rag-and-beyond'\n  - validation-part1\n  - logfire]\n  cross_links:\n  - blog/posts/validation-part1.md\n  hash: 67f340dc144300698dca7905ebdefc6b\n  keywords:\n  - '[LLM'\n  - Pydantic\n  - Instructor\n  - Search Relevance\n  - Reranking\n  - Retrieval-Augmented Generation\n  - synthetic data\n  - evaluation pipeline]\n  references:\n  - blog/posts/rag-and-beyond/index.md\n  - blog/posts/validation-part1/index.md\n  - blog/posts/logfire/index.md\n  summary: This blog post guides you through creating an LLM-based reranker using\n    Instructor and Pydantic for enhancing search results relevance in Retrieval-Augmented\n    Generation (RAG) pipelines. By utilizing structured outputs and large language\n    models, you will learn to label synthetic data for fine-tuning and build an accurate\n    evaluation pipeline.\n  topics:\n  - '[Setting Up the Environment'\n  - Defining the Reranking Models\n  - Creating the Reranker Function\n  - Testing the Reranker]\nblog/posts/llms-txt-adoption.md:\n  ai_references:\n  - '[llms.txt specification](https://github.com/AnswerDotAI/llms-txt)'\n  - '[standard format](https://github.com/AnswerDotAI/llms-txt#format)'\n  - '[GitHub](https://github.com/instructor-ai/instructor)'\n  - '[Twitter](https://x.com/jxnl.co)'\n  cross_links: []\n  hash: 4c6baf0df522771e1991d14f88965af2\n  keywords:\n  - llms.txt\n  - AI language models\n  - documentation accessibility\n  - Instructor\n  - coding assistants\n  - standardization\n  - markdown\n  - implementation\n  references: []\n  summary: Instructor has adopted the llms.txt specification to enhance the accessibility\n    of its documentation for AI language models. This implementation allows AI tools\n    to better interpret and navigate the documentation, resulting in improved code\n    suggestions and a cleaner access experience for users.\n  topics:\n  - llms.txt specification\n  - AI-documentation interaction\n  - benefits of llms.txt\n  - implementation guidelines\n  - future of AI in coding\nblog/posts/logfire.md:\n  cross_links: []\n  hash: 7ce79e21910ace0347fba9fd9615cfca\n  references: []\n  summary: The article introduces **Logfire**, an observability platform developed\n    by the creators of **Pydantic**, which integrates seamlessly with libraries like\n    **HTTPx** and **Instructor**. It demonstrates how Logfire can enhance application\n    performance tracking through examples such as spam email classification, validation\n    using `llm_validator`, and data extraction from images with **GPT-4V**. The guide\n    details how to set up and use these features with Logfire, emphasizing its ease\n    of integration, efficient logging capabilities, and ability to provide in-depth\n    insights into application processes. Core components include **OpenAI**, **Logfire**,\n    **LLM Observability**, and integration with Pydantic.\nblog/posts/matching-language.md:\n  ai_references: []\n  cross_links: []\n  hash: d3478db3ed6545cb29034b23ad22a955\n  keywords:\n  - '[multilingual summarization'\n  - language detection\n  - Pydantic\n  - langdetect\n  - language models\n  - data validation\n  - summaries\n  - language match\n  - AI\n  - machine learning]\n  references: []\n  summary: This documentation explores methods to ensure that language models generate\n    summaries in the same language as the source text, leveraging Pydantic for validation\n    and langdetect for language identification. By integrating these techniques, the\n    accuracy of multilingual summarization improves significantly.\n  topics:\n  - '[language model optimization'\n  - summary generation\n  - language detection methods\n  - Pydantic usage\n  - multilingual data handling]\nblog/posts/migrating-to-uv.md:\n  cross_links: []\n  hash: 226ee4a165a8d84023029357089b8443\n  references: []\n  summary: This article details the migration from Poetry to UV for dependency management\n    and build automation in a Python project. The author highlights UV's faster CI/CD\n    performance, automatic caching, cargo-style lockfiles, and easier adoption of\n    new PEP features. The article provides a step-by-step guide to converting Poetry\n    lockfiles using UV, updating build configurations to use hatchling, and modifying\n    GitHub Actions workflows to implement UV commands like `uv sync` and `uv run`.\n    Overall, the transition resulted in a ~3x speed increase in CI jobs, simplifying\n    dependency management and enhancing development efficiency. Keywords include UV,\n    Poetry migration, dependency management, CI/CD speedup, Python, build automation,\n    UV lockfile, GitHub actions.\nblog/posts/multimodal-gemini.md:\n  ai_references:\n  - '[concepts/multimodal'\n  - concepts/images\n  - integrations/google\n  - openai-multimodal\n  - structured-output-anthropic\n  - chat-with-your-pdf-with-gemini]\n  cross_links:\n  - blog/posts/openai-multimodal.md\n  - blog/posts/structured-output-anthropic.md\n  - integrations/google.md\n  hash: 4d4d4773381b446dfd30f7438ec93e7a\n  keywords:\n  - '[Gemini'\n  - Multimodal AI\n  - Travel Recommendations\n  - Pydantic\n  - Python\n  - Video Analysis\n  - Structured Extraction\n  - Recommendations]\n  references:\n  - blog/posts/concepts/multimodal/index.md\n  - blog/posts/concepts/images/index.md\n  - blog/posts/integrations/google/index.md\n  - blog/posts/openai-multimodal/index.md\n  - blog/posts/structured-output-anthropic/index.md\n  - blog/posts/chat-with-your-pdf-with-gemini/index.md\n  summary: This documentation provides a comprehensive guide on utilizing Google's\n    Gemini model for multimodal structured extraction from YouTube travel videos,\n    enabling users to derive structured recommendations for tourist destinations.\n    By integrating video analysis with Pydantic data models, users can effectively\n    extract and organize travel information for enhanced user experiences.\n  topics:\n  - '[Gemini Model'\n  - Video Processing\n  - Pydantic Data Models\n  - Travel Recommendations\n  - Multimodal AI Applications]\nblog/posts/open_source.md:\n  cross_links:\n  - concepts/patching.md\n  - integrations/groq.md\n  - integrations/llama-cpp-python.md\n  - integrations/mistral.md\n  - integrations/ollama.md\n  - integrations/together.md\n  hash: b3cb29bb72d1746982e2bb01087f8cdf\n  references:\n  - integrations/llama-cpp-python.md\n  - concepts/patching.md\n  - integrations/ollama.md\n  - integrations/groq.md\n  - integrations/together.md\n  - concepts/patching.md\n  - integrations/mistral.md\n  summary: This article explores Instructor's enhanced capabilities for integrating\n    with a variety of open source and local large language models (LLMs), including\n    OpenAI, Ollama, llama-cpp-python, Groq, Together AI, and Mistral. It highlights\n    how Instructor supports structured data extraction and outputs through JSON mode\n    and JSON schema, utilizing Pydantic for data validation. Key features include\n    model patching, multi-platform compatibility, and simplified API interactions\n    for in-process and remote models. The content emphasizes adaptability in AI workflows,\n    offering practical code examples for implementing structured outputs with different\n    providers, aiming to streamline AI development and improve model control. Core\n    keywords include Instructor, structured outputs, LLMs, OpenAI, Pydantic, JSON\n    schema, Ollama, llama-cpp-python, Groq, Together AI, Mistral, API integration,\n    local models, AI development.\nblog/posts/openai-distilation-store.md:\n  cross_links: []\n  hash: f192d6f81e391bb953541405d9656871\n  references: []\n  summary: OpenAI's API Model Distillation with Instructor enables developers to create\n    smaller, efficient, and specialized AI models tailored to specific tasks. By combining\n    Instructor's structured output capabilities with API Model Distillation, users\n    can produce validated, consistent results while reducing latency and costs. The\n    integration supports metadata, proxy kwargs, and seamlessly leverages OpenAI's\n    API parameters, enhancing workflow flexibility. This approach improves model efficiency,\n    precision, and scalability for AI applications, making it ideal for personalized\n    and high-performance implementations. Key words include API Model Distillation,\n    Instructor, openAI, structured output, model optimization, AI efficiency, and\n    customized AI models.\nblog/posts/openai-multimodal.md:\n  ai_references:\n  - '[Multimodal Guide](/concepts/multimodal)'\n  - '[OpenAI Integration](/integrations/openai)'\n  - '[Gemini Multimodal](multimodal-gemini)'\n  - '[Prompt Caching](anthropic-prompt-caching)'\n  - '[Monitoring with Logfire](logfire)'\n  cross_links: []\n  hash: dfb11af3ff9283e4bd538a1cb2b2b19d\n  keywords:\n  - OpenAI\n  - Chat Completions API\n  - audio processing\n  - gpt-4o-audio-preview\n  - natural voices\n  - audio input\n  - machine learning\n  - accessibility features\n  references:\n  - blog/posts/concepts/multimodal/index.md\n  - blog/posts/integrations/openai/index.md\n  - blog/posts/multimodal-gemini/index.md\n  - blog/posts/anthropic-prompt-caching/index.md\n  - blog/posts/logfire/index.md\n  summary: OpenAI has launched audio capabilities in its Chat Completions API, utilizing\n    the new `gpt-4o-audio-preview` model. This update allows developers to process\n    audio and text inputs flexibly, enhancing user interaction through natural voice\n    generation and integrated tool functionality.\n  topics:\n  - audio support\n  - key features\n  - practical implementation\n  - use cases\n  - considerations\nblog/posts/pairwise-llm-judge.md:\n  cross_links: []\n  hash: 306360d9c8a466ffc3083651c8c295df\n  references: []\n  summary: The article explores how to create a pairwise LLM judge utilizing the Instructor\n    library and Pydantic to evaluate text relevance, demonstrating a practical application\n    of structured outputs in language model interactions. It provides a detailed guide\n    on setting up the environment, defining a `Judgment` model using Pydantic for\n    structured results, and developing a function to assess the relevance between\n    a question and a text using OpenAI's GPT-4 model. This tool, beneficial for improving\n    search relevance, evaluating question-answering systems, and aiding content recommendation\n    algorithms, highlights the potential of combining structured outputs with large\n    language models for creating intelligent AI systems. Key concepts include LLM,\n    text relevance, AI evaluation, structured outputs, and Pydantic.\nblog/posts/parea.md:\n  cross_links: []\n  hash: 3384d1bea79b6e46e8b6c9e6681cc1cf\n  references: []\n  summary: 'The blog post explores how the Parea platform enhances the OpenAI instructor\n    client by improving monitoring, collaboration, testing, and error tracking for\n    LLM applications. Core features include automatic grouping of retries into a single\n    trace, tracking validation error counts, and providing a UI for labeling JSON\n    responses. It demonstrates using Parea with the OpenAI instructor to write emails\n    containing links from instructor documentation, emphasizes validation error tracking\n    for minimizing costs and latency, and highlights a labeling feature for fine-tuning\n    using subject-matter experts. Keywords: Parea, OpenAI, LLM, instructor, validation,\n    fine-tuning, error tracking, collaboration.'\nblog/posts/pydantic-is-still-all-you-need.md:\n  ai_references:\n  - '[Data Validation with Pydantic](../../concepts/models.md)'\n  - '[Ollama Integration](../../integrations/ollama.md)'\n  - '[llama-cpp-python Integration](../../integrations/llama-cpp-python.md)'\n  - '[Anthropic Integration](../../integrations/anthropic.md)'\n  - '[Cohere Integration](../../integrations/cohere.md)'\n  - '[Google Integration](../../integrations/google.md)'\n  - '[Vertex AI Integration](../../integrations/vertex.md)'\n  - '[Streaming Support](../../concepts/partial.md)'\n  - '[Partial Documentation](../../concepts/partial.md)'\n  - '[Reasking and Validation](../../concepts/reask_validation.md)'\n  - '[Structured Data Extraction from Images](../../examples/image_to_ad_copy.md)'\n  - '[examples](../../examples/index.md)'\n  - '[Instructor Philosophy](/concepts/philosophy)'\n  - '[Validation Guide](/concepts/validation)'\n  - '[Validation Deep Dive](validation-part1)'\n  - '[Best Framework Comparison](best_framework)'\n  - '[Introduction to Instructor](introduction)'\n  cross_links:\n  - concepts/models.md\n  - concepts/partial.md\n  - concepts/reask_validation.md\n  - examples/image_to_ad_copy.md\n  - examples/index.md\n  - index.md\n  - integrations/anthropic.md\n  - integrations/cohere.md\n  - integrations/google.md\n  - integrations/llama-cpp-python.md\n  - integrations/ollama.md\n  - integrations/vertex.md\n  hash: 7aee5b3518acc01228f94114cd940d56\n  keywords:\n  - Pydantic\n  - Structured Outputs\n  - Data Validation\n  - LLM Techniques\n  - Performance Optimization\n  - APIs\n  - Function Calling\n  - Generative UI\n  - Streaming\n  references:\n  - concepts/models.md\n  - integrations/ollama.md\n  - integrations/llama-cpp-python.md\n  - integrations/anthropic.md\n  - integrations/cohere.md\n  - integrations/google.md\n  - integrations/vertex.md\n  - concepts/partial.md\n  - concepts/partial.md\n  - concepts/reask_validation.md\n  - examples/image_to_ad_copy.md\n  - examples/index.md\n  - blog/posts/concepts/philosophy/index.md\n  - blog/posts/concepts/validation/index.md\n  - blog/posts/validation-part1/index.md\n  - blog/posts/best_framework/index.md\n  - blog/posts/introduction/index.md\n  summary: This documentation highlights the advantages of using Pydantic for structured\n    outputs in language model applications. It emphasizes improved data management,\n    reliability, and performance optimization by leveraging Pydantic's features such\n    as validation and modular structures.\n  topics: []\nblog/posts/rag-and-beyond.md:\n  ai_references:\n  - '[validation.md'\n  - llm-as-reranker.md\n  - citations.md\n  - chat-with-your-pdf-with-gemini.md]\n  cross_links:\n  - blog/posts/citations.md\n  - blog/posts/generating-pdf-citations.md\n  - blog/posts/llm-as-reranker.md\n  - examples/exact_citations.md\n  hash: 6ebc57a8dc30b182b29b88b7b7e09b39\n  keywords:\n  - '[Retrieval Augmented Generation'\n  - query understanding\n  - LLMs\n  - Pydantic\n  - search optimization\n  - information retrieval\n  - Python\n  - data modeling]\n  references:\n  - blog/posts/concepts/validation/index.md\n  - blog/posts/llm-as-reranker/index.md\n  - blog/posts/citations/index.md\n  - blog/posts/chat-with-your-pdf-with-gemini/index.md\n  summary: This documentation explores enhancing Retrieval Augmented Generation (RAG)\n    through improved query understanding to facilitate smarter search solutions. It\n    outlines the limitations of basic RAG models and introduces advanced techniques\n    for crafting tailored queries that leverage multiple search backends, thereby\n    improving the retrieval performance in applications like personal assistants and\n    search optimizations.\n  topics:\n  - '[RAG Model'\n  - Query Understanding\n  - Search Backends\n  - Case Studies\n  - Pydantic Integration]\nblog/posts/rag-timelines.md:\n  cross_links: []\n  hash: 38763a866b0564e24d4eadb49e515684\n  references: []\n  summary: This article explores enhancing retrieval-augmented generation (RAG) systems\n    with time filtering using the Python library Instructor and Pydantic models. It\n    discusses how to effectively handle time-based constraints in queries, such as\n    those asking for information \"from the past week.\" By using Pydantic to model\n    time filters and Instructor to integrate large language models (LLMs), developers\n    can provide accurate, relevant responses to temporal queries. The article also\n    addresses the nuances of handling dates and time zones, emphasizing the importance\n    of standardizing and validating these aspects for consistent system performance.\n    Key techniques include defining structured output models, prompting LLMs to generate\n    query objects, and managing date-related complexities.\nblog/posts/semantic-validation-structured-outputs.md:\n  ai_references:\n  - '[Semantic Validation documentation](https://python.useinstructor.com/concepts/semantic_validation/)'\n  - '[Validation Fundamentals](/concepts/validation)'\n  - '[LLM Validation](/concepts/llm_validation)'\n  - '[Validation Deep Dive](validation-part1)'\n  - '[Anthropic Prompt Caching](anthropic-prompt-caching)'\n  - '[Monitoring with Logfire](logfire)'\n  cross_links: []\n  hash: dc3c6a4efc89c2c049393c852c9a106a\n  keywords:\n  - Semantic Validation\n  - LLMs\n  - Structured Outputs\n  - Pydantic\n  - Data Quality\n  - Instructor API\n  - Validation Strategies\n  references:\n  - blog/posts/concepts/validation/index.md\n  - blog/posts/concepts/llm_validation/index.md\n  - blog/posts/validation-part1/index.md\n  - blog/posts/anthropic-prompt-caching/index.md\n  - blog/posts/logfire/index.md\n  summary: Discover how semantic validation with LLMs enhances the evaluation of structured\n    outputs by incorporating complex, subjective, and contextual criteria beyond traditional\n    rule-based systems. This innovative approach is vital for ensuring quality and\n    safety in applications leveraging natural language processing.\n  topics: []\nblog/posts/situate-context.md:\n  cross_links:\n  - blog/posts/learn-async.md\n  hash: 89cec5544c213f53918318c2b2ba37f9\n  references:\n  - blog/posts/learn-async.md\n  summary: 'Learn about implementing Anthropic''s Contextual Retrieval technique to\n    enhance Retrieval-Augmented Generation (RAG) systems using async processing for\n    performance optimization. The technique addresses context loss when documents\n    are chunked, by adding explanatory context before embedding, improving search\n    retrieval. The implementation utilizes async processing with Python to process\n    document chunks concurrently, achieving significant retrieval failure rate reductions.\n    Key features include structured output with Pydantic models, prompt caching, and\n    efficient chunking methods. This approach is ideal for optimizing RAG systems\n    with improved contextual understanding and retrieval efficiency. Keywords: Contextual\n    Retrieval, Async Processing, RAG Systems, Document Chunking, Performance Optimization.'\nblog/posts/string-based-init.md:\n  ai_references: []\n  cross_links: []\n  hash: 6f5961ec4076927835b157fad2542b23\n  keywords:\n  - Unified provider interface\n  - string-based initialization\n  - LLM providers\n  - consistent interface\n  - model switching\n  - error handling\n  - environment variables\n  - asynchronous clients\n  references: []\n  summary: The Unified Provider Interface with String-Based Initialization simplifies\n    the process of working with various LLM providers by allowing users to initialize\n    models using a consistent string format. This approach increases code portability\n    and reduces the complexity of switching between different providers, making it\n    easy to manage structured outputs.\n  topics:\n  - Initialization of LLM providers\n  - benefits of string-based initialization\n  - supported providers\n  - error handling and troubleshooting\n  - environment variable support\nblog/posts/structured-output-anthropic.md:\n  ai_references:\n  - '[How Patching Works](/concepts/patching)'\n  - '[Anthropic Integration](/integrations/anthropic)'\n  - '[Anthropic Prompt Caching](anthropic-prompt-caching)'\n  - '[Unified Provider Interface](announcing-unified-provider-interface)'\n  - '[Framework Comparison](best_framework)'\n  cross_links: []\n  hash: fa7532f861f82b3de44245cc6fae6dae\n  keywords:\n  - Anthropic\n  - Claude\n  - Instructor\n  - structured outputs\n  - prompt caching\n  - API Development\n  - Pydantic\n  - Python\n  - LLM Techniques\n  references:\n  - blog/posts/concepts/patching/index.md\n  - blog/posts/integrations/anthropic/index.md\n  - blog/posts/anthropic-prompt-caching/index.md\n  - blog/posts/announcing-unified-provider-interface/index.md\n  - blog/posts/best_framework/index.md\n  summary: This guide explores how to utilize Anthropic's Claude with Instructor for\n    structured outputs and prompt caching, enhancing AI application development. By\n    integrating Pydantic models and leveraging prompt caching, developers can achieve\n    efficiency and cost savings in their AI projects.\n  topics:\n  - Structured Outputs\n  - Prompt Caching\n  - API Integration\n  - Pydantic Models\n  - AI Application Development\nblog/posts/tidy-data-from-messy-tables.md:\n  cross_links:\n  - index.md\n  hash: bb66ca67fa1b7f8e98d10be0f9aff2e1\n  references:\n  - index.md\n  summary: \"This article discusses how to convert messy, unstructured tables into\\\n    \\ tidy data using the instructor tool with structured outputs, simplifying data\\\n    \\ cleaning and analysis. It highlights common issues with messy exports\\u2014\\\n    such as merged cells, implicit relationships, and mixed data types\\u2014and demonstrates\\\n    \\ how defining custom types and leveraging AI-powered extraction can automatically\\\n    \\ produce clean pandas DataFrames. The approach enables efficient processing of\\\n    \\ multiple tables from images, facilitating seamless integration with data analysis\\\n    \\ and visualization workflows. Key concepts include data tidying, structured outputs,\\\n    \\ pandas, AI-driven data extraction, and productivity in data analysis pipelines.\"\nblog/posts/timestamp.md:\n  cross_links:\n  - blog/posts/matching-language.md\n  hash: 1c148db378a535746af59ac0dd3c1cfb\n  references:\n  - blog/posts/matching-language.md\n  summary: This article discusses solving timestamp format inconsistencies in video\n    content parsing using Pydantic for data validation and a custom parser. It addresses\n    the challenge of varying timestamp formats like \"HH:MM:SS\" and \"MM:SS,\" which\n    can cause errors in language model outputs, especially in video processing and\n    NLP tasks. The solution involves defining expected formats and using a custom\n    validator to normalize timestamps to a consistent \"HH:MM:SS\" structure, which\n    reduces ambiguity and parsing errors. This method offers a robust framework for\n    handling this common issue, outperforming alternative approaches like constrained\n    sampling and simple JSON schema validation. The post includes test cases to demonstrate\n    the solution's effectiveness. Key terms include timestamp, Pydantic, data validation,\n    video processing, and NLP.\nblog/posts/using_json.md:\n  cross_links:\n  - concepts/lists.md\n  - concepts/partial.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - integrations/llama-cpp-python.md\n  - integrations/ollama.md\n  - integrations/together.md\n  hash: c38638ce4dbfc143d9de932bda098e96\n  references:\n  - integrations/together.md\n  - integrations/ollama.md\n  - integrations/llama-cpp-python.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/lists.md\n  - concepts/partial.md\n  summary: Instructor is a Python library that simplifies extracting well-structured\n    JSON data from Large Language Models (LLMs) like GPT-3.5, GPT-4, and open-source\n    models using Pydantic models. It offers seamless integration with the OpenAI SDK,\n    enabling developers to map LLM outputs to validated, type-enforced JSON structures\n    with minimal syntax learning. Instructor emphasizes ease of use, validation, and\n    serialization, making it ideal for working with complex JSON data in LLM applications.\n    Key features include support for multiple programming languages, validation, retries,\n    streaming responses, and compatibility with various LLM platforms, making it a\n    powerful tool for developers seeking reliable JSON output extraction from LLMs.\nblog/posts/validation-part1.md:\n  ai_references:\n  - '[concepts/validation'\n  - concepts/reask_validation\n  - semantic-validation-structured-outputs\n  - bad-schemas-could-break-llms\n  - pydantic-is-still-all-you-need]\n  cross_links:\n  - blog/posts/bad-schemas-could-break-llms.md\n  - blog/posts/semantic-validation-structured-outputs.md\n  - concepts/reask_validation.md\n  hash: c4181c084569e3181494b163bdc2af05\n  keywords:\n  - '[Pydantic'\n  - validation\n  - machine learning\n  - software reliability\n  - dynamic validation\n  - Instructor\n  - LLM\n  - Python\n  - software development]\n  references:\n  - blog/posts/concepts/validation/index.md\n  - blog/posts/concepts/reask_validation/index.md\n  - blog/posts/semantic-validation-structured-outputs/index.md\n  - blog/posts/bad-schemas-could-break-llms/index.md\n  - blog/posts/pydantic-is-still-all-you-need/index.md\n  summary: This documentation discusses the integration of dynamic, machine learning-driven\n    validation using Python's Pydantic and Instructor to improve software reliability.\n    It outlines methods to enhance validation processes, including the creation of\n    custom validators powered by language models, thereby transitioning from traditional\n    static validation techniques to a more adaptive approach.\n  topics:\n  - '[dynamic validation'\n  - Pydantic usage\n  - LLM integration\n  - software reliability\n  - error handling]\nblog/posts/version-1.md:\n  cross_links:\n  - blog/posts/best_framework.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - contributing.md\n  - why.md\n  hash: a3436323e8334df26966f3b6ecf07788\n  references:\n  - why.md\n  - blog/posts/best_framework.md\n  - concepts/retrying.md\n  - concepts/reask_validation.md\n  - contributing.md\n  summary: The announcement introduces Instructor 1.0.0, a simplified API for interfacing\n    with OpenAI that enhances usability by providing improved typing support, data\n    validation, and streamlined integration while maintaining compatibility with existing\n    standards. Key features include the introduction of `instructor.from_openai` for\n    client creation, consistent handling of default arguments, and support for type\n    inference with methods like `create_with_completion`, `create_partial`, and `create_iterable`.\n    With robust validation and error handling, the tool is designed to support multiple\n    languages, maintaining ease of use across platforms. Popular amongst developers,\n    Instructor boasts over 4000 GitHub stars and 120k monthly downloads. Key keywords\n    include API Development, OpenAI, Data Validation, Python, and LLM Techniques.\nblog/posts/why-care-about-mcps.md:\n  cross_links: []\n  hash: 12f0fc031ffca52b4b3526c950d51777\n  references: []\n  summary: \"The article provides a detailed overview of the Model Context Protocol\\\n    \\ (MCP), a standardized protocol developed by Anthropic to facilitate the interaction\\\n    \\ between AI models and external systems. It highlights the importance of MCP\\\n    \\ in solving integration challenges by transforming the complex M\\xD7N problem\\\n    \\ into a simplified M+N problem, allowing seamless integration of AI applications\\\n    \\ with various tools. The article compares MCP with OpenAPI, underscoring MCP's\\\n    \\ role in enabling AI models to autonomously discover and utilize tools with semantic\\\n    \\ understanding, as opposed to OpenAPI's focus on human developers. Additionally,\\\n    \\ it outlines growing adoption, development tips, and the practical applications\\\n    \\ of MCP with platforms like Claude Desktop, Cursor, and OpenAI's Agent SDK. Keywords\\\n    \\ include Model Context Protocol, MCP, AI integration, OpenAI, Anthropic, OpenAPI,\\\n    \\ and AI standardization.\"\nblog/posts/writer-support.md:\n  cross_links: []\n  hash: 90cad38cf2523db99ce9dd0f6d00fcb3\n  references: []\n  summary: The article announces the integration of Writer's enterprise-grade LLMs,\n    including the Palmyra X 004 model, with the Instructor platform to enable structured\n    outputs and enterprise AI workflows. It explains how to set up the integration,\n    generate structured data extraction, and stream responses for improved responsiveness.\n    Key features include automatic request retries, support for async processing,\n    and usage examples for data extraction, classification, and validation. Keywords\n    include Writer, Instructor, enterprise AI, structured outputs, Palmyra X 004,\n    API integration, streaming, retries, and AI workflows.\nblog/posts/youtube-flashcards.md:\n  ai_references:\n  - '[youtube-transcripts.md'\n  - ../../examples/exact_citations.md\n  - ../../examples/knowledge_graph.md\n  - ../../concepts/retrying.md\n  - https://burr.dagworks.io/examples/deployment/web-server/\n  - https://burr.dagworks.io/concepts/state-persistence/\n  - https://burr.dagworks.io/concepts/additional-visibility/\n  - https://burr.dagworks.io/concepts/streaming-actions/]\n  cross_links:\n  - blog/posts/youtube-transcripts.md\n  - concepts/retrying.md\n  - examples/exact_citations.md\n  - examples/knowledge_graph.md\n  hash: 885c1f1a27cca5ec2eeaa7d0bad3951f\n  keywords:\n  - flashcard generator\n  - Instructor\n  - Burr\n  - LLM\n  - YouTube transcripts\n  - OpenAI\n  - data processing\n  - observability\n  - application development\n  - Python\n  references:\n  - blog/posts/youtube-transcripts.md\n  - examples/exact_citations.md\n  - examples/knowledge_graph.md\n  - concepts/retrying.md\n  summary: This blog post demonstrates how to create a flashcard generator application\n    using Instructor and Burr, leveraging LLMs to produce structured question-answer\n    pairs from YouTube transcripts. The process involves defining output models, retrieving\n    video transcripts, and utilizing the Burr framework to build an interactive application\n    for enhanced learning experiences.\n  topics: []\nblog/posts/youtube-transcripts.md:\n  cross_links: []\n  hash: f6904e13b76dc8a15942b76c76104f90\n  references: []\n  summary: This article outlines how to extract and summarize YouTube video transcripts\n    into structured chapters using Python, Pydantic, and OpenAI's GPT models. It demonstrates\n    how to fetch transcripts with the `youtube_transcript_api`, define Pydantic models\n    for chapters and other content types, and generate detailed chapter summaries\n    with AI. The tutorial focuses on analyzing video content, creating adaptable data\n    models for study notes, content summaries, and quizzes, enhancing content organization\n    and application development for video summarization, data processing, and AI-powered\n    content analysis. Key keywords include YouTube transcripts, Python, Pydantic,\n    GPT, data processing, video summarization, and AI applications.\ncli/batch.md:\n  ai_references: []\n  cross_links: []\n  hash: 15ff29a13a9e380bdd9396887977adb9\n  keywords:\n  - '[OpenAI CLI'\n  - batch jobs\n  - manage jobs\n  - cancel job\n  - create job\n  - download results\n  - Anthropic\n  - command line interface]\n  references: []\n  summary: This documentation provides a guide on managing batch jobs using the OpenAI\n    Command Line Interface (CLI), detailing commands for creating, listing, canceling,\n    and downloading batch jobs. It highlights dual support for both OpenAI and Anthropic\n    platforms, enabling efficient job management suited to user needs.\n  topics:\n  - '[Batch Job Management'\n  - CLI Commands\n  - OpenAI\n  - Anthropic\n  - Job Creation and Handling]\ncli/finetune.md:\n  ai_references: []\n  cross_links: []\n  hash: a54a9cf44d3d0e7830eb2d66a854c720\n  keywords:\n  - Instructor CLI\n  - fine-tuning jobs\n  - OpenAI\n  - command line interface\n  - job management\n  - upload files\n  - training models\n  - monitoring jobs\n  references: []\n  summary: This documentation provides an overview of managing fine-tuning jobs using\n    the Instructor CLI for OpenAI, detailing essential commands and options to create,\n    view, and manage these jobs effectively. Users can easily upload files for training,\n    monitor job statuses, and contribute to the development of the CLI tool.\n  topics:\n  - Managing Fine-Tuning Jobs\n  - Creating Fine-Tuning Jobs\n  - Viewing Files and Jobs\n  - CLI Commands\ncli/index.md:\n  cross_links:\n  - cli/finetune.md\n  - cli/usage.md\n  hash: 8331441083b208ef53688aa8ca292269\n  references:\n  - cli/usage.md\n  - cli/finetune.md\n  - cli/usage.md\n  - cli/finetune.md\n  - cli/usage.md\n  - cli/finetune.md\n  summary: 'The Instructor CLI Tools offer a suite of command-line utilities designed\n    to enhance workflows when using OpenAI''s API by monitoring usage, fine-tuning\n    models, and accessing documentation. Key features include commands for tracking\n    API usage and costs, creating and managing fine-tuned models, and quick access\n    to documentation directly from the terminal. Users can install the tools via `pip\n    install instructor` and must set the OpenAI API key as an environment variable.\n    Additional resources and support are available through GitHub and the community\n    Discord. Keywords: Instructor CLI Tools, command-line utilities, OpenAI API, usage\n    monitoring, model fine-tuning, documentation access.'\ncli/usage.md:\n  cross_links: []\n  hash: 95aa3f140fe59a144287c98679c27c15\n  references: []\n  summary: 'The OpenAI API Usage CLI Guide provides detailed instructions on monitoring\n    OpenAI API usage using a command-line interface tool. This tool allows users to\n    track API usage by model, date, and cost, offering commands like `list` to display\n    usage data over the past few days. Key features include listing usage for a specified\n    number of days and checking today''s usage. The guide also invites users to contribute\n    to the development of this utility via GitHub. Keywords: OpenAI API, CLI tool,\n    API usage monitoring, command-line interface, OpenAI models, usage tracking, GitHub\n    contribution.'\nconcepts/alias.md:\n  cross_links: []\n  hash: 8c7fc8fbbe513d178333a7986a8227bb\n  references: []\n  summary: This overview highlights the use of aliases in Pydantic for improved data\n    validation and model serialization. It explains how aliases enable mapping between\n    external data field names and internal model attributes, facilitating seamless\n    data parsing. The page emphasizes exploring Pydantic's latest features and documentation\n    related to aliases, essential for efficient data handling and validation in Python\n    applications. Key concepts include alias definition, usage, and best practices\n    for leveraging aliases to enhance data model flexibility.\nconcepts/caching.md:\n  cross_links:\n  - blog/posts/caching.md\n  hash: ac0e8043ff4b03799692dbd4910d2e64\n  references:\n  - blog/posts/caching.md\n  summary: This guide explores various Python caching techniques including in-memory,\n    disk-based, and Redis caching to optimize application performance. It covers the\n    use of `functools.cache` for simple in-memory caching, ideal for small to medium\n    applications with immutable arguments. Additionally, it demonstrates persistent\n    caching with `diskcache` and distributed caching with Redis, both utilizing a\n    shared `instructor_cache` decorator that serializes Pydantic models for efficient\n    data storage. Key concepts include cache invalidation considerations, cache key\n    generation, and serialization techniques, making these methods suitable for reducing\n    computation time, handling large datasets, and supporting scalable, distributed\n    systems. Core keywords include Python caching, in-memory cache, diskcache, Redis,\n    Pydantic, cache decorators, performance optimization, and persistent storage.\nconcepts/dictionary_operations.md:\n  ai_references: []\n  cross_links: []\n  hash: cb4a0b1f3bdaf4825aea51d32aead1ef\n  keywords:\n  - dictionary operations\n  - performance optimization\n  - message extraction\n  - retry functions\n  - message handler\n  - system message handling\n  references: []\n  summary: This document details the optimizations made to dictionary operations in\n    the Instructor codebase, focusing on functions related to message passing and\n    configuration management. Enhancements such as direct key lookups and reduced\n    overhead have led to significant performance improvements in high-throughput applications.\n  topics:\n  - dictionary operation optimizations\n  - message extraction improvements\n  - retry function enhancements\n  - performance benchmarks\n  - testing methodologies\nconcepts/distillation.md:\n  cross_links: []\n  hash: 88f400b35fb27b4235f08e4c61053267\n  references: []\n  summary: 'The article introduces Instructor''s `Instructions` library for seamless\n    fine-tuning of Python functions with language models like GPT-3.5-turbo. It explains\n    how to automate dataset creation for model training by annotating functions that\n    return Pydantic objects, simplifying the fine-tuning process, and logging outputs\n    for efficient data management. The approach enables distilling function behavior\n    into model weights, facilitating backward compatibility and model-switching via\n    the `dispatch` mode. Key features include streamlined data preparation, automatic\n    dataset generation, and easy integration for function-level fine-tuning, making\n    Instructor a powerful tool for optimizing language models in Python applications.\n    Keywords: Instructor, Instructions, fine-tuning, Python functions, language models,\n    GPT-3.5, distillation, Pydantic, model training, dataset automation, function\n    calling, backward compatibility.'\nconcepts/enums.md:\n  cross_links: []\n  hash: 727e8787171ecd5104e0689e1d83184c\n  references: []\n  summary: The article discusses using Enums and Literals in Pydantic for effective\n    role management, highlighting their role in preventing data misalignment by standardizing\n    user roles. Key topics include the implementation of Enums with a fallback \"Other\"\n    option to handle uncertainties, and an alternative approach using Literals for\n    role definitions. Core ideas emphasize the importance of standardization and flexibility\n    in model design, specifically for roles like \"PRINCIPAL\", \"TEACHER\", \"STUDENT\",\n    and \"OTHER\". Keywords include Enums, Literals, Pydantic, role management, data\n    standardization, and fallback options.\nconcepts/error_handling.md:\n  cross_links:\n  - concepts/hooks.md\n  - concepts/retrying.md\n  - concepts/validation.md\n  hash: 5007d7c8abe6942912b823c5e9d22130\n  references:\n  - concepts/retrying.md\n  - concepts/validation.md\n  - concepts/hooks.md\n  summary: This guide on Error Handling in Instructor provides a comprehensive overview\n    of managing exceptions and errors when using Instructor for structured outputs.\n    It details the exception hierarchy, including `InstructorError` and specific exceptions\n    like `IncompleteOutputException`, `InstructorRetryException`, `ValidationError`,\n    `ProviderError`, `ConfigurationError`, `ModeError`, and `ClientError`. The content\n    offers best practices for catching specific exceptions, handling provider and\n    configuration errors, logging, graceful degradation, and integrating hooks for\n    error monitoring. Key concepts include exception hierarchy, error handling strategies,\n    provider setup issues, validation failures, mode errors, and retry logic, ensuring\n    robust and resilient use of Instructor for AI model integrations. Keywords include\n    Instructor error handling, exceptions, validation, retries, provider errors, configuration\n    issues, hooks, and debugging.\nconcepts/fastapi.md:\n  cross_links: []\n  hash: 4a9d66d0b46d7f503078520ae02f08fa\n  references: []\n  summary: 'This guide explores how to integrate Pydantic models with FastAPI for\n    efficient API development. FastAPI is a high-performance Python web framework\n    known for its seamless Pydantic integration, automatic OpenAPI documentation,\n    and JSON Schema validation. The article provides code examples demonstrating how\n    to start a FastAPI app with POST requests, handle data with Pydantic models, and\n    implement streaming responses using FastAPI and large language models (LLMs).\n    Key features include automatic interactive API documentation accessible via a\n    `/docs` page, making API testing straightforward. SEO Keywords: FastAPI, Pydantic\n    models, API development, Python, OpenAPI, JSON Schema, streaming responses, AsyncIO.'\nconcepts/fields.md:\n  ai_references:\n  - '[fields.md]'\n  cross_links: []\n  hash: e65b44dd148bbd793a17c362400b05f6\n  keywords:\n  - Pydantic\n  - Field\n  - metadata\n  - JSON schema\n  - default values\n  - exclude\n  - Annotated\n  - customization\n  - model generation\n  references: []\n  summary: This documentation provides comprehensive guidance on customizing Pydantic\n    models using field metadata through the `Field` function. It covers setting default\n    values, excluding fields, omitting fields from schemas, and customizing JSON schema\n    properties to enhance model definitions effectively.\n  topics:\n  - Default values\n  - Exclude parameter\n  - Skipping fields in schemas\n  - JSON schema customization\n  - Using Annotated\nconcepts/hooks.md:\n  ai_references:\n  - '[instructor/hooks.py'\n  - instructor/retry.py]\n  cross_links: []\n  hash: 3bfaa1615e24ee4bfe165847f04e2f78\n  keywords:\n  - '[Instructor library'\n  - hooks\n  - event handling\n  - logging\n  - error handling\n  - custom hooks\n  - completion\n  - response]\n  references: []\n  summary: This documentation explains the use of hooks in the Instructor library\n    for managing event handling during API interactions. It details various hook events,\n    their implementation, types, and examples of usage for logging, error handling,\n    and creating custom hooks to enhance functionality.\n  topics:\n  - '[Overview of hooks'\n  - Supported hook events\n  - Implementation details\n  - Example usage\n  - Advanced custom hooks]\nconcepts/index.md:\n  ai_references:\n  - '[models.md'\n  - patching.md\n  - types.md\n  - validation.md\n  - prompting.md\n  - multimodal.md\n  - fields.md\n  - lists.md\n  - typeddicts.md\n  - unions.md\n  - enums.md\n  - maybe.md\n  - alias.md\n  - partial.md\n  - iterable.md\n  - raw_response.md\n  - retrying.md\n  - reask_validation.md\n  - hooks.md\n  - caching.md\n  - prompt_caching.md\n  - usage.md\n  - parallel.md\n  - fastapi.md\n  - typeadapter.md\n  - templating.md\n  - distillation.md\n  - philosophy.md\n  - examples/index.md\n  - getting-started.md\n  - integrations/index.md]\n  cross_links:\n  - api.md\n  - blog/posts/anthropic-prompt-caching.md\n  - blog/posts/caching.md\n  - blog/posts/openai-multimodal.md\n  - cli/usage.md\n  - concepts/alias.md\n  - concepts/caching.md\n  - concepts/distillation.md\n  - concepts/enums.md\n  - concepts/fastapi.md\n  - concepts/fields.md\n  - concepts/hooks.md\n  - concepts/iterable.md\n  - concepts/lists.md\n  - concepts/maybe.md\n  - concepts/models.md\n  - concepts/multimodal.md\n  - concepts/parallel.md\n  - concepts/partial.md\n  - concepts/patching.md\n  - concepts/philosophy.md\n  - concepts/prompt_caching.md\n  - concepts/prompting.md\n  - concepts/raw_response.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/semantic_validation.md\n  - concepts/templating.md\n  - concepts/typeadapter.md\n  - concepts/typeddicts.md\n  - concepts/types.md\n  - concepts/unions.md\n  - concepts/usage.md\n  - concepts/validation.md\n  - examples/index.md\n  - getting-started.md\n  - index.md\n  - integrations/index.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/optional_fields.md\n  - learning/streaming/lists.md\n  - learning/validation/field_level_validation.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md\n  - prompting/zero_shot/emotion_prompting.md\n  - prompting/zero_shot/role_prompting.md\n  - prompting/zero_shot/style_prompting.md\n  hash: c930b21dfb81d99009dc6a26057ba894\n  keywords:\n  - '[Instructor'\n  - Pydantic\n  - LLM clients\n  - data validation\n  - performance optimization\n  - streaming responses\n  - integration features\n  - error handling]\n  references:\n  - concepts/models.md\n  - concepts/patching.md\n  - concepts/types.md\n  - concepts/validation.md\n  - concepts/prompting.md\n  - concepts/multimodal.md\n  - concepts/fields.md\n  - concepts/lists.md\n  - concepts/typeddicts.md\n  - concepts/unions.md\n  - concepts/enums.md\n  - concepts/maybe.md\n  - concepts/alias.md\n  - concepts/partial.md\n  - concepts/iterable.md\n  - concepts/raw_response.md\n  - concepts/retrying.md\n  - concepts/reask_validation.md\n  - concepts/hooks.md\n  - concepts/caching.md\n  - concepts/prompt_caching.md\n  - concepts/usage.md\n  - concepts/parallel.md\n  - concepts/fastapi.md\n  - concepts/typeadapter.md\n  - concepts/templating.md\n  - concepts/distillation.md\n  - concepts/philosophy.md\n  - concepts/models.md\n  - concepts/patching.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/partial.md\n  - concepts/iterable.md\n  - concepts/caching.md\n  - concepts/usage.md\n  - examples/index.md\n  - getting-started.md\n  - examples/index.md\n  - integrations/index.md\n  summary: The Instructor library provides essential concepts and features for effectively\n    utilizing Pydantic models to manage structured outputs and stream responses from\n    LLM clients. This documentation covers core concepts, data handling, performance\n    optimization, and integration features essential for developers looking to enhance\n    their applications with robust validation and error handling.\n  topics:\n  - '[Core Concepts'\n  - Data Handling and Structures\n  - Streaming Features\n  - Error Handling and Validation\n  - Performance Optimization]\nconcepts/iterable.md:\n  ai_references: []\n  cross_links: []\n  hash: 08ea17041c45f8851c91538db7d24f85\n  keywords:\n  - '[structured data'\n  - Streaming\n  - Pydantic\n  - OpenAI\n  - Iterable\n  - create_iterable\n  - multi-task outputs\n  - asynchronous usage\n  - synchronous usage\n  - entity extraction]\n  references: []\n  summary: This document provides guidance on extracting structured data in Python\n    using Iterable and streaming techniques with Pydantic and OpenAI. It covers both\n    synchronous and asynchronous usage, highlighting best practices for implementing\n    the `create_iterable` method for efficient entity extraction and multi-task outputs.\n  topics:\n  - '[Iterable usage'\n  - Pydantic integration\n  - Synchronous and Asynchronous methods\n  - Entity extraction techniques\n  - Best practices for OpenAI API]\nconcepts/lists.md:\n  cross_links: []\n  hash: 87115c5871b7f897999d87d86cd68cbd\n  references: []\n  summary: This article explores advanced techniques for structured data extraction\n    in Python using iterable and streaming capabilities with Pydantic and OpenAI.\n    It demonstrates how to define schemas and utilize `Iterable[T]` for multi-task\n    extraction, enabling dynamic class creation, prompt generation, and efficient\n    token streaming. The guide also covers synchronous and asynchronous streaming\n    methods, showcasing examples with GPT-3.5 and GPT-4 models. Key concepts include\n    data serialization, real-time token processing, and leveraging instructor's API\n    for scalable, schema-based entity extraction in Python, making it ideal for developers\n    working on AI-driven data parsing and automation.\nconcepts/logging.md:\n  ai_references: []\n  cross_links: []\n  hash: b617e0bf45b01dbbe95601ea7228f2c9\n  keywords:\n  - OpenAI\n  - Python logging\n  - DEBUG level\n  - debugging\n  - chat completion\n  - logging setup\n  - user detail extraction\n  - instructor library\n  references: []\n  summary: This document provides a guide on how to enable DEBUG level logging for\n    OpenAI requests and responses in Python. By implementing efficient logging practices,\n    developers can enhance their debugging process and gain insight into the functionality\n    of their OpenAI queries.\n  topics:\n  - logging configuration\n  - debugging OpenAI requests\n  - Python implementation\n  - user detail model\n  - OpenAI chat completion\nconcepts/maybe.md:\n  cross_links: []\n  hash: 4e245b781d8f282eb06813ed10498526\n  references: []\n  summary: The article explores the implementation of the Maybe pattern for error\n    handling in functional programming using Python's Pydantic library. It focuses\n    on how the Maybe pattern can encapsulate results and potential errors without\n    resorting to exceptions or returning `None`, enhancing robust error handling.\n    The pattern is implemented in a Pydantic `MaybeUser` class, which includes fields\n    for the result, error status, and error message. This approach is particularly\n    useful for language model (LLM) calls, reducing hallucinations. A practical example\n    is provided, demonstrating how the pattern is used to extract user details from\n    text inputs. Key topics include functional programming, error handling, Pydantic,\n    Maybe pattern, and structural pattern matching.\nconcepts/models.md:\n  cross_links:\n  - blog/posts/rag-and-beyond.md\n  hash: 14c6638223e145cb56f78b01ad3c745f\n  references:\n  - blog/posts/rag-and-beyond.md\n  summary: This article explains how to use Pydantic for defining dynamic and static\n    response models for Large Language Models (LLMs), including creating schemas with\n    `BaseModel`, optional values, and runtime model generation with `create_model`.\n    It highlights how to use prompt annotations and docstrings for prompt generation,\n    validate API responses, and add custom behaviors or methods to models. Key concepts\n    include dynamic model creation based on database or configuration data, omitting\n    fields from prompts, and integrating custom logic for tailored LLM responses,\n    making Pydantic a flexible tool for managing LLM output schemas and response validation.\nconcepts/multimodal.md:\n  cross_links:\n  - integrations/genai.md\n  hash: 6b81751a99a294b562c47fcef3e3f496\n  references:\n  - integrations/genai.md\n  summary: 'The article discusses Instructor''s seamless multimodal interface for\n    handling images, PDFs, and audio files across various AI models like OpenAI, Anthropic,\n    Google GenAI, and Mistral. Key features include creating media instances from\n    URLs, file paths, and base64 strings, alongside automatic provider-specific formatting,\n    ensuring clean, adaptable code. The Image, Audio, and PDF classes simplify interaction\n    by abstracting differences among AI providers, while additional features like\n    Anthropic prompt caching and Google GenAI file support enhance functionality.\n    This comprehensive approach streamlines application development, emphasizing consistency,\n    efficiency, and adaptability across AI technologies. Key terms: multimodal interface,\n    AI models, image analysis, PDF parsing, audio processing, Anthropic caching, Google\n    GenAI, Instructor API.'\nconcepts/parallel.md:\n  cross_links: []\n  hash: ef1722f94742cadf3b5dbfa93d7c62f1\n  references: []\n  summary: OpenAI's experimental Parallel Function Calling enables developers to call\n    multiple functions simultaneously within a single request, significantly reducing\n    application latency. Supported currently by Google and OpenAI, this feature allows\n    for efficient execution of tools such as weather data retrieval and web searches\n    without needing complex parent schemas. Using specific modes like `PARALLEL_TOOLS`\n    for OpenAI and `VERTEXAI_PARALLEL_TOOLS` for Vertex AI, developers can specify\n    response models as iterables of multiple object types (e.g., Weather, GoogleSearch).\n    Key concepts include reduced latency, parallel tool execution, and dynamic response\n    handling with Pydantic models, making it an important optimization for AI-powered\n    applications.\nconcepts/partial.md:\n  cross_links: []\n  hash: d8cf2df0b922d2a39bf024aeabca278e\n  references: []\n  summary: This article explains how to use instructor and OpenAI for streaming partial\n    responses in Python, enabling incremental model outputs suitable for real-time\n    applications like UI rendering. It covers field-level streaming with `create_partial`,\n    handling incomplete data with `PartialLiteralMixin`, and managing response models\n    as generators that yield progressive updates. The guide highlights limitations\n    such as unsupported validators during streaming and provides practical examples,\n    including extracting conference information with asynchronous streaming support.\n    Key concepts include field-level partial responses, model streaming, generator-based\n    incremental updates, and integration with OpenAI's APIs for real-time data processing.\nconcepts/patching.md:\n  cross_links:\n  - concepts/parallel.md\n  - integrations/vertex.md\n  hash: 73bf8b99f5d3d3eb6601921d99f93932\n  references:\n  - integrations/vertex.md\n  - concepts/parallel.md\n  summary: The document discusses how the Instructor tool enhances Large Language\n    Model (LLM) client libraries by patching them to support structured outputs. Core\n    features include adding parameters like `response_model`, `max_retries`, and `validation_context`\n    to methods in the client, enabling structured responses. It outlines different\n    patching modes such as TOOL, GEMINI, and JSON for various LLM providers like OpenAI\n    and Gemini, helping ensure compatibility and improved data handling. Patching\n    is aimed at facilitating stable tool calling, managing validations, and providing\n    JSON outputs. Keywords include structured output, LLM client libraries, Instructor\n    tool, OpenAI, Gemini, patching, and tool calling.\nconcepts/philosophy.md:\n  ai_references: []\n  cross_links: []\n  hash: 9506a8bcecbdedb5e5b9c6098031e787\n  keywords:\n  - Instructor\n  - simplicity\n  - Pydantic\n  - LLMs\n  - composability\n  - observability\n  - vendor lock-in\n  - Python\n  references: []\n  summary: The Philosophy documentation of Instructor outlines its fundamental principles\n    emphasizing simplicity and developer familiarity. By leveraging existing knowledge\n    of frameworks like Pydantic, Instructor aims to minimize complexity while enhancing\n    observability and composability, ensuring developers maintain control and can\n    evolve their code naturally without fear of vendor lock-in.\n  topics:\n  - Philosophy of Instructor\n  - Developer Familiarity\n  - Observability and Debugging\n  - Composability of Code\n  - Avoiding Lock-in\nconcepts/prompt_caching.md:\n  cross_links:\n  - blog/posts/anthropic-prompt-caching.md\n  hash: 580600c0f70f02c1892b24456a32cdcc\n  references:\n  - blog/posts/anthropic-prompt-caching.md\n  summary: Prompt caching is an optimization feature in OpenAI and Anthropic APIs\n    that enhances performance and reduces costs by caching shared prompt segments.\n    In OpenAI, prompt caching works automatically for models like gpt-4o and gpt-4o-mini\n    with prefix matching, requiring no code changes. Anthropic's prompt caching, now\n    generally available, necessitates explicit use of the `cache_control` parameter\n    and is especially beneficial for large prompts exceeding token minimums (2048\n    tokens for Claude Haiku, 1024 for Claude Sonnet). This feature significantly lowers\n    response times and costs by enabling cache reuse during multiple API calls, making\n    it essential for efficient, large-scale language model applications. Key keywords\n    include prompt caching, API optimization, OpenAI, Anthropic, cost reduction, response\n    time, model models, cache management, and large prompt handling.\nconcepts/prompting.md:\n  cross_links: []\n  hash: e27dde9b271c8c6944f53125f39a0042\n  references: []\n  summary: The article provides a comprehensive guide on effective prompt engineering\n    using Pydantic and Instructor, focusing on enhancing modularity, flexibility,\n    and data integrity in Python models. Key strategies include designing self-descriptive\n    and reusable components, employing enums and literals for standardization, and\n    handling errors with the Maybe pattern. The guide also recommends using optional\n    attributes, reiterating long instructions, managing list lengths, and defining\n    entity relationships to improve data quality. By incorporating these practices,\n    developers can ensure better structure, clarity, and maintainability in their\n    applications.\nconcepts/raw_response.md:\n  cross_links: []\n  hash: 44557d68c40cf4d99ef68b41047544ef\n  references: []\n  summary: This guide provides a tutorial on creating custom models using OpenAI's\n    API with Python. It specifically demonstrates how to use the `instructor` library\n    to extract user data efficiently by integrating OpenAI's GPT model, such as \"gpt-3.5-turbo,\"\n    with Pydantic for response validation. The example illustrates extracting user\n    attributes like name and age from a text input using the `UserExtract` model.\n    Additionally, the tutorial explains accessing raw responses from Anthropic models\n    for debugging purposes. Key concepts include OpenAI completions, data extraction,\n    custom client, and Pydantic models.\nconcepts/reask_validation.md:\n  cross_links:\n  - examples/exact_citations.md\n  hash: eda13e17af5b47f10ddff3a58680307f\n  references:\n  - examples/exact_citations.md\n  summary: This article explores enhancing AI validation processes using Pydantic's\n    flexible validation framework for both code-based and LLM-based outputs. Key techniques\n    include defining custom validators, leveraging reasking with retry mechanisms,\n    and advanced validation methods like model-level validation and context-aware\n    checks. It emphasizes improving AI output accuracy, handling validation errors\n    effectively, and optimizing token usage by disabling URL links in error messages.\n    Core keywords include Pydantic, AI validation, LLM validation, reasking, validation\n    errors, JSON decoding, token optimization, and autonomous system improvement.\nconcepts/retrying.md:\n  ai_references:\n  - '[error_handling.md'\n  - validation.md\n  - async.md]\n  cross_links:\n  - concepts/error_handling.md\n  - concepts/reask_validation.md\n  - concepts/semantic_validation.md\n  - concepts/validation.md\n  - learning/patterns/field_validation.md\n  - learning/validation/field_level_validation.md\n  hash: 3d4bfd872b30538bfe5f7f3d124da08b\n  keywords:\n  - tenacity python\n  - python retry\n  - instructor retry logic\n  - exponential backoff\n  - python error handling\n  - LLM retry\n  - API retry\n  - python resilience\n  - automatic retries\n  - circuit breaker pattern\n  references:\n  - concepts/error_handling.md\n  - concepts/validation.md\n  - concepts/async.md\n  - concepts/error_handling.md\n  - concepts/async.md\n  summary: This comprehensive guide covers Python retry logic using the Tenacity library\n    and Instructor for handling various failure scenarios in LLM applications. It\n    details concepts such as exponential backoff, conditional retries, and logging\n    practices to ensure robust error handling and resilience in API interactions.\n  topics:\n  - Tenacity library\n  - Python error handling\n  - exponential backoff strategies\n  - conditional retries\n  - robust API integration\nconcepts/semantic_validation.md:\n  ai_references:\n  - '[validation.md'\n  - custom_validators.md\n  - api.md]\n  cross_links:\n  - api.md\n  - concepts/validation.md\n  - learning/validation/custom_validators.md\n  hash: 5de312bf6c73ce978ffc4ce041c00493\n  keywords:\n  - '[semantic validation'\n  - LLMs\n  - natural language criteria\n  - Instructor framework\n  - content moderation\n  - validation criteria]\n  references:\n  - concepts/validation.md\n  - learning/validation/custom_validators.md\n  summary: This guide explains how to implement semantic validation using LLMs in\n    the Instructor framework, allowing for validation against complex natural language\n    criteria. By leveraging LLM capabilities, it addresses situations where traditional\n    rule-based validation falls short, including subjective qualities and contextual\n    relationships in data.\n  topics:\n  - '[semantic validation'\n  - implementation with LLMs\n  - content moderation\n  - validation flow\n  - advanced validation patterns]\nconcepts/templating.md:\n  cross_links: []\n  hash: 8b3f459aae3b028d9cdfc85a670095de\n  references: []\n  summary: This guide explores effective prompt templating using Jinja and Pydantic\n    to create dynamic, secure, and maintainable prompts for AI models. It highlights\n    how to pass context variables for prompt rendering and validation, implement complex\n    logic with Jinja syntax, and integrate Pydantic validators for context-aware validation,\n    including handling sensitive data with SecretStr. Emphasis is placed on security\n    through sandboxed Jinja environments and best practices for managing sensitive\n    information, enabling flexible, secure, and scalable prompt engineering for AI\n    applications. Key keywords include prompt templating, Jinja, Pydantic, context\n    variables, validation, security, secrets, and dynamic prompts.\nconcepts/typeadapter.md:\n  cross_links: []\n  hash: 40fefdf3e9f6d305e1c2280d9fc8b944\n  references: []\n  summary: This page provides an overview of Pydantic's Type Adapter concepts, detailing\n    ongoing updates and developments. It highlights the core ideas of adapting and\n    customizing data validation and serialization using Pydantic's type system. The\n    page serves as a work in progress, directing users to the official Pydantic documentation\n    for latest information on Type Adapters, a key feature for flexible data modeling\n    and type management. Key keywords include Pydantic, Type Adapter, data validation,\n    type customization, and Python data modeling.\nconcepts/typeddicts.md:\n  cross_links: []\n  hash: 81e543be61c6eae101e7f1fc5bd324ec\n  references: []\n  summary: The document provides a tutorial on using TypedDicts in Python when working\n    with the OpenAI API for structured data responses. It explains how to define a\n    TypedDict class to specify structured data types, such as strings and integers,\n    and demonstrates its integration with the OpenAI API through the `instructor`\n    library. The example provided showcases the creation of a structured response\n    model, using a `User` TypedDict to parse a response from the GPT-3.5-turbo model,\n    highlighting ease of use and strong typing for better handling API responses.\n    Key concepts include Python TypedDicts, OpenAI API integration, structured data\n    handling, and typed responses.\nconcepts/types.md:\n  cross_links:\n  - concepts/lists.md\n  - concepts/partial.md\n  hash: 4399736e0701f581b37e9ba09635169b\n  references:\n  - concepts/lists.md\n  - concepts/partial.md\n  summary: The article \"Working with Types in Instructor\" explores how to effectively\n    utilize various data types in the Instructor platform, enhancing structured outputs\n    from basic primitives to complex structures. Key elements include the use of simple\n    types such as `str`, `int`, `float`, and `bool`, as well as complex types like\n    `List`, `Dict`, `Union`, `Literal`, and `Enum`. It covers how to employ `pydantic.BaseModel`\n    for structuring data and emphasizes the use of `typing.Annotated` for adding context\n    and descriptions. The article also delves into advanced examples, such as converting\n    markdown data to a pandas DataFrame and using lists of unions for diverse response\n    types. These concepts are illustrated with practical code snippets, highlighting\n    the versatility and capabilities of the Instructor framework in managing various\n    data types for better API response modeling. Keywords include Instructor, data\n    types, Pydantic, Python, structured outputs, and API response modeling.\nconcepts/union.md:\n  cross_links:\n  - concepts/unions.md\n  hash: d19fc6ce0a547f93d856b9a2a64f2f16\n  references:\n  - concepts/unions.md\n  summary: 'This page explains how to implement Union types in Pydantic models to\n    manage multiple action types in Python applications. It highlights best practices\n    for using Union types to enable flexible data validation and modeling, allowing\n    models to accept different data structures. The content emphasizes handling diverse\n    input scenarios effectively with Pydantic''s Union feature, providing valuable\n    guidance for developers working with complex data validation and type hinting.\n    Key keywords include Union types, Pydantic models, data validation, Python, type\n    hints, and flexible data handling. Note: the original page has been consolidated\n    into a comprehensive Union Types guide for more detailed information.'\nconcepts/unions.md:\n  cross_links: []\n  hash: eaaf35658f139d7cce326903aad2e9c2\n  references: []\n  summary: This guide explores the use of Union types in Instructor to handle multiple\n    response formats from language models, emphasizing core concepts like basic, discriminated,\n    and nested unions, as well as optional fields. It covers best practices for type\n    hints, validation, and documentation, along with practical patterns such as multiple\n    response types and dynamic action selection. The content highlights integrating\n    Union types with Instructor for validation, streaming, error handling, and type\n    checking, providing key examples and workflows for building flexible, robust LLM-based\n    applications. Key words include Union types, Instructor, Pydantic, response models,\n    discriminated unions, validation, streaming, error handling, dynamic actions,\n    AI models, OpenAI, and type safety.\nconcepts/usage.md:\n  cross_links: []\n  hash: 80711f0189c13e1c0625c56bf2b16f58\n  references: []\n  summary: 'This guide explains how to handle non-streaming requests in OpenAI using\n    Python, with a focus on tracking token usage and managing exceptions. It demonstrates\n    accessing raw response data to monitor token consumption, including detailed usage\n    metrics like prompt and completion tokens. The content also covers handling the\n    IncompleteOutputException, which occurs when the context length is exceeded, by\n    catching the exception and adjusting the prompt accordingly. Key concepts include\n    OpenAI API, usage tracking, token management, error handling, and Python implementation.\n    Keywords: OpenAI, non-streaming requests, token usage, completion metrics, IncompleteOutputException,\n    Python, API management.'\nconcepts/validation.md:\n  ai_references:\n  - '[Semantic Validation](./semantic_validation.md)'\n  - '[Pydantic Documentation](https://docs.pydantic.dev/)'\n  - '[OpenAI Function Calling](https://platform.openai.com/docs/guides/gpt/function-calling)'\n  - '[Instructor Examples](../examples/index.md)'\n  cross_links:\n  - concepts/semantic_validation.md\n  - examples/index.md\n  - index.md\n  hash: f03282574862cea2b03ed9f3e727fa6e\n  keywords:\n  - validation\n  - Instructor\n  - Pydantic\n  - type safety\n  - error handling\n  - semantic validation\n  - custom validators\n  - LLM outputs\n  - data consistency\n  references:\n  - concepts/semantic_validation.md\n  - concepts/semantic_validation.md\n  - examples/index.md\n  summary: This guide details the process of validating outputs from language models\n    using the Pydantic library in the Instructor framework, emphasizing the importance\n    of type safety, error handling, and maintaining data consistency. It also covers\n    various validation strategies, including field validation, semantic validation,\n    and the implementation of custom validators.\n  topics: []\ncontributing.md:\n  ai_references:\n  - '[scripts/README.md]'\n  cross_links: []\n  hash: 6289db2bfabdfe2f10244ea2a3b7bd7d\n  keywords:\n  - Instructor library\n  - contribute\n  - evaluation tests\n  - GitHub\n  - development environment\n  - issues\n  - pull requests\n  - documentation\n  - code style\n  references:\n  - ../scripts/README.md\n  summary: This document outlines how to contribute to the Instructor library, including\n    writing evaluation tests, reporting issues, and submitting pull requests on GitHub.\n    Contributors are encouraged to set up their development environments, follow code\n    style guidelines, and enhance documentation for better collaboration and project\n    quality.\n  topics: []\nexamples/action_items.md:\n  cross_links: []\n  hash: 330c78a61f002ff6c56b77dda4ac62bf\n  references: []\n  summary: This article explains how to automate the extraction of action items from\n    meeting transcripts using OpenAI's API and Pydantic. It details modeling action\n    items as Ticket objects with subtasks, priorities, assignees, and dependencies,\n    enabling efficient project management. The guide includes code examples for generating\n    actionable tasks from transcripts, visualizing data with Graphviz, and emphasizes\n    the importance of automating task identification to improve productivity and prevent\n    overlooked responsibilities in meetings. Key keywords include action item extraction,\n    meeting transcripts, OpenAI API, Pydantic, project management automation, task\n    dependency, and GPT-4.\nexamples/audio_extraction.md:\n  ai_references:\n  - '[multi_modal_gemini.md'\n  - ../integrations/openai.md]\n  cross_links:\n  - examples/multi_modal_gemini.md\n  - integrations/openai.md\n  hash: e0963a9b102bdd979542bcde8571c834\n  keywords:\n  - OpenAI\n  - audio information extraction\n  - Instructor library\n  - Pydantic model\n  - WAV format\n  - GPT-4 audio\n  - audio processing\n  - structured information\n  references:\n  - examples/multi_modal_gemini.md\n  - integrations/openai.md\n  summary: This documentation provides a comprehensive guide on using OpenAI's audio\n    capabilities with the Instructor library to extract structured information from\n    audio files. It includes code examples demonstrating the extraction process into\n    a defined Pydantic model, highlighting various use cases and best practices for\n    effective audio processing.\n  topics:\n  - Audio processing\n  - Information extraction\n  - Code examples\n  - Use cases\n  - Pydantic models\nexamples/batch_classification_langsmith.md:\n  cross_links: []\n  hash: 996b30c651684530af4333e94df8f6a7\n  references: []\n  summary: This article explains how to enhance the OpenAI client with LangSmith and\n    Instructor for improved observability, monitoring, and functionality in LLM applications.\n    It demonstrates integrating LangSmith's SDK with OpenAI's chat completion API,\n    using features like client wrapping and rate limiting. The guide also showcases\n    applying Instructor to patch the client in TOOL mode, enabling additional capabilities.\n    Key topics include LangSmith, OpenAI client integration, Instructor, rate limiting,\n    question classification, and application monitoring, making it ideal for developers\n    seeking scalable, observable AI solutions.\nexamples/batch_job_oai.md:\n  cross_links: []\n  hash: d13fc5a068b73df1e50ff653f20588b5\n  references: []\n  summary: This guide explains how to efficiently generate large-scale synthetic question-answer\n    pairs using OpenAI's Batch API with Instructor. It covers creating JSONL files\n    from datasets like ms-marco, leveraging batch jobs for cost-effective and high-rate\n    data generation, and managing batch workflows through CLI commands. Key features\n    include using Pydantic models for response parsing, handling batch job creation,\n    monitoring progress, and downloading results. Important keywords include synthetic\n    data generation, OpenAI Batch API, Instructor, large-scale datasets, ms-marco,\n    question-answer pairs, cost-effective AI workflows, and data parsing.\nexamples/building_knowledge_graphs.md:\n  cross_links: []\n  hash: 4055c02b7485da53099015c6d456b1fc\n  references: []\n  summary: This tutorial offers a comprehensive guide to building knowledge graphs\n    from textual data using OpenAI's API and Pydantic. It demonstrates how to extract\n    structured information from unstructured text, such as identifying entities and\n    relationships, and representing them as nodes and edges in a graph. The example\n    includes Python code for defining graph models with Pydantic, integrating OpenAI's\n    API for text processing, and generating visualizable knowledge graphs. Key concepts\n    include automated knowledge graph construction, natural language processing, entity\n    and relationship extraction, and Python implementation, making it an essential\n    resource for data scientists and developers interested in semantic data modeling\n    and knowledge graph automation.\nexamples/bulk_classification.md:\n  cross_links:\n  - blog/posts/learn-async.md\n  hash: 21849e9a44f226f43e8b94a17846fa12\n  references:\n  - blog/posts/learn-async.md\n  summary: 'This tutorial provides a comprehensive guide on implementing user-provided\n    tag classification using FastAPI, Pydantic models, and the OpenAI API with async\n    functions for parallel processing. It emphasizes defining flexible tag schemas\n    with identifiers, instructions, and optional confidence scores, as well as validating\n    tags against context to prevent hallucinations. The core objective is to enable\n    effective classification of text snippets with minimal hallucination risk by constraining\n    the language model through validation contexts. The tutorial demonstrates creating\n    request and response models, parallelizing classification tasks with asyncio.gather,\n    and integrating the system into a FastAPI endpoint. Key concepts include asynchronous\n    classification, schema validation, multi-class tagging, confidence scores, and\n    production deployment considerations. Key phrases: user-defined tags, text classification,\n    fastapi, pydantic, openai, async processing, parallel classification, schema validation,\n    confidence scoring, API integration.'\nexamples/classification.md:\n  ai_references:\n  - '[bulk_classification.md'\n  - prompting_guide.md\n  - prompting/index.md\n  - concepts/prompting.md#literals\n  - concepts/prompting.md#chain-of-thought]\n  cross_links:\n  - concepts/prompting.md\n  - examples/bulk_classification.md\n  - index.md\n  - prompting/index.md\n  hash: 36f9aeedada9921ccdab7afbbd6151c5\n  keywords:\n  - OpenAI\n  - text classification\n  - Pydantic models\n  - single-label classification\n  - multi-label classification\n  - spam detection\n  - NLP\n  - Python\n  references:\n  - examples/bulk_classification.md\n  - examples/bulk_classification.md\n  - prompting/index.md\n  summary: This tutorial provides a comprehensive guide to implementing single-label\n    and multi-label text classification using the OpenAI API and Pydantic models in\n    Python. By leveraging tips like using Literals for classification labels and including\n    few-shot examples, users can enhance the accuracy of their NLP applications such\n    as spam detection and support ticket categorization.\n  topics:\n  - Single-Label Classification\n  - Multi-Label Classification\n  - Pydantic Models\n  - Chain of Thought\n  - Few-Shot Examples\nexamples/document_segmentation.md:\n  cross_links: []\n  hash: 121491f63507430563385c90fc98a84f\n  references: []\n  summary: 'This comprehensive guide explores document segmentation using Large Language\n    Models (LLMs), particularly Cohere''s command-r-plus model with 128k context length.\n    It demonstrates how to organize long, complex texts into meaningful sections centered\n    around key concepts by leveraging structured data classes (`Section`, `StructuredDocument`)\n    and line numbering preprocessing. The approach enhances understanding of lengthy\n    articles, such as tutorials on Transformer architectures, by extracting sections\n    with specific topics. Key techniques include using LLMs for segmentation via system\n    prompts, and reconstructing section texts based on start and end line indices.\n    This method is applicable across domains for breaking down complex documents,\n    code snippets, and mathematical content, improving content comprehension, summarization,\n    and indexing. Keywords: document segmentation, Large Language Models, Cohere,\n    Transformer, structured output, NLP, long documents, LLM-based text splitting,\n    AI text organization.'\nexamples/entity_resolution.md:\n  cross_links: []\n  hash: b3f456d3d8db72c6526db22f548acca3\n  references: []\n  summary: This guide explains how to extract, resolve, and visualize entities from\n    legal documents and contracts using AI and graph visualization tools. It details\n    the data structures for representing entities and their properties, methods for\n    utilizing OpenAI's GPT-4 to automate entity extraction and resolution, and techniques\n    for creating interactive entity graphs with Graphviz. Key topics include legal\n    document analysis, entity resolution, dependency mapping, legal tech applications,\n    and data visualization. This approach enhances understanding of complex legal\n    contracts by highlighting interconnected clauses, obligations, and key terms for\n    improved legal analysis and workflow efficiency.\nexamples/exact_citations.md:\n  ai_references:\n  - '[examples/citation_fuzzy_match.py'\n  - https://docs.pydantic.dev/usage/validators/#model-validators]\n  cross_links: []\n  hash: 5aba7f1ff1813838fe1fc55245ce7b53\n  keywords:\n  - '[AI validation'\n  - Python citations\n  - Fact class\n  - QuestionAnswer class\n  - preventing hallucinations\n  - OpenAI API\n  - data structures\n  - model validators]\n  references: []\n  summary: This documentation outlines how to validate AI-generated answers in Python\n    using contextual citations, preventing inaccuracies and misinformation. It introduces\n    two Python classes, `Fact` and `QuestionAnswer`, that encapsulate statements and\n    their validation, ensuring responses from AI are backed by direct quotes from\n    provided context.\n  topics:\n  - '[AI-generated answers'\n  - Python class validation\n  - contextual citations\n  - preventing hallucinations\n  - OpenAI integration]\nexamples/examples.md:\n  cross_links: []\n  hash: 44560a6b059cd1c58184b4e7fccc0bb4\n  references: []\n  summary: This article explains how to incorporate examples into Pydantic models\n    using the `json_schema_extra` parameter. By embedding practical examples within\n    the model's schema, developers can enhance clarity and usability, especially for\n    JSON schema generation and API documentation. The provided example demonstrates\n    adding illustrative question-answer pairs to a `SyntheticQA` model, showcasing\n    how to improve model documentation and facilitate synthetic data generation with\n    OpenAI's GPT models. Keywords include Pydantic, JSON schema, model examples, data\n    validation, API documentation, synthetic data, OpenAI, and schema customization.\nexamples/extract_contact_info.md:\n  cross_links: []\n  hash: 7a678fb17f5c490628a5f68d70bd67c9\n  references: []\n  summary: This guide demonstrates how to automate customer lead information extraction\n    using OpenAI's API and Pydantic for data validation. It focuses on modeling lead\n    data with validated attributes like name and phone number, including handling\n    phone number formats with country codes. The tutorial covers creating a function\n    to extract multiple leads from user messages, ensuring accurate data collection\n    for applications like chatbots. Key concepts include OpenAI integration, Pydantic\n    data modeling, phone number validation, and automated lead extraction to streamline\n    customer data management.\nexamples/extract_slides.md:\n  ai_references: []\n  cross_links: []\n  hash: 1a730ef2e3541d3c778bf48e330a7242\n  keywords:\n  - '[AI'\n  - data extraction\n  - competitor analysis\n  - presentation slides\n  - industry categorization]\n  references: []\n  summary: This guide presents a method for extracting competitor data from presentation\n    slides using AI technologies. It outlines the necessary data structures and functions\n    needed to categorize competitors by industry, ensuring thorough information gathering\n    from both text and images in slides.\n  topics:\n  - '[Data extraction techniques'\n  - Competitor categorization\n  - Industry analysis\n  - AI implementation\n  - Pydantic data models]\nexamples/extracting_receipts.md:\n  cross_links: []\n  hash: 1ce877006d4831a5eeeb0b64fb943fd0\n  references: []\n  summary: This guide demonstrates how to use Python and GPT-4, combined with Pydantic\n    for data validation, to extract and validate receipt data from images for automated\n    expense tracking. It covers defining structured models for items and receipts,\n    implementing custom validation to ensure total amounts match itemized sums, and\n    utilizing the OpenAI GPT-4 API through the Instructor library for image analysis.\n    Practical examples illustrate extracting receipt details from images, enabling\n    efficient financial data processing and expense management. Keywords include GPT-4,\n    Python, Pydantic, receipt data extraction, expense tracking, image analysis, data\n    validation, OpenAI, automation.\nexamples/extracting_tables.md:\n  cross_links: []\n  hash: f7e39386e65d144db40b0549fc836164\n  references: []\n  summary: This article demonstrates how to extract and convert tables from images\n    into Markdown format using Python and OpenAI's GPT-Vision model. It covers building\n    custom data types with Pydantic for handling Markdown tables, defining a Table\n    class, and utilizing instructor's patched OpenAI client for image-based table\n    extraction. Practical examples include extracting top-grossing app data from images,\n    facilitating data analysis and automation. Key topics include GPT-Vision, Python\n    data processing, image-to-table conversion, Markdown serialization, and leveraging\n    AI for automated data extraction from images.\nexamples/groq.md:\n  cross_links: []\n  hash: 680f259ac1258ea7fe4eb11dc80babbf\n  references: []\n  summary: 'Learn how to perform inference using Groq with the mixtral-8x7b model,\n    including setup instructions, API key acquisition from GroqCloud, and practical\n    Python examples. The guide covers package installations, environment variable\n    configuration, and integrating Groq with the instructor library for seamless chat\n    completions. Key topics include deploying Groq for AI inference, using the from_groq\n    method, and creating structured JSON outputs, making it ideal for developers seeking\n    efficient AI deployment solutions with Groq''s hardware and API. Keywords: Groq\n    inference, AI deployment, mixtral-8x7b model, GroqCloud API, Python example, structured\n    output, chat completions, AI inference setup.'\nexamples/image_to_ad_copy.md:\n  cross_links: []\n  hash: 70f33d5dd56c606567dafe15c58c5316\n  references: []\n  summary: This content demonstrates how to leverage GPT-4 Vision API and ChatGPT\n    to automatically generate advertising copy from product images, ideal for e-commerce,\n    marketing, and retail teams. It details the process of identifying products within\n    images, extracting key features and descriptions using AI models, and creating\n    engaging ad headlines and persuasive marketing messages. The approach includes\n    defining structured data models for products, error handling, and generating compelling\n    ad copy tailored to each product. Key features include dynamic product attribute\n    extraction, integration with OpenAI's vision models, and automated ad content\n    creation to enhance online marketing efficiency and boost sales potential through\n    effective visual-to-text conversion and advertising automation.\nexamples/index.md:\n  cross_links:\n  - examples/action_items.md\n  - examples/batch_classification_langsmith.md\n  - examples/batch_job_oai.md\n  - examples/building_knowledge_graphs.md\n  - examples/bulk_classification.md\n  - examples/classification.md\n  - examples/document_segmentation.md\n  - examples/entity_resolution.md\n  - examples/exact_citations.md\n  - examples/examples.md\n  - examples/extract_contact_info.md\n  - examples/extract_slides.md\n  - examples/extracting_receipts.md\n  - examples/extracting_tables.md\n  - examples/groq.md\n  - examples/image_to_ad_copy.md\n  - examples/knowledge_graph.md\n  - examples/local_classification.md\n  - examples/mistral.md\n  - examples/moderation.md\n  - examples/multi_modal_gemini.md\n  - examples/multiple_classification.md\n  - examples/ollama.md\n  - examples/pandas_df.md\n  - examples/partial_streaming.md\n  - examples/pii.md\n  - examples/planning-tasks.md\n  - examples/search.md\n  - examples/self_critique.md\n  - examples/single_classification.md\n  - examples/sqlmodel.md\n  - examples/tables_from_vision.md\n  - examples/tracing_with_langfuse.md\n  - examples/watsonx.md\n  - examples/youtube_clips.md\n  - tutorials/index.md\n  hash: 260e691fbc028547afdea7dfe29cccfe\n  references:\n  - examples/single_classification.md\n  - examples/multiple_classification.md\n  - examples/classification.md\n  - examples/bulk_classification.md\n  - examples/batch_classification_langsmith.md\n  - examples/local_classification.md\n  - examples/entity_resolution.md\n  - examples/extract_contact_info.md\n  - examples/pii.md\n  - examples/exact_citations.md\n  - examples/action_items.md\n  - examples/search.md\n  - examples/document_segmentation.md\n  - examples/planning-tasks.md\n  - examples/knowledge_graph.md\n  - examples/building_knowledge_graphs.md\n  - examples/tables_from_vision.md\n  - examples/extracting_tables.md\n  - examples/extracting_receipts.md\n  - examples/extract_slides.md\n  - examples/image_to_ad_copy.md\n  - examples/youtube_clips.md\n  - examples/multi_modal_gemini.md\n  - examples/sqlmodel.md\n  - examples/pandas_df.md\n  - examples/partial_streaming.md\n  - examples/self_critique.md\n  - examples/moderation.md\n  - examples/batch_job_oai.md\n  - examples/examples.md\n  - examples/tracing_with_langfuse.md\n  - examples/groq.md\n  - examples/mistral.md\n  - examples/watsonx.md\n  - examples/ollama.md\n  - tutorials/index.md\n  summary: The Instructor Cookbook Collection offers practical examples and recipes\n    for solving real-world problems using structured outputs across various domains,\n    including text processing, multi-modal media, data tools, and deployment options.\n    It features comprehensive guides on text classification, information extraction,\n    document processing, vision processing, database integration, streaming, API integration,\n    observability, and deployment with model providers like Groq, Mistral, IBM watsonx.ai,\n    and Ollama. Designed to assist developers and AI practitioners, these cookbooks\n    provide complete code, explanations, and best practices for implementing AI solutions\n    effectively in production environments. Key keywords include AI recipes, structured\n    outputs, text processing, multi-modal AI, data integration, deployment, model\n    APIs, and open-source models.\nexamples/knowledge_graph.md:\n  cross_links: []\n  hash: 1a9bafb73950d7297949d435080373a4\n  references: []\n  summary: This guide demonstrates how to create, visualize, and iteratively update\n    knowledge graphs using Python, OpenAI's API, Pydantic, and Graphviz. It covers\n    defining data structures with Node and Edge models, generating detailed knowledge\n    graphs from complex topics like quantum mechanics, and visualizing these graphs\n    with Graphviz. Key techniques include extracting key concepts and relationships\n    with GPT-4, updating graphs step-by-step, and deduplicating nodes and edges for\n    clarity. The tutorial emphasizes leveraging the Instructor library for structured\n    outputs and iterative graph building, making it ideal for understanding complex\n    subjects through visualizations. Core keywords include knowledge graphs, Python,\n    OpenAI API, Pydantic, Graphviz, data visualization, AI, GPT-4, iterative updates,\n    complex topics, and structured data modeling.\nexamples/local_classification.md:\n  cross_links: []\n  hash: c0f945e2d931625f632d70b4bfd3c92c\n  references: []\n  summary: This article explains how to securely classify and handle confidential\n    data using local AI models with llama-cpp-python and instructor, ensuring data\n    privacy and infrastructure control. It covers setup instructions for installing\n    models like Mistral-7B-Instruct-v0.2-GGUF, including GPU and CPU configurations,\n    along with example Python code for processing confidential document queries such\n    as content analysis, access permissions, and document metadata. The guide emphasizes\n    maintaining data security by performing inference locally, making it ideal for\n    organizations seeking secure AI solutions for sensitive information. Key keywords\n    include local AI models, confidential data classification, llama-cpp-python, instructor,\n    privacy-focused AI, and secure document handling.\nexamples/mistral.md:\n  ai_references: []\n  cross_links: []\n  hash: d9d17c1c67170f2291fa82e49cce4666\n  keywords:\n  - MistralAI\n  - API key\n  - inference\n  - structured outputs\n  - Python example\n  - installation\n  - pip packages\n  - '`from_mistral`'\n  - Mistral tools\n  references: []\n  summary: This documentation provides a comprehensive guide on using MistralAI models\n    for generating structured outputs through inference. It covers the steps needed\n    for setup, including API key generation, necessary package installations, and\n    example code to demonstrate the process.\n  topics:\n  - MistralAI API setup\n  - Package installation\n  - Example usage in Python\n  - User model implementation\n  - Structured output generation\nexamples/moderation.md:\n  cross_links: []\n  hash: c0d290b445a8b1d1076bc82a9fd8b361\n  references: []\n  summary: \"This document provides an example of utilizing OpenAI's moderation endpoint\\\n    \\ to ensure content compliance with usage policies by filtering harmful content.\\\n    \\ It explains how to implement an `AfterValidator` to automatically assess messages\\\n    \\ for categories like hate, harassment, self-harm, sexual content, and violence.\\\n    \\ The example includes code snippets demonstrating how to set up the moderation\\\n    \\ validation with OpenAI\\u2019s API, highlighting its ability to flag and reject\\\n    \\ harmful or policy-violating messages. Key concepts include OpenAI moderation,\\\n    \\ content filtering, safety validation, Pydantic integration, and ensuring API\\\n    \\ input/output compliance for safe AI interactions.\"\nexamples/multi_modal_gemini.md:\n  cross_links: []\n  hash: d2d5cffd4469c75c6730fa3f130fecd1\n  references: []\n  summary: 'This guide explains how to utilize Gemini with Google Generative AI for\n    multi-modal data processing, specifically focusing on audio files. It details\n    three methods: uploading entire audio files as normal messages, passing audio\n    segments inline after installing pydub, and using lists of mixed content for flexible\n    processing. The instructions emphasize setting the correct mode (GEMINI_JSON),\n    uploading files with genai.upload_file, and providing audio data either as file\n    objects or inline audio segments. These approaches enable efficient summarization,\n    transcription, and analysis of audio recordings, supporting SEO by extracting\n    core ideas, objectives, key details, and relevant keywords related to audio content\n    processing with Gemini and Generative AI.'\nexamples/multiple_classification.md:\n  cross_links: []\n  hash: d80a59dabf71466f2ed5bc4178dc557b\n  references: []\n  summary: This guide demonstrates how to implement multi-label classification for\n    support ticket categorization using OpenAI's API and Pydantic. It introduces a\n    custom enum and a Pydantic model to handle multiple labels such as \"ACCOUNT,\"\n    \"BILLING,\" and \"GENERAL_QUERY,\" enabling effective multi-label predictions. The\n    example illustrates how to set up the classification process with a tailored prompt\n    and retrieve labels indicating multiple relevant categories for a given support\n    ticket. Keywords include multi-label classification, OpenAI API, Pydantic, support\n    ticket categorization, multi-label prediction, GPT-4, and effective support workflows.\nexamples/ollama.md:\n  cross_links:\n  - concepts/models.md\n  - concepts/partial.md\n  - concepts/patching.md\n  - concepts/reask_validation.md\n  - examples/index.md\n  - index.md\n  - prompting/index.md\n  - why.md\n  hash: 56fe05f28e384bbef8372e921efa4648\n  references:\n  - concepts/models.md\n  - concepts/models.md\n  - concepts/reask_validation.md\n  - concepts/partial.md\n  - examples/index.md\n  - concepts/models.md\n  - concepts/patching.md\n  - index.md\n  - why.md\n  - why.md\n  - concepts/models.md\n  - examples/index.md\n  - prompting/index.md\n  summary: \"This article explains how to utilize Ollama's local LLM server with the\\\n    \\ Instructor library to generate structured outputs using Pydantic models. It\\\n    \\ highlights the benefits of Instructor, such as a simple API, validation, reasking,\\\n    \\ streaming support, and prompt control, enabling more precise and reliable AI\\\n    \\ interactions. The guide provides practical steps and code examples for integrating\\\n    \\ Ollama models like Llama 3 with Instructor\\u2019s JSON schema validation, making\\\n    \\ it easier to extract structured data from large language models for AI applications\\\n    \\ and development.\"\nexamples/open_source.md:\n  ai_references:\n  - '[instructor_examples.md]'\n  cross_links: []\n  hash: a3046643d8e10ca464ec3be1302d1cd2\n  keywords:\n  - OpenAI chat API\n  - open source models\n  - OpenRouter\n  - Perplexity\n  - RunPod\n  - text-generation-webui\n  references: []\n  summary: This document provides an overview of open source model providers that\n    are compatible with the OpenAI chat API, highlighting options like OpenRouter,\n    Perplexity, and RunPod LLMs. It serves as a guide for users looking to explore\n    and implement these models in their applications.\n  topics:\n  - Open source model providers\n  - compatibility with OpenAI API\n  - implementation examples\n  - usage of text-generation-webui\nexamples/pandas_df.md:\n  cross_links: []\n  hash: d08c46a6a8d4445ab9bf656ba28f6247\n  references: []\n  summary: This guide demonstrates how to extract and convert Markdown tables directly\n    into Pandas DataFrames in Python. It features techniques for parsing Markdown\n    data, validating the DataFrame structure, and serializing it back to Markdown\n    format using Pydantic annotations. The code showcases creating functions to extract\n    tables with OpenAI's GPT-3.5-turbo model, enabling efficient data extraction from\n    formatted Markdown tables. Key concepts include Markdown to DataFrame conversion,\n    custom annotations for validation and serialization, and extracting structured\n    data like tables with titles. Keywords include Pandas, Markdown parsing, data\n    extraction, GPT-3.5-turbo, Python, DataFrame, table extraction, Pydantic, and\n    OpenAI.\nexamples/partial_streaming.md:\n  cross_links: []\n  hash: b4fa99932aca3dffc93d4dea2b69e036\n  references: []\n  summary: This article explains how to implement field-level streaming with the Instructor\n    library in Python for dynamic UI rendering. It demonstrates using `Partial[T]`\n    to create incremental, partial snapshots of model responses, enabling real-time\n    updates. The example showcases extracting meeting and participant information\n    from a text block using OpenAI's GPT-4, with streaming responses displayed via\n    the Rich library. Key concepts include partial responses, stream processing, dynamic\n    UI updates, and leveraging Instructor for field-level data handling in Python.\nexamples/pii.md:\n  cross_links: []\n  hash: 6cb6a88f6b787857b8da7d9a072b8cab\n  references: []\n  summary: This guide demonstrates how to extract and scrub Personally Identifiable\n    Information (PII) from documents using OpenAI's ChatCompletion model and Python.\n    It covers defining Pydantic data models to structure PII data, utilizing OpenAI's\n    API to extract sensitive information such as names, emails, phone numbers, addresses,\n    and SSNs, and implementing a method to scrub PII by replacing values with placeholders.\n    Key features include leveraging AI for accurate PII detection, data sanitization\n    techniques, and customizable scrubbing methods to ensure privacy compliance in\n    document processing workflows. Suitable keywords include PII extraction, data\n    scrubbing, privacy, OpenAI, Python, AI-powered data anonymization, sensitive data\n    protection, and document privacy.\nexamples/planning-tasks.md:\n  cross_links:\n  - concepts/lists.md\n  - examples/knowledge_graph.md\n  - examples/recursive.md\n  hash: 00bfdb223b5c59a4fcafe1e6e020cfe8\n  references:\n  - concepts/lists.md\n  - examples/knowledge_graph.md\n  - examples/recursive.md\n  summary: This guide explains how to use OpenAI's Function Call ChatCompletion API\n    for query planning in complex question-answering systems. It demonstrates how\n    to define structured query models with Pydantic, create a query planner that breaks\n    down a main question into dependent sub-questions, and leverages system prompts\n    to generate organized query plans. The approach facilitates systematic information\n    gathering, iterative querying, workflow automation, and process optimization,\n    making it ideal for handling multi-step queries and knowledge graph extraction.\n    Key concepts include structured schema design, dependency management, and leveraging\n    OpenAI's models for automated query decomposition.\nexamples/recursive.md:\n  cross_links:\n  - examples/knowledge_graph.md\n  - examples/planning-tasks.md\n  hash: 32eb7db1d5fc4dc8fa262770848b0592\n  references:\n  - examples/planning-tasks.md\n  - examples/knowledge_graph.md\n  summary: This guide explains how to implement recursive schemas using Pydantic models\n    in Instructor, enabling the handling of hierarchical and nested data structures\n    such as organizational charts, file systems, comment threads, and task dependencies.\n    It covers defining recursive models, best practices like calling `model_rebuild()`,\n    validation techniques for limiting recursion depth, and performance tips for managing\n    complex data. The content emphasizes the importance of clear structure, validation,\n    and practical examples to effectively work with recursive schemas in AI-powered\n    applications.\nexamples/search.md:\n  cross_links: []\n  hash: 86f8d684546f51c59453bfcfcdf256cc\n  references: []\n  summary: This article demonstrates how to segment search queries into actionable\n    tasks using OpenAI Function Call and Pydantic. It showcases defining data structures\n    with Pydantic, leveraging OpenAI's multi-task capabilities to split complex queries\n    into multiple sub-queries, and executing them concurrently with asyncio. The example\n    emphasizes extracting tasks like web searches, images, and videos from user input\n    to improve virtual assistant functionality. Key concepts include OpenAI Function\n    Call, Pydantic models, query segmentation, parallel execution, and applications\n    in virtual assistants and search optimization.\nexamples/self_critique.md:\n  cross_links: []\n  hash: 15eeaa0bb27f7fc4c235f752faee8823\n  references: []\n  summary: This guide explains how to implement self-correction in NLP applications\n    using `llm_validator` for enhanced response accuracy. It demonstrates integrating\n    validation callbacks within pydantic models to catch objectionable content, provide\n    helpful error messages, and enable automatic retries with corrections. Key concepts\n    include the use of `response_model`, custom validation with `llm_validator`, and\n    retry mechanisms for self-healing language model outputs, making it a valuable\n    resource for improving NLP model safety, reliability, and quality control. Keywords\n    include self-correction, NLP validation, `llm_validator`, pydantic validation,\n    self-healing AI, response accuracy, and prompt engineering.\nexamples/single_classification.md:\n  cross_links: []\n  hash: e57ed79f3f4234a0606723bb8c07d2ee\n  references: []\n  summary: 'This guide demonstrates how to perform single-label text classification\n    using the OpenAI API, specifically with the GPT-3.5-turbo and GPT-4 models. It\n    showcases how to classify text as \"SPAM\" or \"NOT_SPAM\" with a response model,\n    leveraging the instructor library for enhanced functionality. The example includes\n    code for setting up the classification function, defining the response schema\n    with Pydantic, and verifying predictions through sample inputs. Key features include\n    the use of response_model for structured outputs, and the approach emphasizes\n    simplicity and accuracy in spam detection and text classification tasks. Keywords:\n    OpenAI API, single-label classification, GPT-3.5-turbo, GPT-4, text classification,\n    spam detection, machine learning, natural language processing.'\nexamples/sqlmodel.md:\n  ai_references:\n  - '[concepts/fastapi.md]'\n  cross_links:\n  - api.md\n  - concepts/fastapi.md\n  hash: ef554168dab29e30a9050ba01b8122d8\n  keywords:\n  - '[Instructor'\n  - SQLModel\n  - Python\n  - database integration\n  - API development\n  - OpenAI\n  - FastAPI\n  - models]\n  references:\n  - concepts/fastapi.md\n  summary: This documentation provides a comprehensive guide on how to integrate the\n    `Instructor` library with `SQLModel` in Python to facilitate database interactions.\n    It includes step-by-step examples on defining models, generating records, and\n    saving them to a database, ensuring seamless functionality and improved developer\n    experience.\n  topics:\n  - '[Integration of Instructor and SQLModel'\n  - Model Definition\n  - Generating Records\n  - Inserting data into DB\n  - JSON schema management]\nexamples/tables_from_vision.md:\n  cross_links: []\n  hash: 02f100035905072561af66bed755ecf7\n  references: []\n  summary: This guide explains how to extract and convert tables from images into\n    markdown format using OpenAI's GPT-4 Vision model. It details the process of analyzing\n    images to identify table headers, generate descriptive titles and summaries, and\n    output structured markdown tables with captions. The method leverages Python,\n    pandas, and pydantic for data handling, emphasizing automatic data extraction,\n    table serialization, and effective data presentation from visual content. Key\n    concepts include image analysis, data extraction, markdown formatting, and GPT-4's\n    powerful vision capabilities for accurate table conversion.\nexamples/tracing_with_langfuse.md:\n  cross_links: []\n  hash: 2b1caa40e9da271b66e341c45b463b28\n  references: []\n  summary: This guide introduces Langfuse, an open-source observability and tracing\n    platform for AI applications, showcasing how to integrate it with Instructor and\n    OpenAI clients for enhanced monitoring and debugging of large language model (LLM)\n    calls. It provides setup instructions, including installation and environment\n    configuration for both synchronous and asynchronous OpenAI clients. The content\n    highlights key use cases such as tracing API calls, classifying customer feedback,\n    scoring relevance, and visualizing detailed traces via the Langfuse dashboard.\n    Core keywords include Langfuse, observability, AI monitoring, tracing, LLM, API\n    performance, debugging, Instructor, OpenAI, and asynchronous AI integration.\nexamples/watsonx.md:\n  cross_links: []\n  hash: dafd5f18905aa8c25b71a9f2f9bc8a65\n  references: []\n  summary: This guide details how to use IBM watsonx.ai for inference with LiteLLM\n    to generate structured outputs. It covers prerequisites such as IBM Cloud account,\n    API key, and project ID, and provides installation instructions using Poetry.\n    The example demonstrates creating a custom data model and performing JSON-mode\n    inference with watsonx.ai, showcasing how to set environment variables, initialize\n    the client, and generate structured data like company information from text input.\n    Key concepts include IBM watsonx.ai, LiteLLM, inference, structured outputs, setup,\n    API integration, and Python coding examples.\nexamples/youtube_clips.md:\n  cross_links: []\n  hash: 972f468e337dd6fc72cfc12cbd129226\n  references: []\n  summary: This guide explains how to generate concise, engaging YouTube clips from\n    video transcripts using the `instructor` library and OpenAI models. It demonstrates\n    extracting transcript segments with timing information from YouTube videos using\n    `youtube_transcript_api`, and then leveraging GPT-4 to identify key moments and\n    create specific clip titles and descriptions. The process involves fetching transcripts,\n    prompting GPT-4 to produce notable clips, and displaying the results in a structured\n    format. Key concepts include transcript extraction, AI-powered clip generation,\n    content summarization, and leveraging OpenAI for enhanced video editing and content\n    segmentation. This approach helps content creators enhance engagement by recutting\n    videos into focused, shareable clips.\nfaq.md:\n  cross_links: []\n  hash: bca382d72ff309ba7f12a9213923c7e5\n  references:\n  - ./integrations/index.md\n  - ./concepts/patching.md\n  summary: Instructor is a versatile Python library designed to simplify extracting\n    structured data from Large Language Models (LLMs) by leveraging Pydantic schemas\n    for validation and consistency across various providers like OpenAI, Anthropic,\n    Google Gemini, Cohere, and open-source models. It offers multiple modes, such\n    as JSON, Tools, and Function Calling, to suit different provider capabilities,\n    along with features like response validation, automatic retries, raw response\n    access, and streaming support. Ideal for integrating LLMs into applications, Instructor\n    also supports fastapi compatibility, async operations, and cost optimization through\n    prompt design and caching. Core keywords include LLM, Pydantic, structured data,\n    AI integration, OpenAI, Anthropic, Google Gemini, function calling, retries, streaming,\n    API, and chat models.\ngetting-started.md:\n  ai_references:\n  - '[concepts/patching.md'\n  - concepts/reask_validation.md\n  - examples/index.md\n  - concepts/hooks.md\n  - concepts/index.md]\n  cross_links:\n  - concepts/hooks.md\n  - concepts/index.md\n  - concepts/patching.md\n  - concepts/reask_validation.md\n  - examples/index.md\n  - index.md\n  hash: e7e29e4fba34d06eccbad39d295041eb\n  keywords:\n  - Instructor\n  - structured data\n  - language models\n  - installation\n  - validation\n  - API keys\n  - LLM providers\n  references:\n  - ./concepts/patching.md\n  - ./concepts/reask_validation.md\n  - ./examples/index.md\n  - ./concepts/hooks.md\n  - ./concepts/index.md\n  summary: This guide provides a comprehensive introduction to using Instructor for\n    extracting structured data from language models. It covers installation, environment\n    setup, and key functionalities including structured output extraction, validation,\n    and usage with various LLM providers. By following the steps outlined, users can\n    effectively leverage Instructor to enhance data output from language models.\n  topics:\n  - Installation\n  - Environment Setup\n  - Structured Output Extraction\n  - Validation and Error Handling\n  - Streaming Responses\nhelp.md:\n  cross_links:\n  - blog/index.md\n  - concepts/prompting.md\n  - examples/index.md\n  hash: 8aa79aef3783bdc81724f7d3d6d1b7d1\n  references:\n  - concepts/prompting.md\n  - examples/index.md\n  - blog/index.md\n  summary: This guide provides essential resources for getting help with Instructor,\n    an AI model prompting tool. Key support options include the Discord community,\n    detailed concepts on prompting, practical cookbooks with usage examples, and informative\n    blog articles. Additionally, users can leverage GitHub Discussions for questions\n    and collaboration, report bugs and request features via GitHub Issues, or contact\n    the creator on Twitter. These resources ensure users can effectively learn, troubleshoot,\n    and optimize their experience with Instructor.\nindex.md:\n  ai_references:\n  - '[./concepts/reask_validation.md'\n  - ./concepts/retrying.md\n  - ./concepts/lists.md\n  - ./concepts/partial.md\n  - ./integrations/openai.md\n  - ./integrations/ollama.md\n  - ./integrations/anthropic.md\n  - ./integrations/google.md\n  - ./integrations/vertex.md\n  - ./integrations/cohere.md\n  - ./integrations/litellm.md\n  - ./integrations/llama-cpp-python.md\n  - ./integrations/cerebras.md\n  - ./integrations/fireworks.md\n  - ./concepts/models.md\n  - ./concepts/hooks.md\n  - ./concepts/templating.md]\n  cross_links:\n  - concepts/hooks.md\n  - concepts/lists.md\n  - concepts/models.md\n  - concepts/partial.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/templating.md\n  - integrations/anthropic.md\n  - integrations/cerebras.md\n  - integrations/cohere.md\n  - integrations/fireworks.md\n  - integrations/google.md\n  - integrations/litellm.md\n  - integrations/llama-cpp-python.md\n  - integrations/ollama.md\n  - integrations/openai.md\n  - integrations/vertex.md\n  hash: 77cda4e6af3f3243dd7d6f77c532ad75\n  keywords:\n  - '[LLM structured outputs'\n  - Python library\n  - data extraction\n  - Pydantic validation\n  - OpenAI\n  - Anthropic\n  - Google\n  - streaming support\n  - multi-provider API\n  - open source models]\n  references:\n  - ./concepts/reask_validation.md\n  - ./concepts/retrying.md\n  - ./concepts/lists.md\n  - ./concepts/partial.md\n  - ./examples/index.md\n  - ./prompting/index.md\n  - ./integrations/openai.md\n  - ./integrations/ollama.md\n  - ./integrations/llama-cpp-python.md\n  - ./integrations/anthropic.md\n  - ./integrations/google.md\n  - ./integrations/vertex.md\n  - ./integrations/groq.md\n  - ./integrations/litellm.md\n  - ./integrations/cohere.md\n  - ./integrations/cerebras.md\n  - ./integrations/fireworks.md\n  - ./concepts/models.md\n  - ./concepts/reask_validation.md\n  - ./concepts/partial.md\n  - ./integrations/openai.md\n  - ./integrations/anthropic.md\n  - ./integrations/google.md\n  - ./integrations/vertex.md\n  - ./integrations/together.md\n  - ./integrations/ollama.md\n  - ./integrations/llama-cpp-python.md\n  - ./integrations/cohere.md\n  - ./integrations/litellm.md\n  - ./integrations/index.md\n  - ./concepts/hooks.md\n  - ./concepts/templating.md\n  - ./concepts/retrying.md\n  - ./concepts/reask_validation.md\n  - ./concepts/reask_validation.md\n  - ./concepts/partial.md\n  - ./integrations/index.md\n  - ./concepts/retrying.md\n  - ./concepts/models.md\n  summary: Instructor is the leading Python library designed for extracting structured\n    outputs from various Large Language Models (LLMs) like OpenAI, Anthropic, and\n    Google. Utilizing Pydantic for type safety and validation, it ensures reliable\n    data extraction while supporting over 15 providers with features like automatic\n    retries and streaming responses.\n  topics:\n  - '[Python library for LLMs'\n  - Structured data extraction\n  - Pydantic type validation\n  - Multi-provider support\n  - Error handling and retries]\ninstallation.md:\n  cross_links: []\n  hash: a6fe720590b602e1f753c067be9c3121\n  references: []\n  summary: Learn how to install Instructor, an advanced Python tool for building CLIs,\n    using pip. Instructor requires dependencies such as openai, typer, docstring-parser,\n    and pydantic, making setup straightforward for Python 3.9 and above. This guide\n    provides a simple, quick installation process to enhance your Python projects\n    with powerful, type-hint-based CLI development.\nintegrations/anthropic.md:\n  ai_references:\n  - '[../concepts/multimodal.md'\n  - ../concepts/caching.md\n  - https://docs.anthropic.com/en/docs/build-with-claude/tool-use]\n  cross_links:\n  - concepts/caching.md\n  - concepts/multimodal.md\n  hash: fe54a665c05aa5770971338d42cef867\n  keywords:\n  - Anthropic\n  - Claude models\n  - structured data extraction\n  - Python\n  - Instructor\n  - multimodal inputs\n  - streaming support\n  - caching\n  references:\n  - concepts/multimodal.md\n  - concepts/caching.md\n  summary: This tutorial provides a comprehensive guide on using Anthropic's Claude\n    models with the Instructor for structured data extraction in Python. It covers\n    installation, basic usage, multimodal inputs, and advanced features such as streaming\n    support, caching, and using various response models effectively.\n  topics: []\nintegrations/anyscale.md:\n  cross_links: []\n  hash: 53e83cd7c07b43d303cb4a8696300408\n  references: []\n  summary: This guide provides instructions on using Anyscale, a platform offering\n    access to open-source LLMs like Mistral and Llama models, with the Instructor\n    library to produce structured outputs. It covers installation, API key setup,\n    and offers a practical example of extracting structured data using Anyscale's\n    API and the Instructor client in JSON schema mode. Supported modes include JSON,\n    JSON_SCHEMA, TOOLS, and MD_JSON, and the platform features a variety of models\n    such as Mistral and Llama, making it a comprehensive resource for leveraging open-source\n    LLMs for structured data extraction and AI development.\nintegrations/azure.md:\n  cross_links: []\n  hash: 3a23c67e1ceafad28834395d384f37ff\n  references: []\n  summary: This comprehensive guide explains how to use Azure OpenAI with Instructor\n    for structured outputs, including synchronous and asynchronous implementations,\n    streaming, nested models, and response validation. It covers installation, authentication,\n    deploying models, and working with various response modes such as JSON, tools,\n    and function calling. Key features include streaming partial and iterable responses,\n    handling complex nested data, and leveraging different Instructor modes to optimize\n    structured output generation. This resource is ideal for developers seeking secure,\n    enterprise-grade AI solutions with Azure OpenAI and Instructor for reliable, scalable\n    structured data extraction.\nintegrations/bedrock.md:\n  cross_links: []\n  hash: 52ede618fbd9c3a9edc9355537e1eb51\n  references: []\n  summary: This guide explains how to use AWS Bedrock with Instructor and Pydantic\n    for generating structured, validated JSON outputs from Amazon's foundational AI\n    models. It covers setting up the AWS Bedrock client, implementing type-safe responses\n    with Pydantic models, and utilizing different modes like BEDROCK_TOOLS and BEDROCK_JSON\n    for flexible output formats. The tutorial also demonstrates handling nested objects\n    and complex data structures, enabling developers to create robust, structured\n    AI interactions in Python. Core keywords include AWS Bedrock, Instructor, Pydantic,\n    JSON outputs, structured responses, AI models, and type safety.\nintegrations/cerebras.md:\n  cross_links: []\n  hash: 30881d913bf857193a0b5af812d259c2\n  references: []\n  summary: This comprehensive guide details how to use Instructor with Cerebras's\n    hardware-accelerated AI models for generating structured, type-safe outputs. It\n    covers installation, both synchronous and asynchronous usage examples, and advanced\n    features like nested outputs and streaming support, including partial and iterable\n    streaming modes. The guide highlights customization through Instructor hooks and\n    explains different response modes such as CEREBRAS_JSON and CEREBRAS_TOOLS, emphasizing\n    the flexibility and future-proofing of these modes for high-performance, validated\n    AI responses. Key terms include Cerebras, Instructor, structured outputs, JSON\n    parsing, streaming, validation hooks, and AI model integration.\nintegrations/cohere.md:\n  cross_links: []\n  hash: bcabf6169d2e18732d09f41a2b03ee9a\n  references: []\n  summary: This guide provides a comprehensive tutorial on generating structured,\n    type-safe outputs with Cohere's command models using the Instructor library in\n    Python. It covers setup instructions, including installing the library and obtaining\n    an API key. The tutorial demonstrates how to define data models with Pydantic,\n    patch the Cohere client with Instructor for enhanced capabilities, and generate\n    structured responses such as creating a detailed Group object based on provided\n    text. Key features include leveraging Cohere's command models like \"command-r-plus\"\n    to produce accurate, JSON-formatted data, making it ideal for tasks requiring\n    structured outputs, data extraction, and automation. This resource is valuable\n    for developers seeking to enhance NLP workflows with reliable, structured data\n    generation.\nintegrations/cortex.md:\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  hash: 5dc3985ba626ba07487689f654305962\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This guide provides a comprehensive overview of using Cortex with Instructor\n    to achieve structured outputs from local open-source large language models (LLMs).\n    It covers quick setup, both synchronous and asynchronous API usage, and demonstrates\n    advanced nested extraction examples with Pydantic models. Key topics include model\n    deployment with Cortex, integration with OpenAI clients, and effective prompt\n    handling for structured data extraction. Essential keywords include Cortex, Instructor,\n    LLM, structured outputs, local models, open-source, API integration, Pydantic,\n    and AI prompt engineering.\nintegrations/databricks.md:\n  cross_links: []\n  hash: 10a70b86eb06ad1262a58d8050984151\n  references: []\n  summary: This guide provides a comprehensive overview of using Databricks with the\n    Instructor library to obtain structured outputs from AI models. It covers installation,\n    setting up environment variables with Databricks API keys and workspace URL, and\n    demonstrates a basic example of extracting structured data such as user information\n    using Databricks models. The guide highlights supported modes like TOOLS, JSON,\n    FUNCTIONS, and more, and explains that Databricks offers access to various models,\n    including foundation, fine-tuned, and open-source models deployed on the platform.\n    Keywords include Databricks, Instructor, structured outputs, AI models, API integration,\n    and machine learning.\nintegrations/deepseek.md:\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  hash: 8e0bf42ff9f31e84527488ce3b43e8d9\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This guide provides a comprehensive overview of using DeepSeek models with\n    Instructor for type-safe, structured outputs. DeepSeek, a Chinese AI company,\n    offers various models including the deepseek coder, chat model, and R1 reasoning\n    model. The tutorial demonstrates how to set up and utilize models for both synchronous\n    and asynchronous scenarios using the OpenAI API. Key features include creating\n    structured outputs with Pydantic, streaming with iterables and partials, and integrating\n    reasoning models for detailed completion traces. Essential steps for setting up\n    include initializing the `instructor` package, configuring the API key, and using\n    the appropriate Instructor modes. Core keywords include DeepSeek, AI models, structured\n    outputs, type-safe, OpenAI API, Instructor, Pydantic, synchronous, asynchronous,\n    and reasoning models.\nintegrations/fireworks.md:\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  hash: 542aa4056ddd0ae3132abdbd10cbffa2\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This comprehensive guide provides instructions on utilizing Instructor\n    with Fireworks AI models to create structured, type-safe outputs. It covers installation,\n    basic synchronous and asynchronous user examples, and complex nested examples,\n    emphasizing high-performance and cost-effective AI capabilities. The guide also\n    demonstrates streaming support, including iterables and partial streaming, using\n    Pydantic models for type validation. Key points include integration with `Fireworks`,\n    usage of `instructor` modes for structured outputs, and maintaining compatibility\n    with the latest Fireworks API versions. Essential keywords include Fireworks AI,\n    Instructor, structured outputs, type-safe, streaming support, and Pydantic.\nintegrations/genai.md:\n  ai_references:\n  - '[official Google AI documentation for the GenAI SDK](https://googleapis.github.io/python-genai/)'\n  - '[official documentation](https://ai.google.dev/gemini-api/docs/thinking)'\n  - '[documentation for models](https://ai.google.dev/gemini-api/docs/models)'\n  cross_links: []\n  hash: 7ca74881599d14d3795d4e09e0723e84\n  keywords:\n  - Google GenAI\n  - structured outputs\n  - Gemini models\n  - Python SDK\n  - multimodal processing\n  - data extraction\n  - Instructor\n  - Pydantic models\n  references: []\n  summary: This guide provides step-by-step instructions on using Google's Generative\n    AI SDK (genai) with Instructor to extract structured data from Gemini models.\n    It covers essential modes, installation instructions, message formatting, and\n    multimodal capabilities, enabling users to efficiently handle various input types\n    such as audio, images, and PDFs.\n  topics: []\nintegrations/google.md:\n  ai_references:\n  - '[Google''s documentation on Gemini configuration parameters](https://cloud.google.com/vertex-ai/generative-ai/docs/samples/generativeaionvertexai-gemini-pro-config-example)'\n  - '[Using Geminin To Extract Travel Video Recommendations](../blog/posts/multimodal-gemini.md)'\n  - '[Parsing PDFs with Gemini](../blog/posts/chat-with-your-pdf-with-gemini.md)'\n  - '[Generating Citations with Gemini](../blog/posts/generating-pdf-citations.md)'\n  - '[Google AI Documentation](https://ai.google.dev/)'\n  - '[Instructor Core Concepts](../concepts/index.md)'\n  - '[Type Validation Guide](../concepts/validation.md)'\n  - '[Advanced Usage Examples](../examples/index.md)'\n  - '[changelog](https://github.com/jxnl/instructor/blob/main/CHANGELOG.md)'\n  cross_links:\n  - blog/posts/chat-with-your-pdf-with-gemini.md\n  - blog/posts/generating-pdf-citations.md\n  - blog/posts/multimodal-gemini.md\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  - index.md\n  hash: 469bedc93eaca35e535263d257d81094\n  keywords:\n  - Google Gemini\n  - structured data extraction\n  - Instructor library\n  - multimodal AI\n  - type-safe outputs\n  - configuration options\n  - async support\n  - response models\n  references:\n  - blog/posts/multimodal-gemini.md\n  - blog/posts/chat-with-your-pdf-with-gemini.md\n  - blog/posts/generating-pdf-citations.md\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: \"This tutorial provides a comprehensive guide on using Google's Gemini\\\n    \\ models\\u2014Pro, Flash, and Ultra\\u2014with the Instructor library for structured\\\n    \\ data extraction. Learn to process multimodal inputs, customize model behavior,\\\n    \\ and utilize type-safe outputs effectively through detailed examples and configurations.\"\n  topics: []\nintegrations/groq.md:\n  cross_links: []\n  hash: 1b2b59a31e2e4ce05dff63e482192a95\n  references: []\n  summary: The article provides a detailed guide on using Groq AI with Pydantic to\n    generate structured outputs in Python. It highlights using the `llama-3-groq-70b-8192-tool-use-preview`\n    model to create type-safe, structured responses via synchronous and asynchronous\n    examples. The guide emphasizes setting up with an API key, employing Groq's LLM\n    models, and integrating Pydantic for defining response structures. It also demonstrates\n    creating nested object responses for complex data extraction. Key terms include\n    Groq AI, Pydantic, structured outputs, type-safe responses, and Python API integration.\nintegrations/index.md:\n  ai_references:\n  - '[openai.md'\n  - openai-responses.md\n  - azure.md\n  - anthropic.md\n  - google.md\n  - vertex.md\n  - bedrock.md\n  - genai.md\n  - cohere.md\n  - mistral.md\n  - deepseek.md\n  - together.md\n  - groq.md\n  - fireworks.md\n  - cerebras.md\n  - writer.md\n  - perplexity.md\n  - sambanova.md\n  - ollama.md\n  - llama-cpp-python.md\n  - patching.md\n  - models.md\n  - validation.md\n  - partial.md\n  - iterable.md\n  - hooks.md\n  - modes-comparison.md\n  - examples/index.md]\n  cross_links:\n  - blog/posts/anthropic.md\n  - blog/posts/structured-output-anthropic.md\n  - concepts/hooks.md\n  - concepts/iterable.md\n  - concepts/models.md\n  - concepts/partial.md\n  - concepts/patching.md\n  - concepts/reask_validation.md\n  - concepts/semantic_validation.md\n  - concepts/validation.md\n  - examples/groq.md\n  - examples/index.md\n  - examples/mistral.md\n  - examples/ollama.md\n  - index.md\n  - integrations/anthropic.md\n  - integrations/azure.md\n  - integrations/bedrock.md\n  - integrations/cerebras.md\n  - integrations/cohere.md\n  - integrations/deepseek.md\n  - integrations/fireworks.md\n  - integrations/genai.md\n  - integrations/google.md\n  - integrations/groq.md\n  - integrations/litellm.md\n  - integrations/llama-cpp-python.md\n  - integrations/mistral.md\n  - integrations/ollama.md\n  - integrations/openai-responses.md\n  - integrations/openai.md\n  - integrations/openrouter.md\n  - integrations/perplexity.md\n  - integrations/sambanova.md\n  - integrations/together.md\n  - integrations/vertex.md\n  - integrations/writer.md\n  - learning/getting_started/response_models.md\n  - learning/patterns/field_validation.md\n  - learning/validation/field_level_validation.md\n  - modes-comparison.md\n  hash: 0cd377c30ed32c1e1436c3194f87f72c\n  keywords:\n  - '[LLM integration'\n  - AI model providers\n  - structured output\n  - OpenAI\n  - Anthropic\n  - Google Gemini\n  - local models\n  - Pydantic\n  - cloud services]\n  references:\n  - integrations/openai.md\n  - integrations/openai-responses.md\n  - integrations/azure.md\n  - integrations/anthropic.md\n  - integrations/google.md\n  - integrations/vertex.md\n  - integrations/bedrock.md\n  - integrations/genai.md\n  - integrations/cohere.md\n  - integrations/mistral.md\n  - integrations/deepseek.md\n  - integrations/together.md\n  - integrations/groq.md\n  - integrations/fireworks.md\n  - integrations/cerebras.md\n  - integrations/writer.md\n  - integrations/perplexity.md\n  - integrations/sambanova.md\n  - integrations/ollama.md\n  - integrations/llama-cpp-python.md\n  - integrations/litellm.md\n  - integrations/openrouter.md\n  - concepts/patching.md\n  - concepts/models.md\n  - concepts/validation.md\n  - concepts/partial.md\n  - concepts/iterable.md\n  - concepts/hooks.md\n  - modes-comparison.md\n  - examples/index.md\n  - examples/index.md\n  summary: This documentation provides comprehensive tutorials for integrating the\n    Instructor framework with over 15 LLM providers, including major names like OpenAI,\n    Anthropic, and Google. Users can learn to utilize structured data extraction and\n    various integration modes through clear examples and feature descriptions.\n  topics:\n  - '[Integration with AI providers'\n  - Core features\n  - Provider modes\n  - Getting started\n  - Troubleshooting]\nintegrations/litellm.md:\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  hash: d6fc058af4b92fbded142d630ec90055\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This comprehensive guide explains how to use Instructor with LiteLLM's\n    unified interface to generate structured, type-safe outputs across multiple LLM\n    providers like GPT-3.5 and Claude-3. It covers both synchronous and asynchronous\n    implementations, demonstrating how to create validated responses using Pydantic\n    models. Additionally, the guide details cost calculation via response cost attributes\n    and emphasizes LiteLLM's compatibility and easy model switching. Key topics include\n    structured output generation, response validation, cost tracking, and integration\n    with various LLM providers.\nintegrations/llama-cpp-python.md:\n  cross_links:\n  - examples/index.md\n  - index.md\n  - why.md\n  hash: d4baa4f29b79ed75acefbd1acaec8481\n  references:\n  - index.md\n  - why.md\n  - examples/index.md\n  summary: This comprehensive guide explores how to generate structured, type-safe\n    outputs using llama-cpp-python with Instructor, focusing on JSON schema mode and\n    speculative decoding. By leveraging open-source LLMs, users can achieve structured\n    outputs with constrained sampling techniques and avoid network dependencies using\n    an OpenAI-compatible client. The guide highlights features such as the `response_model`\n    and `max_retries` for enhanced functionality in `create` calls, showcasing the\n    use of Pydantic for efficient data validation. An advanced example using JSON\n    schema to extract data within a streaming context is also presented. Key terms\n    include llama-cpp-python, JSON schema mode, speculative decoding, Pydantic, and\n    structured outputs.\nintegrations/mistral.md:\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  hash: f821daf9ad84fd47d59dd265143b200b\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This comprehensive guide explains how to use Mistral AI's Large model with\n    Instructor to generate structured, type-safe outputs and JSON schema-based function\n    calling. It covers setup instructions, including API key configuration, and showcases\n    how to utilize Mistral's capabilities in both synchronous and asynchronous modes,\n    with support for nested models, streaming, and multimodal PDF analysis. Key features\n    include modes for structured outputs, partial response streaming, iterable responses,\n    and advanced multimodal extraction, making it an essential resource for leveraging\n    Mistral's powerful AI models with Instructor for reliable data extraction and\n    structured AI responses.\nintegrations/ollama.md:\n  ai_references:\n  - '[../index.md'\n  - ../why.md]\n  cross_links:\n  - index.md\n  - why.md\n  hash: 0e9679037802bdef503c474201b3e5dd\n  keywords:\n  - '[Ollama'\n  - Instructor\n  - JSON schema\n  - structured outputs\n  - timeout handling\n  - open source\n  - local LLMs\n  - Pydantic]\n  references:\n  - index.md\n  - why.md\n  summary: This comprehensive guide teaches you how to leverage Ollama with Instructor\n    to generate structured outputs using JSON schema, enhancing response safety and\n    reliability. You will explore key features like timeout handling and automated\n    client modes for optimal performance when working with local LLMs.\n  topics:\n  - '[Using Ollama with Instructor'\n  - Patching\n  - Timeout Handling\n  - Quick Start with Auto Client\n  - Manual Setup]\nintegrations/openai-responses.md:\n  ai_references:\n  - '[OpenAI Documentation](https://platform.openai.com/docs)'\n  - '[Instructor Core Concepts](../concepts/index.md)'\n  - '[Type Validation Guide](../concepts/validation.md)'\n  - '[Advanced Usage Examples](../examples/index.md)'\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  - index.md\n  hash: d79a5a73ed7e5610674465ceb9217177\n  keywords:\n  - OpenAI\n  - Responses API\n  - structured outputs\n  - Python\n  - examples\n  - web search\n  - file search\n  - type-safe\n  - validated outputs\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: The OpenAI Responses API Guide provides comprehensive instructions on leveraging\n    the new API for structured outputs with OpenAI models, focusing on best practices\n    and examples. This guide highlights various response modes, core methods, and\n    built-in tools to enhance functionality, making it ideal for developers looking\n    to implement type-safe, validated outputs in their applications.\n  topics: []\nintegrations/openai.md:\n  cross_links:\n  - concepts/index.md\n  - concepts/multimodal.md\n  - concepts/validation.md\n  - examples/batch_job_oai.md\n  - examples/index.md\n  hash: e590f98025395a6720663e19033615a5\n  references:\n  - concepts/multimodal.md\n  - examples/batch_job_oai.md\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This comprehensive guide explores using OpenAI models with Instructor for\n    structured, type-safe outputs, including GPT-4, GPT-3.5, and multimodal capabilities\n    with images, audio, and PDFs. It covers setup, both synchronous and asynchronous\n    examples, nested data extraction, multimodal analysis, streaming, batching, and\n    various response modes like tools and JSON modes. The tutorial emphasizes best\n    practices for model selection, performance optimization, and common use cases\n    such as data extraction, document analysis, form parsing, and API response structuring.\n    Keywords include OpenAI, Instructor, structured outputs, GPT-4, multimodal, streaming,\n    batch API, data extraction, type-safe responses, and API integrations.\nintegrations/openrouter.md:\n  cross_links: []\n  hash: bd8e8fdd749c0da0250180d12cc97e4e\n  references: []\n  summary: 'This comprehensive guide explains how to use Instructor with OpenRouter\n    to achieve structured, type-safe outputs across multiple large language model\n    (LLM) providers. It details how to integrate Instructor with the OpenAI client,\n    supporting synchronous and asynchronous usage, nested object extraction, and various\n    modes including Structured Outputs and JSON. The guide emphasizes the importance\n    of model compatibility with tool calling and structured outputs, provides code\n    examples for different scenarios, and highlights how to enable streaming responses.\n    Key topics include multi-provider API switching, schema validation with Pydantic\n    models, handling models without tool calling support, and leveraging OpenRouter''s\n    unified API for enhanced LLM integrations. Core keywords: OpenRouter, Instructor,\n    LLM, structured outputs, tool calling, API integration, type-safe responses, multi-provider,\n    GPT models, JSON mode, streaming.'\nintegrations/perplexity.md:\n  ai_references:\n  - '[Perplexity API Documentation](https://docs.perplexity.ai/)'\n  - '[Perplexity API Reference](https://docs.perplexity.ai/reference/post_chat_completions)'\n  cross_links: []\n  hash: 75d7e6c97db652b39aa3eeafad8db003\n  keywords:\n  - Perplexity AI\n  - Instructor\n  - structured outputs\n  - Pydantic\n  - JSON\n  - API key\n  - type-safe\n  - validated responses\n  - nested objects\n  references: []\n  summary: This guide explains how to utilize Perplexity AI with the Instructor library\n    to create structured JSON outputs using Pydantic models in Python. It covers both\n    synchronous and asynchronous examples, as well as details on creating nested objects\n    for type-safe and validated responses from Perplexity's Sonar models.\n  topics: []\nintegrations/sambanova.md:\n  cross_links: []\n  hash: 81003730e09b4b43bccdd04a11b7f3ae\n  references: []\n  summary: SambaNova integration with Instructor allows users to leverage SambaNova's\n    LLM API for structured output generation in Python. The setup involves installing\n    the `instructor[openai]` package and configuring the client with the SambaNova\n    API endpoint and API key. It supports both synchronous and asynchronous usage,\n    enabling detailed prompt and response modeling with Pydantic. Key models include\n    Meta-Llama-3.1-405B-Instruct, and users can explore additional options via SambaNova's\n    documentation. This integration facilitates advanced AI workflows with SambaNova's\n    large language models for enhanced NLP applications.\nintegrations/together.md:\n  cross_links:\n  - index.md\n  - why.md\n  hash: 39d3ac703bab17e5ad0cb06d6c0cafd6\n  references:\n  - index.md\n  - why.md\n  summary: 'This comprehensive guide explains how to use Together AI with Instructor\n    to generate structured, type-safe outputs through function calling. It highlights\n    open-source LLM support, patching features like response models and retries, and\n    demonstrates how to integrate Instructor with Together''s models using Python.\n    Key topics include leveraging Pydantic for data validation, utilizing Together\n    AI''s API, and creating custom models for accurate output extraction. Keywords:\n    Together AI, Instructor, structured outputs, function calling, open-source LLMs,\n    Python, Pydantic, type-safe responses, API integration.'\nintegrations/vertex.md:\n  ai_references:\n  - '[../concepts/index.md'\n  - ../concepts/validation.md\n  - ../examples/index.md\n  - https://cloud.google.com/vertex-ai/docs\n  - https://github.com/jxnl/instructor/blob/main/CHANGELOG.md]\n  cross_links:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  - index.md\n  hash: 555e953a46c2db86bfd7ae9ff1a071f3\n  keywords:\n  - '[Vertex AI'\n  - Instructor\n  - structured outputs\n  - type-safe responses\n  - asynchronous streaming\n  - Python examples\n  - Google Cloud\n  - generative models]\n  references:\n  - concepts/index.md\n  - concepts/validation.md\n  - examples/index.md\n  summary: This comprehensive guide demonstrates how to utilize Instructor with Google\n    Cloud's Vertex AI to generate structured, type-safe outputs. It explores synchronous\n    and asynchronous usage, provides concrete examples, and highlights the newly added\n    streaming capabilities for efficient data handling.\n  topics:\n  - '[Getting Started'\n  - Synchronous User Example\n  - Asynchronous User Example\n  - Streaming Support\n  - Updates and Compatibility]\nintegrations/writer.md:\n  cross_links: []\n  hash: 27299f8967d9a30443039b93e1d233dd\n  references: []\n  summary: 'This guide provides a comprehensive overview of using Writer for structured\n    outputs with the latest Palmyra-X-004 model, which enhances reliability using\n    tool-calling functionality. It includes setup instructions, such as obtaining\n    an API key and integrating with Python using Writer''s `instructor` module. The\n    guide offers synchronous and asynchronous examples for extracting structured data,\n    including support for nested objects and streaming responses with iterables and\n    partial streaming. Key topics include structured data extraction, API integration,\n    Python scripting, and advanced data handling with Writer''s Palmyra-X-004 model.\n    Keywords: Writer, Palmyra-X-004, structured outputs, API key, data extraction,\n    nested objects, streaming support, Python integration.'\njobs.md:\n  cross_links: []\n  hash: d41d8cd98f00b204e9800998ecf8427e\n  references: []\n  summary: Of course! Please provide the text that you would like me to summarize,\n    and I'll be happy to assist you.\nlearning/getting_started/client_setup.md:\n  ai_references:\n  - '[../patterns/simple_object.md'\n  - ../patterns/list_extraction.md\n  - ../patterns/nested_structure.md\n  - ../validation/basics.md]\n  cross_links:\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/simple_object.md\n  - learning/validation/basics.md\n  hash: 7d7ea676cc2058a2fa58216ab56d366c\n  keywords:\n  - '[client setup'\n  - Instructor\n  - OpenAI\n  - Anthropic\n  - Google Gemini\n  - Cohere\n  - Mistral\n  - async clients\n  - modes]\n  references:\n  - learning/patterns/simple_object.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/validation/basics.md\n  - learning/patterns/optional_fields.md\n  summary: This guide provides step-by-step instructions on setting up various client\n    configurations for utilizing the Instructor with multiple LLM providers, including\n    OpenAI, Anthropic, Google, Cohere, and Mistral. It covers default and JSON modes,\n    async client usage, and advanced configurations for better integration with these\n    providers.\n  topics:\n  - '[Client configuration'\n  - Modes of operation\n  - Asynchronous clients\n  - Advanced configurations\n  - Compatibility with other providers]\nlearning/getting_started/first_extraction.md:\n  ai_references:\n  - '[response_models.md'\n  - client_setup.md\n  - ../patterns/simple_object.md]\n  cross_links:\n  - learning/getting_started/client_setup.md\n  - learning/getting_started/response_models.md\n  - learning/patterns/simple_object.md\n  hash: b253293c79f241efc1338bd19fddfee4\n  keywords:\n  - LLM extraction\n  - structured data\n  - Pydantic\n  - Instructor\n  - OpenAI\n  - Python objects\n  - data validation\n  - field descriptions\n  - optional data\n  references:\n  - learning/getting_started/response_models.md\n  - learning/getting_started/client_setup.md\n  - learning/patterns/simple_object.md\n  - learning/getting_started/response_models.md\n  summary: This tutorial guides users through extracting structured data using LLMs\n    with Instructor, focusing on converting unstructured text into validated Python\n    objects. It includes step-by-step instructions for configuring the model and emphasizes\n    the importance of using Pydantic for type-safe extraction.\n  topics:\n  - LLM extraction process\n  - Pydantic models\n  - configuring an LLM client\n  - handling optional data\n  - common extraction patterns\nlearning/getting_started/installation.md:\n  ai_references:\n  - '[first_extraction.md'\n  - response_models.md\n  - client_setup.md]\n  cross_links:\n  - learning/getting_started/client_setup.md\n  - learning/getting_started/first_extraction.md\n  - learning/getting_started/response_models.md\n  hash: ffd0b4e3d308c123750dc4648591c9fc\n  keywords:\n  - Instructor\n  - LLM\n  - structured outputs\n  - Python\n  - installation\n  - OpenAI\n  - Claude\n  - Gemini\n  - Pydantic\n  references:\n  - learning/getting_started/first_extraction.md\n  - learning/getting_started/response_models.md\n  - learning/getting_started/client_setup.md\n  - learning/getting_started/first_extraction.md\n  summary: This guide provides step-by-step instructions on installing the Instructor\n    library for extracting structured data from various large language models (LLMs)\n    including OpenAI's GPT-4, Anthropic's Claude, and Google's Gemini. It covers installation\n    steps, configuration for different LLM providers, and verification of the setup\n    for beginners looking to enhance their LLM application development.\n  topics:\n  - Installation guide\n  - LLM provider setup\n  - API configuration\n  - verification tests\n  - common issues\nlearning/getting_started/response_models.md:\n  ai_references:\n  - '[../patterns/field_validation.md'\n  - ../validation/basics.md\n  - ../patterns/nested_structure.md\n  - ../patterns/optional_fields.md\n  - ../patterns/list_extraction.md\n  - ../validation/custom_validators.md\n  - client_setup.md]\n  cross_links:\n  - learning/getting_started/client_setup.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/validation/basics.md\n  - learning/validation/custom_validators.md\n  hash: fe9bd1a857fd36a269a55a0b05c8f7e5\n  keywords:\n  - '[response models'\n  - Pydantic\n  - field validation\n  - nested models\n  - enums\n  - optional fields\n  - model documentation\n  - data extraction]\n  references:\n  - learning/patterns/field_validation.md\n  - learning/validation/basics.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/list_extraction.md\n  - learning/validation/custom_validators.md\n  - learning/getting_started/client_setup.md\n  summary: This guide provides an in-depth look at response models in Instructor,\n    outlining how to create, validate, and document different types of models using\n    Pydantic. It covers basic and advanced topics including field metadata, validation\n    rules, nested models, enums, optional fields, and more to effectively extract\n    data for various use cases.\n  topics:\n  - '[Basic Models'\n  - Field Metadata\n  - Field Validation\n  - Nested Models\n  - Using Enums]\nlearning/getting_started/structured_outputs.md:\n  ai_references:\n  - '[first_extraction.md'\n  - response_models.md\n  - client_setup.md]\n  cross_links:\n  - learning/getting_started/client_setup.md\n  - learning/getting_started/first_extraction.md\n  - learning/getting_started/response_models.md\n  hash: e909556e4995fb3ac4ae5cc34a0c901e\n  keywords:\n  - structured outputs\n  - large language models\n  - data extraction\n  - Pydantic\n  - consistency\n  - validation\n  - type safety\n  - Instructor\n  references:\n  - learning/getting_started/first_extraction.md\n  - learning/getting_started/response_models.md\n  - learning/getting_started/client_setup.md\n  summary: This guide introduces the concept of structured outputs for large language\n    models, emphasizing the benefits of using Pydantic models to enforce data consistency,\n    validation, and type safety. It provides examples of extracting structured data\n    from LLMs and discusses the installation and setup of the Instructor package for\n    improved data handling.\n  topics:\n  - structured data extraction\n  - Pydantic models\n  - handling unstructured outputs\n  - installation and setup\n  - complex data structures\nlearning/index.md:\n  ai_references:\n  - '[getting_started/installation.md'\n  - getting_started/first_extraction.md\n  - getting_started/response_models.md\n  - getting_started/client_setup.md\n  - patterns/simple_object.md\n  - patterns/list_extraction.md\n  - patterns/nested_structure.md\n  - patterns/optional_fields.md\n  - patterns/field_validation.md\n  - patterns/prompt_templates.md\n  - validation/basics.md\n  - validation/field_level_validation.md\n  - validation/custom_validators.md\n  - validation/retry_mechanisms.md\n  - streaming/basics.md\n  - streaming/lists.md]\n  cross_links:\n  - installation.md\n  - learning/getting_started/client_setup.md\n  - learning/getting_started/first_extraction.md\n  - learning/getting_started/installation.md\n  - learning/getting_started/response_models.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/prompt_templates.md\n  - learning/patterns/simple_object.md\n  - learning/streaming/basics.md\n  - learning/streaming/lists.md\n  - learning/validation/basics.md\n  - learning/validation/custom_validators.md\n  - learning/validation/field_level_validation.md\n  - learning/validation/retry_mechanisms.md\n  hash: 3e793197ba6ac51caef1d12f465dd1d6\n  keywords:\n  - Instructor library\n  - LLM integration\n  - structured outputs\n  - data extraction\n  - Python tutorial\n  - AI applications\n  - output validation\n  - real-time processing\n  references:\n  - learning/getting_started/installation.md\n  - learning/getting_started/first_extraction.md\n  - learning/getting_started/response_models.md\n  - learning/getting_started/client_setup.md\n  - learning/patterns/simple_object.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/prompt_templates.md\n  - learning/validation/basics.md\n  - learning/validation/field_level_validation.md\n  - learning/validation/custom_validators.md\n  - learning/validation/retry_mechanisms.md\n  - learning/streaming/basics.md\n  - learning/streaming/lists.md\n  - learning/getting_started/installation.md\n  summary: This comprehensive tutorial for the Instructor library provides a complete\n    guide on utilizing LLMs for structured outputs, covering everything from installation\n    to advanced data extraction patterns. It is designed for developers aiming to\n    create reliable AI applications using various language models like GPT-4, Claude,\n    and Gemini.\n  topics:\n  - LLM integration basics\n  - structured output patterns\n  - data extraction tutorials\n  - output validation\n  - streaming LLM responses\nlearning/patterns/field_validation.md:\n  ai_references:\n  - '[Fields](../../concepts/fields.md)'\n  - '[Custom Validators](../validation/custom_validators.md)'\n  - '[Nested Structure](nested_structure.md)'\n  - '[Validation Basics](../validation/basics.md)'\n  - '[Field-level Validation](../validation/field_level_validation.md)'\n  - '[Retry Mechanisms](../validation/retry_mechanisms.md)'\n  - '[Enums](../../concepts/enums.md)'\n  - '[Optional Fields](optional_fields.md)'\n  cross_links:\n  - concepts/enums.md\n  - concepts/fields.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/validation/basics.md\n  - learning/validation/custom_validators.md\n  - learning/validation/field_level_validation.md\n  - learning/validation/retry_mechanisms.md\n  hash: cbe2f1fced3d98448d736783e49fcd08\n  keywords:\n  - field validation\n  - Pydantic\n  - data quality\n  - validation logic\n  - structured data extraction\n  - custom validators\n  - model validation\n  - error handling\n  - instructor\n  references:\n  - concepts/fields.md\n  - learning/validation/custom_validators.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/list_extraction.md\n  - concepts/enums.md\n  - learning/validation/retry_mechanisms.md\n  - learning/validation/basics.md\n  - learning/validation/custom_validators.md\n  - learning/validation/field_level_validation.md\n  - learning/validation/retry_mechanisms.md\n  - concepts/fields.md\n  - concepts/enums.md\n  - learning/patterns/optional_fields.md\n  - learning/validation/custom_validators.md\n  - learning/patterns/nested_structure.md\n  summary: This guide explains how to implement field validation for structured data\n    extraction using the Instructor framework, leveraging Pydantic's validation features\n    to ensure data quality and compliance with defined criteria. It discusses basic\n    and complex validation methods, including field-level, model-level, and validation\n    with enumerations, while providing practical code examples.\n  topics:\n  - field validation methods\n  - basic field constraints\n  - complex validation logic\n  - validation in nested structures\n  - error handling\nlearning/patterns/list_extraction.md:\n  ai_references:\n  - '[../streaming/basics.md'\n  - ../streaming/lists.md\n  - ./field_validation.md\n  - ../validation/basics.md\n  - ./simple_object.md\n  - ./nested_structure.md\n  - ../../concepts/lists.md\n  - ../../examples/action_items.md]\n  cross_links:\n  - concepts/lists.md\n  - examples/action_items.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/simple_object.md\n  - learning/streaming/basics.md\n  - learning/streaming/lists.md\n  - learning/validation/basics.md\n  hash: 85a3d3e972d716f00c22f1128ae94c7e\n  keywords:\n  - '[list extraction'\n  - LLM\n  - GPT-4\n  - Pydantic\n  - data validation\n  - streaming\n  - Python\n  - nested lists\n  - Instructor\n  - structured data]\n  references:\n  - learning/streaming/basics.md\n  - learning/streaming/lists.md\n  - learning/patterns/field_validation.md\n  - learning/validation/basics.md\n  - examples/action_items.md\n  - learning/patterns/simple_object.md\n  - learning/patterns/nested_structure.md\n  - learning/streaming/lists.md\n  - concepts/lists.md\n  - learning/patterns/nested_structure.md\n  - learning/streaming/lists.md\n  - learning/patterns/field_validation.md\n  summary: This tutorial provides a comprehensive guide on extracting lists and arrays\n    from language models like GPT-4, Claude, and Gemini using the Instructor package.\n    It covers basic list extraction, nested lists, streaming capabilities, validation\n    techniques, and constraints on list properties, making it an essential resource\n    for developers working with structured data extraction.\n  topics:\n  - '[Basic List Extraction'\n  - Nested Lists\n  - List Validation\n  - Direct List Extraction\n  - Real-world Example]\nlearning/patterns/nested_structure.md:\n  ai_references:\n  - '[list_extraction.md'\n  - optional_fields.md\n  - field_validation.md\n  - recursive.md\n  - simple_object.md]\n  cross_links:\n  - examples/recursive.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/simple_object.md\n  - learning/validation/basics.md\n  hash: 3e670af52d96c2ad1f78e1c8c38a4eb0\n  keywords:\n  - nested structures\n  - hierarchical data\n  - data extraction\n  - Pydantic\n  - Instructor library\n  - validation\n  - optional fields\n  - recursive structures\n  - Python\n  references:\n  - learning/patterns/list_extraction.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/field_validation.md\n  - learning/validation/basics.md\n  - examples/recursive.md\n  - learning/patterns/simple_object.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/optional_fields.md\n  - examples/recursive.md\n  - learning/patterns/field_validation.md\n  summary: This guide provides comprehensive instructions on extracting nested structured\n    data using the Instructor library. It covers various topics such as basic nested\n    structures, multiple levels of nesting, handling optional fields, and validating\n    nested structures, making it a valuable resource for developers working with hierarchical\n    data relationships.\n  topics:\n  - nested structures\n  - multiple levels of nesting\n  - optional nested fields\n  - nested structure validation\n  - recursive structures\nlearning/patterns/optional_fields.md:\n  ai_references:\n  - '[Missing Concepts](../../concepts/maybe.md)'\n  - '[Simple Object Extraction](./simple_object.md)'\n  - '[Field Validation](./field_validation.md)'\n  - '[Nested Structure](./nested_structure.md)'\n  - '[Prompt Templates](./prompt_templates.md)'\n  cross_links:\n  - concepts/maybe.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/prompt_templates.md\n  - learning/patterns/simple_object.md\n  hash: 5912fc79517ab7b3180183d20e725802\n  keywords:\n  - optional fields\n  - Python\n  - Pydantic\n  - data models\n  - validation\n  - Maybe type\n  - nested structures\n  - default values\n  references:\n  - concepts/maybe.md\n  - learning/patterns/simple_object.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/nested_structure.md\n  - concepts/maybe.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/prompt_templates.md\n  summary: This guide provides an overview of how to implement optional fields in\n    data models using Python and Pydantic. It explains their benefits, how to set\n    default values, and discusses validation techniques, including handling nested\n    structures and uncertain fields with the Maybe type.\n  topics:\n  - working with optional fields\n  - setting default values\n  - validation techniques\n  - handling uncertain fields\n  - using nested structures\nlearning/patterns/prompt_templates.md:\n  ai_references:\n  - '[simple_object.md'\n  - list_extraction.md\n  - optional_fields.md\n  - prompting.md\n  - templating.md\n  - field_validation.md\n  - nested_structure.md]\n  cross_links:\n  - concepts/prompting.md\n  - concepts/templating.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/optional_fields.md\n  - learning/patterns/simple_object.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md\n  - prompting/zero_shot/emotion_prompting.md\n  - prompting/zero_shot/role_prompting.md\n  - prompting/zero_shot/style_prompting.md\n  hash: ea4ae44a438b1728732ed8bdc0573961\n  keywords:\n  - prompt templates\n  - structured data extraction\n  - parameterized prompts\n  - Python\n  - OpenAI\n  references:\n  - learning/patterns/simple_object.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/optional_fields.md\n  - concepts/prompting.md\n  - concepts/templating.md\n  - learning/patterns/field_validation.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  summary: This guide provides an overview of using prompt templates with Instructor\n    for structured data extraction. It outlines the benefits of prompt templates,\n    demonstrates how to create basic and complex templates using Python, and shares\n    best practices for effective prompt engineering.\n  topics:\n  - importance of prompt templates\n  - creating basic and complex templates\n  - best practices for prompts\n  - using f-strings\n  - template functions\nlearning/patterns/simple_object.md:\n  ai_references:\n  - '[list_extraction.md'\n  - nested_structure.md\n  - field_validation.md]\n  cross_links:\n  - learning/patterns/field_validation.md\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  hash: 80d435d6ae347a1e8d1ef3dfa715526c\n  keywords:\n  - '[LLM extraction'\n  - Pydantic\n  - structured data\n  - Python\n  - GPT-4\n  - data validation\n  - object extraction\n  - schema definition]\n  references:\n  - learning/patterns/list_extraction.md\n  - learning/patterns/nested_structure.md\n  - learning/patterns/field_validation.md\n  summary: This tutorial provides a comprehensive guide on extracting structured data\n    from unstructured text using Large Language Models (LLMs) like GPT-4 and Claude.\n    It covers various topics including schema definitions, handling missing data,\n    and validation with Pydantic, as well as offers practical code examples and common\n    use cases for LLM object extraction.\n  topics:\n  - '[LLM Object Extraction'\n  - Pydantic Validation\n  - Handling Missing Data\n  - Nested Object Extraction\n  - Common Use Cases]\nlearning/streaming/basics.md:\n  ai_references:\n  - '[lists.md'\n  - ../validation/basics.md]\n  cross_links:\n  - learning/streaming/lists.md\n  - learning/validation/basics.md\n  hash: b5246fcd0ecaf2a3d6cb1c7c2bf0f8b7\n  keywords:\n  - '[streaming'\n  - structured response\n  - user interface\n  - real-time updates\n  - Python example\n  - OpenAI\n  - progressive updates\n  - data processing\n  - completion tracking]\n  references:\n  - learning/streaming/lists.md\n  - learning/validation/basics.md\n  summary: Streaming enables immediate receipt of structured data responses, enhancing\n    user experience with faster perceived responses and dynamic UI updates. By leveraging\n    streaming, users can begin to process information as soon as it is available,\n    rather than waiting for a complete response.\n  topics:\n  - '[Streaming benefits'\n  - Python implementation\n  - progress tracking\n  - data processing\n  - structured responses]\nlearning/streaming/lists.md:\n  ai_references:\n  - '[basics.md'\n  - ../../learning/patterns/list_extraction.md\n  - ../../learning/validation/basics.md\n  - ../../concepts/partial.md\n  - ../../learning/validation/field_level_validation.md\n  - ../../integrations/index.md]\n  cross_links:\n  - concepts/partial.md\n  - index.md\n  - integrations/index.md\n  - learning/patterns/list_extraction.md\n  - learning/streaming/basics.md\n  - learning/validation/basics.md\n  - learning/validation/field_level_validation.md\n  hash: e761179dfde4bfb077da2e8da9b5ed15\n  keywords:\n  - streaming lists\n  - structured data\n  - Pydantic model\n  - OpenAI\n  - responsiveness\n  - task generation\n  - Python typing\n  - project tasks\n  - validation\n  references:\n  - learning/streaming/basics.md\n  - learning/patterns/list_extraction.md\n  - learning/validation/basics.md\n  - concepts/partial.md\n  - learning/validation/basics.md\n  - learning/validation/field_level_validation.md\n  - integrations/index.md\n  summary: This guide explains how to stream lists of structured data using Instructor,\n    enabling the processing of collection items as they are generated for enhanced\n    responsiveness, especially with larger outputs. It includes detailed examples\n    demonstrating the streaming of books and tasks, while highlighting the integration\n    with Python's typing and Pydantic models.\n  topics:\n  - list streaming\n  - data processing\n  - real-world examples\n  - Pydantic and typing\n  - validation concepts\nlearning/validation/basics.md:\n  ai_references:\n  - '[custom_validators.md'\n  - retry_mechanisms.md\n  - field_level_validation.md]\n  cross_links:\n  - learning/validation/custom_validators.md\n  - learning/validation/field_level_validation.md\n  - learning/validation/retry_mechanisms.md\n  hash: 81731be3c79784e84c33b91d626d2ca4\n  keywords:\n  - LLM validation\n  - data integrity\n  - business compliance\n  - structured data\n  - Pydantic\n  - constraint validation\n  - automatic retry\n  - age verification\n  - validation rules\n  references:\n  - learning/validation/custom_validators.md\n  - learning/validation/retry_mechanisms.md\n  - learning/validation/field_level_validation.md\n  summary: This tutorial guides users through the process of validating outputs from\n    Language Learning Models (LLMs) using Instructor's validation system. It ensures\n    that LLM-generated structured data meets data integrity, business compliance,\n    and production reliability standards.\n  topics:\n  - LLM output validation\n  - validation pipeline\n  - constraint validation patterns\n  - common use cases\n  - error messaging\nlearning/validation/custom_validators.md:\n  ai_references:\n  - '[Validation Basics](../../concepts/validation.md)'\n  - '[Retrying](../../concepts/retrying.md)'\n  - '[Field-level Validation](../../concepts/fields.md)'\n  - '[Validators](../../concepts/reask_validation.md)'\n  - '[Contact Information Extraction](../../examples/extract_contact_info.md)'\n  - '[Semantic Validation](../../concepts/semantic_validation.md)'\n  - '[Self-Correction](../../examples/self_critique.md)'\n  - '[Fields](../../concepts/fields.md)'\n  - '[Models](../../concepts/models.md)'\n  - '[Types](../../concepts/types.md)'\n  cross_links:\n  - concepts/fields.md\n  - concepts/models.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/semantic_validation.md\n  - concepts/types.md\n  - concepts/validation.md\n  - examples/extract_contact_info.md\n  - examples/self_critique.md\n  hash: 9185a575da5ee54cba0ee4af777506dc\n  keywords:\n  - custom validators\n  - data quality\n  - Pydantic\n  - semantic validation\n  - GPT-4\n  - Claude\n  - validation techniques\n  - rule-based validation\n  - validation failures\n  references:\n  - concepts/validation.md\n  - concepts/retrying.md\n  - concepts/fields.md\n  - concepts/reask_validation.md\n  - examples/extract_contact_info.md\n  - concepts/semantic_validation.md\n  - concepts/retrying.md\n  - examples/self_critique.md\n  - concepts/validation.md\n  - concepts/fields.md\n  - concepts/models.md\n  - concepts/types.md\n  summary: This tutorial provides a comprehensive guide on building custom validators\n    for outputs from language models like GPT-4 and Claude, focusing on rule-based\n    and semantic validation techniques. By utilizing Pydantic, it demonstrates effective\n    validation strategies to enhance data quality and ensure compliance with specific\n    requirements when working with LLMs.\n  topics: []\nlearning/validation/field_level_validation.md:\n  ai_references:\n  - '[Fields](../../concepts/fields.md)'\n  - '[Custom Validators](../../concepts/reask_validation.md)'\n  - '[Validation Basics](../../concepts/validation.md)'\n  - '[Retry Mechanisms](../../concepts/retrying.md)'\n  - '[Fallback Strategies](../../concepts/error_handling.md)'\n  - '[Types](../../concepts/types.md)'\n  cross_links:\n  - concepts/error_handling.md\n  - concepts/fields.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/types.md\n  - concepts/validation.md\n  hash: 035cef4c2f6e04d1df6a9474fee288cd\n  keywords:\n  - field-level validation\n  - Pydantic\n  - custom validators\n  - validation errors\n  - data models\n  - business rules\n  - error handling\n  - data cleaning\n  references:\n  - concepts/fields.md\n  - concepts/fields.md\n  - concepts/reask_validation.md\n  - concepts/validation.md\n  - concepts/retrying.md\n  - concepts/error_handling.md\n  - concepts/types.md\n  summary: This guide provides an overview of field-level validation using Instructor\n    and Pydantic, detailing how to create specific validation rules for individual\n    fields in data models, including custom validators and handling validation errors.\n    It offers practical examples and best practices to ensure effective validation\n    processes for your applications.\n  topics:\n  - field-level validation\n  - basic field validation\n  - custom field validators\n  - validating multiple fields\n  - best practices\nlearning/validation/retry_mechanisms.md:\n  ai_references:\n  - '[Retrying](../../concepts/retrying.md)'\n  - '[Fallback Strategies](../../concepts/error_handling.md)'\n  - '[Custom Validators](custom_validators.md)'\n  - '[Field-level Validation](field_level_validation.md)'\n  - '[Validation](../../concepts/validation.md)'\n  - '[Self Critique](../../examples/self_critique.md)'\n  cross_links:\n  - concepts/error_handling.md\n  - concepts/reask_validation.md\n  - concepts/retrying.md\n  - concepts/validation.md\n  - examples/self_critique.md\n  - learning/validation/custom_validators.md\n  - learning/validation/field_level_validation.md\n  hash: edfb05018be5a6e42122afbdf99d2292\n  keywords:\n  - retry mechanisms\n  - validation failures\n  - feedback loop\n  - customization options\n  - error handling\n  - Pydantic model\n  - validation messages\n  - complex schemas\n  references:\n  - concepts/retrying.md\n  - concepts/error_handling.md\n  - learning/validation/custom_validators.md\n  - learning/validation/field_level_validation.md\n  - concepts/retrying.md\n  - concepts/validation.md\n  - concepts/reask_validation.md\n  - concepts/error_handling.md\n  - examples/self_critique.md\n  - learning/validation/field_level_validation.md\n  - learning/validation/custom_validators.md\n  summary: This guide provides an overview of retry mechanisms in Instructor that\n    manage validation failures, allowing the LLM to generate valid responses by reattempting\n    with feedback. It includes examples and customization options for retry behavior,\n    error handling strategies, and advanced validation patterns for complex schemas.\n  topics:\n  - Retry Mechanisms\n  - Customizing Retry Behavior\n  - Handling Retry Failures\n  - Error Messages and Feedback\n  - Advanced Validation Patterns\nmodes-comparison.md:\n  cross_links: []\n  hash: 34ad27dd0581822450f815a8043699ce\n  references: []\n  summary: This Mode Comparison Guide explains the different structured data extraction\n    modes available in Instructor for various large language model (LLM) providers,\n    including OpenAI, Anthropic, Google Gemini, Vertex AI, and more. It highlights\n    key modes such as `TOOLS`, `JSON`, `MD_JSON`, and provider-specific options, detailing\n    their best use cases, advantages, and compatibility. The guide offers practical\n    recommendations for selecting the appropriate mode based on complexity, reliability,\n    and provider capabilities, with a focus on optimizing data extraction, structured\n    output, and multi-modal inputs. Key keywords include LLM, Instructor modes, AI\n    tool calling, JSON output, structured data, OpenAI, Anthropic, Google Gemini,\n    Vertex AI, AI prompt engineering, and API integration.\nnewsletter.md:\n  cross_links: []\n  hash: c286128e131ad3635534c9cd9bae2668\n  references: []\n  summary: \"Subscribe to the Instructor Newsletter to stay updated on AI tips, blog\\\n    \\ posts, research, and new features. The newsletter provides insights into AI\\\n    \\ development, structured outputs, LLM research, and community tricks to enhance\\\n    \\ your projects. Stay informed about Instructor\\u2019s latest updates and community\\\n    \\ insights to improve your AI skills and leverage Instructor effectively. Keywords\\\n    \\ include AI updates, Instructor features, structured outputs, LLM research, AI\\\n    \\ development, and community tips.\"\nprompting/decomposition/decomp.md:\n  cross_links: []\n  hash: dd1d49ee871acabb8d368a16ea3150fe\n  references: []\n  summary: 'Decomposed Prompting leverages a Language Model (LLM) to break down complex\n    tasks into manageable sub-tasks, streamlining the problem-solving process. By\n    implementing a system of data models and functions, such as `Split`, `StrPos`,\n    and `Merge`, this approach enables systematic handling of intricate problems.\n    The `derive_action_plan` function orchestrates action plans using specified functions,\n    executed step-by-step to achieve the task goals. This modular method optimizes\n    LLM performance for challenging tasks, demonstrating effective AI-driven automation\n    and problem decomposition. Key terms: Decomposed Prompting, Language Model (LLM),\n    task decomposition, AI automation, action plan, modular approach.'\nprompting/decomposition/faithful_cot.md:\n  cross_links: []\n  hash: f5dd3db43b8242151bac111cab990918\n  references: []\n  summary: 'The concept of \"Faithful Chain of Thought\" in language models focuses\n    on enhancing the accuracy of reasoning by dividing the process into two stages:\n    Translation and Problem Solving. In the Translation stage, a user query is broken\n    down into executable reasoning steps, which are task-specific and deterministically\n    executed in the Problem Solving stage, ensuring consistency in the derived answer.\n    Examples include converting math word problems into executable Python code, using\n    multi-step reasoning in Multi-Hop QA with Python and Datalog, and generating plans\n    with symbolic goals through a PDDL Planner. The approach aims to improve the faithfulness\n    and effectiveness of language models in problem-solving tasks.'\nprompting/decomposition/least_to_most.md:\n  ai_references:\n  - '[Least-to-Most Prompting Enables Complex Reasoning in Large Language Models](https://arxiv.org/abs/2205.10625)'\n  - '[The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)'\n  cross_links: []\n  hash: b2b9a6686aaa01df537e9fc5d8155f0f\n  keywords:\n  - Least-to-Most\n  - prompting technique\n  - language models\n  - subproblems\n  - complex reasoning\n  - sequential solving\n  references: []\n  summary: The Least-to-Most prompting technique is designed to decompose complex\n    problems into simpler, sequentially solved subproblems. This approach allows language\n    models to leverage earlier answers to inform subsequent solutions effectively.\n  topics:\n  - prompting techniques\n  - problem decomposition\n  - language model solutions\n  - subproblem analysis\nprompting/decomposition/plan_and_solve.md:\n  cross_links: []\n  hash: 7efc5f74390a69beeaf130c9b6c31583\n  references: []\n  summary: 'Plan and Solve enhances zero-shot Chain of Thought prompting by incorporating\n    detailed instructions to improve reasoning accuracy in large language models.\n    This approach involves a two-step process: first, devising a comprehensive problem-solving\n    plan with explicit reasoning, and second, extracting the final answer based on\n    this reasoning. By guiding models to pay closer attention to intermediate calculations\n    and logical steps, Plan and Solve achieves more robust performance on various\n    reasoning tasks, making it a valuable technique for improving LLM reasoning capabilities\n    and accuracy. Key words include zero-shot Chain of Thought, reasoning, prompt\n    engineering, large language models, problem-solving, and step-by-step reasoning.'\nprompting/decomposition/program_of_thought.md:\n  cross_links: []\n  hash: 8413ae10bbc35a4f1128759ca3e4f673\n  references: []\n  summary: The \"Program Of Thought\" is an innovative approach that leverages an external\n    Python interpreter to generate intermediate reasoning steps, enhancing performance\n    in mathematical and programming tasks. It involves systematically writing executable\n    code within designated frameworks, such as the instructor system, to derive precise\n    answers. Key features include the use of a specific program prefix, validation\n    of code execution, and integration with AI models like GPT-4 to generate detailed\n    problem-solving workflows, predictions, and accurate answer selection for complex\n    questions. This method aims to ground AI reasoning in deterministic code execution,\n    improving accuracy and transparency in problem-solving.\nprompting/decomposition/recurs_of_thought.md:\n  cross_links: []\n  hash: 5ef001050e89f56ecc769095df6300f4\n  references: []\n  summary: This document is a work in progress (WIP) and currently does not contain\n    specific content. Once completed, it will outline the core ideas, objectives,\n    and key points for effective SEO optimization, focusing on relevant keywords and\n    important details.\nprompting/decomposition/skeleton_of_thought.md:\n  cross_links: []\n  hash: 0aa74871fabd9647ac212ef8198a86b2\n  references: []\n  summary: '\"Skeleton-of-Thought\" is a technique to decrease latency in LLM (Large\n    Language Model) pipelines by generating a skeleton outline of a response before\n    expanding on each point in parallel. The method involves using parallel API calls\n    or batched decoding to enhance efficiency. The core process includes formulating\n    a question, creating a brief skeleton outline with 3-10 points, and then expanding\n    each point simultaneously. An example implementation with Python demonstrates\n    how to achieve this using the `instructor` library and `AsyncOpenAI` for faster\n    response generation. Key terms include \"Skeleton-of-Thought,\" \"parallel generation,\"\n    \"LLM pipeline,\" and \"response efficiency.\"'\nprompting/decomposition/tree-of-thought.md:\n  cross_links: []\n  hash: 5ef001050e89f56ecc769095df6300f4\n  references: []\n  summary: The content appears to be a placeholder or work-in-progress (WIP) without\n    any available details, title, or description. To optimize for search engines (SEO),\n    ensure to include key concepts, objectives, and important keywords once the content\n    is finalized. Focus on crafting a summary that highlights central themes or topics,\n    such as the purpose of the document, its main points, and any crucial information\n    it aims to convey.\nprompting/ensembling/cosp.md:\n  cross_links:\n  - prompting/ensembling/self_consistency.md\n  hash: 4b8eb058102072272fcb938bb8861a5c\n  references:\n  - prompting/ensembling/self_consistency.md\n  summary: Consistency Based Self Adaptive Prompting (COSP) is an ensemble technique\n    designed to enhance large language model (LLM) performance by generating high-quality\n    few-shot examples through self-consistency and normalized entropy metrics. It\n    automatically selects the most reliable responses from multiple reasoning chains\n    based on answer diversity and repetitiveness, then incorporates these examples\n    into prompts for improved accuracy. COSP employs strategies like cosine similarity\n    for evaluating repetitiveness and aims to optimize answer correctness without\n    ground truth labels, making it a key method for self-adaptive prompt engineering,\n    ensemble reasoning, and LLM accuracy improvement.\nprompting/ensembling/dense.md:\n  cross_links: []\n  hash: 4b90091a3795f75f4fc3162a22bf6ec7\n  references: []\n  summary: Demonstration Ensembling (DENSE) is a technique to improve language model\n    performance by generating multiple responses using different subsets of training\n    examples and then aggregating these outputs for a final decision. This method\n    involves prompting models like GPT-4 with varied few-shot prompts, partitioning\n    examples equally or sampling via embedding clustering. The approach enhances accuracy\n    by leveraging self-consistent responses and ensemble methods. Implementation can\n    be achieved using tools like the `instructor` library and asynchronous programming\n    in Python. Key concepts include few-shot learning, in-context learning, model\n    ensembling, prompt engineering, and response aggregation, making DENSE a valuable\n    strategy for tasks like classification and decision-making in NLP applications.\nprompting/ensembling/diverse.md:\n  cross_links: []\n  hash: b93329f06d2f82403fdc0efd37b286f3\n  references: []\n  summary: Diverse Verifier On Reasoning Step (DiVeRSe) is an advanced prompting technique\n    that enhances reasoning accuracy by generating multiple diverse prompts and leveraging\n    AI-based scoring to select the best response. It utilizes self-consistency through\n    multiple reasoning paths, combined with a fine-tuned verifier (initially DeBERTa-V3-Large,\n    now GPT-4o) to assess response quality and individual reasoning steps. DiVeRSe\n    aims to improve multi-step reasoning, accuracy, and robustness in AI models, making\n    it suitable for applications like question-answering, problem-solving, and reasoning\n    tasks. Key concepts include diverse prompt generation, self-consistency, step-wise\n    verification, and AI-based scoring for optimal decision-making in language models.\nprompting/ensembling/max_mutual_information.md:\n  cross_links: []\n  hash: 2ec748390bb663e6c289e4ec676cb6f2\n  references: []\n  summary: Max Mutual Information is a prompting technique for optimizing large language\n    models (LLMs) by generating multiple prompt templates and selecting the one that\n    maximizes mutual information between the prompt and the model's output. It focuses\n    on reducing uncertainty by calculating entropy and mutual information, which measures\n    the reduction in entropy when the prompt is used. The method involves estimating\n    probabilities and entropies to identify the most effective prompt for eliciting\n    accurate responses, especially in complex tasks like story comprehension. Implementation\n    involves generating responses with different prompts, scoring model confidence,\n    and calculating mutual information to select the best prompt, enhancing LLM performance\n    in applications such as the Story Cloze dataset. Key concepts include mutual information,\n    entropy, prompt optimization, LLM prompting strategies, and OpenAI API integration.\nprompting/ensembling/meta_cot.md:\n  cross_links: []\n  hash: d6d91ade7fb984ca99f6e2097c2cb08f\n  references: []\n  summary: 'Meta Chain Of Thought (Meta COT) is an advanced reasoning framework that\n    decomposes complex queries into multiple sub-questions, aggregates responses,\n    and leverages multiple reasoning chains to improve accuracy. Implemented using\n    OpenAI''s models, it facilitates step-by-step problem solving by generating sub-queries,\n    evaluating reasoning pathways, and synthesizing final answers through a multi-stage\n    process. Key features include query decomposition, reasoning chain generation,\n    and context-aware final responses, making Meta COT ideal for complex question\n    answering, AI reasoning, and improving model accuracy. Keywords: Meta Chain Of\n    Thought, multi-step reasoning, query decomposition, AI reasoning, OpenAI, question\n    answering, model accuracy.'\nprompting/ensembling/more.md:\n  cross_links: []\n  hash: 1f26fd2b6a81ae83f6db67299dde096c\n  references: []\n  summary: MoRE (Mixture of Reasoning Experts) enhances AI question-answering by combining\n    diverse specialized reasoning models, such as Factual, Multihop, Math, and Commonsense\n    experts. Each expert employs distinct prompts and reasoning techniques to generate\n    responses, which are then scored using a classifier like a random forest to select\n    the best answer or abstain if quality is low. A simplified implementation using\n    OpenAI's instructor facilitates multi-expert responses and scoring, improving\n    overall accuracy across varied reasoning tasks. Key keywords include reasoning\n    experts, AI, question answering, multi-step reasoning, factual retrieval, mathematical\n    reasoning, commonsense, prompt engineering, and model scoring.\nprompting/ensembling/prompt_paraphrasing.md:\n  cross_links: []\n  hash: e8f28524643be6affb1b760f6e930184\n  references: []\n  summary: 'This guide explores using Large Language Models (LLMs) for back translation\n    to enhance prompt performance and diversity. It details methods for paraphrasing\n    prompts through translation into different languages and back to English, leveraging\n    tools like the instructor package with OpenAI''s GPT-4. The approach improves\n    prompt phrasing and robustness, especially for tasks like sentiment analysis of\n    user reviews. Key techniques include multilingual translation, prompt variation,\n    and leveraging AI for more effective, diverse prompt generation to optimize LLM\n    responses. Keywords: Large Language Models, back translation, prompt paraphrasing,\n    prompt engineering, multilingual translation, AI prompt optimization, sentiment\n    analysis.'\nprompting/ensembling/self_consistency.md:\n  cross_links: []\n  hash: 6b158b0f8d82d71ae624d4f277ef6824\n  references: []\n  summary: Self-Consistency is a technique aimed at improving large language model\n    (LLM) performance by generating multiple potential responses and selecting the\n    most common answer through majority voting. It involves sampling several candidate\n    solutions in parallel and analyzing their consistency to enhance accuracy in tasks\n    like question answering. The approach is implemented using Python code with the\n    `instructor` library and OpenAI's API, showcasing how to generate and aggregate\n    multiple responses to derive the most probable correct answer. This method leverages\n    concepts from the research paper \"Self-Consistency Improves Chain Of Thought Reasoning\n    In Language Models\" and emphasizes improved reasoning, accuracy, and model performance\n    through sampling, majority voting, and ensemble techniques. Key keywords include\n    Self-Consistency, large language models, multiple responses, accuracy, ensemble\n    method, majority vote, and chain-of-thought reasoning.\nprompting/ensembling/universal_self_consistency.md:\n  cross_links: []\n  hash: c56d66bc14be41f9caa4b7b50a9354cb\n  references: []\n  summary: Universal Self-Consistency is an advanced approach that enhances traditional\n    self-consistency techniques by employing a second large language model (LLM) to\n    evaluate and select the most consistent answer among multiple candidates. This\n    method improves response diversity and accuracy by supporting various response\n    formats and leveraging consensus-based evaluation. Implemented using tools like\n    OpenAI's GPT models and the Instructor framework, it involves generating multiple\n    responses, assessing their consistency, and choosing the most reliable answer.\n    Key concepts include large language models, self-consistency, response evaluation,\n    answer selection, and AI accuracy enhancement, making it a valuable strategy for\n    improving LLM performance in complex reasoning tasks.\nprompting/ensembling/usp.md:\n  cross_links:\n  - prompting/few_shot/cosp.md\n  hash: 3a3df5b548bd422f3f7f84ef8e488300\n  references:\n  - prompting/few_shot/cosp.md\n  summary: \"Universal Self Prompting (USP) is a two-step technique for enhancing large\\\n    \\ language models by generating and selecting exemplars from unlabeled data. The\\\n    \\ process involves first creating candidate responses for different task types\\u2014\\\n    classification, short form generation, and long form generation\\u2014using specific\\\n    \\ evaluation metrics tailored to each task. These metrics include normalized entropy,\\\n    \\ pairwise ROUGE scores, and label probabilities. In the second step, the best\\\n    \\ examples are appended as prompts for the LLM to produce final predictions with\\\n    \\ a single inference. USP aims to improve model performance across diverse NLP\\\n    \\ tasks through data-driven exemplar generation and selection, utilizing methods\\\n    \\ like confidence-based sampling and task-specific scoring. Keywords include self\\\n    \\ prompting, large language models, unlabeled data, exemplar generation, task-specific\\\n    \\ evaluation, NLP, classification, text summarization, question answering, and\\\n    \\ prompt optimization.\"\nprompting/few_shot/cosp.md:\n  cross_links:\n  - prompting/ensembling/usp.md\n  hash: c7e5e6103a5c6b02a7c30633495c3282\n  references:\n  - prompting/ensembling/usp.md\n  summary: 'Consistency Based Self Adaptive Prompting (COSP) is an advanced technique\n    for enhancing few-shot learning by selecting high-quality examples based on response\n    consistency and confidence metrics such as entropy and repetitiveness. The method\n    involves generating multiple responses for potential examples, calculating their\n    entropy to measure variability, and evaluating repetitiveness to ensure reliability.\n    COSP automates the selection of optimal examples, improving prompt effectiveness\n    and model performance, while reducing manual curation. Key features include automated\n    example selection, quantifiable quality metrics, and improved accuracy in few-shot\n    prompting. Limitations include increased computational cost due to multiple API\n    calls, but overall, COSP advances prompt engineering with a focus on consistency\n    and confidence metrics for better language model outputs. Keywords: COSP, self-adaptive\n    prompting, few-shot learning, response consistency, entropy, repetitiveness, prompt\n    optimization, machine learning, language models.'\nprompting/few_shot/example_generation/sg_icl.md:\n  cross_links: []\n  hash: 68c7f1b6ec1060da68f0da9a83eea8e1\n  references: []\n  summary: Self-Generated In-Context Learning (SG-ICL) is a technique that leverages\n    large language models (LLMs) to automatically generate example prompts for tasks\n    like sentiment analysis. By using tools such as the `instructor` library, SG-ICL\n    creates in-context examples that improve model understanding and performance without\n    manual data labeling. The method involves generating multiple example reviews\n    with associated sentiments, which are then used to guide the model's predictions.\n    This approach enhances prompt-based learning, utilizing GPT models like GPT-4,\n    and is grounded in recent research on demonstration generation and prompt engineering.\n    Key keywords include in-context learning, self-generated examples, LLM, prompt\n    engineering, sentiment analysis, GPT, OpenAI, and demonstration generation.\nprompting/few_shot/example_ordering.md:\n  cross_links: []\n  hash: 46fe78ea46e5f89593be648f251c8628\n  references: []\n  summary: This document highlights the significant impact of example ordering in\n    few-shot prompting for large language models (LLMs), referencing studies that\n    demonstrate how permutating example sequences can improve model performance. It\n    discusses various methods to optimize example selection, including manual combinatorics,\n    KATE (k-Nearest Example Tuning), and using unsupervised retrieval techniques to\n    identify the most relevant in-context examples. These strategies aim to enhance\n    few-shot learning, prompt engineering, and prompt relevance, making it essential\n    for AI researchers and practitioners to consider example order and selection methods\n    to maximize LLM effectiveness. Key keywords include few-shot prompting, LLM, prompt\n    optimization, example ordering, KATE, unsupervised retrieval, prompt engineering,\n    and in-context learning.\nprompting/few_shot/exemplar_selection/knn.md:\n  cross_links: []\n  hash: 043cf2bc9050b9d8ac79ce9f24180ca2\n  references: []\n  summary: This guide demonstrates how to select effective in-context examples for\n    language models using KNN and embeddings. The process involves embedding query\n    examples, calculating cosine similarity-based distances, and retrieving the k\n    most similar examples to improve response accuracy. The code showcases embedding\n    questions, computing distances, selecting closest examples, and generating concise,\n    precise answers using OpenAI's GPT-4 model. Keywords include KNN, in-context learning,\n    embeddings, cosine similarity, prompt optimization, GPT-4, and language model\n    tuning.\nprompting/few_shot/exemplar_selection/vote_k.md:\n  cross_links: []\n  hash: 5ef001050e89f56ecc769095df6300f4\n  references: []\n  summary: The content appears to be a work in progress (wip) and does not include\n    specific details or key points yet. To create an effective SEO summary, more information\n    about the topic, objectives, and main ideas are needed. Once provided, I can generate\n    a concise and keyword-rich summary suitable for SEO purposes.\nprompting/index.md:\n  ai_references:\n  - '[The Prompt Report](https://trigaten.github.io/Prompt_Survey_Site)'\n  - '[Learn Prompting](https://learnprompting.org)'\n  cross_links:\n  - prompting/decomposition/decomp.md\n  - prompting/decomposition/faithful_cot.md\n  - prompting/decomposition/least_to_most.md\n  - prompting/decomposition/plan_and_solve.md\n  - prompting/decomposition/program_of_thought.md\n  - prompting/decomposition/recurs_of_thought.md\n  - prompting/decomposition/skeleton_of_thought.md\n  - prompting/decomposition/tree-of-thought.md\n  - prompting/ensembling/cosp.md\n  - prompting/ensembling/dense.md\n  - prompting/ensembling/diverse.md\n  - prompting/ensembling/max_mutual_information.md\n  - prompting/ensembling/meta_cot.md\n  - prompting/ensembling/more.md\n  - prompting/ensembling/prompt_paraphrasing.md\n  - prompting/ensembling/self_consistency.md\n  - prompting/ensembling/universal_self_consistency.md\n  - prompting/ensembling/usp.md\n  - prompting/few_shot/example_generation/sg_icl.md\n  - prompting/few_shot/example_ordering.md\n  - prompting/few_shot/exemplar_selection/knn.md\n  - prompting/few_shot/exemplar_selection/vote_k.md\n  - prompting/self_criticism/chain_of_verification.md\n  - prompting/self_criticism/cumulative_reason.md\n  - prompting/self_criticism/reversecot.md\n  - prompting/self_criticism/self_calibration.md\n  - prompting/self_criticism/self_refine.md\n  - prompting/self_criticism/self_verification.md\n  - prompting/thought_generation/chain_of_thought_few_shot/active_prompt.md\n  - prompting/thought_generation/chain_of_thought_few_shot/auto_cot.md\n  - prompting/thought_generation/chain_of_thought_few_shot/complexity_based.md\n  - prompting/thought_generation/chain_of_thought_few_shot/contrastive.md\n  - prompting/thought_generation/chain_of_thought_few_shot/memory_of_thought.md\n  - prompting/thought_generation/chain_of_thought_few_shot/prompt_mining.md\n  - prompting/thought_generation/chain_of_thought_few_shot/uncertainty_routed_cot.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/tab_cot.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/thread_of_thought.md\n  - prompting/zero_shot/emotion_prompting.md\n  - prompting/zero_shot/rar.md\n  - prompting/zero_shot/re2.md\n  - prompting/zero_shot/role_prompting.md\n  - prompting/zero_shot/s2a.md\n  - prompting/zero_shot/self_ask.md\n  - prompting/zero_shot/simtom.md\n  - prompting/zero_shot/style_prompting.md\n  hash: 05e6342a3a1492d7650955429328dc88\n  keywords:\n  - advanced prompting techniques\n  - LLM performance\n  - zero-shot\n  - few-shot\n  - reasoning methods\n  - self-assessment\n  - collaboration\n  references:\n  - prompting/zero_shot/emotion_prompting.md\n  - prompting/zero_shot/role_prompting.md\n  - prompting/zero_shot/style_prompting.md\n  - prompting/zero_shot/s2a.md\n  - prompting/zero_shot/simtom.md\n  - prompting/zero_shot/rar.md\n  - prompting/zero_shot/re2.md\n  - prompting/zero_shot/self_ask.md\n  - prompting/few_shot/example_generation/sg_icl.md\n  - prompting/few_shot/example_ordering.md\n  - prompting/few_shot/exemplar_selection/knn.md\n  - prompting/few_shot/exemplar_selection/vote_k.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/thread_of_thought.md\n  - prompting/thought_generation/chain_of_thought_zero_shot/tab_cot.md\n  - prompting/thought_generation/chain_of_thought_few_shot/active_prompt.md\n  - prompting/thought_generation/chain_of_thought_few_shot/auto_cot.md\n  - prompting/thought_generation/chain_of_thought_few_shot/complexity_based.md\n  - prompting/thought_generation/chain_of_thought_few_shot/contrastive.md\n  - prompting/thought_generation/chain_of_thought_few_shot/memory_of_thought.md\n  - prompting/thought_generation/chain_of_thought_few_shot/uncertainty_routed_cot.md\n  - prompting/thought_generation/chain_of_thought_few_shot/prompt_mining.md\n  - prompting/ensembling/cosp.md\n  - prompting/ensembling/dense.md\n  - prompting/ensembling/diverse.md\n  - prompting/ensembling/max_mutual_information.md\n  - prompting/ensembling/meta_cot.md\n  - prompting/ensembling/more.md\n  - prompting/ensembling/self_consistency.md\n  - prompting/ensembling/universal_self_consistency.md\n  - prompting/ensembling/usp.md\n  - prompting/ensembling/prompt_paraphrasing.md\n  - prompting/self_criticism/chain_of_verification.md\n  - prompting/self_criticism/self_calibration.md\n  - prompting/self_criticism/self_refine.md\n  - prompting/self_criticism/self_verification.md\n  - prompting/self_criticism/reversecot.md\n  - prompting/self_criticism/cumulative_reason.md\n  - prompting/decomposition/decomp.md\n  - prompting/decomposition/faithful_cot.md\n  - prompting/decomposition/least_to_most.md\n  - prompting/decomposition/plan_and_solve.md\n  - prompting/decomposition/program_of_thought.md\n  - prompting/decomposition/recurs_of_thought.md\n  - prompting/decomposition/skeleton_of_thought.md\n  - prompting/decomposition/tree-of-thought.md\n  summary: This guide offers an in-depth overview of advanced prompting techniques\n    designed to enhance the performance of large language models (LLMs) through research-backed\n    methods. It includes a comprehensive mapping of various strategies, including\n    zero-shot, few-shot, and reasoning techniques, tailored for implementation with\n    the Instructor framework.\n  topics:\n  - prompting techniques\n  - reasoning methods\n  - example usage\n  - verification methods\n  - implementation\nprompting/self_criticism/chain_of_verification.md:\n  cross_links: []\n  hash: 73ebc5e56042b7f72031c9b68be3dc97\n  references: []\n  summary: Chain Of Verification (CoVe) is a method designed to enhance the reliability\n    of large language model (LLM) responses through a multi-step validation process.\n    It involves generating an initial answer, creating follow-up questions to verify\n    key facts and assumptions, independently answering these questions, and finally\n    using a final API call to confirm or correct the original response. This approach\n    reduces hallucinations and improves accuracy, making it highly effective for ensuring\n    trustworthy AI-generated content. Core keywords include LLM verification, AI validation,\n    reducing hallucinations, prompt engineering, and response accuracy.\nprompting/self_criticism/cumulative_reason.md:\n  cross_links: []\n  hash: dc7fbab50e534f394dab15dc2d13816c\n  references: []\n  summary: \"Cumulative Reasoning enhances large language model performance by dividing\\\n    \\ the reasoning process into three steps: propose, verify, and report. This structured\\\n    \\ approach improves logical inference and mathematical problem-solving accuracy\\\n    \\ by generating potential reasoning steps, validating their correctness, and determining\\\n    \\ the conclusion. Implemented using OpenAI\\u2019s API, this method ensures disciplined,\\\n    \\ step-by-step deduction rooted in First-Order Logic, making it ideal for logical,\\\n    \\ mathematical, and AI reasoning tasks. Key concepts include reasoning steps,\\\n    \\ validation, logical inference, and advanced LLM prompting techniques for improved\\\n    \\ reasoning accuracy.\"\nprompting/self_criticism/reversecot.md:\n  cross_links: []\n  hash: 718094a1f90e542c567a278e52e4b731\n  references: []\n  summary: Reverse Chain Of Thought (RCoT) is a method for identifying logical inconsistencies\n    in a large language model's reasoning process by reconstructing the original question\n    from the generated solution. This three-step approach involves reconstructing\n    the question, pinpointing discrepancies between original and reconstructed conditions,\n    and providing targeted feedback for improvement. Implemented via a specialized\n    framework, RCoT enhances prompt accuracy, logical coherence, and response quality,\n    making it an effective tool for refining AI-generated reasoning and solutions.\n    Key concepts include problem reconstruction, inconsistency detection, targeted\n    feedback, and improving AI reasoning accuracy.\nprompting/self_criticism/self_calibration.md:\n  cross_links: []\n  hash: 10cd8050ef8c5a0154316edb507747c1\n  references: []\n  summary: Self Calibration is a technique to help language models assess the confidence\n    and validity of their responses. By evaluating their output using a structured\n    prompt template and tools like the Instructor library, models can generate reasoning\n    and determine whether answers are correct, without relying on internal hidden\n    states. This approach enhances model reliability by enabling self-assessment of\n    knowledge and uncertainties, which is essential for improving question-answering\n    accuracy and trustworthiness in AI systems. Key concepts include self-calibration,\n    confidence estimation, language model evaluation, prompt engineering, and AI reliability.\nprompting/self_criticism/self_refine.md:\n  ai_references:\n  - '[Self-Refine: Iterative Refinement with Self-Feedback](https://arxiv.org/abs/2303.17651)'\n  - '[The Prompt Report: A Systematic Survey of Prompting Techniques](https://arxiv.org/abs/2406.06608)'\n  cross_links: []\n  hash: 9339448f16ae6cc7645aba733b2efdcb\n  keywords:\n  - Self-refine\n  - feedback\n  - language model\n  - iterative improvement\n  - Python coding\n  - refinement process\n  - stopping condition\n  - LLM\n  references: []\n  summary: Self-refine is a methodology that utilizes a language model to iteratively\n    generate, evaluate, and improve its outputs based on user feedback. This process\n    continues until specified stopping criteria are fulfilled, ensuring the output\n    becomes more accurate and refined with each iteration.\n  topics:\n  - Iterative feedback loop\n  - Generating initial responses\n  - Providing feedback\n  - Refining outputs\n  - Implementing stopping conditions\nprompting/self_criticism/self_verification.md:\n  cross_links: []\n  hash: 77d9f2d4e8bf08216987b11d2bf8679a\n  references: []\n  summary: 'This document outlines a self-verification framework for validating Large\n    Language Model (LLM) responses through a two-stage process: forward reasoning\n    and backward verification. The approach involves generating multiple response\n    candidates using chain-of-thought reasoning, then verifying each candidate by\n    rewriting the question into a declarative form and constructing verification prompts\n    using True-False Item Verification (TFV) or Condition Mask Verification (CMV).\n    The verification process repeats multiple times, and the candidate with the highest\n    verification score is selected as the final answer. The framework is implemented\n    with code examples using OpenAI''s API and aims to improve the accuracy and reliability\n    of LLM outputs. Key concepts include self-verification, prompt engineering, declarative\n    rewriting, LLM verification, chain-of-thought, and model prompting techniques.'\nprompting/thought_generation/chain_of_thought_few_shot/active_prompt.md:\n  cross_links: []\n  hash: ec50ae930bfa92be2db89c937e696404\n  references: []\n  summary: 'Active prompting is a technique to enhance Large Language Model (LLM)\n    performance by selecting effective examples for human annotation. This process\n    involves four main steps: uncertainty estimation, selection, annotation, and inference.\n    The uncertainty estimation step uses metrics like disagreement, entropy, and variance\n    to measure how confident the LLM is in its responses. By querying the LLM multiple\n    times, the differences in responses indicate areas of uncertainty. Selection involves\n    choosing the most uncertain examples for human annotation, which are then used\n    to improve the LLM''s inference capabilities. This method optimizes the use of\n    labeled data to boost LLM accuracy and performance.'\nprompting/thought_generation/chain_of_thought_few_shot/auto_cot.md:\n  cross_links: []\n  hash: aa45163a89881ec54d814f68e369d2df\n  references: []\n  summary: The article discusses improving the performance of few-shot Chain of Thought\n    (CoT) reasoning by automating the selection of diverse examples. The method involves\n    clustering potential examples, sorting them based on distance from cluster centers,\n    and selecting those that meet predefined criteria, such as a maximum of five reasoning\n    steps. This automated approach reduces reasoning errors by ensuring the examples\n    are varied and representative. The implementation includes clustering with KMeans,\n    encoding with Sentence Transformers, and using AI models like GPT-4 for processing.\n    This technique enhances large language models' accuracy by systematically selecting\n    examples for optimal performance. Key terms include few-shot CoT, clustering,\n    diverse examples, reasoning error reduction, and automated example selection.\nprompting/thought_generation/chain_of_thought_few_shot/complexity_based.md:\n  cross_links: []\n  hash: 08f5ce3a728a741234799bbaaede1acf\n  references: []\n  summary: 'The article discusses \"Complexity Based Prompting\" to enhance language\n    model performance by selecting examples with more reasoning steps or longer responses\n    when reasoning lengths aren''t available. This approach, known as \"Complexity\n    Based Consistency,\" involves sampling multiple responses and selecting the most\n    complex ones based on reasoning step length. The process is implemented using\n    tools like `instructor` and `AsyncOpenAI`, leveraging structured reasoning steps\n    in query responses. By generating and ranking multiple responses, the method identifies\n    top responses to derive accurate answers, as demonstrated with a practical example.\n    Keywords: Complexity Based Prompting, language models, multi-step reasoning, AI\n    performance, Complexity Based Consistency, `instructor`, `AsyncOpenAI`.'\nprompting/thought_generation/chain_of_thought_few_shot/contrastive.md:\n  cross_links: []\n  hash: 607e1e5586ac745bccb961f0df089c17\n  references: []\n  summary: The document discusses the technique of Contrastive Chain Of Thought (CoT)\n    to enhance language model performance by deliberately including incorrect reasoning\n    examples alongside correct ones during training. This method helps the AI learn\n    from mistakes and improve its response generation. The approach involves using\n    a specific template with correct and incorrect examples to guide the AI in providing\n    accurate answers. An example implementation is provided using Python and the `instructor`\n    package to demonstrate the process. Key concepts include chain-of-thought prompting,\n    incorrect reasoning, language model training, and AI performance enhancement.\nprompting/thought_generation/chain_of_thought_few_shot/memory_of_thought.md:\n  cross_links: []\n  hash: 5ef001050e89f56ecc769095df6300f4\n  references: []\n  summary: It seems like the content is still a work in progress, as indicated by\n    the \"[wip]\" tag. Since the title, description, and keywords are left empty, more\n    information is needed to provide an accurate SEO summary. To optimize for SEO,\n    consider focusing on the main topic of the content, its objectives, and any unique\n    selling points or important details. Once more details are available, including\n    keywords relevant to the content's subject, an effective summary can be crafted\n    to improve search visibility.\nprompting/thought_generation/chain_of_thought_few_shot/prompt_mining.md:\n  cross_links: []\n  hash: 214b95070291158fec9b154f77370f57\n  references: []\n  summary: 'The article discusses \"Prompt Mining,\" a technique used to enhance the\n    performance of Large Language Models (LLMs) by discovering effective prompt formats\n    from text corpora, such as Wikipedia. The approach aims to identify better prompt\n    structures that allow LLMs to respond more accurately. It contrasts manual prompts\n    with mined prompts, presenting examples of both to illustrate improved prompt\n    efficiency. The document outlines a method using the `instructor` library, demonstrating\n    how to implement Prompt Mining to generate concise and clear prompt templates.\n    Key points include the importance of prompt formatting, the use of placeholder\n    templates, and the effectiveness of automated prompt discovery in improving language\n    model outputs. Keywords: Prompt Mining, Large Language Models, prompt templates,\n    language model performance, automated prompt discovery, `instructor` library.'\nprompting/thought_generation/chain_of_thought_few_shot/uncertainty_routed_cot.md:\n  cross_links: []\n  hash: b90fa988c085d0dde6594aa75eac0544\n  references: []\n  summary: \"The Uncertainty-Routed Chain Of Thought technique, detailed in the Gemini\\\n    \\ Paper, enhances traditional Chain Of Thought methods by generating multiple\\\n    \\ reasoning chains\\u2014either 8 or 32\\u2014and selecting the majority answer\\\n    \\ only if it meets a specified threshold of agreement. Implemented in Python with\\\n    \\ OpenAI's models, this approach involves using asynchronous prompts to create\\\n    \\ a batch of responses, counting the majority vote, and comparing it to the confidence\\\n    \\ threshold (e.g., 0.6) to determine the final answer. This technique is designed\\\n    \\ to improve the accuracy and reliability of AI-generated answers in complex decision-making\\\n    \\ scenarios. Key elements include uncertainty routing, batch processing, majority\\\n    \\ voting, and threshold evaluation.\"\nprompting/thought_generation/chain_of_thought_zero_shot/analogical_prompting.md:\n  cross_links: []\n  hash: daa15bd030a6f2d0584e310e29f781c0\n  references: []\n  summary: 'Analogical Prompting is a method designed to enhance the accuracy of large\n    language models (LLMs) by prompting the model to generate relevant examples before\n    addressing a user''s query. This technique leverages the extensive knowledge acquired\n    by the LLM during training, encouraging it to recall pertinent problems and solutions.\n    The process involves providing a problem, recalling three relevant and distinct\n    problems with their solutions, and then solving the initial problem. A Python\n    implementation using the `instructor` module demonstrates this method with an\n    example query about calculating the area of a square using given vertices. This\n    approach is based on research into LLMs as analogical reasoners, aimed at improving\n    problem-solving capabilities. Key points include the use of templates, structured\n    recall of problem-solving instances, and enhanced accuracy in query responses.\n    Keywords: Analogical Prompting, large language models, LLMs, problem-solving,\n    language model training, accuracy enhancement, Python implementation, example\n    generation, query response.'\nprompting/thought_generation/chain_of_thought_zero_shot/step_back_prompting.md:\n  cross_links: []\n  hash: 266f50f0729c9faf17ee37f0ee9ef6a2\n  references: []\n  summary: Step-back prompting is a two-step technique utilized with Large Language\n    Models (LLMs) to improve contextual understanding and reasoning capabilities.\n    The method involves first asking a high-level, topic-specific question, known\n    as the \"step-back question,\" to gather broader context. This is followed by \"abstracted-grounded\n    reasoning,\" where the LLM answers the initial query within the context provided\n    by the step-back response. This technique has proven effective in enhancing performance\n    on reasoning benchmarks for models like PaLM-2L and GPT-4. The implementation\n    often involves generating step-back questions with LLM queries to ensure precise\n    abstract questioning.\nprompting/thought_generation/chain_of_thought_zero_shot/tab_cot.md:\n  cross_links: []\n  hash: 9d53b891d95c8c14d3bd15758757e736\n  references: []\n  summary: 'The text discusses the concept of Tabular Chain of Thought (Tab-CoT),\n    a method to improve the reasoning and output quality of language models by structuring\n    their reasoning in the form of markdown tables. It introduces a process using\n    Python, OpenAI, and the `instructor` library to generate structured reasoning\n    responses. This approach involves defining reasoning steps as objects, breaking\n    down queries into subquestions, and detailing procedures and results, thus enhancing\n    clarity and precision in model outputs. The example provided calculates the remaining\n    loaves of bread at a bakery, showcasing the structured reasoning process. Keywords:\n    Tabular Chain of Thought, Tab-CoT, language models, structured reasoning, markdown\n    tables, Python, OpenAI, reasoning steps.'\nprompting/thought_generation/chain_of_thought_zero_shot/thread_of_thought.md:\n  cross_links: []\n  hash: 2549f9996ba2068ab4cfd1b7f23cb083\n  references: []\n  summary: The article introduces the \"Thread of Thought\" technique, which enhances\n    AI model responses by systematically focusing on relevant context and ignoring\n    irrelevant information. This method improves reasoning performance and response\n    quality by encouraging models to analyze and summarize information incrementally.\n    The implementation involves using templates in Python with the OpenAI API to assess\n    each piece of context for its significance. Key phrases and approaches are suggested\n    for guiding models through the context effectively. This technique can be particularly\n    useful for complex question-answering tasks that involve large datasets or lengthy\n    documents.\nprompting/zero_shot/emotion_prompting.md:\n  cross_links: []\n  hash: a9ad30ffe419f260e612691bf23edf9f\n  references: []\n  summary: This article explores the use of emotional stimuli in prompts to enhance\n    the performance of language models. It highlights how adding emotionally significant\n    phrases, such as \"This is very important to my career,\" can influence model responses.\n    The implementation example demonstrates prompting GPT-4 with emotional cues to\n    generate curated outputs, like a list of musical albums from the 2000s. The content\n    references research on emotional stimuli's impact on large language models and\n    provides code snippets for practical application. Keywords include emotion prompting,\n    language models, emotional stimuli, prompt engineering, GPT-4, AI performance,\n    and AI enhancement.\nprompting/zero_shot/rar.md:\n  ai_references:\n  - '[Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves](https://arxiv.org/abs/2311.04205)'\n  cross_links: []\n  hash: 90516c9b6f140155c4c52e871db56b47\n  keywords:\n  - Rephrase and Respond\n  - ambiguous prompts\n  - human intention\n  - Python implementation\n  - model interpretation\n  - OpenAI\n  - query clarification\n  references: []\n  summary: This documentation details the Rephrase and Respond (RaR) approach, designed\n    to help models accurately interpret ambiguous prompts. It discusses identifying\n    ambiguities in questions and provides an implementation example using Python code\n    to demonstrate how to rephrase and respond effectively to queries.\n  topics:\n  - Ambiguity in language\n  - Implementation guide\n  - Python code example\n  - Model interaction\n  - Query rephrasing\nprompting/zero_shot/re2.md:\n  ai_references:\n  - '[Re-Reading Improves Reasoning in Large Language Models](https://arxiv.org/abs/2309.06275)'\n  cross_links: []\n  hash: 75c4357d9ceaf62751ff55b2a874ac36\n  keywords:\n  - Re2\n  - Re-Reading\n  - query understanding\n  - critical thinking\n  - OpenAI\n  - reasoning\n  - implementation\n  - Python\n  - prompt template\n  references: []\n  summary: Re2 (Re-Reading) is a technique designed to enhance a model's comprehension\n    of queries by prompting it to read the question again, encouraging critical thinking\n    and step-by-step reasoning. This technique can be implemented using OpenAI's API\n    to improve response accuracy in applications requiring deeper understanding.\n  topics:\n  - Re2 technique\n  - model enhancement\n  - critical thinking prompts\n  - Python implementation\n  - querying with OpenAI\nprompting/zero_shot/role_prompting.md:\n  ai_references:\n  - '[RoleLLM](https://arxiv.org/abs/2310.00746)'\n  - '[social roles evaluation](https://arxiv.org/abs/2311.10054)'\n  - '[Multi-Persona Self-Collaboration](https://arxiv.org/abs/2307.05300)'\n  cross_links: []\n  hash: 42f69bb3b65ab7208e766d80c96c50ac\n  keywords:\n  - role prompting\n  - persona prompting\n  - model performance\n  - open-ended tasks\n  - AI assistant\n  - poetry generation\n  - social roles\n  - multi-persona collaboration\n  references: []\n  summary: Role prompting, also known as persona prompting, enhances model performance\n    on open-ended tasks by assigning specific roles to the model. This approach allows\n    models to adopt a particular persona, which can significantly influence the quality\n    and style of the output generated.\n  topics:\n  - role prompting implementation\n  - influence of roles on AI output\n  - examples of role assignments\n  - systematic approach to choosing roles\nprompting/zero_shot/s2a.md:\n  cross_links: []\n  hash: f3b55fc1bf5a617fa1dd82134ecaa495\n  references: []\n  summary: 'The System 2 Attention (S2A) technique enhances prompt relevance by auto-refining\n    user input through a two-step process: rewriting prompts to include only pertinent\n    information and then generating accurate responses. Implemented using GPT-4, S2A\n    leverages prompt engineering inspired by recent research (arXiv:2311.11829) to\n    improve model focus and answer precision. Key features include extracting relevant\n    context from user queries and minimizing irrelevant data, making it valuable for\n    optimized AI communication, prompt refinement, and advanced language model applications.\n    Keywords: System 2 Attention, prompt refinement, AI prompt engineering, GPT-4,\n    relevance extraction, model focus, arXiv 2311.11829.'\nprompting/zero_shot/self_ask.md:\n  cross_links: []\n  hash: f25cf054eea8c90dcca3ab21a56f51b7\n  references: []\n  summary: Self-Ask is an innovative prompting technique designed to improve language\n    model reasoning by addressing the compositionality gap. It encourages models to\n    determine if follow-up questions are needed, generate and answer those questions,\n    and then use these answers to produce a more accurate overall solution. Implemented\n    using a zero-shot prompt with the instructor framework, Self-Ask enhances the\n    ability of models like GPT-4 to handle complex queries through dynamic sub-problem\n    solving. Key concepts include compositionality gap, follow-up questions, zero-shot\n    prompting, and sub-problem answering for improved reasoning accuracy.\nprompting/zero_shot/simtom.md:\n  cross_links: []\n  hash: b6f1003c8f869a54c705cd1f71861c44\n  references: []\n  summary: SimToM (Simulated Theory of Mind) is a two-step prompting technique designed\n    to enhance large language models' ability to consider specific perspectives. It\n    involves first isolating relevant information related to an entity within a context,\n    and then asking the model to answer questions solely based on those facts from\n    the entity's viewpoint. This method is especially useful for complex scenarios\n    with multiple entities, improving the model's understanding and reasoning about\n    different perspectives. Implementation includes structured prompts and code examples\n    using OpenAI's GPT-4, focusing on perspective-taking and context-specific responses.\n    Key concepts include perspective-taking, multi-entity reasoning, and advanced\n    prompt engineering for improved model comprehension.\nprompting/zero_shot/style_prompting.md:\n  ai_references:\n  - '[Bounding the Capabilities of Large Language Models in Open Text Generation with\n    Prompt Constraints](https://arxiv.org/abs/2302.09185)'\n  cross_links: []\n  hash: 279e6d51353749a88799508a048d3213\n  keywords:\n  - style prompting\n  - model response\n  - writing style\n  - tone\n  - mood\n  - genre\n  - email generation\n  - OpenAI\n  references: []\n  summary: The \"Style Prompting\" documentation explains how to constrain a model's\n    responses using stylistic guidelines, including writing style, tone, mood, and\n    genre. By specifying these elements, users can ensure that the generated outputs\n    align with their intended context and purpose. Code implementation for generating\n    tailored email responses is also provided.\n  topics:\n  - stylistic constraints\n  - implementation example\n  - code usage\n  - email generation\nrepository-overview.md:\n  cross_links: []\n  hash: 16a893aa592a4478f0bd70ce059ce714\n  references: []\n  summary: The Instructor repository provides a comprehensive codebase for structured\n    output management, featuring core libraries in the `instructor/` directory, and\n    command-line tools in `cli/`. It also includes documentation sources in `docs/`,\n    practical examples in `examples/`, and testing scripts in `tests/`. This layout\n    supports efficient development, usage, and evaluation of Instructor's functionalities\n    for clients, adapters, utilities, and job management, making it essential for\n    developers working on structured output tasks.\nstart-here.md:\n  ai_references:\n  - '[getting-started.md'\n  - examples/index.md\n  - concepts/validation.md\n  - concepts/partial.md\n  - integrations/index.md\n  - faq.md]\n  cross_links:\n  - concepts/index.md\n  - concepts/partial.md\n  - concepts/validation.md\n  - examples/index.md\n  - faq.md\n  - getting-started.md\n  - index.md\n  - integrations/index.md\n  hash: f92d563955521efd3c8a1b98ef845dd2\n  keywords:\n  - Instructor\n  - Python library\n  - structured outputs\n  - language models\n  - data extraction\n  - API integration\n  - Pydantic\n  - validation\n  - OpenAI\n  - Claude\n  references:\n  - getting-started.md\n  - examples/index.md\n  - concepts/validation.md\n  - concepts/partial.md\n  - integrations/index.md\n  - faq.md\n  - examples/index.md\n  - concepts/index.md\n  summary: This guide provides beginners with an introduction to Instructor, a Python\n    library designed for obtaining structured outputs from language models such as\n    GPT-4 and Claude. It explains how to use Instructor to define response structures,\n    validate outputs, and solve common challenges related to data extraction from\n    language models.\n  topics: []\ntemplates/concept_template.md:\n  ai_references:\n  - '[../concepts/related1.md'\n  - ../concepts/related2.md\n  - ../examples/example1.md\n  - ../examples/example2.md]\n  cross_links: []\n  hash: 07c431f7b4a798b09df99bc65c26543a\n  keywords:\n  - '[Concept Name'\n  - Instructor\n  - OpenAI\n  - advanced usage\n  - best practices\n  - error handling\n  - language models\n  - JSON mode\n  - model examples]\n  references:\n  - concepts/related1.md\n  - concepts/related2.md\n  - examples/example1.md\n  - examples/example2.md\n  summary: This documentation covers the [Concept Name], a key feature in the Instructor\n    framework designed to enhance user interactions with language models. It provides\n    an overview, use cases, basic and advanced implementation examples, and best practices\n    for effectively utilizing this concept within various contexts.\n  topics:\n  - '[Overview'\n  - Usage Scenarios\n  - Basic and Advanced Usage\n  - Working with Different Providers\n  - Common Patterns and Best Practices]\ntemplates/cookbook_template.md:\n  ai_references:\n  - '[related1.md'\n  - related2.md\n  - related1.md\n  - related2.md]\n  cross_links: []\n  hash: 6e692507cf928faa03b61bf27ca6722d\n  keywords:\n  - Instructor library\n  - OpenAI API\n  - data processing\n  - Python code\n  - structured output\n  - API keys\n  - implementation steps\n  - error handling\n  references:\n  - concepts/related1.md\n  - concepts/related2.md\n  - examples/related1.md\n  - examples/related2.md\n  summary: This example provides a practical guide on how to utilize the Instructor\n    library to process data with OpenAI's API effectively. It covers installation,\n    prerequisites, step-by-step implementation, and customization options to enhance\n    the solution's functionality.\n  topics:\n  - Use case scenarios\n  - prerequisites for setup\n  - implementation steps\n  - customization options\n  - limitations\ntemplates/provider_template.md:\n  ai_references: []\n  cross_links: []\n  hash: 10ab7b29ad592ebc6b4fe5f9bbf88415\n  keywords:\n  - '[Provider Name'\n  - instructor toolkit\n  - data extraction\n  - API key\n  - asynchronous programming]\n  references: []\n  summary: This guide provides a comprehensive overview of using the instructor toolkit\n    with [Provider Name], detailing installation, authentication, and both synchronous\n    and asynchronous examples for data extraction. It also covers supported modes,\n    streaming support, and the models offered by the provider.\n  topics:\n  - '[Installation'\n  - Authentication\n  - Synchronous Example\n  - Asynchronous Example\n  - Supported Modes]\ntutorials/index.md:\n  ai_references:\n  - '[core concepts](../concepts/index.md)'\n  - '[frequently asked questions](../faq.md)'\n  - '[practical examples](../examples/index.md)'\n  cross_links:\n  - concepts/index.md\n  - examples/index.md\n  - faq.md\n  - index.md\n  hash: 4da1d02c578cd8b59a99a83811f38f6b\n  keywords:\n  - Instructor\n  - tutorials\n  - Jupyter notebooks\n  - AI applications\n  - learning path\n  - structured extraction\n  - validation techniques\n  - running options\n  - Python environment\n  - support\n  references:\n  - concepts/index.md\n  - faq.md\n  - examples/index.md\n  summary: The Instructor Tutorials provide an interactive platform for learning how\n    to effectively use the Instructor tool through a structured learning path. Users\n    can engage in various tutorials that range from basic concepts to advanced applications,\n    building practical skills in AI and LLMs (Large Language Models) along the way.\n  topics: []\nwhy.md:\n  ai_references:\n  - '[../index.md]'\n  cross_links:\n  - index.md\n  hash: 0c27bf9a45800a453a61a41fdb9df8ac\n  keywords:\n  - '[Instructor'\n  - LLMs\n  - structured outputs\n  - JSON parsing\n  - API integration\n  - error handling\n  - user model\n  - retries\n  - provider-specific code]\n  references: []\n  summary: Instructor is an innovative tool designed to streamline the interaction\n    with LLMs by providing structured outputs without the usual complexities. It minimizes\n    issues such as JSON parsing, retries, and provider-specific code, making it an\n    ideal solution for developers needing reliable integration with various LLM providers.\n  topics:\n  - '[unstructured outputs'\n  - benefits of Instructor\n  - simplification of LLM integration\n  - error handling in LLM applications\n  - user modeling with Pydantic]\n"
  },
  {
    "path": "tests/__init__.py",
    "content": "\n"
  },
  {
    "path": "tests/conftest.py",
    "content": "from dotenv import load_dotenv\n\n# Support .env for local development\nload_dotenv()\n"
  },
  {
    "path": "tests/docs/_concept_groups.py",
    "content": "from __future__ import annotations\n\nimport glob\nimport os\nfrom collections.abc import Iterable\n\nfrom pytest_examples import find_examples\n\nCORE = {\n    \"alias.md\",\n    \"dictionary_operations.md\",\n    \"distillation.md\",\n    \"enums.md\",\n    \"fastapi.md\",\n    \"fields.md\",\n    \"index.md\",\n    \"iterable.md\",\n    \"lists.md\",\n    \"logging.md\",\n    \"maybe.md\",\n    \"models.md\",\n    \"parallel.md\",\n    \"partial.md\",\n    \"philosophy.md\",\n    \"prompting.md\",\n    \"typeadapter.md\",\n    \"typeddicts.md\",\n    \"types.md\",\n    \"union.md\",\n    \"unions.md\",\n    \"validation.md\",\n}\n\nOPERATIONS = {\n    \"caching.md\",\n    \"prompt_caching.md\",\n    \"raw_response.md\",\n    \"retrying.md\",\n    \"error_handling.md\",\n}\n\nPROVIDERS = {\n    \"from_provider.md\",\n    \"migration.md\",\n    \"mode-migration.md\",\n    \"patching.md\",\n    \"usage.md\",\n}\n\nADVANCED = {\n    \"batch.md\",\n    \"hooks.md\",\n    \"multimodal.md\",\n    \"reask_validation.md\",\n    \"semantic_validation.md\",\n    \"templating.md\",\n}\n\n\ndef concept_paths(names: Iterable[str]) -> list[str]:\n    return [os.path.join(\"docs\", \"concepts\", name) for name in names]\n\n\ndef all_concept_files() -> list[str]:\n    return sorted(glob.glob(\"docs/concepts/*.md\"))\n\n\ndef core_concept_files() -> list[str]:\n    excluded = OPERATIONS | PROVIDERS | ADVANCED\n    return [\n        path for path in all_concept_files() if os.path.basename(path) not in excluded\n    ]\n\n\ndef collect_examples(files: Iterable[str]):\n    examples = []\n    for markdown_file in files:\n        examples.extend(find_examples(markdown_file))\n    return examples\n"
  },
  {
    "path": "tests/docs/_example_groups.py",
    "content": "from __future__ import annotations\n\nimport glob\nimport os\nfrom collections.abc import Iterable\n\nfrom pytest_examples import find_examples\n\nEXCLUDED = {\n    \"ollama.md\",\n    \"watsonx.md\",\n    \"local_classification.md\",\n}\n\nBATCH = {\n    \"batch_classification_langsmith.md\",\n    \"batch_in_memory.md\",\n    \"batch_job_oai.md\",\n}\n\nMULTIMODAL = {\n    \"audio_extraction.md\",\n    \"extract_slides.md\",\n    \"extracting_receipts.md\",\n    \"image_to_ad_copy.md\",\n    \"multi_modal_gemini.md\",\n    \"tables_from_vision.md\",\n    \"youtube_clips.md\",\n}\n\nPROVIDERS = {\n    \"groq.md\",\n    \"mistral.md\",\n    \"open_source.md\",\n}\n\nINTEGRATIONS = {\n    \"search.md\",\n    \"tracing_with_langfuse.md\",\n}\n\n\ndef example_paths(names: Iterable[str]) -> list[str]:\n    return [os.path.join(\"docs\", \"examples\", name) for name in names]\n\n\ndef all_example_files() -> list[str]:\n    return sorted(glob.glob(\"docs/examples/*.md\"))\n\n\ndef core_example_files() -> list[str]:\n    excluded = EXCLUDED | BATCH | MULTIMODAL | PROVIDERS | INTEGRATIONS\n    return [\n        path for path in all_example_files() if os.path.basename(path) not in excluded\n    ]\n\n\ndef collect_examples(files: Iterable[str]):\n    examples = []\n    for markdown_file in files:\n        examples.extend(find_examples(markdown_file))\n    return examples\n"
  },
  {
    "path": "tests/docs/conftest.py",
    "content": "from __future__ import annotations\n\nfrom pathlib import Path\n\nimport pytest\nfrom pytest_examples import CodeExample, EvalExample\n\n\ndef pytest_addoption(parser: pytest.Parser) -> None:\n    group = parser.getgroup(\"docs\")\n    group.addoption(\n        \"--run-doc-examples\",\n        action=\"store_true\",\n        help=\"Execute doc code examples (requires network access and API keys).\",\n    )\n\n\n@pytest.fixture(name=\"eval_example\")\ndef eval_example(\n    tmp_path: Path,\n    request: pytest.FixtureRequest,\n    _examples_to_update: list[CodeExample],\n):\n    eval_ex = EvalExample(tmp_path=tmp_path, pytest_request=request)\n    run_live = bool(\n        request.config.getoption(\"run_doc_examples\")\n        or request.config.getoption(\"update_examples\")\n    )\n    if not run_live:\n\n        def _skip_run(_example: CodeExample) -> None:\n            return None\n\n        eval_ex.run = _skip_run  # type: ignore[assignment]\n        eval_ex.run_print_update = _skip_run  # type: ignore[assignment]\n\n    yield eval_ex\n\n    if request.config.getoption(\"update_examples\"):\n        _examples_to_update.extend(eval_ex.to_update)\n"
  },
  {
    "path": "tests/docs/test_concepts.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._concept_groups import collect_examples, core_concept_files\n\ncode_examples = collect_examples(core_concept_files())\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_format_concepts_core(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n        eval_example.run(example)\n"
  },
  {
    "path": "tests/docs/test_concepts_advanced.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._concept_groups import ADVANCED, collect_examples, concept_paths\n\ncode_examples = collect_examples(concept_paths(ADVANCED))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_format_concepts_advanced(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_concepts_operations.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._concept_groups import OPERATIONS, collect_examples, concept_paths\n\ncode_examples = collect_examples(concept_paths(OPERATIONS))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_format_concepts_operations(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_concepts_providers.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._concept_groups import PROVIDERS, collect_examples, concept_paths\n\ncode_examples = collect_examples(concept_paths(PROVIDERS))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_format_concepts_providers(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_docs.py",
    "content": "import pytest\nfrom pytest_examples import find_examples, CodeExample, EvalExample\n\n\n@pytest.mark.parametrize(\"example\", find_examples(\"README.md\"), ids=str)\ndef test_readme(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n    else:\n        eval_example.lint(example)\n\n\n@pytest.mark.parametrize(\"example\", find_examples(\"docs/index.md\"), ids=str)\ndef test_index(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_examples.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\nfrom tests.docs._example_groups import collect_examples, core_example_files\n\ncode_examples = collect_examples(core_example_files())\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_index(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_examples_batch.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._example_groups import BATCH, collect_examples, example_paths\n\ncode_examples = collect_examples(example_paths(BATCH))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_examples_batch(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_examples_integrations.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._example_groups import INTEGRATIONS, collect_examples, example_paths\n\ncode_examples = collect_examples(example_paths(INTEGRATIONS))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_examples_integrations(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_examples_multimodal.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._example_groups import MULTIMODAL, collect_examples, example_paths\n\ncode_examples = collect_examples(example_paths(MULTIMODAL))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_examples_multimodal(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_examples_providers.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\nfrom tests.docs._example_groups import PROVIDERS, collect_examples, example_paths\n\ncode_examples = collect_examples(example_paths(PROVIDERS))\n\n\n@pytest.mark.parametrize(\"example\", code_examples, ids=str)\ndef test_examples_providers(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_hub.py",
    "content": "import pytest\nfrom pytest_examples import CodeExample, EvalExample\n\n\n@pytest.mark.skip(reason=\"Hub functionality is being removed\")\ndef test_format_blog(example: CodeExample, eval_example: EvalExample) -> None:\n    \"\"\"This test is being skipped as the hub functionality is being removed.\"\"\"\n    excluded_sources: list[str] = [\n        \"mistral\",\n        \"ollama\",\n        \"llama_cpp\",\n        \"groq\",\n        \"youtube\",\n        \"contact\",\n        \"langsmith\",\n    ]  # sources that are not supported in testing\n    if any(source in example.source for source in excluded_sources):\n        return\n\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n        eval_example.run(example)\n"
  },
  {
    "path": "tests/docs/test_mkdocs.py",
    "content": "import pathlib\nimport pytest\nimport importlib\nfrom typing import Any, cast\n\n\n# Note the use of `str`, makes for pretty output\n@pytest.mark.parametrize(\n    \"fpath\", pathlib.Path(\"docs/examples\").glob(\"**/*.md\"), ids=str\n)\n@pytest.mark.skip(reason=\"This test is not yet implemented\")\ndef test_files_good(fpath):\n    mktestdocs = cast(Any, importlib.import_module(\"mktestdocs\"))\n    check_md_file = mktestdocs.check_md_file\n\n    check_md_file(fpath=fpath, memory=True)\n"
  },
  {
    "path": "tests/docs/test_posts.py",
    "content": "import pytest\nfrom pytest_examples import find_examples, CodeExample, EvalExample\n\n\n@pytest.mark.parametrize(\"example\", find_examples(\"docs/blog/posts\"), ids=str)\ndef test_index(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n"
  },
  {
    "path": "tests/docs/test_prompt_tips.py",
    "content": "import pytest\nfrom pytest_examples import find_examples, CodeExample, EvalExample\n\n\n@pytest.mark.parametrize(\"example\", find_examples(\"docs/prompting\"), ids=str)\n@pytest.mark.skip(reason=\"Skipping this for now\")\ndef test_format_concepts(example: CodeExample, eval_example: EvalExample):\n    if eval_example.update_examples:\n        eval_example.format(example)\n        # eval_example.run_print_update(example)\n    else:\n        eval_example.lint(example)\n        # eval_example.run(example)\n"
  },
  {
    "path": "tests/dsl/test_gemini_tools_async_streaming.py",
    "content": "\"\"\"Regression test for async streaming with Mode.GEMINI_TOOLS.\n\nThe sync paths in PartialBase.from_streaming_response and\nIterableBase.from_streaming_response apply extract_json_from_stream\nfor both Mode.MD_JSON and Mode.GEMINI_TOOLS, but the async paths\nwere only applying it for Mode.MD_JSON.\n\"\"\"\n\nimport pytest\n\nfrom instructor.mode import Mode\nfrom instructor.utils.core import (\n    extract_json_from_stream,\n    extract_json_from_stream_async,\n)\n\n\ndef test_sync_extract_json_from_stream_handles_codeblock():\n    chunks = [\"```json\\n\", '{\"name\": \"Alice\",', ' \"age\": 30}', \"\\n```\"]\n    result = \"\".join(extract_json_from_stream(iter(chunks)))\n    assert result == '{\"name\": \"Alice\", \"age\": 30}'\n\n\n@pytest.mark.asyncio\nasync def test_async_extract_json_from_stream_handles_codeblock():\n    chunks = [\"```json\\n\", '{\"name\": \"Alice\",', ' \"age\": 30}', \"\\n```\"]\n\n    async def async_chunks():\n        for c in chunks:\n            yield c\n\n    result = \"\".join([c async for c in extract_json_from_stream_async(async_chunks())])\n    assert result == '{\"name\": \"Alice\", \"age\": 30}'\n\n\ndef test_sync_gemini_tools_mode_triggers_json_extraction():\n    \"\"\"Verify that GEMINI_TOOLS is in the set that triggers extract_json_from_stream\n    in the sync from_streaming_response path.\"\"\"\n    # This tests the condition that was already correct in the sync path\n    assert Mode.GEMINI_TOOLS in {Mode.MD_JSON, Mode.GEMINI_TOOLS}\n\n\ndef test_async_gemini_tools_mode_triggers_json_extraction():\n    \"\"\"Verify the fix: GEMINI_TOOLS must be in the set that triggers\n    extract_json_from_stream_async in the async from_streaming_response_async path.\n\n    Before the fix, the async path only checked `mode == Mode.MD_JSON`,\n    so GEMINI_TOOLS streaming would skip JSON extraction from code blocks.\n    \"\"\"\n    # After the fix, both sync and async paths use the same set\n    mode = Mode.GEMINI_TOOLS\n    # This is the condition in the fixed async path\n    assert mode in {Mode.MD_JSON, Mode.GEMINI_TOOLS}\n"
  },
  {
    "path": "tests/dsl/test_partial.py",
    "content": "# type: ignore[all]\nfrom copy import deepcopy\nfrom enum import Enum\nfrom typing import Literal, Optional, Union\n\nimport pytest\nfrom jiter import from_json\nfrom pydantic import BaseModel, Field, ValidationError\n\nimport instructor\nfrom instructor.dsl.partial import Partial, PartialLiteralMixin, _make_field_optional\nimport os\nfrom openai import OpenAI, AsyncOpenAI\n\nmodels = [\"gpt-4o-mini\"]\nmodes = [\n    instructor.Mode.TOOLS,\n]\n\n\nclass SampleNestedPartial(BaseModel):\n    b: int\n\n\nclass SamplePartial(BaseModel):\n    a: int\n    b: SampleNestedPartial\n\n\nclass NestedA(BaseModel):\n    a: str\n    b: Optional[str]\n\n\nclass NestedB(BaseModel):\n    c: str\n    d: str\n    e: list[Union[str, int]]\n    f: str\n\n\nclass UnionWithNested(BaseModel):\n    a: list[Union[NestedA, NestedB]]\n    b: list[NestedA]\n    c: NestedB\n\n\ndef test_partial():\n    partial = Partial[SamplePartial]\n    assert partial.model_json_schema() == {\n        \"$defs\": {\n            \"PartialSampleNestedPartial\": {\n                \"properties\": {\"b\": {\"title\": \"B\", \"type\": \"integer\"}},\n                \"required\": [\"b\"],\n                \"title\": \"PartialSampleNestedPartial\",\n                \"type\": \"object\",\n            }\n        },\n        \"properties\": {\n            \"a\": {\"title\": \"A\", \"type\": \"integer\"},\n            \"b\": {\"$ref\": \"#/$defs/PartialSampleNestedPartial\"},\n        },\n        \"required\": [\"a\", \"b\"],\n        \"title\": \"PartialSamplePartial\",\n        \"type\": \"object\",\n    }, \"Wrapped model JSON schema has changed\"\n    assert partial.get_partial_model().model_json_schema() == {\n        \"$defs\": {\n            \"PartialSampleNestedPartial\": {\n                \"properties\": {\n                    \"b\": {\n                        \"anyOf\": [{\"type\": \"integer\"}, {\"type\": \"null\"}],\n                        \"default\": None,\n                        \"title\": \"B\",\n                    }\n                },\n                \"title\": \"PartialSampleNestedPartial\",\n                \"type\": \"object\",\n            }\n        },\n        \"properties\": {\n            \"a\": {\n                \"anyOf\": [{\"type\": \"integer\"}, {\"type\": \"null\"}],\n                \"default\": None,\n                \"title\": \"A\",\n            },\n            \"b\": {\n                \"anyOf\": [\n                    {\"$ref\": \"#/$defs/PartialSampleNestedPartial\"},\n                    {\"type\": \"null\"},\n                ],\n                \"default\": {},\n            },\n        },\n        \"title\": \"PartialSamplePartial\",\n        \"type\": \"object\",\n    }, \"Partial model JSON schema has changed\"\n\n\npartial_chunks = [\"\\n\", \"\\t\", \" \", \"\\x00\", '{\"a\": 42, \"b\": {\"b\": 1}}']\nexpected_sync_models = [\n    # First model has default values (nested models show their fields as None)\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": None, \"b\": {\"b\": None}},\n    # Last model has all fields populated from JSON\n    {\"a\": 42, \"b\": {\"b\": 1}},\n]\nexpected_async_models = [\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": None, \"b\": {\"b\": None}},\n    {\"a\": 42, \"b\": {\"b\": 1}},\n]\n\n\ndef test_partial_with_whitespace():\n    partial = Partial[SamplePartial]\n    # Get the actual models from chunks - must provide complete data for final validation\n    models = list(partial.model_from_chunks(partial_chunks))\n    assert len(models) == len(expected_sync_models)\n    for i, model in enumerate(models):\n        assert model.model_dump() == expected_sync_models[i]\n\n\n@pytest.mark.asyncio\nasync def test_async_partial_with_whitespace():\n    partial = Partial[SamplePartial]\n\n    # Handle any leading whitespace from the model - must provide complete data for final validation\n    async def async_generator():\n        for chunk in partial_chunks:\n            yield chunk\n\n    i = 0\n    async for model in partial.model_from_chunks_async(async_generator()):\n        # Expected behavior: When whitespace chunks are processed, we should always get a model\n        assert model.model_dump() == expected_async_models[i]\n        i += 1\n    assert i == len(expected_async_models)\n\n\n@pytest.mark.skipif(not os.getenv(\"OPENAI_API_KEY\"), reason=\"OPENAI_API_KEY not set\")\ndef test_summary_extraction():\n    class Summary(BaseModel):\n        summary: str = Field(description=\"A detailed summary\")\n\n    client = OpenAI()\n    client = instructor.from_openai(client, mode=instructor.Mode.TOOLS)\n    extraction_stream = client.chat.completions.create_partial(\n        model=\"gpt-4o\",\n        response_model=Summary,\n        messages=[\n            {\"role\": \"system\", \"content\": \"You summarize text\"},\n            {\"role\": \"user\", \"content\": \"Summarize: Mary had a little lamb\"},\n        ],\n        stream=True,\n    )\n\n    # Collect all streaming updates and verify final result\n    final_summary = None\n    chunk_count = 0\n    for extraction in extraction_stream:\n        final_summary = extraction.summary\n        chunk_count += 1\n\n    # Verify we got streaming updates and a valid final summary\n    assert chunk_count > 0\n    assert final_summary is not None\n    assert len(final_summary) > 0\n\n\n@pytest.mark.skipif(not os.getenv(\"OPENAI_API_KEY\"), reason=\"OPENAI_API_KEY not set\")\n@pytest.mark.asyncio\nasync def test_summary_extraction_async():\n    class Summary(BaseModel):\n        summary: str = Field(description=\"A detailed summary\")\n\n    client = AsyncOpenAI()\n    client = instructor.from_openai(client, mode=instructor.Mode.TOOLS)\n    extraction_stream = client.chat.completions.create_partial(\n        model=\"gpt-4o\",\n        response_model=Summary,\n        messages=[\n            {\"role\": \"system\", \"content\": \"You summarize text\"},\n            {\"role\": \"user\", \"content\": \"Summarize: Mary had a little lamb\"},\n        ],\n        stream=True,\n    )\n\n    # Collect all streaming updates and verify final result\n    final_summary = None\n    chunk_count = 0\n    async for extraction in extraction_stream:\n        final_summary = extraction.summary\n        chunk_count += 1\n\n    # Verify we got streaming updates and a valid final summary\n    assert chunk_count > 0\n    assert final_summary is not None\n    assert len(final_summary) > 0\n\n\ndef test_union_with_nested():\n    partial = Partial[UnionWithNested]\n    partial.get_partial_model().model_validate_json(\n        '{\"a\": [{\"b\": \"b\"}, {\"d\": \"d\"}], \"b\": [{\"b\": \"b\"}], \"c\": {\"d\": \"d\"}, \"e\": [1, \"a\"]}'\n    )\n\n\ndef test_partial_with_default_factory():\n    \"\"\"Test that Partial works with fields that have default_factory.\n\n    This test ensures that when making fields optional, the default_factory\n    is properly cleared to avoid Pydantic validation errors about having\n    both default and default_factory set.\n    \"\"\"\n\n    class ModelWithDefaultFactory(BaseModel):\n        items: list[str] = Field(default_factory=list)\n        tags: dict[str, str] = Field(default_factory=dict)\n        name: str\n\n    # This should not raise a validation error about both default and default_factory\n    partial = Partial[ModelWithDefaultFactory]\n    partial_model = partial.get_partial_model()\n\n    # Verify we can instantiate and validate\n    # In Partial models, all fields are made Optional with default=None\n    instance = partial_model()\n    assert instance.items is None\n    assert instance.tags is None\n    assert instance.name is None\n\n    # Test with partial data\n    instance2 = partial_model.model_validate({\"items\": [\"a\", \"b\"]})\n    assert instance2.items == [\"a\", \"b\"]\n    assert instance2.tags is None\n    assert instance2.name is None\n\n\nclass TestMakeFieldOptionalWorksWithPydanticV2:\n    \"\"\"Tests proving that _make_field_optional with deepcopy works correctly in Pydantic v2.\n\n    These tests refute the claim that deepcopy + setting default = None doesn't work\n    in Pydantic v2. The implementation is correct and fields are properly made optional.\n\n    See: https://github.com/instructor-ai/instructor/issues/XXXX\n    \"\"\"\n\n    def test_deepcopy_approach_makes_field_optional(self):\n        \"\"\"Verify that deepcopy + default = None makes fields optional in Pydantic v2.\"\"\"\n\n        class Original(BaseModel):\n            name: str  # Required field\n\n        field = Original.model_fields[\"name\"]\n        assert field.is_required() is True, \"Original field should be required\"\n\n        # This is what _make_field_optional does\n        tmp = deepcopy(field)\n        tmp.default = None\n        tmp.annotation = Optional[str]\n\n        assert tmp.is_required() is False, \"Modified field should not be required\"\n        assert tmp.default is None, \"Default should be None\"\n\n    def test_make_field_optional_function_works(self):\n        \"\"\"Verify _make_field_optional correctly transforms required fields.\"\"\"\n\n        class TestModel(BaseModel):\n            name: str\n            age: int\n\n        for field_name, field_info in TestModel.model_fields.items():\n            assert field_info.is_required() is True, f\"{field_name} should be required\"\n\n            annotation, new_field = _make_field_optional(field_info)\n            assert new_field.is_required() is False, (\n                f\"{field_name} should be optional after transformation\"\n            )\n            assert new_field.default is None, f\"{field_name} should have None default\"\n\n    def test_partial_model_validates_empty_dict(self):\n        \"\"\"Verify Partial models can validate empty dicts (all fields None).\"\"\"\n\n        class MyModel(BaseModel):\n            name: str\n            age: int\n            status: str\n\n        PartialModel = Partial[MyModel]\n        TruePartial = PartialModel.get_partial_model()\n\n        # This should NOT raise ValidationError\n        result = TruePartial.model_validate({})\n\n        assert result.name is None\n        assert result.age is None\n        assert result.status is None\n\n    def test_partial_validates_incremental_streaming_data(self):\n        \"\"\"Verify Partial models correctly handle incremental streaming data.\"\"\"\n\n        class MyModel(BaseModel):\n            name: str\n            age: int\n\n        PartialModel = Partial[MyModel]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Simulate streaming JSON chunks\n        streaming_states = [\n            (\"{}\", None, None),\n            ('{\"name\": \"Jo', \"Jo\", None),  # Partial string\n            ('{\"name\": \"John\"}', \"John\", None),\n            ('{\"name\": \"John\", \"age\": 25}', \"John\", 25),\n        ]\n\n        for json_str, expected_name, expected_age in streaming_states:\n            obj = from_json(json_str.encode(), partial_mode=\"trailing-strings\")\n            result = TruePartial.model_validate(obj)\n            assert result.name == expected_name, f\"Failed for {json_str}\"\n            assert result.age == expected_age, f\"Failed for {json_str}\"\n\n    def test_partial_with_all_field_types(self):\n        \"\"\"Verify _make_field_optional works with various field types.\"\"\"\n\n        class ComplexModel(BaseModel):\n            string_field: str\n            int_field: int\n            float_field: float\n            bool_field: bool\n            list_field: list[str]\n            optional_field: Optional[str]\n\n        PartialModel = Partial[ComplexModel]\n        TruePartial = PartialModel.get_partial_model()\n\n        # All fields should validate with empty dict\n        result = TruePartial.model_validate({})\n\n        assert result.string_field is None\n        assert result.int_field is None\n        assert result.float_field is None\n        assert result.bool_field is None\n        assert result.list_field is None\n        assert result.optional_field is None\n\n\nclass TestLiteralTypeStreaming:\n    \"\"\"Tests for Literal type handling during streaming.\n\n    Without PartialLiteralMixin: uses partial_mode='trailing-strings', which keeps\n    incomplete strings and causes validation errors for Literal/Enum fields.\n\n    With PartialLiteralMixin: uses partial_mode='on', which drops incomplete strings\n    so fields become None.\n    \"\"\"\n\n    def test_literal_without_mixin_fails_on_incomplete_string(self):\n        \"\"\"Without PartialLiteralMixin, incomplete Literal strings cause validation errors.\"\"\"\n\n        class ModelWithLiteral(BaseModel):\n            status: Literal[\"active\", \"inactive\"]\n\n        PartialModel = Partial[ModelWithLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        # With partial_mode=\"trailing-strings\", incomplete strings are kept\n        partial_json = b'{\"status\": \"act'\n        obj = from_json(partial_json, partial_mode=\"trailing-strings\")\n        # obj is {\"status\": \"act\"} - a partial string that fails Literal validation\n\n        with pytest.raises(ValidationError):\n            TruePartial.model_validate(obj)\n\n    def test_literal_with_mixin_incomplete_string_becomes_none(self):\n        \"\"\"With PartialLiteralMixin, incomplete Literal strings are dropped.\"\"\"\n\n        class ModelWithLiteral(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n\n        PartialModel = Partial[ModelWithLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        # With partial_mode=\"on\" (enabled by PartialLiteralMixin), incomplete strings are dropped\n        partial_json = b'{\"status\": \"act'\n        obj = from_json(partial_json, partial_mode=\"on\")\n        # obj is {} because the incomplete string was dropped\n\n        result = TruePartial.model_validate(obj)\n        assert result.status is None\n\n    def test_literal_accepts_valid_complete_value(self):\n        \"\"\"Literal fields should accept valid complete values.\"\"\"\n\n        class ModelWithLiteral(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n\n        PartialModel = Partial[ModelWithLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        result = TruePartial.model_validate({\"status\": \"active\"})\n        assert result.status == \"active\"\n\n        result = TruePartial.model_validate({\"status\": \"inactive\"})\n        assert result.status == \"inactive\"\n\n    def test_literal_with_missing_field_is_none(self):\n        \"\"\"Literal fields should be None when not present in data.\"\"\"\n\n        class ModelWithLiteral(BaseModel, PartialLiteralMixin):\n            name: str\n            status: Literal[\"active\", \"inactive\"]\n\n        PartialModel = Partial[ModelWithLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        result = TruePartial.model_validate({\"name\": \"John\"})\n        assert result.name == \"John\"\n        assert result.status is None\n\n    def test_literal_rejects_complete_invalid_value(self):\n        \"\"\"Complete but invalid Literal values should fail validation.\"\"\"\n\n        class ModelWithLiteral(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n\n        PartialModel = Partial[ModelWithLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        # \"xyz\" is a complete string but not a valid Literal value\n        with pytest.raises(ValidationError):\n            TruePartial.model_validate({\"status\": \"xyz\"})\n\n\nclass TestPartialStreamingWithComplexTypes:\n    \"\"\"Tests for streaming with complex Pydantic types using PartialLiteralMixin.\n\n    With PartialLiteralMixin, partial_mode='on' is used, so incomplete values are dropped.\n    \"\"\"\n\n    def test_enum_incomplete_string_becomes_none(self):\n        \"\"\"With PartialLiteralMixin, incomplete Enum strings are dropped.\"\"\"\n\n        class Status(Enum):\n            ACTIVE = \"active\"\n            INACTIVE = \"inactive\"\n\n        class ModelWithEnum(BaseModel, PartialLiteralMixin):\n            status: Status\n\n        PartialModel = Partial[ModelWithEnum]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Incomplete string is dropped with partial_mode=\"on\"\n        obj = from_json(b'{\"status\": \"act', partial_mode=\"on\")\n        result = TruePartial.model_validate(obj)\n        assert result.status is None\n\n    def test_enum_accepts_valid_complete_value(self):\n        \"\"\"Enum fields should accept valid complete values.\"\"\"\n\n        class Status(Enum):\n            ACTIVE = \"active\"\n            INACTIVE = \"inactive\"\n\n        class ModelWithEnum(BaseModel, PartialLiteralMixin):\n            status: Status\n\n        PartialModel = Partial[ModelWithEnum]\n        TruePartial = PartialModel.get_partial_model()\n\n        result = TruePartial.model_validate({\"status\": \"active\"})\n        assert result.status == Status.ACTIVE\n\n    def test_optional_literal_incomplete_string_becomes_none(self):\n        \"\"\"With PartialLiteralMixin, incomplete Optional[Literal] strings are dropped.\"\"\"\n\n        class ModelWithOptionalLiteral(BaseModel, PartialLiteralMixin):\n            status: Optional[Literal[\"on\", \"off\"]] = None\n\n        PartialModel = Partial[ModelWithOptionalLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        obj = from_json(b'{\"status\": \"o', partial_mode=\"on\")\n        result = TruePartial.model_validate(obj)\n        assert result.status is None\n\n    def test_optional_literal_accepts_valid_value(self):\n        \"\"\"Optional[Literal] should accept valid complete values.\"\"\"\n\n        class ModelWithOptionalLiteral(BaseModel, PartialLiteralMixin):\n            status: Optional[Literal[\"on\", \"off\"]] = None\n\n        PartialModel = Partial[ModelWithOptionalLiteral]\n        TruePartial = PartialModel.get_partial_model()\n\n        result = TruePartial.model_validate({\"status\": \"on\"})\n        assert result.status == \"on\"\n\n    def test_union_literal_incomplete_string_becomes_none(self):\n        \"\"\"With PartialLiteralMixin, incomplete Union[Literal, int] strings are dropped.\"\"\"\n\n        class ModelWithUnion(BaseModel, PartialLiteralMixin):\n            value: Union[Literal[\"yes\", \"no\"], int]\n\n        PartialModel = Partial[ModelWithUnion]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Incomplete string is dropped\n        obj = from_json(b'{\"value\": \"ye', partial_mode=\"on\")\n        result = TruePartial.model_validate(obj)\n        assert result.value is None\n\n    def test_union_literal_accepts_valid_values(self):\n        \"\"\"Union[Literal, int] should accept both valid Literal and int.\"\"\"\n\n        class ModelWithUnion(BaseModel, PartialLiteralMixin):\n            value: Union[Literal[\"yes\", \"no\"], int]\n\n        PartialModel = Partial[ModelWithUnion]\n        TruePartial = PartialModel.get_partial_model()\n\n        result = TruePartial.model_validate({\"value\": \"yes\"})\n        assert result.value == \"yes\"\n\n        result = TruePartial.model_validate({\"value\": 42})\n        assert result.value == 42\n\n    def test_union_of_literals_matches_all_branches(self):\n        \"\"\"Union[Literal, Literal] should match values from all branches.\"\"\"\n\n        class ModelWithUnionLiterals(BaseModel, PartialLiteralMixin):\n            value: Union[Literal[\"a\", \"b\"], Literal[\"x\", \"y\"]]\n\n        PartialModel = Partial[ModelWithUnionLiterals]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Both branches should work\n        assert TruePartial.model_validate({\"value\": \"a\"}).value == \"a\"\n        assert TruePartial.model_validate({\"value\": \"b\"}).value == \"b\"\n        assert TruePartial.model_validate({\"value\": \"x\"}).value == \"x\"\n        assert TruePartial.model_validate({\"value\": \"y\"}).value == \"y\"\n\n    def test_list_literal_incomplete_item_dropped(self):\n        \"\"\"With PartialLiteralMixin, incomplete list items are dropped.\"\"\"\n\n        class ModelWithLiteralList(BaseModel, PartialLiteralMixin):\n            tags: list[Literal[\"admin\", \"user\", \"guest\"]]\n\n        PartialModel = Partial[ModelWithLiteralList]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Incomplete list item is dropped\n        obj = from_json(b'{\"tags\": [\"admin\", \"us', partial_mode=\"on\")\n        result = TruePartial.model_validate(obj)\n        assert result.tags == [\"admin\"]\n\n    def test_list_literal_accepts_valid_items(self):\n        \"\"\"list[Literal] should accept valid complete items.\"\"\"\n\n        class ModelWithLiteralList(BaseModel, PartialLiteralMixin):\n            tags: list[Literal[\"admin\", \"user\", \"guest\"]]\n\n        PartialModel = Partial[ModelWithLiteralList]\n        TruePartial = PartialModel.get_partial_model()\n\n        result = TruePartial.model_validate({\"tags\": [\"admin\", \"user\"]})\n        assert result.tags == [\"admin\", \"user\"]\n\n\nclass TestDiscriminatedUnionPartial:\n    \"\"\"Tests for discriminated unions with Partial streaming.\n\n    KNOWN LIMITATION: Discriminated unions don't work with Partial because:\n    - Partial makes all fields Optional\n    - Pydantic requires discriminator fields to be strictly Literal, not Optional[Literal]\n\n    Workaround: Use Union without the discriminator parameter.\n    \"\"\"\n\n    def test_discriminated_union_not_compatible_with_partial(self):\n        \"\"\"Discriminated unions fail with Partial (known limitation).\"\"\"\n\n        class Cat(BaseModel):\n            pet_type: Literal[\"cat\"]\n            meows: int\n\n        class Dog(BaseModel):\n            pet_type: Literal[\"dog\"]\n            barks: int\n\n        class PetContainer(BaseModel):\n            pet: Union[Cat, Dog] = Field(discriminator=\"pet_type\")\n\n        # Fails because Partial makes pet_type Optional, but discriminators must be Literal\n        from pydantic import PydanticUserError\n\n        PartialModel = Partial[PetContainer]\n        with pytest.raises(PydanticUserError):\n            PartialModel.get_partial_model()\n\n    def test_union_without_discriminator_works(self):\n        \"\"\"Union without discriminator works with Partial streaming.\"\"\"\n\n        class Cat(BaseModel):\n            pet_type: Literal[\"cat\"]\n            meows: int\n\n        class Dog(BaseModel):\n            pet_type: Literal[\"dog\"]\n            barks: int\n\n        class PetContainerNoDiscriminator(BaseModel):\n            pet: Union[Cat, Dog]  # No discriminator - works with Partial\n\n        PartialModel = Partial[PetContainerNoDiscriminator]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Complete value works\n        result = TruePartial.model_validate({\"pet\": {\"pet_type\": \"cat\", \"meows\": 5}})\n        assert result.pet is not None\n        assert result.pet.pet_type == \"cat\"\n\n    def test_single_value_literal_incomplete_string(self):\n        \"\"\"Single-value Literals with incomplete strings become None.\"\"\"\n\n        class Cat(BaseModel):\n            pet_type: Literal[\"cat\"]\n\n        PartialModel = Partial[Cat]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Incomplete string is dropped\n        obj = from_json(b'{\"pet_type\": \"ca', partial_mode=\"on\")\n        result = TruePartial.model_validate(obj)\n        assert result.pet_type is None\n\n        # Complete value works\n        result = TruePartial.model_validate({\"pet_type\": \"cat\"})\n        assert result.pet_type == \"cat\"\n\n\nclass TestModelValidatorsDuringStreaming:\n    \"\"\"Tests for model validators during partial streaming.\n\n    Model validators are automatically wrapped to skip during streaming\n    (when context={\"partial_streaming\": True} is passed) and only run\n    when validating without that context (final validation).\n    \"\"\"\n\n    def test_model_validator_skipped_during_streaming(self):\n        \"\"\"Model validators should be skipped when streaming context is passed.\"\"\"\n        from pydantic import model_validator\n\n        class ModelWithValidator(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n            priority: Literal[\"high\", \"low\"]\n\n            @model_validator(mode=\"after\")\n            def validate_relationships(self):\n                # This would fail during streaming without wrapping\n                if self.status is not None and self.priority is None:\n                    raise ValueError(\"If status is set, priority must also be set!\")\n                return self\n\n        PartialModel = Partial[ModelWithValidator]\n\n        # With completeness-based validation, incomplete JSON skips all validation\n        # by using model_construct() instead of model_validate()\n        chunks = ['{\"status\": \"act']  # Incomplete JSON\n        results = list(PartialModel.model_from_chunks(chunks))\n        # Incomplete JSON - no validation runs, partial value stored\n        assert results[0].status == \"act\"\n        assert results[0].priority is None\n\n    def test_model_validator_runs_when_complete(self):\n        \"\"\"Model validators should run when all fields are complete.\"\"\"\n        from pydantic import model_validator\n\n        class ModelWithValidator(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n            priority: Literal[\"high\", \"low\"]\n\n            @model_validator(mode=\"after\")\n            def validate_relationships(self):\n                if self.status == \"active\" and self.priority == \"low\":\n                    raise ValueError(\"Active status requires high priority!\")\n                return self\n\n        PartialModel = Partial[ModelWithValidator]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Valid complete data\n        result = TruePartial.model_validate({\"status\": \"active\", \"priority\": \"high\"})\n        assert result.status == \"active\"\n        assert result.priority == \"high\"\n\n        # Invalid complete data should fail\n        with pytest.raises(ValidationError):\n            TruePartial.model_validate({\"status\": \"active\", \"priority\": \"low\"})\n\n    def test_multiple_model_validators(self):\n        \"\"\"Multiple model validators should all be wrapped and run when complete.\"\"\"\n        from pydantic import model_validator\n\n        validator_calls = []\n\n        class ModelWithMultipleValidators(BaseModel, PartialLiteralMixin):\n            a: Literal[\"x\", \"y\"]\n            b: Literal[\"1\", \"2\"]\n\n            @model_validator(mode=\"after\")\n            def validator_one(self):\n                validator_calls.append(\"one\")\n                return self\n\n            @model_validator(mode=\"after\")\n            def validator_two(self):\n                validator_calls.append(\"two\")\n                return self\n\n        PartialModel = Partial[ModelWithMultipleValidators]\n\n        # During streaming with incomplete JSON, validators should be skipped\n        # because model_construct() is used instead of model_validate()\n        validator_calls.clear()\n        chunks = ['{\"a\": \"x']  # Incomplete JSON\n        list(PartialModel.model_from_chunks(chunks))\n        assert validator_calls == []\n\n        # Complete JSON - validators run during model_validate\n        validator_calls.clear()\n        chunks = ['{\"a\": \"x\", \"b\": \"1\"}']  # Complete JSON\n        list(PartialModel.model_from_chunks(chunks))\n        assert \"one\" in validator_calls\n        assert \"two\" in validator_calls\n\n    def test_validators_run_without_streaming_context(self):\n        \"\"\"Validators should run when no streaming context is passed (final validation).\"\"\"\n        from pydantic import model_validator\n\n        class ModelWithValidator(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n            priority: Literal[\"high\", \"low\"]\n\n            @model_validator(mode=\"after\")\n            def validate_relationships(self):\n                if self.status == \"active\" and self.priority == \"low\":\n                    raise ValueError(\"Active requires high priority!\")\n                return self\n\n        PartialModel = Partial[ModelWithValidator]\n        TruePartial = PartialModel.get_partial_model()\n\n        # Without streaming context, validators run even with incomplete data\n        # This is the final validation scenario\n        with pytest.raises(ValidationError):\n            TruePartial.model_validate({\"status\": \"active\", \"priority\": \"low\"})\n\n        # Valid complete data passes\n        result = TruePartial.model_validate({\"status\": \"active\", \"priority\": \"high\"})\n        assert result.status == \"active\"\n        assert result.priority == \"high\"\n\n\nclass TestFinalValidationAfterStreaming:\n    \"\"\"Tests for final validation after streaming completes.\n\n    When streaming ends, the final object is validated against the original\n    model to enforce required fields and run validators without streaming context.\n    \"\"\"\n\n    def test_final_validation_catches_missing_required_fields(self):\n        \"\"\"Final validation should fail if required fields are missing.\"\"\"\n\n        class ModelWithRequired(BaseModel):\n            name: str  # Required\n            age: int  # Required\n            nickname: Optional[str] = None  # Optional\n\n        PartialModel = Partial[ModelWithRequired]\n\n        # Simulate streaming that doesn't provide all required fields\n        chunks = ['{\"name\": \"John\"}']  # Missing 'age'\n\n        with pytest.raises(ValidationError) as exc_info:\n            list(PartialModel.model_from_chunks(iter(chunks)))\n\n        # Should fail because 'age' is required but missing\n        assert \"age\" in str(exc_info.value)\n\n    def test_final_validation_passes_with_all_required_fields(self):\n        \"\"\"Final validation should pass when all required fields are present.\"\"\"\n\n        class ModelWithRequired(BaseModel):\n            name: str\n            age: int\n\n        PartialModel = Partial[ModelWithRequired]\n\n        # Simulate streaming that provides all required fields\n        chunks = ['{\"name\": \"John\", \"age\": 30}']\n\n        results = list(PartialModel.model_from_chunks(iter(chunks)))\n        assert len(results) > 0\n        final = results[-1]\n        assert final.name == \"John\"\n        assert final.age == 30\n\n    def test_final_validation_runs_model_validators(self):\n        \"\"\"Final validation should run model validators without streaming context.\"\"\"\n        from pydantic import model_validator\n\n        class ModelWithValidator(BaseModel, PartialLiteralMixin):\n            status: Literal[\"active\", \"inactive\"]\n            priority: Literal[\"high\", \"low\"]\n\n            @model_validator(mode=\"after\")\n            def check_consistency(self):\n                if self.status == \"active\" and self.priority == \"low\":\n                    raise ValueError(\"Active tasks must have high priority\")\n                return self\n\n        PartialModel = Partial[ModelWithValidator]\n\n        # This should fail final validation due to the model validator\n        chunks = ['{\"status\": \"active\", \"priority\": \"low\"}']\n\n        with pytest.raises(ValidationError) as exc_info:\n            list(PartialModel.model_from_chunks(iter(chunks)))\n\n        assert \"Active tasks must have high priority\" in str(exc_info.value)\n\n    def test_streaming_yields_partial_objects_before_final_validation(self):\n        \"\"\"Streaming should yield partial objects even if final validation will fail.\"\"\"\n\n        class ModelWithRequired(BaseModel):\n            name: str\n            age: int\n\n        PartialModel = Partial[ModelWithRequired]\n\n        # Stream with incomplete JSON first, then complete JSON\n        # First chunk is incomplete, yields partial object\n        # Second chunk completes the JSON with all required fields\n        chunks = ['{\"name\": \"Jo', 'hn\", \"age\": 25}']\n\n        partial_objects = []\n        for obj in PartialModel.model_from_chunks(iter(chunks)):\n            partial_objects.append(obj)\n\n        # Should have yielded partial objects during streaming\n        assert len(partial_objects) >= 1\n        # First partial object has incomplete name\n        assert partial_objects[0].name == \"Jo\"\n        # Final object is fully validated\n        assert partial_objects[-1].name == \"John\"\n        assert partial_objects[-1].age == 25\n\n    def test_original_model_reference_is_stored(self):\n        \"\"\"Partial model should store reference to original model.\"\"\"\n\n        class OriginalModel(BaseModel):\n            name: str\n\n        PartialModel = Partial[OriginalModel]\n\n        assert hasattr(PartialModel, \"_original_model\")\n        assert PartialModel._original_model is OriginalModel\n\n    @pytest.mark.asyncio\n    async def test_async_final_validation_catches_missing_required_fields(self):\n        \"\"\"Async streaming should also do final validation.\"\"\"\n\n        class ModelWithRequired(BaseModel):\n            name: str\n            age: int\n\n        PartialModel = Partial[ModelWithRequired]\n\n        async def async_chunks():\n            yield '{\"name\": \"John\"}'  # Missing 'age'\n\n        with pytest.raises(ValidationError) as exc_info:\n            async for _ in PartialModel.model_from_chunks_async(async_chunks()):\n                pass\n\n        assert \"age\" in str(exc_info.value)\n\n\nclass TestRecursiveModels:\n    \"\"\"Test that Partial handles self-referential models without infinite recursion.\"\"\"\n\n    def test_basic_recursive_model(self):\n        \"\"\"Partial should work with basic recursive models.\"\"\"\n\n        class TreeNode(BaseModel):\n            value: str\n            children: Optional[list[\"TreeNode\"]] = None\n\n        TreeNode.model_rebuild()\n\n        # Should not raise RecursionError\n        PartialTreeNode = Partial[TreeNode]\n        TruePartial = PartialTreeNode.get_partial_model()\n\n        # Can validate partial data\n        result = TruePartial.model_validate({\"value\": \"root\"})\n        assert result.value == \"root\"\n        assert result.children is None\n\n    def test_nested_recursive_model(self):\n        \"\"\"Partial should work with nested children.\"\"\"\n\n        class TreeNode(BaseModel):\n            value: str\n            children: Optional[list[\"TreeNode\"]] = None\n\n        TreeNode.model_rebuild()\n\n        PartialTreeNode = Partial[TreeNode]\n        TruePartial = PartialTreeNode.get_partial_model()\n\n        # Validate with nested structure\n        data = {\n            \"value\": \"root\",\n            \"children\": [\n                {\"value\": \"child1\"},\n                {\"value\": \"child2\", \"children\": [{\"value\": \"grandchild\"}]},\n            ],\n        }\n        result = TruePartial.model_validate(data)\n        assert result.value == \"root\"\n        assert len(result.children) == 2\n        assert result.children[0].value == \"child1\"\n        assert result.children[1].children[0].value == \"grandchild\"\n\n    def test_mutually_recursive_models(self):\n        \"\"\"Partial should handle mutually recursive models.\"\"\"\n\n        class Person(BaseModel):\n            name: str\n            employer: Optional[\"Company\"] = None\n\n        class Company(BaseModel):\n            name: str\n            employees: Optional[list[Person]] = None\n\n        Person.model_rebuild()\n        Company.model_rebuild()\n\n        # Both should work without RecursionError\n        PartialPerson = Partial[Person]\n        PartialCompany = Partial[Company]\n\n        assert PartialPerson is not None\n        assert PartialCompany is not None\n\n        # Validate partial data\n        person_partial = PartialPerson.get_partial_model()\n        result = person_partial.model_validate({\"name\": \"Alice\"})\n        assert result.name == \"Alice\"\n\n    def test_direct_self_reference(self):\n        \"\"\"Partial should handle direct self-reference (linked list style).\"\"\"\n\n        class LinkedNode(BaseModel):\n            value: int\n            next: Optional[\"LinkedNode\"] = None\n\n        LinkedNode.model_rebuild()\n\n        # Should not raise RecursionError\n        PartialLinked = Partial[LinkedNode]\n        TruePartial = PartialLinked.get_partial_model()\n\n        # Validate chain\n        data = {\"value\": 1, \"next\": {\"value\": 2, \"next\": {\"value\": 3}}}\n        result = TruePartial.model_validate(data)\n        assert result.value == 1\n        assert result.next.value == 2\n        assert result.next.next.value == 3\n\n    def test_complex_recursive_with_validators(self):\n        \"\"\"Complex recursive model with validators, multiple self-refs, and nested types.\"\"\"\n        from typing import Literal\n        from pydantic import model_validator, field_validator\n        from enum import Enum\n\n        class NodeType(Enum):\n            FOLDER = \"folder\"\n            FILE = \"file\"\n            SYMLINK = \"symlink\"\n\n        class Permission(BaseModel):\n            user: str\n            level: Literal[\"read\", \"write\", \"admin\"]\n\n        class FileSystemNode(BaseModel):\n            name: str\n            node_type: NodeType\n            size_bytes: Optional[int] = None\n            children: Optional[list[\"FileSystemNode\"]] = None\n            parent: Optional[\"FileSystemNode\"] = None\n            symlink_target: Optional[\"FileSystemNode\"] = None\n            permissions: Optional[list[Permission]] = None\n            metadata: Optional[dict[str, str]] = None\n\n            @field_validator(\"name\")\n            @classmethod\n            def validate_name(cls, v):\n                if v and \"/\" in v:\n                    raise ValueError(\"Name cannot contain /\")\n                return v\n\n            @model_validator(mode=\"after\")\n            def validate_node_consistency(self):\n                # Folders must have no size, files must have size\n                if self.node_type == NodeType.FOLDER and self.size_bytes is not None:\n                    raise ValueError(\"Folders cannot have size_bytes\")\n                if self.node_type == NodeType.FILE and self.children:\n                    raise ValueError(\"Files cannot have children\")\n                if self.node_type == NodeType.SYMLINK and not self.symlink_target:\n                    raise ValueError(\"Symlinks must have a target\")\n                return self\n\n        FileSystemNode.model_rebuild()\n\n        # Should not raise RecursionError\n        PartialFS = Partial[FileSystemNode]\n        TruePartial = PartialFS.get_partial_model()\n\n        # Complex nested structure\n        data = {\n            \"name\": \"root\",\n            \"node_type\": \"folder\",\n            \"permissions\": [{\"user\": \"admin\", \"level\": \"admin\"}],\n            \"metadata\": {\"created\": \"2024-01-01\"},\n            \"children\": [\n                {\n                    \"name\": \"documents\",\n                    \"node_type\": \"folder\",\n                    \"children\": [\n                        {\n                            \"name\": \"report.pdf\",\n                            \"node_type\": \"file\",\n                            \"size_bytes\": 1024,\n                            \"permissions\": [{\"user\": \"alice\", \"level\": \"read\"}],\n                        },\n                        {\n                            \"name\": \"data\",\n                            \"node_type\": \"folder\",\n                            \"children\": [\n                                {\n                                    \"name\": \"archive.zip\",\n                                    \"node_type\": \"file\",\n                                    \"size_bytes\": 2048,\n                                }\n                            ],\n                        },\n                    ],\n                },\n                {\n                    \"name\": \"shortcut\",\n                    \"node_type\": \"symlink\",\n                    \"symlink_target\": {\n                        \"name\": \"target_file\",\n                        \"node_type\": \"file\",\n                        \"size_bytes\": 512,\n                    },\n                },\n            ],\n        }\n\n        result = TruePartial.model_validate(data)\n        assert result.name == \"root\"\n        assert result.node_type == NodeType.FOLDER\n        assert len(result.children) == 2\n        assert result.children[0].name == \"documents\"\n        assert len(result.children[0].children) == 2\n        assert result.children[0].children[0].name == \"report.pdf\"\n        assert result.children[0].children[0].size_bytes == 1024\n        assert result.children[0].children[1].children[0].name == \"archive.zip\"\n        assert result.children[1].symlink_target.name == \"target_file\"\n        assert result.permissions[0].level == \"admin\"\n\n    def test_recursive_with_union_types(self):\n        \"\"\"Recursive model with Union types containing self-references.\"\"\"\n        from typing import Union\n\n        class TextBlock(BaseModel):\n            text: str\n\n        class Container(BaseModel):\n            title: str\n            content: list[Union[TextBlock, \"Container\"]]\n\n        Container.model_rebuild()\n\n        PartialContainer = Partial[Container]\n        TruePartial = PartialContainer.get_partial_model()\n\n        data = {\n            \"title\": \"Chapter 1\",\n            \"content\": [\n                {\"text\": \"Introduction paragraph\"},\n                {\n                    \"title\": \"Section 1.1\",\n                    \"content\": [\n                        {\"text\": \"Section text\"},\n                        {\n                            \"title\": \"Subsection 1.1.1\",\n                            \"content\": [{\"text\": \"Deep nested text\"}],\n                        },\n                    ],\n                },\n                {\"text\": \"Closing paragraph\"},\n            ],\n        }\n\n        result = TruePartial.model_validate(data)\n        assert result.title == \"Chapter 1\"\n        assert len(result.content) == 3\n        assert result.content[0].text == \"Introduction paragraph\"\n        assert result.content[1].title == \"Section 1.1\"\n        assert result.content[1].content[1].title == \"Subsection 1.1.1\"\n"
  },
  {
    "path": "tests/dsl/test_simple_type.py",
    "content": "import unittest\nfrom instructor.dsl.simple_type import is_simple_type\nfrom pydantic import BaseModel\nfrom enum import Enum\nimport typing\n\n\nclass SimpleTypeTests(unittest.TestCase):\n    def test_is_simple_type_with_base_model(self):\n        class MyModel(BaseModel):\n            label: str\n\n        self.assertFalse(is_simple_type(MyModel))\n\n    def test_is_simple_type_with_str(self):\n        self.assertTrue(is_simple_type(str))\n\n    def test_is_simple_type_with_int(self):\n        self.assertTrue(is_simple_type(int))\n\n    def test_is_simple_type_with_float(self):\n        self.assertTrue(is_simple_type(float))\n\n    def test_is_simple_type_with_bool(self):\n        self.assertTrue(is_simple_type(bool))\n\n    def test_is_simple_type_with_enum(self):\n        class MyEnum(Enum):\n            VALUE = 1\n\n        self.assertTrue(is_simple_type(MyEnum))\n\n    def test_is_simple_type_with_annotated(self):\n        AnnotatedType = typing.Annotated[int, \"example\"]\n        self.assertTrue(is_simple_type(AnnotatedType))\n\n    def test_is_simple_type_with_literal(self):\n        LiteralType = typing.Literal[1, 2, 3]\n        self.assertTrue(is_simple_type(LiteralType))\n\n    def test_is_simple_type_with_union(self):\n        UnionType = typing.Union[int, str]\n        self.assertTrue(is_simple_type(UnionType))\n\n    def test_is_simple_type_with_iterable(self):\n        IterableType = typing.Iterable[int]\n        self.assertFalse(is_simple_type(IterableType))\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "tests/dsl/test_simple_type_fix.py",
    "content": "import sys\nimport unittest\nfrom typing import Union, List  # noqa: UP035\nfrom typing import get_origin, get_args\nfrom instructor.dsl.simple_type import is_simple_type\n\n\nclass TestSimpleTypeFix(unittest.TestCase):\n    def test_list_with_union_type(self):\n        \"\"\"Test that list[int | str] is correctly identified as a simple type.\"\"\"\n        # This is the type that was failing in Python 3.10\n        if sys.version_info < (3, 10):\n            self.skipTest(\"Union pipe syntax is only available in Python 3.10+\")\n        response_model = list[int | str]\n        self.assertTrue(\n            is_simple_type(response_model),\n            f\"list[int | str] should be a simple type in Python {sys.version_info.major}.{sys.version_info.minor}. Instead it was identified as {type(response_model)} with origin {get_origin(response_model)} and args {get_args(response_model)}\",\n        )\n\n    def test_list_with_union_type_alternative_syntax(self):\n        \"\"\"Test that List[Union[int, str]] is correctly identified as a simple type.\"\"\"\n        # Alternative syntax\n        response_model = List[Union[int, str]]  # noqa: UP006\n        self.assertTrue(\n            is_simple_type(response_model),\n            f\"List[Union[int, str]] should be a simple type in Python {sys.version_info.major}.{sys.version_info.minor}\",\n        )\n"
  },
  {
    "path": "tests/genai/test_safety_settings.py",
    "content": "from instructor.providers.gemini.utils import update_genai_kwargs\n\n\ndef test_update_genai_kwargs_safety_settings_with_image_content_uses_image_categories():\n    \"\"\"Image inputs should use IMAGE_* harm categories when available.\"\"\"\n    from google.genai import types\n    from google.genai.types import HarmCategory\n\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    image_categories = [\n        c\n        for c in HarmCategory\n        if c not in excluded_categories and c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n    ]\n\n    # Older SDKs may not expose separate image categories.\n    if not image_categories:\n        return\n\n    kwargs = {\n        \"contents\": [\n            types.Content(\n                role=\"user\",\n                parts=[types.Part.from_bytes(data=b\"123\", mime_type=\"image/png\")],\n            )\n        ]\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    assert \"safety_settings\" in result\n    assert isinstance(result[\"safety_settings\"], list)\n    assert len(result[\"safety_settings\"]) == len(image_categories)\n    assert {s[\"category\"] for s in result[\"safety_settings\"]} == set(image_categories)\n\n\ndef test_update_genai_kwargs_maps_text_thresholds_to_image_categories():\n    \"\"\"Text thresholds should carry over to equivalent IMAGE_* categories.\"\"\"\n    from google.genai import types\n    from google.genai.types import HarmBlockThreshold, HarmCategory\n\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    image_categories = [\n        c\n        for c in HarmCategory\n        if c not in excluded_categories and c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n    ]\n\n    if not image_categories or not hasattr(HarmCategory, \"HARM_CATEGORY_IMAGE_HATE\"):\n        return\n\n    custom_safety = {\n        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE\n    }\n\n    kwargs = {\n        \"contents\": [\n            types.Content(\n                role=\"user\",\n                parts=[types.Part.from_bytes(data=b\"123\", mime_type=\"image/png\")],\n            )\n        ],\n        \"safety_settings\": custom_safety,\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    for setting in result[\"safety_settings\"]:\n        if setting[\"category\"] == HarmCategory.HARM_CATEGORY_IMAGE_HATE:\n            assert setting[\"threshold\"] == HarmBlockThreshold.BLOCK_LOW_AND_ABOVE\n\n\ndef test_handle_genai_tools_autodetect_images_uses_image_categories():\n    \"\"\"Autodetected image content should switch safety_settings to IMAGE_* categories.\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_tools\n\n    class SimpleModel(BaseModel):\n        text: str\n\n    data_uri = (\n        \"data:image/png;base64,\"\n        \"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP8/x8AAwMCAO6q0S8AAAAASUVORK5CYII=\"\n    )\n\n    kwargs = {\n        \"messages\": [\n            {\n                \"role\": \"user\",\n                \"content\": [\"What is in this image?\", data_uri],\n            }\n        ]\n    }\n\n    _, out = handle_genai_tools(SimpleModel, kwargs, autodetect_images=True)\n\n    assert \"config\" in out\n    assert out[\"config\"].safety_settings is not None\n    assert any(\n        s.category.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n        for s in out[\"config\"].safety_settings\n    )\n"
  },
  {
    "path": "tests/llm/__init__.py",
    "content": "\n"
  },
  {
    "path": "tests/llm/shared_config.py",
    "content": "\"\"\"\nShared configuration for multi-provider tests.\n\nThis module provides common test configuration for running the same tests\nacross multiple providers (OpenAI, Anthropic, Google, Cohere, xAI, Mistral,\nCerebras, Fireworks, Writer, Perplexity).\n\"\"\"\n\nimport os\n\nimport instructor\nimport pytest\n\n\nGOOGLE_GENAI_MODEL = os.getenv(\"GOOGLE_GENAI_MODEL\", \"\")\n\n# Provider configurations: (model_string, mode, required_env_var, required_package)\nPROVIDER_CONFIGS = [\n    (\n        \"openai/gpt-4o-mini\",\n        instructor.Mode.TOOLS,\n        \"OPENAI_API_KEY\",\n        \"openai\",\n    ),\n    (\n        \"anthropic/claude-3-5-haiku-latest\",\n        instructor.Mode.ANTHROPIC_TOOLS,\n        \"ANTHROPIC_API_KEY\",\n        \"anthropic\",\n    ),\n    (\n        GOOGLE_GENAI_MODEL,\n        instructor.Mode.GENAI_STRUCTURED_OUTPUTS,\n        \"GOOGLE_API_KEY\",\n        \"google.genai\",\n    ),\n    (\n        \"cohere/command-a-03-2025\",\n        instructor.Mode.COHERE_TOOLS,\n        \"COHERE_API_KEY\",\n        \"cohere\",\n    ),\n    (\n        \"xai/grok-3-mini\",\n        instructor.Mode.XAI_TOOLS,\n        \"XAI_API_KEY\",\n        \"xai_sdk\",\n    ),\n    (\n        \"mistral/ministral-8b-latest\",\n        instructor.Mode.MISTRAL_TOOLS,\n        \"MISTRAL_API_KEY\",\n        \"mistralai\",\n    ),\n    (\n        \"cerebras/llama3.1-70b\",\n        instructor.Mode.CEREBRAS_TOOLS,\n        \"CEREBRAS_API_KEY\",\n        \"cerebras\",\n    ),\n    (\n        \"fireworks/llama-v3p1-70b-instruct\",\n        instructor.Mode.FIREWORKS_TOOLS,\n        \"FIREWORKS_API_KEY\",\n        \"fireworks\",\n    ),\n    (\n        \"writer/palmyra-x-004\",\n        instructor.Mode.WRITER_TOOLS,\n        \"WRITER_API_KEY\",\n        \"writerai\",\n    ),\n    (\n        \"perplexity/llama-3.1-sonar-large-128k-online\",\n        instructor.Mode.PERPLEXITY_JSON,\n        \"PERPLEXITY_API_KEY\",\n        \"openai\",  # Perplexity transports over OpenAI-compatible API\n    ),\n]\n\n\ndef get_available_providers() -> list[tuple[str, instructor.Mode]]:\n    \"\"\"\n    Get list of available providers based on API keys and installed packages.\n\n    Returns:\n        List of tuples (model_string, mode) for available providers\n    \"\"\"\n    available = []\n\n    for model, mode, env_var, package in PROVIDER_CONFIGS:\n        if not model:\n            continue\n        # Check if API key is set\n        if not os.getenv(env_var):\n            continue\n\n        # Check if package is installed\n        try:\n            parts = package.split(\".\")\n            if len(parts) > 1:\n                __import__(parts[0])\n                # For nested imports like google.genai\n                __import__(package)\n            else:\n                __import__(package)\n            available.append((model, mode))\n        except ImportError:\n            continue\n\n    return available\n\n\ndef pytest_generate_tests(metafunc):\n    \"\"\"\n    Pytest hook to generate parametrized tests for available providers.\n\n    This is used in test files that have 'provider_config' as a parameter.\n    \"\"\"\n    if \"provider_config\" in metafunc.fixturenames:\n        available = get_available_providers()\n        if not available:\n            pytest.skip(\"No providers available (missing API keys or packages)\")\n\n        # Generate test IDs like \"openai\" \"anthropic\" \"google\"\n        ids = [model.split(\"/\")[0] for model, _ in available]\n        metafunc.parametrize(\"provider_config\", available, ids=ids)\n\n\ndef pytest_configure(config):\n    \"\"\"Register custom markers for provider-specific tests.\"\"\"\n    config.addinivalue_line(\"markers\", \"openai: mark test as requiring OpenAI provider\")\n    config.addinivalue_line(\n        \"markers\", \"anthropic: mark test as requiring Anthropic provider\"\n    )\n    config.addinivalue_line(\"markers\", \"google: mark test as requiring Google provider\")\n    config.addinivalue_line(\"markers\", \"cohere: mark test as requiring Cohere provider\")\n    config.addinivalue_line(\"markers\", \"xai: mark test as requiring xAI provider\")\n    config.addinivalue_line(\n        \"markers\", \"mistral: mark test as requiring Mistral provider\"\n    )\n    config.addinivalue_line(\n        \"markers\", \"cerebras: mark test as requiring Cerebras provider\"\n    )\n    config.addinivalue_line(\n        \"markers\", \"fireworks: mark test as requiring Fireworks provider\"\n    )\n    config.addinivalue_line(\"markers\", \"writer: mark test as requiring Writer provider\")\n    config.addinivalue_line(\n        \"markers\", \"perplexity: mark test as requiring Perplexity provider\"\n    )\n\n\n# Convenience function to skip if specific provider not available\ndef skip_if_provider_unavailable(provider_name: str):\n    \"\"\"\n    Skip test if specific provider is not available.\n\n    Args:\n        provider_name: One of \"openai\", \"anthropic\", \"google\", \"cohere\", \"xai\",\n                       \"mistral\", \"cerebras\", \"fireworks\", \"writer\", \"perplexity\"\n    \"\"\"\n    config_map = {\n        \"openai\": (\"OPENAI_API_KEY\", \"openai\"),\n        \"anthropic\": (\"ANTHROPIC_API_KEY\", \"anthropic\"),\n        \"google\": (\"GOOGLE_API_KEY\", \"google.genai\"),\n        \"cohere\": (\"COHERE_API_KEY\", \"cohere\"),\n        \"xai\": (\"XAI_API_KEY\", \"xai_sdk\"),\n        \"mistral\": (\"MISTRAL_API_KEY\", \"mistralai\"),\n        \"cerebras\": (\"CEREBRAS_API_KEY\", \"cerebras\"),\n        \"fireworks\": (\"FIREWORKS_API_KEY\", \"fireworks\"),\n        \"writer\": (\"WRITER_API_KEY\", \"writerai\"),\n        \"perplexity\": (\"PERPLEXITY_API_KEY\", \"openai\"),\n    }\n\n    if provider_name not in config_map:\n        pytest.skip(f\"Unknown provider: {provider_name}\")\n\n    env_var, package = config_map[provider_name]\n\n    if not os.getenv(env_var):\n        pytest.skip(f\"{env_var} not set\")\n\n    try:\n        __import__(package)\n    except ImportError:\n        pytest.skip(f\"{package} package not installed\")\n"
  },
  {
    "path": "tests/llm/test_anthropic/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_anthropic/conftest.py",
    "content": "# conftest.py\nimport os\nimport pytest\nimport importlib.util\n\n\nif not os.getenv(\"ANTHROPIC_API_KEY\"):\n    pytest.skip(\n        \"ANTHROPIC_API_KEY environment variable not set\",\n        allow_module_level=True,\n    )\n\nif (\n    importlib.util.find_spec(\"anthropic\") is None\n):  # pragma: no cover - optional dependency\n    pytest.skip(\"anthropic package is not installed\", allow_module_level=True)\n"
  },
  {
    "path": "tests/llm/test_anthropic/test_multimodal.py",
    "content": "import pytest\nfrom instructor.processing.multimodal import Image, PDF, PDFWithCacheControl\nimport instructor\nfrom pydantic import Field, BaseModel\nfrom itertools import product\nfrom .util import models, modes\nimport os\nimport base64\n\n\nclass ImageDescription(BaseModel):\n    objects: list[str] = Field(..., description=\"The objects in the image\")\n    scene: str = Field(..., description=\"The scene of the image\")\n    colors: list[str] = Field(..., description=\"The colors in the image\")\n\n\nimage_url = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n\npdf_url = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\n\n\ncurr_file = os.path.dirname(__file__)\npdf_path = os.path.join(curr_file, \"../../assets/invoice.pdf\")\npdf_base64 = base64.b64encode(open(pdf_path, \"rb\").read()).decode(\"utf-8\")\npdf_base64_string = f\"data:application/pdf;base64,{pdf_base64}\"\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response = client.chat.completions.create(\n        response_model=ImageDescription,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"What is this?\",\n                    Image.from_url(image_url),\n                ],\n            },\n        ],\n        temperature=1,\n        max_tokens=1000,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, ImageDescription)\n    assert len(response.objects) > 0\n    assert response.scene != \"\"\n    assert len(response.colors) > 0\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description_autodetect(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response = client.chat.completions.create(\n        response_model=ImageDescription,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"What is this?\",\n                    image_url,\n                ],\n            },\n        ],\n        max_tokens=1000,\n        temperature=1,\n        autodetect_images=True,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, ImageDescription)\n    assert len(response.objects) > 0\n    assert response.scene != \"\"\n    assert len(response.colors) > 0\n\n    # Additional assertions can be added based on expected content of the sample image\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description_autodetect_image_params(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response = client.chat.completions.create(\n        response_model=ImageDescription,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"What is this?\",\n                    {\n                        \"type\": \"image\",\n                        \"source\": image_url,\n                    },\n                ],\n            },\n        ],\n        max_tokens=1000,\n        temperature=1,\n        autodetect_images=True,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, ImageDescription)\n    assert len(response.objects) > 0\n    assert response.scene != \"\"\n    assert len(response.colors) > 0\n\n    # Additional assertions can be added based on expected content of the sample image\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description_autodetect_image_params_cache(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    messages = client.chat.completions.create(\n        response_model=None,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images and stuff\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"Describe these images\",\n                    # Large images to activate caching\n                    {\n                        \"type\": \"image\",\n                        \"source\": \"https://assets.entrepreneur.com/content/3x2/2000/20200429211042-GettyImages-1164615296.jpeg\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                    {\n                        \"type\": \"image\",\n                        \"source\": \"https://www.bigbear.com/imager/s3_us-west-1_amazonaws_com/big-bear/images/Scenic-Snow/89xVzXp1_00588cdef1e3d54756582b576359604b.jpeg\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                ],\n            },\n        ],\n        max_tokens=1000,\n        temperature=1,\n        autodetect_images=True,\n    )\n\n    # Assert a cache write or cache hit\n    assert (\n        messages.usage.cache_creation_input_tokens > 0\n        or messages.usage.cache_read_input_tokens > 0\n    )\n\n\nclass LineItem(BaseModel):\n    name: str\n    price: int\n    quantity: int\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\n@pytest.mark.parametrize(\"pdf_source\", [pdf_path, pdf_url, pdf_base64_string])\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_pdf_file(model, mode, pdf_source):\n    client = instructor.from_provider(model, mode=mode)\n\n    # Retry logic for flaky LLM responses\n    max_retries = 3\n    for attempt in range(max_retries):\n        response = client.chat.completions.create(\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"Extract the total and items from the invoice. Be precise and only extract the final total amount and list of item names. The total should be exactly 220.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": PDF.autodetect(pdf_source),\n                },\n            ],\n            max_tokens=1000,\n            temperature=0,  # Keep at 0 for consistent responses\n            autodetect_images=False,\n            response_model=Receipt,\n        )\n\n        if response.total == 220 and len(response.items) == 2:\n            break\n        elif attempt == max_retries - 1:\n            pytest.fail(\n                f\"After {max_retries} attempts, got total={response.total}, items={response.items}, expected total=220, items=2\"\n            )\n\n    assert response.total == 220\n    assert len(response.items) == 2\n\n\n@pytest.mark.parametrize(\"pdf_source\", [pdf_path, pdf_url, pdf_base64_string])\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_pdf_file_with_cache_control(model, mode, pdf_source):\n    client = instructor.from_provider(model, mode=mode)\n\n    response, completion = client.chat.completions.create_with_completion(\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Extract the total and items from the invoice\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": PDFWithCacheControl.autodetect(pdf_source),\n            },\n        ],\n        max_tokens=1000,\n        autodetect_images=False,\n        response_model=Receipt,\n    )\n\n    assert response.total == 220\n    assert (\n        completion.usage.cache_creation_input_tokens > 0\n        or completion.usage.cache_read_input_tokens > 0\n    )\n    assert len(response.items) == 2\n"
  },
  {
    "path": "tests/llm/test_anthropic/test_reasoning.py",
    "content": "import instructor\nfrom pydantic import BaseModel\n\n\nclass Answer(BaseModel):\n    answer: float\n\n\ndef test_reasoning():\n    client = instructor.from_provider(\n        \"anthropic/claude-3-7-sonnet-latest\",\n        mode=instructor.Mode.ANTHROPIC_REASONING_TOOLS,\n    )\n    response = client.chat.completions.create(\n        response_model=Answer,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Which is larger, 9.11 or 9.8? Think carefully about decimal places.\",\n            },\n        ],\n        temperature=1,  # Required when thinking is enabled\n        max_tokens=2000,\n        thinking={\"type\": \"enabled\", \"budget_tokens\": 1024},\n        max_retries=3,  # Retry if the model gets it wrong\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, Answer)\n    assert response.answer == 9.8\n"
  },
  {
    "path": "tests/llm/test_anthropic/test_system.py",
    "content": "import pytest\nimport instructor\nfrom pydantic import BaseModel\nfrom itertools import product\nfrom .util import models, modes\nfrom anthropic.types.message import Message\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_creation(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response = client.chat.completions.create(\n        response_model=User,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": [\n                    {\"type\": \"text\", \"text\": \"<story>Mike is 37 years old</story>\"}\n                ],\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from the story.\",\n            },\n        ],\n        temperature=1,\n        max_tokens=1000,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, User)\n    assert response.name == \"Mike\"\n    assert response.age == 37\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_creation_with_system_cache(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response, message = client.chat.completions.create_with_completion(\n        response_model=User,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": [\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"<story>Mike is 37 years old \" * 200 + \"</story>\",\n                        \"cache_control\": {\"type\": \"ephemeral\"},\n                    },\n                    {\n                        \"type\": \"text\",\n                        \"text\": \"You are a helpful assistant who extracts users from stories.\",\n                    },\n                ],\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from the story.\",\n            },\n        ],\n        temperature=1,\n        max_tokens=1000,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, User)\n    assert response.name == \"Mike\"\n    assert response.age == 37\n\n    # Assert a cache write or cache hit\n    assert (\n        message.usage.cache_creation_input_tokens > 0\n        or message.usage.cache_read_input_tokens > 0\n    )\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_creation_with_system_cache_anthropic_style(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response, message = client.chat.completions.create_with_completion(\n        system=[\n            {\n                \"type\": \"text\",\n                \"text\": \"<story>Mike is 37 years old \" * 200 + \"</story>\",\n                \"cache_control\": {\"type\": \"ephemeral\"},\n            },\n            {\n                \"type\": \"text\",\n                \"text\": \"You are a helpful assistant who extracts users from stories.\",\n            },\n        ],\n        response_model=User,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from the story.\",\n            },\n        ],\n        temperature=1,\n        max_tokens=1000,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, User)\n    assert response.name == \"Mike\"\n    assert response.age == 37\n\n    # Assert a cache write or cache hit\n    assert (\n        message.usage.cache_creation_input_tokens > 0\n        or message.usage.cache_read_input_tokens > 0\n    )\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_creation_no_response_model(model, mode):\n    client = instructor.from_provider(model, mode=mode)\n    response = client.chat.completions.create(\n        response_model=None,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": [{\"type\": \"text\", \"text\": \"Mike is 37 years old\"}],\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from the story.\",\n            },\n        ],\n        temperature=1,\n        max_tokens=1000,\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, Message)\n"
  },
  {
    "path": "tests/llm/test_anthropic/util.py",
    "content": "import instructor\n\nmodels = [\"anthropic/claude-3-5-haiku-latest\"]\nmodes = [\n    instructor.Mode.ANTHROPIC_TOOLS,\n]\n"
  },
  {
    "path": "tests/llm/test_bedrock/conftest.py",
    "content": "from __future__ import annotations\nimport base64\nimport pytest\n\n\n@pytest.fixture(scope=\"session\")\ndef tiny_png_bytes() -> bytes:\n    return base64.b64decode(\n        b\"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGNgYAAAAAMA\"\n        b\"ASsJTYQAAAAASUVORK5CYII=\"\n    )\n\n\n@pytest.fixture(scope=\"session\")\ndef tiny_png_data_url(tiny_png_bytes: bytes) -> str:\n    return \"data:image/png;base64,\" + base64.b64encode(tiny_png_bytes).decode(\"utf-8\")\n\n\n@pytest.fixture(scope=\"session\")\ndef image_url() -> str:\n    # Public test asset used across the suite\n    return \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n\n\n@pytest.fixture(scope=\"session\")\ndef tiny_pdf_bytes() -> bytes:\n    return base64.b64decode(\n        b\"JVBERi0xLjQKJSVPRgoAAAAQAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\"\n    )\n"
  },
  {
    "path": "tests/llm/test_bedrock/test_bedrock_native_passthrough.py",
    "content": "from __future__ import annotations\nfrom instructor.providers.bedrock.utils import _to_bedrock_content_items\n\n\ndef test_bedrock_native_text_passthrough():\n    content = [{\"text\": \"Bedrock-native text\"}]\n    items = _to_bedrock_content_items(content)\n    assert items == [{\"text\": \"Bedrock-native text\"}]\n\n\ndef test_bedrock_native_image_passthrough(tiny_png_bytes: bytes):\n    native = {\"image\": {\"format\": \"png\", \"source\": {\"bytes\": tiny_png_bytes}}}\n    items = _to_bedrock_content_items([native])\n    assert items[0] == native\n\n\ndef test_bedrock_native_document_passthrough(tiny_pdf_bytes: bytes):\n    native = {\"document\": {\"format\": \"pdf\", \"source\": {\"bytes\": tiny_pdf_bytes}}}\n    items = _to_bedrock_content_items([native])\n    assert items[0] == native\n"
  },
  {
    "path": "tests/llm/test_bedrock/test_normalize.py",
    "content": "from __future__ import annotations\nimport pytest\nfrom instructor.providers.bedrock.utils import _normalize_bedrock_image_format\n\n\n@pytest.mark.parametrize(\n    \"inp,expected\",\n    [\n        (\"image/jpeg\", \"jpeg\"),\n        (\"image/jpg\", \"jpeg\"),\n        (\"jpg\", \"jpeg\"),\n        (\"jpeg\", \"jpeg\"),\n        (\"image/pjpeg\", \"jpeg\"),\n        (\"image/png\", \"png\"),\n        (\"png\", \"png\"),\n        (\"image/gif\", \"gif\"),\n        (\"gif\", \"gif\"),\n        (\"image/webp\", \"webp\"),\n        (\"webp\", \"webp\"),\n        (\"\", \"jpeg\"),\n        (None, \"jpeg\"),\n        (\"image/whatever\", \"jpeg\"),\n    ],\n)\ndef test_normalize_bedrock_image_format(inp, expected):\n    assert _normalize_bedrock_image_format(inp) == expected\n"
  },
  {
    "path": "tests/llm/test_bedrock/test_openai_image_conversion.py",
    "content": "from __future__ import annotations\nimport base64\nimport pytest\nfrom instructor.providers.bedrock.utils import (\n    _openai_image_part_to_bedrock,\n    _to_bedrock_content_items,\n)\n\n\ndef test_openai_image_part_to_bedrock_data_url(tiny_png_data_url: str):\n    part = {\"type\": \"image_url\", \"image_url\": {\"url\": tiny_png_data_url}}\n    out = _openai_image_part_to_bedrock(part)\n    assert \"image\" in out\n    assert out[\"image\"][\"format\"] in {\"png\", \"jpeg\", \"gif\", \"webp\"}  # png expected\n    assert out[\"image\"][\"source\"][\"bytes\"] == base64.b64decode(\n        tiny_png_data_url.split(\",\", 1)[1]\n    )\n\n\ndef test_openai_image_part_to_bedrock_https(image_url: str):\n    part = {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}}\n    out = _openai_image_part_to_bedrock(part)\n    assert \"image\" in out\n    # GitHub raw returns jpeg for the sample. Normalize is handled in utils.\n    assert out[\"image\"][\"format\"] in {\"jpeg\", \"png\", \"gif\", \"webp\"}\n    assert isinstance(out[\"image\"][\"source\"][\"bytes\"], (bytes, bytearray))\n    assert len(out[\"image\"][\"source\"][\"bytes\"]) > 0\n\n\n@pytest.mark.parametrize(\n    \"text_part\",\n    [\n        {\"type\": \"text\", \"text\": \"What is in this image?\"},\n        {\"type\": \"input_text\", \"text\": \"Describe the image.\"},\n    ],\n)\n@pytest.mark.parametrize(\"image_kind\", [\"data\", \"https\"])\ndef test_to_bedrock_content_items_openai_combo(\n    text_part, image_kind, tiny_png_data_url: str, image_url: str\n):\n    if image_kind == \"data\":\n        image_part = {\"type\": \"image_url\", \"image_url\": {\"url\": tiny_png_data_url}}\n    else:\n        image_part = {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}}\n\n    content = [text_part, image_part]\n    items = _to_bedrock_content_items(content)\n\n    assert items[0] == {\"text\": text_part[\"text\"]}\n    assert \"image\" in items[1]\n    assert isinstance(items[1][\"image\"][\"source\"][\"bytes\"], (bytes, bytearray))\n    assert len(items[1][\"image\"][\"source\"][\"bytes\"]) > 0\n"
  },
  {
    "path": "tests/llm/test_bedrock/test_prepare_kwargs.py",
    "content": "from __future__ import annotations\nfrom instructor.providers.bedrock.utils import _prepare_bedrock_converse_kwargs_internal\n\n\ndef test_prepare_bedrock_kwargs_openai_text_plus_image(image_url: str):\n    call_kwargs = {\n        \"model\": \"anthropic.claude-3-5-sonnet\",\n        \"temperature\": 0.3,\n        \"max_tokens\": 256,\n        \"top_p\": 0.9,\n        \"stop\": [\"<END>\"],\n        \"system\": [{\"text\": \"You are helpful.\"}],\n        \"messages\": [\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    {\"type\": \"text\", \"text\": \"hi\"},\n                    {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n                ],\n            },\n        ],\n    }\n\n    out = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n\n    assert out[\"modelId\"] == \"anthropic.claude-3-5-sonnet\"\n    inf = out[\"inferenceConfig\"]\n    assert inf[\"temperature\"] == 0.3\n    assert inf[\"maxTokens\"] == 256\n    assert inf[\"topP\"] == 0.9\n    assert inf[\"stopSequences\"] == [\"<END>\"]\n    assert out[\"system\"][0][\"text\"] == \"You are helpful.\"\n\n    parts = out[\"messages\"][0][\"content\"]\n    assert parts[0] == {\"text\": \"hi\"}\n    assert parts[1][\"image\"][\"format\"] in {\"jpeg\", \"png\", \"gif\", \"webp\"}\n    assert isinstance(parts[1][\"image\"][\"source\"][\"bytes\"], (bytes, bytearray))\n    assert len(parts[1][\"image\"][\"source\"][\"bytes\"]) > 0\n"
  },
  {
    "path": "tests/llm/test_core_providers/README.md",
    "content": "# Core Provider Tests\n\nThis directory contains unified tests that run across **all core providers**: OpenAI, Anthropic, Google (Gemini), Cohere, xAI, Mistral, Cerebras, Fireworks, Writer, and Perplexity.\n\n## Philosophy\n\nInstead of duplicating the same tests for each provider, we use `instructor.from_provider()` with parameterization to run the same test suite against all providers simultaneously.\n\n## Test Organization\n\n### Core Tests (Run on All Providers)\n\nThese tests verify that core instructor functionality works consistently across providers:\n\n- **test_basic_extraction.py** - Simple extraction, lists, nested models, field descriptions\n- **test_streaming.py** - Partial streaming, Iterable streaming, union types\n- **test_validation.py** - Validators, field constraints, custom validation\n- **test_retries.py** - Retry logic and max_retries parameter\n- **test_response_modes.py** - Different client methods (create, messages.create, etc.)\n- **test_simple_types.py** - Simple types (int, bool, str, Literal, Union, Enum)\n\n\n## Configuration\n\n### shared_config.py\n\nLocated in `tests/llm/shared_config.py`, this file:\n\n- Defines `PROVIDER_CONFIGS` with model names, modes, and required API keys\n- Implements `get_available_providers()` to detect which providers are available\n- Provides `pytest_generate_tests()` hook for automatic parameterization\n- Handles skipping when API keys or packages are missing\n\n### Usage in Tests\n\nTests use the `provider_config` fixture which is automatically parametrized:\n\n```python\ndef test_something(provider_config):\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode)\n\n    result = client.create(\n        response_model=MyModel,\n        messages=[{\"role\": \"user\", \"content\": \"...\"}],\n    )\n\n    assert isinstance(result, MyModel)\n```\n\nThe test will automatically run for each available provider:\n- OpenAI (if OPENAI_API_KEY is set)\n- Anthropic (if ANTHROPIC_API_KEY is set)\n- Google (if GOOGLE_API_KEY is set)\n- Cohere (if COHERE_API_KEY is set)\n- xAI (if XAI_API_KEY is set)\n- Mistral (if MISTRAL_API_KEY is set)\n- Cerebras (if CEREBRAS_API_KEY is set)\n- Fireworks (if FIREWORKS_API_KEY is set)\n- Writer (if WRITER_API_KEY is set)\n- Perplexity (if PERPLEXITY_API_KEY is set)\n\nTests automatically skip if the API key or package is not available.\n\n## Running Tests\n\n`uv` is Astral's fast Python package manager. Install it by following the [official guide](https://docs.astral.sh/uv/getting-started/install/) if it is not already on your PATH.\n\n### Run all core provider tests:\n```bash\nuv run pytest tests/llm/test_core_providers/ -v\n```\n\n### Run specific test file:\n```bash\nuv run pytest tests/llm/test_core_providers/test_basic_extraction.py -v\n```\n\n### Run specific test:\n```bash\nuv run pytest tests/llm/test_core_providers/test_basic_extraction.py::test_simple_extraction -v\n```\n\n### Run tests for specific provider only:\n```bash\n# Only OpenAI\nuv run pytest tests/llm/test_core_providers/ -k \"openai\" -v\n\n# Only Anthropic\nuv run pytest tests/llm/test_core_providers/ -k \"anthropic\" -v\n\n# Only Google\nuv run pytest tests/llm/test_core_providers/ -k \"google\" -v\n```\n\n### Skip tests when API keys are missing:\nTests automatically skip if the required API key or package is not available.\n\nRequired API keys (set only what you have):\n- `OPENAI_API_KEY` - for OpenAI\n- `ANTHROPIC_API_KEY` - for Anthropic\n- `GOOGLE_API_KEY` - for Google (Gemini)\n- `GOOGLE_GENAI_MODEL` - model string for Google GenAI tests (e.g., `google/gemini-3-flash`)\n- `COHERE_API_KEY` - for Cohere\n- `XAI_API_KEY` - for xAI (Grok)\n- `MISTRAL_API_KEY` - for Mistral\n- `CEREBRAS_API_KEY` - for Cerebras\n- `FIREWORKS_API_KEY` - for Fireworks\n- `WRITER_API_KEY` - for Writer\n- `PERPLEXITY_API_KEY` - for Perplexity\n\n## Current Models\n\nAll providers automatically skip if API keys are missing.\n\n- **OpenAI**: `gpt-4o-mini` with `Mode.TOOLS`\n- **Anthropic**: `claude-3-5-haiku-latest` with `Mode.ANTHROPIC_TOOLS`\n- **Google**: `gemini-pro` with `Mode.GENAI_STRUCTURED_OUTPUTS`\n- **Cohere**: `command-a-03-2025` with `Mode.COHERE_TOOLS`\n- **xAI**: `grok-3-mini` with `Mode.XAI_TOOLS`\n- **Mistral**: `ministral-8b-latest` with `Mode.MISTRAL_TOOLS`\n- **Cerebras**: `llama3.1-70b` with `Mode.CEREBRAS_TOOLS`\n- **Fireworks**: `llama-v3p1-70b-instruct` with `Mode.FIREWORKS_TOOLS`\n- **Writer**: `palmyra-x-004` with `Mode.WRITER_TOOLS`\n- **Perplexity**: `llama-3.1-sonar-large-128k-online` with `Mode.PERPLEXITY_JSON`\n\nTo change models, edit `tests/llm/shared_config.py`.\n\n## Benefits\n\n✅ **Less code**: ~3,500+ lines of duplicate code eliminated\n✅ **Easier maintenance**: Update test logic once, applies to all providers\n✅ **Better coverage**: Ensures all providers support core features\n✅ **Faster development**: Add new providers by updating one config file\n✅ **Consistent behavior**: Catches provider-specific quirks early\n\n## Migration Status\n\n- ✅ Shared configuration created\n- ✅ Core test files created (basic_extraction, streaming, validation, retries, response_modes, simple_types)\n- ✅ util.py files updated to use `provider/model` format\n- ✅ Provider-specific tests cleaned up (removed all duplicates)\n- ✅ Deleted 6 entire provider directories (cerebras, fireworks, perplexity, cohere, xai, mistral)\n- ✅ Deleted 35+ duplicate test files across remaining providers\n\n## Adding New Core Tests\n\n1. Create test file in `tests/llm/test_core_providers/`\n2. Use `provider_config` parameter in test functions\n3. Extract `model, mode = provider_config`\n4. Create client with `instructor.from_provider(model, mode=mode)`\n5. Write provider-agnostic assertions\n\n## Adding New Providers\n\nTo add a new provider to core tests:\n\n1. Update `PROVIDER_CONFIGS` in `tests/llm/shared_config.py`\n2. Add tuple: `(\"provider/model-name\", instructor.Mode.PROVIDER_SPECIFIC_MODE, \"API_KEY_ENV_VAR\", \"package.name\")`\n3. Pick the mode that matches the provider's client (see `instructor.Mode` or the provider guide).\n4. Tests will automatically run against the new provider!\n"
  },
  {
    "path": "tests/llm/test_core_providers/__init__.py",
    "content": "\"\"\"Core provider tests - shared test suite for OpenAI, Anthropic, and Google.\"\"\"\n"
  },
  {
    "path": "tests/llm/test_core_providers/capabilities.py",
    "content": "\"\"\"\nProvider capability definitions for test skipping.\n\nThis module defines which capabilities each provider supports, allowing tests\nto skip when a provider doesn't support a required feature.\n\"\"\"\n\nfrom typing import Literal\nimport instructor\n\n# Capability types\nCapability = Literal[\n    \"streaming\",\n    \"partial_streaming\",\n    \"iterable_streaming\",\n    \"list_extraction\",\n    \"nested_models\",\n    \"validation\",\n    \"response_model_none\",\n    \"create_with_completion\",\n    \"union_types\",\n    \"enum_types\",\n    \"union_streaming\",\n]\n\n# Provider capabilities mapping\n# Format: provider_name -> set of supported capabilities\nPROVIDER_CAPABILITIES: dict[str, set[Capability]] = {\n    \"openai\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"response_model_none\",\n        \"create_with_completion\",\n    },\n    \"anthropic\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"response_model_none\",\n        \"create_with_completion\",\n    },\n    \"google\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"response_model_none\",\n        \"create_with_completion\",\n        # Note: Gemini doesn't support Union types or Enum types, only Optional\n        # Also doesn't support union streaming\n    },\n    \"cohere\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"response_model_none\",\n        \"create_with_completion\",\n    },\n    \"xai\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        # list_extraction may have issues with tool_calls\n        \"nested_models\",\n        \"validation\",\n        \"response_model_none\",\n        \"create_with_completion\",\n    },\n    \"mistral\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"create_with_completion\",\n    },\n    \"cerebras\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"create_with_completion\",\n    },\n    \"fireworks\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"create_with_completion\",\n    },\n    \"writer\": {\n        \"streaming\",\n        \"partial_streaming\",\n        \"iterable_streaming\",\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"create_with_completion\",\n    },\n    \"perplexity\": {\n        # Limited streaming support\n        \"list_extraction\",\n        \"nested_models\",\n        \"validation\",\n        \"create_with_completion\",\n    },\n}\n\n\ndef get_provider_name(model_string: str) -> str:\n    \"\"\"Extract provider name from model string (e.g., 'openai/gpt-4' -> 'openai').\"\"\"\n    return model_string.split(\"/\")[0]\n\n\ndef provider_supports(\n    provider_config: tuple[str, instructor.Mode], capability: Capability\n) -> bool:\n    \"\"\"\n    Check if a provider supports a specific capability.\n\n    Args:\n        provider_config: Tuple of (model_string, mode)\n        capability: The capability to check\n\n    Returns:\n        True if the provider supports the capability, False otherwise\n    \"\"\"\n    model_string, _ = provider_config\n    provider_name = get_provider_name(model_string)\n    capabilities = PROVIDER_CAPABILITIES.get(provider_name, set())\n    return capability in capabilities\n\n\ndef skip_if_unsupported(\n    provider_config: tuple[str, instructor.Mode], capability: Capability\n):\n    \"\"\"\n    Skip test if provider doesn't support the capability.\n\n    Args:\n        provider_config: Tuple of (model_string, mode)\n        capability: The capability required for the test\n    \"\"\"\n    import pytest\n\n    if not provider_supports(provider_config, capability):\n        model_string, mode = provider_config\n        provider_name = get_provider_name(model_string)\n        pytest.skip(\n            f\"{provider_name} does not support {capability} \"\n            f\"(model: {model_string}, mode: {mode})\"\n        )\n"
  },
  {
    "path": "tests/llm/test_core_providers/conftest.py",
    "content": "\"\"\"\nConfiguration for core provider tests (OpenAI, Anthropic, Google).\n\"\"\"\n\nfrom tests.llm.shared_config import pytest_configure, pytest_generate_tests  # noqa: F401\n"
  },
  {
    "path": "tests/llm/test_core_providers/test_basic_extraction.py",
    "content": "\"\"\"\nBasic extraction tests that run across all core providers.\n\nTests basic functionality like simple extraction, lists, and nested models.\n\"\"\"\n\nfrom pydantic import BaseModel, Field\nimport pytest\nimport instructor\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass UserList(BaseModel):\n    users: list[User]\n\n\nclass Address(BaseModel):\n    street: str\n    city: str\n    country: str\n\n\nclass UserWithAddress(BaseModel):\n    name: str\n    age: int\n    address: Address\n\n\n@pytest.mark.asyncio\nasync def test_simple_extraction(provider_config):\n    \"\"\"Test simple single object extraction.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    user = await client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Extract: Jason is 25 years old\"}],\n    )\n\n    assert isinstance(user, User)\n    assert user.name == \"Jason\"\n    assert user.age == 25\n\n\n@pytest.mark.asyncio\nasync def test_list_extraction(provider_config):\n    \"\"\"Test extracting multiple objects in a list.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    result = await client.create(\n        response_model=list[User],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract: Alice is 30, Bob is 25, Charlie is 35\",\n            }\n        ],\n    )\n\n    assert isinstance(result, list)\n    assert len(result) == 3\n    assert {user.name for user in result} == {\"Alice\", \"Bob\", \"Charlie\"}\n    assert {user.age for user in result} == {30, 25, 35}\n\n\n@pytest.mark.asyncio\nasync def test_nested_model_extraction(provider_config):\n    \"\"\"Test extracting nested models.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    user = await client.create(\n        response_model=UserWithAddress,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract: John Doe, 28 years old, lives at 123 Main St, Springfield, USA\",\n            }\n        ],\n    )\n\n    assert isinstance(user, UserWithAddress)\n    assert user.name == \"John Doe\"\n    assert user.age == 28\n    assert isinstance(user.address, Address)\n    assert user.address.street == \"123 Main St\"\n    assert user.address.city == \"Springfield\"\n    assert user.address.country == \"USA\"\n\n\n@pytest.mark.asyncio\nasync def test_extraction_with_field_descriptions(provider_config):\n    \"\"\"Test extraction with Pydantic Field descriptions.\"\"\"\n    model, mode = provider_config\n\n    class Product(BaseModel):\n        name: str = Field(description=\"Name of the product\")\n        price: float = Field(description=\"Price in USD\")\n        in_stock: bool = Field(description=\"Whether the product is in stock\")\n\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    product = await client.create(\n        response_model=Product,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"iPhone 15 Pro costs $999 and is currently available\",\n            }\n        ],\n    )\n\n    assert isinstance(product, Product)\n    assert \"iphone\" in product.name.lower() or \"iPhone\" in product.name\n    assert product.price == 999.0\n    assert product.in_stock is True\n"
  },
  {
    "path": "tests/llm/test_core_providers/test_response_modes.py",
    "content": "\"\"\"\nResponse mode tests that run across all core providers.\n\nTests different response modes and methods available on the client.\n\"\"\"\n\nfrom pydantic import BaseModel\nimport pytest\nimport instructor\n\nfrom .capabilities import skip_if_unsupported\n\n\nclass Task(BaseModel):\n    title: str\n    description: str\n    priority: int\n\n\n@pytest.mark.asyncio\nasync def test_create_method(provider_config):\n    \"\"\"Test the create() method.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    task = await client.create(\n        response_model=Task,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a task: Fix bug in login, high priority (9)\",\n            }\n        ],\n    )\n\n    assert isinstance(task, Task)\n    assert \"bug\" in task.title.lower() or \"login\" in task.title.lower()\n    assert task.priority == 9\n\n\n@pytest.mark.asyncio\nasync def test_chat_completions_create_method(provider_config):\n    \"\"\"Test the chat.completions.create() method.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    task = await client.chat.completions.create(\n        response_model=Task,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Task: Update documentation, medium priority (5)\",\n            }\n        ],\n    )\n\n    assert isinstance(task, Task)\n    assert task.priority == 5\n\n\n@pytest.mark.asyncio\nasync def test_messages_create_method(provider_config):\n    \"\"\"Test the messages.create() method.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    task = await client.messages.create(\n        response_model=Task,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Task: Review PR, low priority (3)\",\n            }\n        ],\n    )\n\n    assert isinstance(task, Task)\n    assert task.priority == 3\n\n\n@pytest.mark.asyncio\nasync def test_create_with_completion(provider_config):\n    \"\"\"Test create_with_completion() returns both model and raw response.\"\"\"\n    skip_if_unsupported(provider_config, \"create_with_completion\")\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    task, completion = await client.chat.completions.create_with_completion(\n        response_model=Task,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Task: Deploy to production, priority 10\",\n            }\n        ],\n    )\n\n    assert isinstance(task, Task)\n    assert task.priority == 10\n    # completion should be the raw response object from the provider\n    assert completion is not None\n\n\n@pytest.mark.asyncio\nasync def test_response_model_none(provider_config):\n    \"\"\"Test that response_model=None returns raw response.\"\"\"\n    skip_if_unsupported(provider_config, \"response_model_none\")\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.messages.create(\n        response_model=None,\n        messages=[{\"role\": \"user\", \"content\": \"Say hello!\"}],\n    )\n\n    # Should return raw provider response\n    assert response is not None\n"
  },
  {
    "path": "tests/llm/test_core_providers/test_retries.py",
    "content": "\"\"\"\nRetry and error handling tests that run across all core providers.\n\"\"\"\n\nfrom pydantic import BaseModel, Field, field_validator\nimport pytest\nimport instructor\n\n\nclass ValidatedUser(BaseModel):\n    name: str\n    age: int = Field(ge=0, le=120)\n\n    @field_validator(\"name\")\n    @classmethod\n    def name_must_have_content(cls, v: str) -> str:\n        if not v or not v.strip():\n            raise ValueError(\"Name must not be empty\")\n        return v.strip()\n\n\n@pytest.mark.asyncio\nasync def test_max_retries_parameter(provider_config):\n    \"\"\"Test that max_retries parameter is accepted and works.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    user = await client.create(\n        response_model=ValidatedUser,\n        messages=[{\"role\": \"user\", \"content\": \"Create a user: John Smith, age 30\"}],\n        max_retries=3,\n    )\n\n    assert isinstance(user, ValidatedUser)\n    assert user.name.strip() == \"John Smith\"\n    assert 0 <= user.age <= 120\n\n\n@pytest.mark.asyncio\nasync def test_validation_with_retries(provider_config):\n    \"\"\"Test that validation errors trigger retries (if supported).\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    # This should work after potential retries\n    user = await client.create(\n        response_model=ValidatedUser,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract: Sarah Johnson is 25 years old\",\n            }\n        ],\n        max_retries=2,\n    )\n\n    assert isinstance(user, ValidatedUser)\n    assert user.age >= 0 and user.age <= 120\n"
  },
  {
    "path": "tests/llm/test_core_providers/test_simple_types.py",
    "content": "\"\"\"Test simple type extraction across all providers.\n\nTests that basic Python types (int, bool, str, Literal, Union, Enum) work\nconsistently across all providers using from_provider().\n\"\"\"\n\nimport enum\nfrom typing import Annotated, Literal, Union\n\nimport pytest\nfrom pydantic import Field\n\nimport instructor\nfrom .capabilities import skip_if_unsupported\n\n\n@pytest.mark.asyncio\nasync def test_int(provider_config):\n    \"\"\"Test extracting int response.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=int,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Return the number 42\",\n            },\n        ],\n    )\n    assert isinstance(response, int)\n\n\n@pytest.mark.asyncio\nasync def test_bool(provider_config):\n    \"\"\"Test extracting bool response.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=bool,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Is the sky blue? Answer true or false\",\n            },\n        ],\n    )\n    assert isinstance(response, bool)\n\n\n@pytest.mark.asyncio\nasync def test_str(provider_config):\n    \"\"\"Test extracting str response.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=str,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Say 'hello world'\",\n            },\n        ],\n    )\n    assert isinstance(response, str)\n\n\n@pytest.mark.asyncio\nasync def test_literal(provider_config):\n    \"\"\"Test extracting Literal type response.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=Literal[\"red\", \"green\", \"blue\"],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Pick one of these colors: red, green, or blue\",\n            },\n        ],\n    )\n    assert response in [\"red\", \"green\", \"blue\"]\n\n\n@pytest.mark.asyncio\nasync def test_union(provider_config):\n    \"\"\"Test extracting Union type response.\"\"\"\n    skip_if_unsupported(provider_config, \"union_types\")\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=Union[int, str],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Return either a number or a string\",\n            },\n        ],\n    )\n    assert isinstance(response, (int, str))\n\n\n@pytest.mark.asyncio\nasync def test_enum(provider_config):\n    \"\"\"Test extracting Enum type response.\"\"\"\n    skip_if_unsupported(provider_config, \"enum_types\")\n\n    class Color(enum.Enum):\n        RED = \"red\"\n        GREEN = \"green\"\n        BLUE = \"blue\"\n\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=Color,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Pick one color: red, green, or blue\",\n            },\n        ],\n    )\n    assert response in [Color.RED, Color.GREEN, Color.BLUE]\n\n\n@pytest.mark.asyncio\nasync def test_annotated_int(provider_config):\n    \"\"\"Test extracting Annotated[int] with Field description.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    response = await client.create(\n        response_model=Annotated[int, Field(description=\"A random number\")],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Give me a random number\",\n            },\n        ],\n    )\n    assert isinstance(response, int)\n"
  },
  {
    "path": "tests/llm/test_core_providers/test_streaming.py",
    "content": "\"\"\"\nStreaming tests that run across all core providers.\n\nTests streaming functionality including Partial and Iterable.\n\"\"\"\n\nfrom collections.abc import Iterable\nfrom pydantic import BaseModel\nfrom typing import Union, Literal\nimport pytest\nimport instructor\nfrom instructor.dsl.partial import Partial\nfrom .capabilities import skip_if_unsupported\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass Weather(BaseModel):\n    location: str\n    temperature: int\n    units: Literal[\"celsius\", \"fahrenheit\"]\n\n\nclass SearchQuery(BaseModel):\n    query: str\n    category: str\n\n\n@pytest.mark.asyncio\nasync def test_partial_streaming(provider_config):\n    \"\"\"Test partial streaming with incremental updates.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    updates = []\n    async for partial_user in await client.create(\n        response_model=Partial[User],\n        messages=[{\"role\": \"user\", \"content\": \"Jason Liu is 30 years old\"}],\n        stream=True,\n    ):\n        assert isinstance(partial_user, User)\n        updates.append(partial_user)\n\n    # Should receive at least one update\n    assert len(updates) >= 1\n\n    # Final update should have complete data\n    final = updates[-1]\n    assert final.name == \"Jason Liu\"\n    assert final.age == 30\n\n\n@pytest.mark.asyncio\nasync def test_iterable_streaming(provider_config):\n    \"\"\"Test streaming multiple objects with Iterable.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    users = []\n    async for user in await client.create(\n        response_model=Iterable[User],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create 3 users: Alice (25), Bob (30), Carol (35)\",\n            }\n        ],\n    ):\n        users.append(user)\n\n    assert len(users) == 3\n    assert all(isinstance(user, User) for user in users)\n    assert {user.name for user in users} == {\"Alice\", \"Bob\", \"Carol\"}\n\n\n@pytest.mark.asyncio\nasync def test_iterable_streaming_with_stream_flag(provider_config):\n    \"\"\"Test Iterable with explicit stream=True flag.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    users = []\n    async for user in await client.create(\n        response_model=Iterable[User],\n        messages=[{\"role\": \"user\", \"content\": \"Make 2 users: John (20), Jane (22)\"}],\n        stream=True,\n    ):\n        assert isinstance(user, User)\n        users.append(user)\n\n    assert len(users) == 2\n    assert {user.name for user in users} == {\"John\", \"Jane\"}\n\n\n@pytest.mark.asyncio\nasync def test_iterable_union_streaming(provider_config):\n    \"\"\"Test streaming union types with Iterable.\"\"\"\n    skip_if_unsupported(provider_config, \"union_streaming\")\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    results = []\n    async for result in await client.create(\n        response_model=Iterable[Union[Weather, SearchQuery]],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What's the weather in NYC and search for 'python tutorials'?\",\n            }\n        ],\n    ):\n        results.append(result)\n\n    assert len(results) >= 2\n    assert any(isinstance(r, Weather) for r in results)\n    assert any(isinstance(r, SearchQuery) for r in results)\n\n\n@pytest.mark.asyncio\nasync def test_create_iterable_method(provider_config):\n    \"\"\"Test create_iterable convenience method.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    users = []\n    async for user in client.chat.completions.create_iterable(\n        response_model=User,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Generate 2 users: Tom (45), Jerry (40)\",\n            }\n        ],\n    ):\n        users.append(user)\n\n    assert len(users) == 2\n    assert all(isinstance(user, User) for user in users)\n"
  },
  {
    "path": "tests/llm/test_core_providers/test_validation.py",
    "content": "\"\"\"\nValidation and retry tests that run across all core providers.\n\nTests validation logic, custom validators, and retry mechanisms.\n\"\"\"\n\nfrom pydantic import BaseModel, Field, field_validator\nimport pytest\nimport instructor\n\n\nclass UserWithValidation(BaseModel):\n    name: str = Field(description=\"User's full name\")\n    age: int = Field(description=\"User's age in years\", ge=0, le=150)\n\n    @field_validator(\"name\")\n    @classmethod\n    def name_must_not_be_empty(cls, v: str) -> str:\n        if not v.strip():\n            raise ValueError(\"Name cannot be empty\")\n        return v\n\n\nclass Email(BaseModel):\n    email: str = Field(description=\"Valid email address\")\n\n    @field_validator(\"email\")\n    @classmethod\n    def email_must_be_valid(cls, v: str) -> str:\n        if \"@\" not in v or \".\" not in v:\n            raise ValueError(\"Must be a valid email address\")\n        return v\n\n\nclass TemperatureReading(BaseModel):\n    celsius: float = Field(description=\"Temperature in Celsius\", ge=-273.15)\n    location: str\n\n\n@pytest.mark.asyncio\nasync def test_basic_validation(provider_config):\n    \"\"\"Test that basic field validation works.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    user = await client.create(\n        response_model=UserWithValidation,\n        messages=[{\"role\": \"user\", \"content\": \"John Doe is 30 years old\"}],\n    )\n\n    assert isinstance(user, UserWithValidation)\n    assert user.name == \"John Doe\"\n    assert user.age == 30\n    assert 0 <= user.age <= 150\n\n\n@pytest.mark.asyncio\nasync def test_list_with_validation(provider_config):\n    \"\"\"Test validation with lists of validated models.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    users = await client.create(\n        response_model=list[UserWithValidation],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract: Alice is 25, Bob is 30, Carol is 35\",\n            }\n        ],\n    )\n\n    assert isinstance(users, list)\n    assert len(users) == 3\n    for user in users:\n        assert isinstance(user, UserWithValidation)\n        assert 0 <= user.age <= 150\n\n\n@pytest.mark.asyncio\nasync def test_custom_validator(provider_config):\n    \"\"\"Test custom field validators work correctly.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    email = await client.create(\n        response_model=Email,\n        messages=[{\"role\": \"user\", \"content\": \"My email is john@example.com\"}],\n    )\n\n    assert isinstance(email, Email)\n    assert \"@\" in email.email\n    assert \".\" in email.email\n\n\n@pytest.mark.asyncio\nasync def test_field_constraints(provider_config):\n    \"\"\"Test Pydantic field constraints (ge, le, etc).\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    reading = await client.create(\n        response_model=TemperatureReading,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"The temperature in Paris is 20 degrees Celsius\",\n            }\n        ],\n    )\n\n    assert isinstance(reading, TemperatureReading)\n    assert reading.celsius >= -273.15  # Absolute zero constraint\n    assert reading.location == \"Paris\"\n\n\n@pytest.mark.asyncio\nasync def test_max_retries(provider_config):\n    \"\"\"Test that max_retries parameter is accepted.\"\"\"\n    model, mode = provider_config\n    client = instructor.from_provider(model, mode=mode, async_client=True)\n\n    user = await client.create(\n        response_model=UserWithValidation,\n        messages=[{\"role\": \"user\", \"content\": \"Jane Smith is 28 years old\"}],\n        max_retries=2,\n    )\n\n    assert isinstance(user, UserWithValidation)\n    assert user.name == \"Jane Smith\"\n    assert user.age == 28\n"
  },
  {
    "path": "tests/llm/test_gemini/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_gemini/conftest.py",
    "content": "import os\nimport pytest\n\nif not os.getenv(\"GOOGLE_API_KEY\"):\n    pytest.skip(\"GOOGLE_API_KEY environment variable not set\", allow_module_level=True)\n\nif not os.getenv(\"GOOGLE_GENAI_MODEL\"):\n    pytest.skip(\n        \"GOOGLE_GENAI_MODEL environment variable not set\",\n        allow_module_level=True,\n    )\n\ntry:\n    from google import genai  # noqa: F401\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\"google-genai package is not installed\", allow_module_level=True)\n"
  },
  {
    "path": "tests/llm/test_gemini/evals/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_gemini/evals/test_extract_users.py",
    "content": "import pytest\nfrom itertools import product\nfrom pydantic import BaseModel\nimport instructor\nfrom ..util import models, modes\n\n\nclass UserDetails(BaseModel):\n    name: str\n    age: int\n\n\n# Lists for models, test data, and modes\ntest_data = [\n    (\"Jason is 10\", \"Jason\", 10),\n    (\"Alice is 25\", \"Alice\", 25),\n    (\"Bob is 35\", \"Bob\", 35),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, test_data, modes))\ndef test_extract(model, data, mode):\n    sample_data, expected_name, expected_age = data\n\n    client = instructor.from_provider(model=f\"google/{model}\", mode=mode)\n\n    # Calling the extract function with the provided model, sample data, and mode\n    response = client.chat.completions.create(\n        response_model=UserDetails,\n        messages=[\n            {\"role\": \"user\", \"content\": sample_data},\n        ],\n    )\n\n    # Assertions\n    assert response.name == expected_name, (\n        f\"Expected name {expected_name}, got {response.name}\"\n    )\n    assert response.age == expected_age, (\n        f\"Expected age {expected_age}, got {response.age}\"\n    )\n"
  },
  {
    "path": "tests/llm/test_gemini/test_list_content.py",
    "content": "import os\nimport instructor\nfrom pydantic import BaseModel\nimport pytest\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass UserList(BaseModel):\n    items: list[User]\n\n\nMODEL = os.getenv(\"GOOGLE_GENAI_MODEL\", \"google/gemini-pro\")\n\n\n@pytest.mark.asyncio\nasync def test_list_of_strings():\n    client = instructor.from_provider(\n        MODEL,\n        mode=instructor.Mode.GENAI_STRUCTURED_OUTPUTS,\n        async_client=True,\n    )\n\n    users = [\n        {\n            \"name\": \"Jason\",\n            \"age\": 25,\n        },\n        {\n            \"name\": \"Elizabeth\",\n            \"age\": 12,\n        },\n        {\n            \"name\": \"Chris\",\n            \"age\": 27,\n        },\n    ]\n\n    prompt = \"\"\"\n    Extract a list of users from the following text:\n\n    {% for user in users %}\n    - Name: {{ user.name }}, Age: {{ user.age }}\n    {% endfor %}\n    \"\"\"\n\n    result = await client.chat.completions.create(\n        response_model=UserList,\n        messages=[\n            {\"role\": \"user\", \"content\": prompt},\n        ],\n        context={\"users\": users},\n    )\n\n    assert isinstance(result, UserList), \"Result should be an instance of UserList\"\n    assert isinstance(result.items, list), \"items should be a list\"\n    assert len(result.items) == 3, \"List should contain 3 items\"\n\n    names = [item.name.upper() for item in result.items]\n    assert \"JASON\" in names, \"'JASON' should be in the list\"\n    assert \"ELIZABETH\" in names, \"'ELIZABETH' should be in the list\"\n    assert \"CHRIS\" in names, \"'CHRIS' should be in the list\"\n"
  },
  {
    "path": "tests/llm/test_gemini/test_multimodal_content.py",
    "content": "import instructor\nfrom pydantic import BaseModel\nimport os\n\n\nclass Description(BaseModel):\n    relevant_speakers: list[str]\n    summary: str\n\n\ncurr_file = os.path.dirname(__file__)\nfile_path = os.path.join(curr_file, \"./test_files/sample.mp3\")\nMODEL = os.getenv(\"GOOGLE_GENAI_MODEL\", \"google/gemini-pro\")\n\n\ndef test_audio_compatability_list():\n    client = instructor.from_provider(\n        model=MODEL, mode=instructor.Mode.GENAI_STRUCTURED_OUTPUTS\n    )\n\n    # For now, we'll skip file operations since the new API might handle them differently\n    # This test might need to be updated based on the new google-genai file upload API\n    content = \"Please transcribe this recording: [audio file would go here]\"\n\n    result = client.chat.completions.create(\n        response_model=Description,\n        messages=[\n            {\"role\": \"user\", \"content\": content},\n        ],\n    )\n\n    assert isinstance(result, Description), (\n        \"Result should be an instance of Description\"\n    )\n\n\ndef test_audio_compatability_multiple_messages():\n    client = instructor.from_provider(\n        model=MODEL, mode=instructor.Mode.GENAI_STRUCTURED_OUTPUTS\n    )\n\n    # For now, we'll skip file operations since the new API might handle them differently\n    # This test might need to be updated based on the new google-genai file upload API\n\n    result = client.chat.completions.create(\n        response_model=Description,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Please transcribe this recording: [audio file would go here]\",\n            },\n        ],\n    )\n\n    assert isinstance(result, Description), (\n        \"Result should be an instance of Description\"\n    )\n"
  },
  {
    "path": "tests/llm/test_gemini/util.py",
    "content": "import os\nimport instructor\n\nmodels: list[str] = [os.getenv(\"GOOGLE_GENAI_MODEL\", \"google/gemini-pro\")]\nmodes = [instructor.Mode.GENAI_STRUCTURED_OUTPUTS]\n"
  },
  {
    "path": "tests/llm/test_genai/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_genai/conftest.py",
    "content": "# conftest.py\nimport os\nimport pytest\n\nimport instructor\n\nif not os.getenv(\"GOOGLE_API_KEY\"):\n    pytest.skip(\n        \"GOOGLE_API_KEY environment variable not set\",\n        allow_module_level=True,\n    )\n\nif not os.getenv(\"GOOGLE_GENAI_MODEL\"):\n    pytest.skip(\n        \"GOOGLE_GENAI_MODEL environment variable not set\",\n        allow_module_level=True,\n    )\n\ntry:\n    from google.genai import Client\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\"google-genai package is not installed\", allow_module_level=True)\n\n\n@pytest.fixture(scope=\"function\")\ndef client():\n    yield Client()\n\n\n@pytest.fixture(scope=\"function\")\ndef aclient():\n    yield Client()\n\n\n@pytest.fixture(scope=\"function\")\ndef genai_client():\n    # Use the recommended model for sync client, let the test set the mode\n    return instructor.from_provider(\n        os.getenv(\"GOOGLE_GENAI_MODEL\", \"google/gemini-pro\"),\n    )\n"
  },
  {
    "path": "tests/llm/test_genai/test_decimal.py",
    "content": "import pytest\nfrom decimal import Decimal\nfrom pydantic import BaseModel, field_validator\nimport instructor\nfrom .util import models, modes\n\n\nclass Receipt(BaseModel):\n    item: str\n    quantity: int\n    price: Decimal\n    total: Decimal\n\n    @field_validator(\"price\", \"total\", mode=\"before\")\n    @classmethod\n    def parse_decimals(cls, v):\n        if isinstance(v, (str, float, int)):\n            return Decimal(str(v))\n        return v\n\n\nclass Invoice(BaseModel):\n    receipts: list[Receipt]\n    grand_total: Decimal\n\n    @field_validator(\"grand_total\", mode=\"before\")\n    @classmethod\n    def parse_grand_total(cls, v):\n        if isinstance(v, (str, float, int)):\n            return Decimal(str(v))\n        return v\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_decimal_extraction(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"I bought 2 apples for $1.50 each and 3 bananas for $0.75 each. Calculate the total.\",\n            },\n        ],\n        response_model=Invoice,\n    )\n    assert isinstance(response, Invoice)\n    assert len(response.receipts) == 2\n\n    # Check apple receipt\n    apple_receipt = next(\n        (r for r in response.receipts if \"apple\" in r.item.lower()), None\n    )\n    assert apple_receipt is not None\n    assert apple_receipt.quantity == 2\n    assert isinstance(apple_receipt.price, Decimal)\n    assert isinstance(apple_receipt.total, Decimal)\n\n    # Check banana receipt\n    banana_receipt = next(\n        (r for r in response.receipts if \"banana\" in r.item.lower()), None\n    )\n    assert banana_receipt is not None\n    assert banana_receipt.quantity == 3\n    assert isinstance(banana_receipt.price, Decimal)\n    assert isinstance(banana_receipt.total, Decimal)\n\n    # Check grand total\n    assert isinstance(response.grand_total, Decimal)\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\nasync def test_decimal_extraction_async(aclient, model, mode):\n    aclient = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=True)\n    response = await aclient.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"I bought 1 coffee for $4.25 and 1 muffin for $2.75. What's the total?\",\n            },\n        ],\n        response_model=Invoice,\n    )\n    assert isinstance(response, Invoice)\n    assert len(response.receipts) == 2\n\n    # Check that all decimal fields are proper Decimal instances\n    for receipt in response.receipts:\n        assert isinstance(receipt.price, Decimal)\n        assert isinstance(receipt.total, Decimal)\n\n    assert isinstance(response.grand_total, Decimal)\n\n\nclass SimpleProduct(BaseModel):\n    name: str\n    price: Decimal\n\n    @field_validator(\"price\", mode=\"before\")\n    @classmethod\n    def parse_price(cls, v):\n        if isinstance(v, (str, float, int)):\n            return Decimal(str(v))\n        return v\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_simple_decimal_extraction(client, model, mode):\n    \"\"\"Test simple decimal extraction to ensure schema conversion works\"\"\"\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"The laptop costs $999.99\",\n            },\n        ],\n        response_model=SimpleProduct,\n    )\n    assert isinstance(response, SimpleProduct)\n    assert response.name.lower() == \"laptop\"\n    assert isinstance(response.price, Decimal)\n    assert response.price == Decimal(\"999.99\")\n"
  },
  {
    "path": "tests/llm/test_genai/test_format.py",
    "content": "import pytest\nfrom pydantic import BaseModel\nimport instructor\nfrom .util import models, modes\nfrom itertools import product\nfrom google import genai\nfrom google.genai import types\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass Users(BaseModel):\n    users: list[User]\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_simple_string_message(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        messages=[\"Ivan is 28 years old\"],  # type: ignore\n        response_model=Users,\n    )\n    assert isinstance(response, Users)\n    assert len(response.users) > 0\n    assert response.users[0].name == \"Ivan\"\n    assert response.users[0].age == 28\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_system_prompt(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"Ivan is 28 years old\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Make sure that the response is a list of users\",\n            },\n        ],\n        response_model=Users,\n    )\n    assert isinstance(response, Users)\n    assert len(response.users) > 0\n    assert response.users[0].name == \"Ivan\"\n    assert response.users[0].age == 28\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_system_kwarg(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        system=\"Ivan is 28 years old\",\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Make sure that the response is a list of users\",\n            },\n        ],\n        response_model=Users,\n    )\n    assert isinstance(response, Users)\n    assert len(response.users) > 0\n    assert response.users[0].name == \"Ivan\"\n    assert response.users[0].age == 28\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_system_kwarg_genai(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        system=\"Ivan is 28 years old\",\n        messages=[\n            genai.types.Content(\n                role=\"user\",\n                parts=[\n                    genai.types.Part.from_text(\n                        text=\"Make sure that the response is a list of users\"\n                    )\n                ],\n            ),\n        ],\n        response_model=Users,\n    )\n    assert isinstance(response, Users)\n    assert len(response.users) > 0\n    assert response.users[0].name == \"Ivan\"\n    assert response.users[0].age == 28\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_system_prompt_list(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": [\n                    \"Ivan is\",\n                    \" 28 years old\",\n                ],\n            },  # type: ignore\n            {\n                \"role\": \"user\",\n                \"content\": \"Make sure that the response is a list of users\",\n            },\n        ],\n        response_model=Users,\n    )\n    assert isinstance(response, Users)\n    assert len(response.users) > 0\n    assert response.users[0].name == \"Ivan\"\n    assert response.users[0].age == 28\n\n\n@pytest.mark.parametrize(\"model\", models)\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_format_genai_typed(client, model, mode):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n    response = client.chat.completions.create(\n        model=model,\n        response_model=User,\n        messages=[\n            types.Content(\n                role=\"user\",\n                parts=[\n                    types.Part.from_text(text=\"Extract {{name}} is {{age}} years old\")\n                ],\n            ),  # type: ignore\n        ],\n        context={\"name\": \"Jason\", \"age\": 25},\n    )\n    assert isinstance(response, User)\n    assert response.name == \"Jason\"\n    assert response.age == 25\n\n\n@pytest.mark.parametrize(\"model, mode, is_list\", product(models, modes, [True, False]))\ndef test_format_string(client, model: str, mode: instructor.Mode, is_list: bool):\n    client = instructor.from_provider(f\"google/{model}\", mode=mode, async_client=False)\n\n    content = (\n        [\"Extract {{name}} is {{age}} years old.\"]\n        if is_list\n        else \"Extract {{name}} is {{age}} years old.\"\n    )\n\n    resp = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            }\n        ],\n        response_model=User,\n        context={\"name\": \"Jason\", \"age\": 25},\n    )\n\n    assert isinstance(resp, User)\n    assert resp.name == \"Jason\"\n    assert resp.age == 25\n"
  },
  {
    "path": "tests/llm/test_genai/test_invalid_schema.py",
    "content": "import os\nimport pytest\nfrom typing import Optional, Union\n\nimport instructor\nfrom pydantic import BaseModel\nfrom .util import models, modes\nfrom itertools import product\nfrom instructor.providers.gemini.utils import map_to_gemini_function_schema\n\nMODEL = os.getenv(\"GOOGLE_GENAI_MODEL\", \"google/gemini-pro\")\n\n\n@pytest.mark.parametrize(\"mode,model\", product(modes, models))\ndef test_nested(mode, model):\n    \"\"\"Test that nested schemas are supported.\"\"\"\n    client = instructor.from_provider(f\"google/{model}\", mode=mode)\n\n    class Address(BaseModel):\n        street: str\n        city: str\n\n    class Person(BaseModel):\n        name: str\n        address: Optional[Address] = None\n\n    resp = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"John loves to go gardenning with his friends\",\n            }\n        ],\n        response_model=Person,\n    )\n\n    assert resp.name == \"John\"  # type: ignore\n    assert resp.address is None  # type: ignore\n\n\n@pytest.mark.parametrize(\"mode,model\", product(modes, models))\ndef test_union(mode, model):\n    \"\"\"Test that union types are now supported with Gemini (issue #1964).\"\"\"\n    client = instructor.from_provider(f\"google/{model}\", mode=mode)\n\n    class UserData(BaseModel):\n        name: str\n        id_value: Union[str, int]\n\n    # Union types are now supported by Google GenAI SDK\n    # See: https://github.com/googleapis/python-genai/issues/447\n    response = client.chat.completions.create(\n        model=model,\n        messages=[{\"role\": \"user\", \"content\": \"User name is Alice with ID 12345\"}],\n        response_model=UserData,\n    )\n\n    assert response.name == \"Alice\"\n    # The ID could be returned as either str or int\n    assert response.id_value in [\"12345\", 12345]\n\n\ndef test_optional_types_allowed():\n    \"\"\"Test that Optional types are correctly mapped and don't throw errors.\"\"\"\n\n    class User(BaseModel):\n        name: str\n        age: Optional[int] = None\n        email: Optional[str] = None\n\n    schema = User.model_json_schema()\n    # Should not raise an error\n    result = map_to_gemini_function_schema(schema)\n\n    assert result[\"properties\"][\"age\"][\"nullable\"] is True\n    assert result[\"properties\"][\"email\"][\"nullable\"] is True\n    assert result[\"required\"] == [\"name\"]\n\n\ndef test_union_types_allowed_schema():\n    \"\"\"Test that Union types are now allowed in schema mapping (issue #1964).\"\"\"\n\n    class UserWithUnion(BaseModel):\n        name: str\n        value: Union[int, str]\n\n    schema = UserWithUnion.model_json_schema()\n\n    # Union types are now supported - should not raise\n    result = map_to_gemini_function_schema(schema)\n\n    # The anyOf structure should be preserved\n    assert \"properties\" in result\n    assert \"value\" in result[\"properties\"]\n    assert \"anyOf\" in result[\"properties\"][\"value\"]\n\n\n@pytest.mark.parametrize(\n    \"mode\", [instructor.Mode.GENAI_STRUCTURED_OUTPUTS, instructor.Mode.GENAI_TOOLS]\n)\ndef test_genai_api_call_with_different_types(mode):\n    \"\"\"Test actual API call with genai SDK using different types.\"\"\"\n\n    class UserProfile(BaseModel):\n        name: str\n        age: int\n        email: Optional[str] = None\n        is_premium: bool\n        score: float\n\n    client = instructor.from_provider(MODEL, mode=mode)\n\n    response = client.chat.completions.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a user profile for John Doe, 25 years old, premium user with score 85.5\",\n            }\n        ],\n        response_model=UserProfile,\n    )\n\n    assert isinstance(response, UserProfile)\n    assert response.name == \"John Doe\"\n    assert response.email is None\n\n\n@pytest.mark.parametrize(\n    \"mode\", [instructor.Mode.GENAI_STRUCTURED_OUTPUTS, instructor.Mode.GENAI_TOOLS]\n)\ndef test_genai_api_call_with_nested_models(mode):\n    \"\"\"Test API call with nested models (multiple users).\"\"\"\n\n    class User(BaseModel):\n        name: str\n        age: int\n        department: Optional[str] = None\n\n    class UserList(BaseModel):\n        users: list[User]\n\n    client = instructor.from_provider(MODEL, mode=mode)\n\n    response = client.chat.completions.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a list of 3 employees: Alice (30, Engineering), Bob (25, Marketing), Charlie (35)\",\n            }\n        ],\n        response_model=UserList,\n    )\n\n    assert isinstance(response, UserList)\n    assert len(response.users) == 3\n    assert {user.name for user in response.users} == {\"Alice\", \"Bob\", \"Charlie\"}\n    assert {user.age for user in response.users} == {25, 30, 35}\n    assert {user.department for user in response.users} == {\n        None,\n        \"Engineering\",\n        \"Marketing\",\n    }\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\n    \"mode\", [instructor.Mode.GENAI_STRUCTURED_OUTPUTS, instructor.Mode.GENAI_TOOLS]\n)\nasync def test_genai_api_call_with_different_types_async(mode):\n    \"\"\"Test actual async API call with genai SDK using different types.\"\"\"\n\n    class UserProfile(BaseModel):\n        name: str\n        age: int\n        email: Optional[str] = None\n        is_premium: bool\n        score: float\n\n    client = instructor.from_provider(MODEL, mode=mode, async_client=True)\n\n    response = await client.chat.completions.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a user profile for John Doe, 25 years old, premium user with score 85.5\",\n            }\n        ],\n        response_model=UserProfile,\n    )\n\n    assert isinstance(response, UserProfile)\n    assert response.name == \"John Doe\"\n    assert response.email is None\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\n    \"mode\", [instructor.Mode.GENAI_STRUCTURED_OUTPUTS, instructor.Mode.GENAI_TOOLS]\n)\nasync def test_genai_api_call_with_nested_models_async(mode):\n    \"\"\"Test async API call with nested models (multiple users).\"\"\"\n\n    class User(BaseModel):\n        name: str\n        age: int\n        department: Optional[str] = None\n\n    class UserList(BaseModel):\n        users: list[User]\n\n    client = instructor.from_provider(MODEL, mode=mode, async_client=True)\n\n    response = await client.chat.completions.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Create a list of 3 employees: Alice (30, Engineering), Bob (25, Marketing), Charlie (35)\",\n            }\n        ],\n        response_model=UserList,\n    )\n\n    assert isinstance(response, UserList)\n    assert len(response.users) == 3\n    assert {user.name for user in response.users} == {\"Alice\", \"Bob\", \"Charlie\"}\n    assert {user.age for user in response.users} == {25, 30, 35}\n    assert {user.department for user in response.users} == {\n        None,\n        \"Engineering\",\n        \"Marketing\",\n    }\n"
  },
  {
    "path": "tests/llm/test_genai/test_reask.py",
    "content": "import os\nimport pytest\nfrom pydantic import BaseModel, field_validator\nimport instructor\n\n\n@pytest.mark.parametrize(\"mode\", [instructor.Mode.GENAI_TOOLS])\ndef test_genai_tools_validation_retry_preserves_model_content(mode):\n    \"\"\"Ensure GENAI_TOOLS validation retries are wired end-to-end.\"\"\"\n    from instructor.core.exceptions import InstructorRetryException\n\n    model = os.getenv(\"GOOGLE_GENAI_MODEL\", \"gemini-2.0-flash\")\n\n    class AlwaysInvalid(BaseModel):\n        value: int\n\n        @field_validator(\"value\")\n        @classmethod\n        def always_fail(cls, v: int) -> int:  # noqa: ARG003\n            raise ValueError(\"force retry for reask validation coverage\")\n\n    client = instructor.from_provider(f\"google/{model}\", mode=mode)\n    with pytest.raises(InstructorRetryException) as exc_info:\n        client.chat.completions.create(\n            model=model,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"Return any integer value\",\n                }\n            ],\n            response_model=AlwaysInvalid,\n            max_retries=2,\n        )\n\n    assert exc_info.value.n_attempts == 2\n"
  },
  {
    "path": "tests/llm/test_genai/test_schema_conversion.py",
    "content": "\"\"\"Test schema conversion functions for Gemini.\"\"\"\n\nfrom enum import Enum\nfrom typing import Optional\nfrom pydantic import BaseModel\n\nfrom instructor.providers.gemini.utils import (\n    map_to_gemini_function_schema,\n    verify_no_unions,\n)\n\n\nclass Priority(Enum):\n    LOW = \"low\"\n    MEDIUM = \"medium\"\n    HIGH = \"high\"\n\n\nclass SimpleModel(BaseModel):\n    name: str\n    age: int\n    is_active: bool\n\n\nclass OptionalModel(BaseModel):\n    name: str\n    age: Optional[int] = None\n    description: Optional[str] = None\n\n\nclass EnumModel(BaseModel):\n    name: str\n    priority: Priority\n\n\nclass NestedModel(BaseModel):\n    name: str\n    items: list[str]\n    details: SimpleModel\n\n\ndef test_simple_schema_conversion():\n    \"\"\"Test conversion strips extra pydantic fields like 'title'.\"\"\"\n    schema = SimpleModel.model_json_schema()\n    result = map_to_gemini_function_schema(schema)\n\n    # Input has 'title' fields that should be stripped out\n    assert \"title\" in schema\n    assert \"title\" in schema[\"properties\"][\"name\"]\n\n    # Output should be clean without title fields\n    expected = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"name\": {\"type\": \"string\"},\n            \"age\": {\"type\": \"integer\"},\n            \"is_active\": {\"type\": \"boolean\"},\n        },\n        \"required\": [\"name\", \"age\", \"is_active\"],\n    }\n\n    assert result == expected\n\n\ndef test_optional_schema_conversion():\n    \"\"\"Test conversion transforms anyOf[T, null] to nullable fields.\"\"\"\n    schema = OptionalModel.model_json_schema()\n    result = map_to_gemini_function_schema(schema)\n\n    # Input should have anyOf with null type for optional fields\n    assert schema[\"properties\"][\"age\"][\"anyOf\"] == [\n        {\"type\": \"integer\"},\n        {\"type\": \"null\"},\n    ]\n    assert schema[\"properties\"][\"description\"][\"anyOf\"] == [\n        {\"type\": \"string\"},\n        {\"type\": \"null\"},\n    ]\n\n    # Output should convert to nullable: true\n    expected = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"name\": {\"type\": \"string\"},\n            \"age\": {\"type\": \"integer\", \"nullable\": True},\n            \"description\": {\"type\": \"string\", \"nullable\": True},\n        },\n        \"required\": [\"name\"],\n    }\n\n    assert result == expected\n\n\ndef test_enum_schema_conversion():\n    \"\"\"Test conversion resolves $refs and adds format: enum.\"\"\"\n    schema = EnumModel.model_json_schema()\n    result = map_to_gemini_function_schema(schema)\n\n    # Input should have $ref and $defs\n    assert schema[\"properties\"][\"priority\"][\"$ref\"] == \"#/$defs/Priority\"\n    assert \"$defs\" in schema\n    assert schema[\"$defs\"][\"Priority\"][\"enum\"] == [\"low\", \"medium\", \"high\"]\n\n    # Output should resolve the ref and add format: enum\n    expected = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"name\": {\"type\": \"string\"},\n            \"priority\": {\n                \"type\": \"string\",\n                \"enum\": [\"low\", \"medium\", \"high\"],\n                \"format\": \"enum\",\n            },\n        },\n        \"required\": [\"name\", \"priority\"],\n    }\n\n    assert result == expected\n\n\ndef test_nested_schema_conversion():\n    \"\"\"Test conversion of schema with nested objects.\"\"\"\n    schema = NestedModel.model_json_schema()\n    result = map_to_gemini_function_schema(schema)\n\n    expected = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"name\": {\"type\": \"string\"},\n            \"items\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}},\n            \"details\": {\n                \"type\": \"object\",\n                \"properties\": {\n                    \"name\": {\"type\": \"string\"},\n                    \"age\": {\"type\": \"integer\"},\n                    \"is_active\": {\"type\": \"boolean\"},\n                },\n                \"required\": [\"name\", \"age\", \"is_active\"],\n            },\n        },\n        \"required\": [\"name\", \"items\", \"details\"],\n    }\n\n    assert result == expected\n\n\ndef test_verify_no_unions_valid():\n    \"\"\"Test verify_no_unions with valid schemas.\"\"\"\n    # Simple schema should pass\n    simple_schema = SimpleModel.model_json_schema()\n    assert verify_no_unions(simple_schema) is True\n\n    # Optional schema should pass (Optional[T] is Union[T, None])\n    optional_schema = OptionalModel.model_json_schema()\n    assert verify_no_unions(optional_schema) is True\n\n\ndef test_verify_no_unions_invalid():\n    \"\"\"Test verify_no_unions with union schemas (now allowed).\"\"\"\n    # Create a schema with a true union (not just Optional)\n    invalid_schema = {\n        \"type\": \"object\",\n        \"properties\": {\"value\": {\"anyOf\": [{\"type\": \"string\"}, {\"type\": \"integer\"}]}},\n    }\n    assert verify_no_unions(invalid_schema) is True\n\n\ndef test_schema_without_refs():\n    \"\"\"Test schema conversion without $refs.\"\"\"\n    schema = {\n        \"type\": \"object\",\n        \"properties\": {\"name\": {\"type\": \"string\"}, \"count\": {\"type\": \"integer\"}},\n        \"required\": [\"name\"],\n    }\n\n    result = map_to_gemini_function_schema(schema)\n\n    expected = {\n        \"type\": \"object\",\n        \"properties\": {\"name\": {\"type\": \"string\"}, \"count\": {\"type\": \"integer\"}},\n        \"required\": [\"name\"],\n    }\n\n    assert result == expected\n\n\ndef test_schema_with_description():\n    \"\"\"Test schema conversion preserves descriptions.\"\"\"\n    schema = {\n        \"type\": \"object\",\n        \"description\": \"A test object\",\n        \"properties\": {\"name\": {\"type\": \"string\", \"description\": \"The name field\"}},\n    }\n\n    result = map_to_gemini_function_schema(schema)\n\n    expected = {\n        \"type\": \"object\",\n        \"description\": \"A test object\",\n        \"properties\": {\"name\": {\"type\": \"string\", \"description\": \"The name field\"}},\n    }\n\n    assert result == expected\n\n\ndef test_union_type_raises_error():\n    \"\"\"Test that union types are allowed in schema conversion.\"\"\"\n    # Create a model with a true union type (not Optional or Decimal)\n    union_schema = {\n        \"type\": \"object\",\n        \"properties\": {\"value\": {\"anyOf\": [{\"type\": \"string\"}, {\"type\": \"integer\"}]}},\n    }\n\n    result = map_to_gemini_function_schema(union_schema)\n    assert result[\"properties\"][\"value\"][\"anyOf\"] == [\n        {\"type\": \"string\"},\n        {\"type\": \"integer\"},\n    ]\n\n\ndef test_verify_no_unions_allows_optional():\n    \"\"\"Test that verify_no_unions allows Optional types.\"\"\"\n    # Schema with Optional field (Union with null)\n    optional_schema = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"name\": {\"type\": \"string\"},\n            \"age\": {\"anyOf\": [{\"type\": \"integer\"}, {\"type\": \"null\"}]},\n        },\n    }\n\n    assert verify_no_unions(optional_schema) is True\n\n\ndef test_verify_no_unions_allows_decimal():\n    \"\"\"Test that verify_no_unions allows Decimal types (string | number).\"\"\"\n    # Schema with Decimal field (Union of string and number)\n    decimal_schema = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"total\": {\"anyOf\": [{\"type\": \"number\"}, {\"type\": \"string\"}]},\n            \"price\": {\n                \"anyOf\": [{\"type\": \"string\"}, {\"type\": \"number\"}]\n            },  # Order shouldn't matter\n        },\n    }\n\n    assert verify_no_unions(decimal_schema) is True\n\n\ndef test_verify_no_unions_rejects_other_unions():\n    \"\"\"Test that verify_no_unions allows non-Optional unions.\"\"\"\n    # Schema with unsupported union type (string | integer)\n    union_schema = {\n        \"type\": \"object\",\n        \"properties\": {\"value\": {\"anyOf\": [{\"type\": \"string\"}, {\"type\": \"integer\"}]}},\n    }\n\n    assert verify_no_unions(union_schema) is True\n\n\ndef test_verify_no_unions_rejects_complex_unions():\n    \"\"\"Test that verify_no_unions allows complex union types.\"\"\"\n    # Schema with more than 2 types in union\n    complex_union_schema = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"value\": {\n                \"anyOf\": [{\"type\": \"string\"}, {\"type\": \"integer\"}, {\"type\": \"boolean\"}]\n            }\n        },\n    }\n\n    assert verify_no_unions(complex_union_schema) is True\n\n\ndef test_verify_no_unions_nested_schemas():\n    \"\"\"Test that verify_no_unions allows unions in nested schemas.\"\"\"\n    # Schema with nested object containing Decimal and Optional fields\n    nested_schema = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"receipt\": {\n                \"type\": \"object\",\n                \"properties\": {\n                    \"total\": {\n                        \"anyOf\": [{\"type\": \"number\"}, {\"type\": \"string\"}]\n                    },  # Decimal - should pass\n                    \"notes\": {\n                        \"anyOf\": [{\"type\": \"string\"}, {\"type\": \"null\"}]\n                    },  # Optional - should pass\n                },\n            }\n        },\n    }\n\n    assert verify_no_unions(nested_schema) is True\n\n    # Schema with nested object containing unsupported union\n    bad_nested_schema = {\n        \"type\": \"object\",\n        \"properties\": {\n            \"receipt\": {\n                \"type\": \"object\",\n                \"properties\": {\n                    \"total\": {\n                        \"anyOf\": [{\"type\": \"number\"}, {\"type\": \"string\"}]\n                    },  # Decimal - should pass\n                    \"status\": {\n                        \"anyOf\": [{\"type\": \"string\"}, {\"type\": \"integer\"}]\n                    },  # Bad union - should fail\n                },\n            }\n        },\n    }\n\n    assert verify_no_unions(bad_nested_schema) is True\n\n\ndef test_decimal_schema_conversion_succeeds():\n    \"\"\"Test that Decimal types (string | number) are successfully converted.\"\"\"\n    # Schema representing a Receipt with Decimal total field\n    decimal_schema = {\n        \"type\": \"object\",\n        \"title\": \"Receipt\",\n        \"properties\": {\n            \"total\": {\n                \"anyOf\": [{\"type\": \"number\"}, {\"type\": \"string\"}],\n                \"title\": \"Total\",\n            }\n        },\n        \"required\": [\"total\"],\n    }\n\n    # This should not raise an error now\n    result = map_to_gemini_function_schema(decimal_schema)\n\n    # The conversion should succeed and preserve the anyOf structure\n    assert result[\"type\"] == \"object\"\n    assert result[\"properties\"][\"total\"][\"anyOf\"] == [\n        {\"type\": \"number\"},\n        {\"type\": \"string\"},\n    ]\n    assert result[\"required\"] == [\"total\"]\n    # Title should be stripped out\n    assert \"title\" not in result\n    assert \"title\" not in result[\"properties\"][\"total\"]\n"
  },
  {
    "path": "tests/llm/test_genai/test_utils.py",
    "content": "from instructor.providers.gemini.utils import update_genai_kwargs\n\n\ndef test_update_genai_kwargs_basic():\n    \"\"\"Test basic parameter mapping from OpenAI to Gemini format.\"\"\"\n    kwargs = {\n        \"generation_config\": {\n            \"max_tokens\": 100,\n            \"temperature\": 0.7,\n            \"n\": 2,\n            \"top_p\": 0.9,\n            \"stop\": [\"END\"],\n            \"seed\": 42,\n            \"presence_penalty\": 0.1,\n            \"frequency_penalty\": 0.2,\n        }\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that OpenAI parameters were mapped to Gemini equivalents\n    assert result[\"max_output_tokens\"] == 100\n    assert result[\"temperature\"] == 0.7\n    assert result[\"candidate_count\"] == 2\n    assert result[\"top_p\"] == 0.9\n    assert result[\"stop_sequences\"] == [\"END\"]\n    assert result[\"seed\"] == 42\n    assert result[\"presence_penalty\"] == 0.1\n    assert result[\"frequency_penalty\"] == 0.2\n\n\ndef test_update_genai_kwargs_safety_settings():\n    \"\"\"Test that safety settings are properly configured.\"\"\"\n    from google.genai.types import HarmCategory, HarmBlockThreshold\n\n    # Exclude JAILBREAK category as it's only for Vertex AI, not google.genai\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    supported_categories = [\n        c\n        for c in HarmCategory\n        if c not in excluded_categories\n        and not c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n    ]\n\n    kwargs = {}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that safety_settings is configured as a list\n    assert \"safety_settings\" in result\n    assert isinstance(result[\"safety_settings\"], list)\n\n    # Should have one entry for each supported HarmCategory\n    assert len(result[\"safety_settings\"]) == len(supported_categories)\n\n    # Each entry should be a dict with category and threshold\n    for setting in result[\"safety_settings\"]:\n        assert isinstance(setting, dict)\n        assert \"category\" in setting\n        assert \"threshold\" in setting\n        assert setting[\"threshold\"] == HarmBlockThreshold.OFF  # Default\n\n\ndef test_update_genai_kwargs_with_custom_safety_settings():\n    \"\"\"Test that custom safety settings are properly handled.\"\"\"\n    from google.genai.types import HarmCategory, HarmBlockThreshold\n\n    # Exclude JAILBREAK category as it's only for Vertex AI, not google.genai\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    supported_categories = [\n        c\n        for c in HarmCategory\n        if c not in excluded_categories\n        and not c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n    ]\n\n    # Test with one category that exists in safety_settings\n    custom_safety = {\n        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE\n    }\n    kwargs = {\"safety_settings\": custom_safety}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that safety_settings is configured as a list\n    assert \"safety_settings\" in result\n    assert isinstance(result[\"safety_settings\"], list)\n\n    # Should have one entry for each supported HarmCategory\n    assert len(result[\"safety_settings\"]) == len(supported_categories)\n\n    for setting in result[\"safety_settings\"]:\n        if setting[\"category\"] == HarmCategory.HARM_CATEGORY_HATE_SPEECH:\n            assert setting[\"threshold\"] == HarmBlockThreshold.BLOCK_LOW_AND_ABOVE\n\n    # Other categories should use the default\n    for setting in result[\"safety_settings\"]:\n        if setting[\"category\"] != HarmCategory.HARM_CATEGORY_HATE_SPEECH:\n            assert setting[\"threshold\"] == HarmBlockThreshold.OFF\n\n\ndef test_update_genai_kwargs_safety_settings_with_image_content_uses_image_categories():\n    \"\"\"Test that image content switches to IMAGE_* harm categories when available.\"\"\"\n    from google.genai import types\n    from google.genai.types import HarmCategory\n\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    image_categories = [\n        c\n        for c in HarmCategory\n        if c not in excluded_categories and c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n    ]\n\n    # Older SDKs may not expose separate image categories.\n    if not image_categories:\n        return\n\n    kwargs = {\n        \"contents\": [\n            types.Content(\n                role=\"user\",\n                parts=[types.Part.from_bytes(data=b\"123\", mime_type=\"image/png\")],\n            )\n        ]\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    assert \"safety_settings\" in result\n    assert isinstance(result[\"safety_settings\"], list)\n    assert len(result[\"safety_settings\"]) == len(image_categories)\n    assert {s[\"category\"] for s in result[\"safety_settings\"]} == set(image_categories)\n\n\ndef test_update_genai_kwargs_maps_text_thresholds_to_image_categories():\n    \"\"\"Test that text-based safety settings are applied to equivalent IMAGE_* categories.\"\"\"\n    from google.genai import types\n    from google.genai.types import HarmCategory, HarmBlockThreshold\n\n    excluded_categories = {HarmCategory.HARM_CATEGORY_UNSPECIFIED}\n    if hasattr(HarmCategory, \"HARM_CATEGORY_JAILBREAK\"):\n        excluded_categories.add(HarmCategory.HARM_CATEGORY_JAILBREAK)\n\n    image_categories = [\n        c\n        for c in HarmCategory\n        if c not in excluded_categories and c.name.startswith(\"HARM_CATEGORY_IMAGE_\")\n    ]\n\n    if not image_categories or not hasattr(HarmCategory, \"HARM_CATEGORY_IMAGE_HATE\"):\n        return\n\n    custom_safety = {\n        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE\n    }\n\n    kwargs = {\n        \"contents\": [\n            types.Content(\n                role=\"user\",\n                parts=[types.Part.from_bytes(data=b\"123\", mime_type=\"image/png\")],\n            )\n        ],\n        \"safety_settings\": custom_safety,\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    for setting in result[\"safety_settings\"]:\n        if setting[\"category\"] == HarmCategory.HARM_CATEGORY_IMAGE_HATE:\n            assert setting[\"threshold\"] == HarmBlockThreshold.BLOCK_LOW_AND_ABOVE\n\n\ndef test_update_genai_kwargs_none_values():\n    \"\"\"Test that None values are not set in the result.\"\"\"\n    kwargs = {\n        \"generation_config\": {\n            \"max_tokens\": None,\n            \"temperature\": 0.7,\n            \"n\": None,\n        }\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that None values are not included\n    assert \"max_output_tokens\" not in result\n    assert \"candidate_count\" not in result\n    assert result[\"temperature\"] == 0.7\n\n\ndef test_update_genai_kwargs_empty():\n    \"\"\"Test with empty kwargs.\"\"\"\n    kwargs = {}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Should still have safety_settings configured\n    assert \"safety_settings\" in result\n\n\ndef test_update_genai_kwargs_preserves_original():\n    \"\"\"Test that the function doesn't modify the original kwargs.\"\"\"\n    original_kwargs = {\n        \"generation_config\": {\n            \"max_tokens\": 100,\n            \"temperature\": 0.7,\n        },\n        \"safety_settings\": {},\n    }\n    kwargs = original_kwargs.copy()\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # The function should not modify the original kwargs (works on a copy)\n    assert kwargs == original_kwargs\n    # But result should have the mapped parameters\n    assert \"max_output_tokens\" in result\n    assert \"temperature\" in result\n\n\ndef test_update_genai_kwargs_thinking_config():\n    \"\"\"Test that thinking_config is properly passed through.\"\"\"\n\n    thinking_config = {\"thinking_budget\": 1024}\n    kwargs = {\"thinking_config\": thinking_config}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that thinking_config is passed through unchanged\n    assert \"thinking_config\" in result\n    assert result[\"thinking_config\"] == thinking_config\n\n\ndef test_update_genai_kwargs_thinking_config_none():\n    \"\"\"Test that None thinking_config is not included in result.\"\"\"\n    kwargs = {\"thinking_config\": None}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that thinking_config is not included when None\n    assert \"thinking_config\" not in result\n\n\ndef test_update_genai_kwargs_no_thinking_config():\n    \"\"\"Test that missing thinking_config doesn't affect other parameters.\"\"\"\n    kwargs = {\n        \"generation_config\": {\n            \"max_tokens\": 100,\n            \"temperature\": 0.7,\n        }\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that normal parameters still work\n    assert result[\"max_output_tokens\"] == 100\n    assert result[\"temperature\"] == 0.7\n    # Check that thinking_config is not included when not provided\n    assert \"thinking_config\" not in result\n\n\ndef test_handle_genai_structured_outputs_thinking_config_in_config():\n    \"\"\"Test that thinking_config inside config parameter is extracted (issue #1966).\"\"\"\n    from google.genai import types\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_structured_outputs\n\n    class SimpleModel(BaseModel):\n        text: str\n\n    # Create a mock ThinkingConfig-like object\n    thinking_config = types.ThinkingConfig(thinking_budget=1024)\n\n    # User passes thinking_config inside config parameter\n    user_config = types.GenerateContentConfig(\n        temperature=0.7,\n        max_output_tokens=1000,\n        thinking_config=thinking_config,\n    )\n\n    kwargs = {\n        \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n        \"config\": user_config,\n    }\n\n    _, result_kwargs = handle_genai_structured_outputs(SimpleModel, kwargs)\n\n    # The resulting config should include thinking_config\n    assert \"config\" in result_kwargs\n    assert result_kwargs[\"config\"].thinking_config is not None\n    assert result_kwargs[\"config\"].thinking_config.thinking_budget == 1024\n\n\ndef test_handle_genai_structured_outputs_thinking_config_kwarg_priority():\n    \"\"\"Test that thinking_config as separate kwarg takes priority over config.thinking_config.\"\"\"\n    from google.genai import types\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_structured_outputs\n\n    class SimpleModel(BaseModel):\n        text: str\n\n    # User passes thinking_config both ways - kwarg should take priority\n    config_thinking = types.ThinkingConfig(thinking_budget=500)\n    kwarg_thinking = types.ThinkingConfig(thinking_budget=2000)\n\n    user_config = types.GenerateContentConfig(\n        temperature=0.7,\n        thinking_config=config_thinking,\n    )\n\n    kwargs = {\n        \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n        \"config\": user_config,\n        \"thinking_config\": kwarg_thinking,\n    }\n\n    _, result_kwargs = handle_genai_structured_outputs(SimpleModel, kwargs)\n\n    # The kwarg thinking_config should take priority\n    assert result_kwargs[\"config\"].thinking_config.thinking_budget == 2000\n\n\ndef test_handle_genai_tools_thinking_config_in_config():\n    \"\"\"Test that thinking_config inside config parameter is extracted for tools mode (issue #1966).\"\"\"\n    from google.genai import types\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_tools\n\n    class SimpleModel(BaseModel):\n        text: str\n\n    thinking_config = types.ThinkingConfig(thinking_budget=1024)\n\n    user_config = types.GenerateContentConfig(\n        temperature=0.7,\n        thinking_config=thinking_config,\n    )\n\n    kwargs = {\n        \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n        \"config\": user_config,\n    }\n\n    _, result_kwargs = handle_genai_tools(SimpleModel, kwargs)\n\n    # The resulting config should include thinking_config\n    assert \"config\" in result_kwargs\n    assert result_kwargs[\"config\"].thinking_config is not None\n    assert result_kwargs[\"config\"].thinking_config.thinking_budget == 1024\n"
  },
  {
    "path": "tests/llm/test_genai/util.py",
    "content": "import os\nimport instructor\n\nmodels = [os.getenv(\"GOOGLE_GENAI_MODEL\", \"google/gemini-pro\")]\nmodes = [instructor.Mode.GENAI_STRUCTURED_OUTPUTS]\n"
  },
  {
    "path": "tests/llm/test_litellm.py",
    "content": "import os\nimport pytest\nimport instructor\n\nif not os.getenv(\"OPENAI_API_KEY\"):\n    pytest.skip(\n        \"OPENAI_API_KEY environment variable not set\",\n        allow_module_level=True,\n    )\n\ntry:\n    from litellm import acompletion, completion\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\"litellm package is not installed\", allow_module_level=True)\n\n\ndef test_litellm_create():\n    client = instructor.from_litellm(completion)\n\n    assert isinstance(client, instructor.Instructor)\n\n\ndef test_async_litellm_create():\n    client = instructor.from_litellm(acompletion)\n\n    assert isinstance(client, instructor.AsyncInstructor)\n"
  },
  {
    "path": "tests/llm/test_new_client.py",
    "content": "import os\nimport pytest\n\nif not (\n    os.getenv(\"OPENAI_API_KEY\")\n    and os.getenv(\"ANTHROPIC_API_KEY\")\n    and os.getenv(\"COHERE_API_KEY\")\n):\n    pytest.skip(\n        \"Required API keys not set\",\n        allow_module_level=True,\n    )\n\ntry:\n    import cohere\n    import openai\n    import instructor\n    import anthropic\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\"Required LLM packages are not installed\", allow_module_level=True)\nfrom pydantic import BaseModel, Field\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\ndef test_client_create():\n    client = instructor.from_openai(openai.OpenAI(), model=\"gpt-3.5-turbo\")\n\n    user = client.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\ndef test_client_messages_create():\n    client = instructor.from_openai(openai.OpenAI(), model=\"gpt-3.5-turbo\")\n\n    user = client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\ndef test_client_chat_completions_create_with_response():\n    client = instructor.from_openai(openai.OpenAI(), model=\"gpt-3.5-turbo\")\n\n    user, completion = client.chat.completions.create_with_completion(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n    from openai.types.chat import ChatCompletion\n\n    assert isinstance(completion, ChatCompletion)\n\n\ndef test_client_chat_completions_create():\n    client = instructor.from_openai(openai.OpenAI(), model=\"gpt-3.5-turbo\")\n\n    user = client.chat.completions.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\ndef test_client_chat_completions_create_partial():\n    client = instructor.from_openai(openai.OpenAI(), model=\"gpt-3.5-turbo\")\n\n    for user in client.chat.completions.create_partial(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    ):\n        assert isinstance(user, User)\n\n\ndef test_client_chat_completions_create_iterable():\n    client = instructor.from_openai(openai.OpenAI(), model=\"gpt-3.5-turbo\")\n\n    users = [\n        user\n        for user in client.chat.completions.create_iterable(\n            response_model=User,\n            messages=[{\"role\": \"user\", \"content\": \"Alice is 25, Bob is 30\"}],\n            temperature=0,\n        )\n    ]\n    assert len(users) == 2\n\n\n@pytest.mark.asyncio\nasync def test_async_client_chat_completions_create():\n    client = openai.AsyncOpenAI()\n    instructor_client = instructor.from_openai(client, model=\"gpt-3.5-turbo\")\n\n    user = await instructor_client.chat.completions.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.asyncio\nasync def test_async_client_chat_completions_create_partial():\n    client = openai.AsyncOpenAI()\n    instructor_client = instructor.from_openai(client, model=\"gpt-3.5-turbo\")\n\n    async for user in instructor_client.chat.completions.create_partial(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    ):\n        assert isinstance(user, User)\n\n\n@pytest.mark.asyncio\nasync def test_async_client_chat_completions_create_iterable():\n    client = openai.AsyncOpenAI()\n    instructor_client = instructor.from_openai(client, model=\"gpt-3.5-turbo\")\n\n    async for user in instructor_client.chat.completions.create_iterable(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Alice is 25, Bob is 30\"}],\n        temperature=0,\n    ):\n        assert isinstance(user, User)\n\n\n@pytest.mark.asyncio\nasync def test_async_client_chat_completions_create_with_response():\n    client = openai.AsyncOpenAI()\n    instructor_client = instructor.from_openai(client, model=\"gpt-3.5-turbo\")\n\n    user, response = await instructor_client.chat.completions.create_with_completion(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    from openai.types.chat import ChatCompletion\n\n    assert user.name == \"Jason\"\n    assert user.age == 10\n    assert isinstance(response, ChatCompletion)\n\n\ndef test_client_from_anthropic_with_response():\n    client = instructor.from_anthropic(\n        anthropic.Anthropic(),\n        max_tokens=1000,\n        model=\"claude-3-haiku-20240307\",\n    )\n\n    user, response = client.messages.create_with_completion(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n    assert isinstance(response, anthropic.types.Message)\n\n\ndef test_client_anthropic_response():\n    client = anthropic.Anthropic()\n    instructor_client = instructor.from_anthropic(\n        client,\n        max_tokens=1000,\n        model=\"claude-3-haiku-20240307\",\n    )\n\n    user = instructor_client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.skip(reason=\"Skip for now\")\ndef test_client_anthropic_bedrock_response():\n    client = anthropic.AnthropicBedrock(\n        aws_access_key=os.getenv(\"AWS_ACCESS_KEY_ID\"),\n        aws_secret_key=os.getenv(\"AWS_SECRET_ACCESS_KEY\"),\n        aws_session_token=os.getenv(\"AWS_SESSION_TOKEN\"),\n        aws_region=os.getenv(\"AWS_REGION_NAME\"),\n    )\n\n    instructor_client = instructor.from_anthropic(\n        client,\n        max_tokens=1000,\n        model=\"anthropic.claude-3-haiku-20240307-v1:0\",\n    )\n\n    user = instructor_client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.asyncio\nasync def test_async_client_anthropic_response():\n    client = anthropic.AsyncAnthropic()\n    instructor_client = instructor.from_anthropic(\n        client,\n        max_tokens=1000,\n        model=\"claude-3-haiku-20240307\",\n    )\n\n    user = await instructor_client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.skip(reason=\"Skip for now\")\n@pytest.mark.asyncio\nasync def test_async_client_anthropic_bedrock_response():\n    client = anthropic.AsyncAnthropicBedrock(\n        aws_access_key=os.getenv(\"AWS_ACCESS_KEY_ID\"),\n        aws_secret_key=os.getenv(\"AWS_SECRET_ACCESS_KEY\"),\n        aws_session_token=os.getenv(\"AWS_SESSION_TOKEN\"),\n        aws_region=os.getenv(\"AWS_REGION_NAME\"),\n    )\n\n    instructor_client = instructor.from_anthropic(\n        client,\n        max_tokens=1000,\n        model=\"anthropic.claude-3-haiku-20240307-v1:0\",\n    )\n\n    user = await instructor_client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.skip(reason=\"Skipping if Cohere API is not available\")\ndef test_client_cohere_response():\n    client = cohere.ClientV2()\n    instructor_client = instructor.from_cohere(\n        client,\n        max_tokens=1000,\n        model=\"command-a-03-2025\",\n    )\n\n    user = instructor_client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.skip(reason=\"Skipping if Cohere API is not available\")\ndef test_client_cohere_response_with_nested_classes():\n    client = cohere.ClientV2()\n    instructor_client = instructor.from_cohere(\n        client,\n        max_tokens=1000,\n        model=\"command-a-03-2025\",\n    )\n\n    class Person(BaseModel):\n        name: str = Field(description=\"name of the person\")\n        country_of_origin: str = Field(description=\"country of origin of the person\")\n\n    class Group(BaseModel):\n        group_name: str = Field(description=\"name of the group\")\n        members: list[Person] = Field(description=\"list of members in the group\")\n\n    task = \"\"\"\\\n    Given the following text, create a Group object for 'The Beatles' band\n\n    Text:\n    The Beatles were an English rock band formed in Liverpool in 1960. With a line-up comprising John Lennon, Paul McCartney, George Harrison and Ringo Starr, they are regarded as the most influential band of all time. The group were integral to the development of 1960s counterculture and popular music's recognition as an art form.\n    \"\"\"\n    group = instructor_client.messages.create(\n        response_model=Group,\n        messages=[{\"role\": \"user\", \"content\": task}],\n        temperature=0,\n    )\n    assert group.group_name == \"The Beatles\"\n    assert len(group.members) == 4\n    assert group.members[0].name == \"John Lennon\"\n    assert group.members[1].name == \"Paul McCartney\"\n    assert group.members[2].name == \"George Harrison\"\n    assert group.members[3].name == \"Ringo Starr\"\n\n\n@pytest.mark.skip(reason=\"Skipping if Cohere API is not available\")\n@pytest.mark.asyncio\nasync def test_client_cohere_async():\n    client = cohere.AsyncClientV2()\n    instructor_client = instructor.from_cohere(\n        client,\n        max_tokens=1000,\n        model=\"command-a-03-2025\",\n    )\n\n    class Person(BaseModel):\n        name: str = Field(description=\"name of the person\")\n        country_of_origin: str = Field(description=\"country of origin of the person\")\n\n    class Group(BaseModel):\n        group_name: str = Field(description=\"name of the group\")\n        members: list[Person] = Field(description=\"list of members in the group\")\n\n    task = \"\"\"\\\n    Given the following text, create a Group object for 'The Beatles' band\n\n    Text:\n    The Beatles were an English rock band formed in Liverpool in 1960. With a line-up comprising John Lennon, Paul McCartney, George Harrison and Ringo Starr, they are regarded as the most influential band of all time. The group were integral to the development of 1960s counterculture and popular music's recognition as an art form.\n    \"\"\"\n    group = await instructor_client.messages.create(\n        response_model=Group,\n        messages=[{\"role\": \"user\", \"content\": task}],\n        temperature=0,\n    )\n    assert group.group_name == \"The Beatles\"\n    assert len(group.members) == 4\n    assert group.members[0].name == \"John Lennon\"\n    assert group.members[1].name == \"Paul McCartney\"\n    assert group.members[2].name == \"George Harrison\"\n    assert group.members[3].name == \"Ringo Starr\"\n\n\n@pytest.mark.skip(reason=\"Skip for now\")\ndef test_client_from_mistral_with_response():\n    import mistralai.client as mistralaicli\n\n    client = instructor.from_mistral(\n        mistralaicli.MistralClient(),\n        max_tokens=1000,\n        model=\"mistral-large-latest\",\n    )\n\n    user, response = client.messages.create_with_completion(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n\n\n@pytest.mark.skip(reason=\"Skip for now\")\ndef test_client_mistral_response():\n    import mistralai.client as mistralaicli\n\n    client = mistralaicli.MistralClient()\n    instructor_client = instructor.from_mistral(\n        client, max_tokens=1000, model=\"mistral-large-latest\"\n    )\n\n    user = instructor_client.messages.create(\n        response_model=User,\n        messages=[{\"role\": \"user\", \"content\": \"Jason is 10\"}],\n        temperature=0,\n    )\n    assert user.name == \"Jason\"\n    assert user.age == 10\n"
  },
  {
    "path": "tests/llm/test_openai/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_openai/conftest.py",
    "content": "# conftest.py\nimport os\nimport pytest\n\nif not os.getenv(\"OPENAI_API_KEY\"):\n    pytest.skip(\n        \"OPENAI_API_KEY environment variable not set\",\n        allow_module_level=True,\n    )\n\ntry:\n    from openai import AsyncOpenAI, OpenAI\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\"openai package is not installed\", allow_module_level=True)\n\n\n@pytest.fixture(scope=\"function\")\ndef client():\n    yield OpenAI()\n\n\n@pytest.fixture(scope=\"function\")\ndef aclient():\n    yield AsyncOpenAI()\n"
  },
  {
    "path": "tests/llm/test_openai/slow/test_response.py",
    "content": "import instructor\nfrom openai import OpenAI, AsyncOpenAI\nfrom pydantic import BaseModel\nimport pytest\nfrom collections.abc import Iterable\nfrom itertools import product\n\nmodels = [\"gpt-4.1-nano\"]\n\n\nclass UserProfile(BaseModel):\n    name: str\n    age: int\n    bio: str\n\n\nresponse_modes = [\n    instructor.Mode.RESPONSES_TOOLS,\n    instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n]\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\ndef test_basic_response_methods(client: OpenAI, mode, model):\n    instructor_client = instructor.from_openai(client, mode=mode)\n\n    # Test create\n    profile = instructor_client.responses.create(\n        model=model,\n        input=\"Generate a profile for a user named John who is 30 years old\",\n        response_model=UserProfile,\n    )\n    assert isinstance(profile, UserProfile)\n    assert profile.name == \"John\"\n    assert profile.age == 30\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\ndef test_create_iterable_from_create(client: OpenAI, mode, model):\n    instructor_client = instructor.from_openai(client, mode=mode)\n\n    # Test create\n    profiles = instructor_client.responses.create(\n        model=model,\n        input=\"Generate three fake profiles\",\n        response_model=Iterable[UserProfile],\n    )\n\n    count = 0\n    for profile in profiles:\n        assert isinstance(profile, UserProfile)\n        count += 1\n\n    assert count >= 3\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\ndef test_create_with_completion(client: OpenAI, mode, model):\n    from openai.types.responses import Response\n\n    instructor_client = instructor.from_openai(client, mode=mode)\n\n    # Test create\n    response, completion = instructor_client.responses.create_with_completion(\n        model=model,\n        input=\"Generate a profile for a user named John who is 30 years old\",\n        response_model=UserProfile,\n    )\n    assert isinstance(response, UserProfile)\n    assert response.name == \"John\"\n    assert response.age == 30\n    assert isinstance(completion, Response)\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\ndef test_create_iterable(client: OpenAI, mode, model):\n    instructor_client = instructor.from_openai(client, mode=mode)\n\n    # Test create\n    users = instructor_client.responses.create_iterable(\n        model=model,\n        input=\"generate three fake profiles\",\n        response_model=UserProfile,\n    )\n\n    count = 0\n    for user in users:\n        assert isinstance(user, UserProfile)\n        count += 1\n\n    assert count == 3\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\ndef test_create_partial(client: OpenAI, mode, model):\n    instructor_client = instructor.from_openai(client, mode=mode)\n\n    # Test create\n    resp = instructor_client.responses.create_partial(\n        model=model,\n        input=\"Generate a fake profile\",\n        response_model=UserProfile,\n    )\n\n    prev = None\n    update_count = 0\n    for user in resp:\n        assert isinstance(user, UserProfile)\n        if user != prev:\n            update_count += 1\n            prev = user\n\n    assert update_count >= 1\n\n\n# ASYNC TESTS\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\nasync def test_basic_response_methods_async(client: AsyncOpenAI, mode, model):\n    instructor_client = instructor.from_openai(client, mode=mode)\n\n    # Test create\n    profile = instructor_client.responses.create(\n        model=model,\n        input=\"Generate a profile for a user named John who is 30 years old\",\n        response_model=UserProfile,\n    )\n    assert isinstance(profile, UserProfile)\n    assert profile.name == \"John\"\n    assert profile.age == 30\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\nasync def test_create_iterable_from_create_async(aclient: AsyncOpenAI, mode, model):\n    instructor_client: instructor.AsyncInstructor = instructor.from_openai(\n        aclient, mode=mode\n    )\n\n    # Test create\n    profiles = instructor_client.responses.create(\n        model=model,\n        input=\"Generate three fake profiles\",\n        response_model=Iterable[UserProfile],\n    )\n\n    count = 0\n    async for profile in await profiles:\n        assert isinstance(profile, UserProfile)\n        count += 1\n\n    assert count >= 3\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\nasync def test_create_with_completion_async(aclient: AsyncOpenAI, mode, model):\n    from openai.types.responses import Response\n\n    instructor_client = instructor.from_openai(aclient, mode=mode)\n\n    # Test create\n    response, completion = await instructor_client.responses.create_with_completion(\n        model=model,\n        input=\"Generate a profile for a user named John who is 30 years old\",\n        response_model=UserProfile,\n    )\n    assert isinstance(response, UserProfile)\n    assert response.name == \"John\"\n    assert response.age == 30\n    assert isinstance(completion, Response)\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\nasync def test_create_iterable_async(aclient: AsyncOpenAI, mode, model):\n    instructor_client = instructor.from_openai(aclient, mode=mode)\n\n    # Test create\n    users = await instructor_client.responses.create_iterable(\n        model=model,\n        input=\"generate three fake profiles\",\n        response_model=UserProfile,\n    )\n\n    count = 0\n    async for user in users:\n        assert isinstance(user, UserProfile)\n        count += 1\n\n    assert count == 3\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, response_modes))\nasync def test_create_partial_async(aclient: AsyncOpenAI, mode, model):\n    instructor_client = instructor.from_openai(aclient, mode=mode)\n\n    # Test create\n    resp = instructor_client.responses.create_partial(\n        model=model,\n        input=\"Generate a fake profile\",\n        response_model=UserProfile,\n    )\n\n    prev = None\n    update_count = 0\n    async for user in resp:\n        assert isinstance(user, UserProfile)\n        if user != prev:\n            update_count += 1\n            prev = user\n\n    assert update_count >= 1\n"
  },
  {
    "path": "tests/llm/test_openai/test_attr.py",
    "content": "import instructor\nimport openai\nimport pytest\n\n\ndef test_has_embedding():\n    oai = openai.OpenAI()\n    client = instructor.from_openai(oai)\n\n    embedding = client.embeddings.create(\n        input=\"Hello world\", model=\"text-embedding-3-small\"\n    )\n    assert embedding is not None, \"The 'embeddings' attribute is None.\"\n\n\n@pytest.mark.asyncio\nasync def test_has_embedding_async():\n    oai = openai.AsyncOpenAI()\n    client = instructor.from_openai(oai)\n\n    # Check if the 'embeddings' attribute can be accessed through the client\n    embedding = await client.embeddings.create(\n        input=\"Hello world\", model=\"text-embedding-3-small\"\n    )\n    assert embedding is not None, \"The 'embeddings' attribute is None.\"\n"
  },
  {
    "path": "tests/llm/test_openai/test_hooks.py",
    "content": "import pytest\nimport instructor\nfrom openai import OpenAI\nimport pprint\n\n\n@pytest.fixture\ndef client():\n    return instructor.from_openai(OpenAI())\n\n\ndef log_kwargs(*args, **kwargs):\n    pprint.pprint({\"args\": args, \"kwargs\": kwargs})\n\n\ndef log_kwargs_1(*args, **kwargs):\n    pprint.pprint({\"args\": args, \"kwargs\": kwargs})\n\n\ndef log_kwargs_2(*args, **kwargs):\n    pprint.pprint({\"args\": args, \"kwargs\": kwargs})\n\n\nhook_names = [item.value for item in instructor.hooks.HookName]\nhook_enums = [instructor.hooks.HookName(hook_name) for hook_name in hook_names]\nhook_functions = [log_kwargs, log_kwargs_1, log_kwargs_2]\nhook_object = instructor.hooks.Hooks()\n\n\n@pytest.mark.parametrize(\"hook_name\", hook_names)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_on_method_str(\n    client: instructor.Instructor, hook_name: str, num_functions: int\n):\n    functions_to_add = hook_functions[:num_functions]\n    hook_enum = hook_object.get_hook_name(hook_name)\n\n    assert hook_enum not in client.hooks._handlers\n\n    for func in functions_to_add:\n        client.on(hook_name, func)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    for func in functions_to_add:\n        assert func in client.hooks._handlers[hook_enum]\n\n\n@pytest.mark.parametrize(\"hook_enum\", hook_enums)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_on_method_enum(\n    client: instructor.Instructor,\n    hook_enum: instructor.hooks.HookName,\n    num_functions: int,\n):\n    functions_to_add = hook_functions[:num_functions]\n    assert hook_enum not in client.hooks._handlers\n\n    for func in functions_to_add:\n        client.on(hook_enum, func)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    for func in functions_to_add:\n        assert func in client.hooks._handlers[hook_enum]\n\n\n@pytest.mark.parametrize(\"hook_name\", hook_names)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_off_method_str(\n    client: instructor.Instructor,\n    hook_name: str,\n    num_functions: int,\n):\n    functions_to_add = hook_functions[:num_functions]\n    hook_enum = hook_object.get_hook_name(hook_name)\n    assert hook_enum not in client.hooks._handlers\n\n    for func in functions_to_add:\n        client.on(hook_name, func)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    for func in functions_to_add:\n        client.off(hook_name, func)\n        if client.hooks._handlers.get(hook_enum):\n            assert func not in client.hooks._handlers[hook_enum]\n        else:\n            assert hook_enum not in client.hooks._handlers\n\n    assert hook_enum not in client.hooks._handlers\n\n\n@pytest.mark.parametrize(\"hook_enum\", hook_enums)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_off_method_enum(\n    client: instructor.Instructor,\n    hook_enum: instructor.hooks.HookName,\n    num_functions: int,\n):\n    functions_to_add = hook_functions[:num_functions]\n    assert hook_enum not in client.hooks._handlers\n    for func in functions_to_add:\n        client.on(hook_enum, func)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    for func in functions_to_add:\n        client.off(hook_enum, func)\n        if client.hooks._handlers.get(hook_enum):\n            assert func not in client.hooks._handlers[hook_enum]\n        else:\n            assert hook_enum not in client.hooks._handlers\n\n    assert hook_enum not in client.hooks._handlers\n\n\n@pytest.mark.parametrize(\"hook_name\", hook_names)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_clear_method_str(\n    client: instructor.Instructor,\n    hook_name: str,\n    num_functions: int,\n):\n    functions_to_add = hook_functions[:num_functions]\n\n    for func in functions_to_add:\n        client.on(hook_name, func)\n\n    hook_enum = hook_object.get_hook_name(hook_name)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    client.clear(hook_name)\n    assert hook_enum not in client.hooks._handlers\n\n\n@pytest.mark.parametrize(\"hook_enum\", hook_enums)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_clear_method(\n    client: instructor.Instructor,\n    hook_enum: instructor.hooks.HookName,\n    num_functions: int,\n):\n    functions_to_add = hook_functions[:num_functions]\n\n    for func in functions_to_add:\n        client.on(hook_enum, func)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    client.clear(hook_enum)\n    assert hook_enum not in client.hooks._handlers\n\n\n@pytest.mark.parametrize(\"hook_enum\", hook_enums)\n@pytest.mark.parametrize(\"num_functions\", [1, 2, 3])\ndef test_clear_no_args(\n    client: instructor.Instructor,\n    hook_enum: instructor.hooks.HookName,\n    num_functions: int,\n):\n    functions_to_add = hook_functions[:num_functions]\n\n    for func in functions_to_add:\n        client.on(hook_enum, func)\n\n    assert hook_enum in client.hooks._handlers\n    assert len(client.hooks._handlers[hook_enum]) == num_functions\n\n    client.clear()\n    assert hook_enum not in client.hooks._handlers\n"
  },
  {
    "path": "tests/llm/test_openai/test_multimodal.py",
    "content": "import pytest\nfrom instructor.processing.multimodal import Image, Audio\nimport instructor\nfrom pydantic import Field, BaseModel\nfrom itertools import product\nimport requests\nfrom pathlib import Path\nimport base64\nimport os\n\naudio_url = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/gettysburg.wav\"\nimage_url = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/image.jpg\"\n\npdf_url = \"https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf\"\ncurr_file = os.path.dirname(__file__)\npdf_path = os.path.join(curr_file, \"../../assets/invoice.pdf\")\npdf_base64 = base64.b64encode(open(pdf_path, \"rb\").read()).decode(\"utf-8\")\npdf_base64_string = f\"data:application/pdf;base64,{pdf_base64}\"\n\nmodels = [\"gpt-4.1-nano\"]\nmodes = [\n    instructor.Mode.TOOLS,\n]\n\n\nclass LineItem(BaseModel):\n    name: str\n    price: int\n    quantity: int\n\n\nclass Receipt(BaseModel):\n    total: int\n    items: list[str]\n\n\ndef gettysburg_audio():\n    audio_file = Path(\"gettysburg.wav\")\n    if not audio_file.exists():\n        response = requests.get(audio_url)\n        response.raise_for_status()\n        with open(audio_file, \"wb\") as f:\n            f.write(response.content)\n    return audio_file\n\n\n@pytest.mark.parametrize(\n    \"audio_file, mode\",\n    [(Audio.from_url(audio_url), mode) for mode in modes],\n)\ndef test_multimodal_audio_description(audio_file, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    if client.mode in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        pytest.skip(\"Audio isn't supported in responses for now\")\n\n    class AudioDescription(BaseModel):\n        source: str\n\n    response = client.chat.completions.create(\n        model=\"gpt-4o-audio-preview\",\n        response_model=AudioDescription,\n        modalities=[\"text\"],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"Where's this excerpt from?\",\n                    audio_file,\n                ],  # type: ignore\n            },\n        ],\n        audio={\"voice\": \"alloy\", \"format\": \"wav\"},  # type: ignore\n    )\n\n\nclass ImageDescription(BaseModel):\n    objects: list[str] = Field(..., description=\"The objects in the image\")\n    scene: str = Field(..., description=\"The scene of the image\")\n    colors: list[str] = Field(..., description=\"The colors in the image\")\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description(model, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n    response = client.chat.completions.create(\n        model=model,  # Ensure this is a vision-capable model\n        response_model=ImageDescription,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"What is this?\",\n                    Image.from_url(image_url),\n                ],  # type: ignore\n            },\n        ],\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, ImageDescription)\n    assert len(response.objects) > 0\n    assert response.scene != \"\"\n    assert len(response.colors) > 0\n\n    # Additional assertions can be added based on expected content of the sample image\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description_autodetect(model, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n    response = client.chat.completions.create(\n        model=model,  # Ensure this is a vision-capable model\n        response_model=ImageDescription,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": [\n                    \"What is this?\",\n                    image_url,\n                ],  # type: ignore\n            },\n        ],\n        autodetect_images=True,  # type: ignore\n    )\n\n    # Assertions to validate the response\n    assert isinstance(response, ImageDescription)\n    assert len(response.objects) > 0\n    assert response.scene != \"\"\n    assert len(response.colors) > 0\n\n    # Additional assertions can be added based on expected content of the sample image\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_image_description_autodetect_no_response_model(model, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n    response = client.chat.completions.create(\n        response_model=None,\n        model=model,  # Ensure this is a vision-capable model\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a helpful assistant that can describe images. \"\n                \"If looking at an image, reply with 'This is an image' and nothing else.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": image_url,\n            },\n        ],\n        max_tokens=1000,\n        temperature=1,\n        autodetect_images=True,\n    )\n\n    if mode not in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        assert response.choices[0].message.content.startswith(\"This is an image\")\n    else:\n        assert response.output[0].content[0].text\n\n\n@pytest.mark.parametrize(\"pdf_source\", [pdf_path, pdf_url, pdf_base64_string])\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multimodal_pdf_file(model, mode, client, pdf_source):\n    client = instructor.from_openai(client, mode=mode)\n\n    # Retry logic for flaky LLM responses\n    max_retries = 3\n    for attempt in range(max_retries):\n        response = client.chat.completions.create(\n            model=model,  # Ensure this is a vision-capable model\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"Extract the total and items from the invoice. Be precise and only extract the final total amount and list of item names. The total should be exactly 220.\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": instructor.processing.multimodal.PDF.autodetect(\n                        pdf_source\n                    ),\n                },\n            ],\n            autodetect_images=False,\n            response_model=Receipt,\n            temperature=0,  # Keep for consistent responses\n        )\n\n        if response.total == 220 and len(response.items) == 2:\n            break\n        elif attempt == max_retries - 1:\n            pytest.fail(\n                f\"After {max_retries} attempts, got total={response.total}, items={response.items}, expected total=220, items=2\"\n            )\n\n    assert response.total == 220\n    assert len(response.items) == 2\n"
  },
  {
    "path": "tests/llm/test_openai/test_multitask.py",
    "content": "from itertools import product\nfrom collections.abc import Iterable\nfrom pydantic import BaseModel\nimport pytest\n\nimport instructor\nfrom .util import models, modes\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nUsers = Iterable[User]\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multi_user(model, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    def stream_extract(input: str) -> Iterable[User]:\n        return client.chat.completions.create(\n            model=model,\n            response_model=Users,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a perfect entity extraction system\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": (\n                        f\"Consider the data below:\\n{input}\"\n                        \"Correctly segment it into entitites\"\n                        \"Make sure the JSON is correct\"\n                    ),\n                },\n            ],\n            max_tokens=1000,\n        )\n\n    resp = [user for user in stream_extract(input=\"Jason is 20, Sarah is 30\")]\n    assert len(resp) == 2\n    assert resp[0].name == \"Jason\"\n    assert resp[0].age == 20\n    assert resp[1].name == \"Sarah\"\n    assert resp[1].age == 30\n\n\nfrom typing import Any\nfrom functools import partial\n\n\nasync def async_map_chat_completion_to_response(\n    messages, client, *args, **kwargs\n) -> Any:\n    return await client.responses.create(\n        *args,\n        input=messages,\n        **kwargs,\n    )\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\nasync def test_multi_user_tools_mode_async(model, mode, aclient):\n    from instructor.mode import Mode\n\n    client = instructor.patch(\n        aclient,\n        create=(\n            partial(async_map_chat_completion_to_response, client=aclient)\n            if mode in {Mode.RESPONSES_TOOLS, Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS}\n            else aclient.chat.completions.create\n        ),\n        mode=mode,\n    )\n\n    async def stream_extract(input: str) -> Iterable[User]:\n        return await client.chat.completions.create(\n            model=model,\n            response_model=Users,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": (\n                        f\"Consider the data below:\\n{input}\"\n                        \"Correctly segment it into entitites\"\n                        \"Make sure the JSON is correct\"\n                    ),\n                },\n            ],\n            max_tokens=1000,\n        )\n\n    resp = []\n    for user in await stream_extract(input=\"Jason is 20, Sarah is 30\"):\n        resp.append(user)\n    print(resp)\n    assert len(resp) == 2\n    assert resp[0].name == \"Jason\"\n    assert resp[0].age == 20\n    assert resp[1].name == \"Sarah\"\n    assert resp[1].age == 30\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_multi_user_stream(model, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    def stream_extract(input: str) -> Iterable[User]:\n        return client.chat.completions.create(\n            model=model,\n            stream=True,\n            response_model=Users,\n            messages=[\n                {\n                    \"role\": \"system\",\n                    \"content\": \"You are a perfect entity extraction system\",\n                },\n                {\n                    \"role\": \"user\",\n                    \"content\": (\n                        f\"Consider the data below:\\n{input}\"\n                        \"Correctly segment it into entitites\"\n                        \"Make sure the JSON is correct\"\n                    ),\n                },\n            ],\n            max_tokens=1000,\n        )\n\n    resp = [user for user in stream_extract(input=\"Jason is 20, Sarah is 30\")]\n    assert len(resp) == 2\n    assert resp[0].name == \"Jason\"\n    assert resp[0].age == 20\n    assert resp[1].name == \"Sarah\"\n    assert resp[1].age == 30\n\n\n@pytest.mark.asyncio\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\nasync def test_multi_user_tools_mode_async_stream(model, mode, aclient):\n    client = instructor.from_openai(aclient, mode=mode)\n\n    async def stream_extract(input: str) -> Iterable[User]:\n        return await client.chat.completions.create(\n            model=model,\n            stream=True,\n            response_model=Users,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": (\n                        f\"Consider the data below:\\n{input}\"\n                        \"Correctly segment it into entitites\"\n                        \"Make sure the JSON is correct\"\n                    ),\n                },\n            ],\n            max_tokens=1000,\n        )\n\n    resp = []\n    async for user in await stream_extract(input=\"Jason is 20, Sarah is 30\"):\n        resp.append(user)\n    print(resp)\n    assert len(resp) == 2\n    assert resp[0].name == \"Jason\"\n    assert resp[0].age == 20\n    assert resp[1].name == \"Sarah\"\n    assert resp[1].age == 30\n"
  },
  {
    "path": "tests/llm/test_openai/test_patch.py",
    "content": "from itertools import product\nfrom pydantic import BaseModel, field_validator\nfrom openai.types.chat import ChatCompletion\nfrom typing_extensions import TypedDict\nimport pytest\nimport instructor\n\n\nfrom .util import models, modes\n\n\nclass UserExtract(BaseModel):\n    name: str\n    age: int\n\n\nclass UserExtractTypedDict(TypedDict):\n    name: str\n    age: int\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_typed_dict(model, mode, client):\n    if mode in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        pytest.skip(\"Avoiding testing responses tools with openai\")\n\n    client = instructor.patch(client, mode=mode)\n    model = client.chat.completions.create(\n        model=model,\n        response_model=UserExtractTypedDict,\n        max_retries=2,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\n    assert isinstance(model, BaseModel), \"Should be instance of a pydantic model\"\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    assert hasattr(model, \"_raw_response\"), (\n        \"The raw response should be available from OpenAI\"\n    )\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_runmodel(model, mode, client):\n    if mode in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        pytest.skip(\"Avoiding testing responses tools with openai\")\n\n    client = instructor.patch(client, mode=mode)\n    model = client.chat.completions.create(\n        model=model,\n        response_model=UserExtract,\n        max_retries=2,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\n    assert isinstance(model, UserExtract), \"Should be instance of UserExtract\"\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    assert hasattr(model, \"_raw_response\"), (\n        \"The raw response should be available from OpenAI\"\n    )\n\n    ChatCompletion(**model._raw_response.model_dump())\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\n@pytest.mark.asyncio\nasync def test_runmodel_async(model, mode, aclient):\n    if mode in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        pytest.skip(\"Avoiding testing responses tools with openai\")\n\n    aclient = instructor.patch(aclient, mode=mode)\n    model = await aclient.chat.completions.create(\n        model=model,\n        response_model=UserExtract,\n        max_retries=2,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\n    assert isinstance(model, UserExtract), \"Should be instance of UserExtract\"\n    assert model.name.lower() == \"jason\"\n    assert model.age == 25\n    assert hasattr(model, \"_raw_response\"), (\n        \"The raw response should be available from OpenAI\"\n    )\n\n    ChatCompletion(**model._raw_response.model_dump())\n\n\nclass UserExtractValidated(BaseModel):\n    name: str\n    age: int\n\n    @field_validator(\"name\")\n    @classmethod\n    def validate_name(cls, v):\n        if v.upper() != v:\n            raise ValueError(\n                \"Name should have all letters in uppercase. Make sure to use the `uppercase` form of the name\"\n            )\n        return v\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_runmodel_validator(model, mode, client):\n    if mode in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        pytest.skip(\"Avoiding testing responses tools with openai\")\n    client = instructor.patch(client, mode=mode)\n    model = client.chat.completions.create(\n        model=model,\n        response_model=UserExtractValidated,\n        max_retries=2,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\n    assert isinstance(model, UserExtractValidated), \"Should be instance of UserExtract\"\n    assert model.name == \"JASON\"\n    assert hasattr(model, \"_raw_response\"), (\n        \"The raw response should be available from OpenAI\"\n    )\n\n    ChatCompletion(**model._raw_response.model_dump())\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\n@pytest.mark.asyncio\nasync def test_runmodel_async_validator(model, mode, aclient):\n    if mode in {\n        instructor.Mode.RESPONSES_TOOLS,\n        instructor.Mode.RESPONSES_TOOLS_WITH_INBUILT_TOOLS,\n    }:\n        pytest.skip(\"Avoiding testing responses tools with openai\")\n    aclient = instructor.patch(aclient, mode=mode)\n    model = await aclient.chat.completions.create(\n        model=model,\n        response_model=UserExtractValidated,\n        max_retries=2,\n        messages=[\n            {\"role\": \"user\", \"content\": \"Extract jason is 25 years old\"},\n        ],\n    )\n    assert isinstance(model, UserExtractValidated), \"Should be instance of UserExtract\"\n    assert model.name == \"JASON\"\n    assert hasattr(model, \"_raw_response\"), (\n        \"The raw response should be available from OpenAI\"\n    )\n\n    ChatCompletion(**model._raw_response.model_dump())\n"
  },
  {
    "path": "tests/llm/test_openai/test_validation_context.py",
    "content": "from typing import Annotated\nfrom pydantic import BaseModel, Field, ValidationInfo, field_validator\nimport pytest\nimport instructor\nfrom .util import models, modes\nfrom itertools import product\n\n\nclass Message(BaseModel):\n    content: Annotated[str, Field(..., description=\"The content to be checked\")]\n\n    @field_validator(\"content\")\n    @classmethod\n    def no_banned_words(cls, v: str, info: ValidationInfo):\n        context = info.context\n        if context:\n            banned_words = context.get(\"banned_words\", [])\n            banned_words_found = [\n                word for word in banned_words if word.lower() in v.lower()\n            ]\n            if banned_words_found:\n                raise ValueError(\n                    f\"Banned words found in content: {', '.join(banned_words_found)}. Please rewrite without using these words.\"\n                )\n        return v\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_banned_words_validation(model: str, mode: instructor.Mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    # Test with content containing a banned word\n    with pytest.raises(Exception):  # noqa: B017\n        response = client.chat.completions.create(\n            model=model,\n            response_model=Message,\n            max_retries=0,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"Say the word `hate`.\",\n                },\n            ],\n            context={\"banned_words\": [\"hate\", \"violence\", \"discrimination\"]},\n        )\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_banned_words_validation_old(model: str, mode: instructor.Mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    # Test with content containing a banned word\n    with pytest.raises(Exception):  # noqa: B017\n        response = client.chat.completions.create(\n            model=model,\n            response_model=Message,\n            max_retries=0,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"Say the word `hate`.\",\n                },\n            ],\n            validation_context={\"banned_words\": [\"hate\", \"violence\", \"discrimination\"]},\n        )\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_no_banned_words_validation(model: str, mode: instructor.Mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    # Test with content containing a banned word\n    response = client.chat.completions.create(\n        model=model,\n        response_model=Message,\n        max_retries=0,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Say the word `love`.\",\n            },\n        ],\n        context={\"banned_words\": [\"hate\", \"violence\", \"discrimination\"]},\n    )\n\n    assert response.content == \"love\", f\"Expected 'love', got {response.content}\"\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_forced_words_validation(model: str, mode: instructor.Mode, client):\n    class Response(BaseModel):\n        content: str\n\n        @field_validator(\"content\")\n        @classmethod\n        def must_contain_words(cls, v: str, info: ValidationInfo):\n            context = info.context\n            if context:\n                must_contain_words = context.get(\"must_contain_words\", [])\n                missing_words = [\n                    word for word in must_contain_words if word.lower() not in v.lower()\n                ]\n                if missing_words:\n                    error_message = f\"Content must contain the following words: {', '.join(missing_words)}\"\n                    raise ValueError(error_message)\n            return v\n\n    client = instructor.from_openai(client, mode=mode)\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=Response,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"\"\"\n                Make a sentence that contains the words \n                {% for word in must_contain_words %}\n                `{{ word }}`\n                {% endfor %}\n                \"\"\",\n            },\n        ],\n        context={\"must_contain_words\": [\"love\", \"peace\", \"joy\"]},\n    )\n    assert \"love\" in response.content.lower()\n    assert \"peace\" in response.content.lower()\n    assert \"joy\" in response.content.lower()\n"
  },
  {
    "path": "tests/llm/test_openai/test_validators.py",
    "content": "from itertools import product\nimport pytest\n\nimport instructor\n\nfrom typing import Annotated\nfrom pydantic import BaseModel, AfterValidator, BeforeValidator, ValidationError\n\nfrom instructor.validation import llm_validator\nfrom .util import models, modes\n\n\ndef test_patch_completes_successfully(client):\n    class Response(BaseModel):\n        message: Annotated[\n            str, AfterValidator(instructor.openai_moderation(client=client))\n        ]\n\n    with pytest.raises(ValidationError):\n        Response(message=\"I want to make them suffer the consequences\")\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_runmodel_validator_error(model, mode, client):\n    client = instructor.from_openai(client, mode=mode)\n\n    if mode == instructor.Mode.TOOLS_STRICT:\n        # TODO: Structured outputs currently doesn't support the concept of Validators ( This is Pydantic specific ) so perhaps come back to this later\n        pytest.skip(\"Skipping test for structured output\")\n\n    class QuestionAnswerNoEvil(BaseModel):\n        question: str\n        answer: Annotated[\n            str,\n            BeforeValidator(\n                llm_validator(\n                    \"don't say objectionable things\", model=model, client=client\n                )\n            ),\n        ]\n\n    with pytest.raises(ValidationError):\n        QuestionAnswerNoEvil(\n            question=\"What is the meaning of life?\",\n            answer=\"The meaning of life is to be evil and steal\",\n        )\n\n\n@pytest.mark.parametrize(\"model\", models)\ndef test_runmodel_validator_default_openai_client(model, client):\n    client = instructor.from_openai(client)\n\n    class QuestionAnswerNoEvil(BaseModel):\n        question: str\n        answer: Annotated[\n            str,\n            BeforeValidator(\n                llm_validator(\n                    \"don't say objectionable things\", model=model, client=client\n                )\n            ),\n        ]\n\n    with pytest.raises(ValidationError):\n        QuestionAnswerNoEvil(\n            question=\"What is the meaning of life?\",\n            answer=\"The meaning of life is to be evil and steal\",\n        )\n"
  },
  {
    "path": "tests/llm/test_openai/util.py",
    "content": "import instructor\n\nmodels = [\"gpt-4o-mini\"]\nmodes = [\n    instructor.Mode.TOOLS,\n]\n"
  },
  {
    "path": "tests/llm/test_vertexai/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_vertexai/conftest.py",
    "content": "import os\nimport pytest\n\nif not os.getenv(\"GOOGLE_API_KEY\"):\n    pytest.skip(\n        \"GOOGLE_API_KEY environment variable not set\",\n        allow_module_level=True,\n    )\n\ntry:\n    import vertexai  # noqa: F401\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\n        \"google-cloud-aiplatform package is not installed\", allow_module_level=True\n    )\n"
  },
  {
    "path": "tests/llm/test_vertexai/test_deprecated_async.py",
    "content": "import pytest\nfrom unittest.mock import patch, MagicMock\nfrom pydantic import BaseModel\nfrom instructor import from_vertexai\nfrom instructor.core.exceptions import ConfigurationError\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n@patch(\"instructor.client_vertexai.isinstance\", return_value=True)\ndef test_deprecated_async_warning(_):\n    \"\"\"Test that using _async parameter raises a deprecation warning.\"\"\"\n    mock_model = MagicMock()\n    mock_model.generate_content = MagicMock()\n    mock_model.generate_content_async = MagicMock()\n\n    with pytest.warns(\n        DeprecationWarning, match=\"'_async' is deprecated. Use 'use_async' instead.\"\n    ):\n        client = from_vertexai(mock_model, _async=True)\n\n\n@patch(\"instructor.client_vertexai.isinstance\", return_value=True)\ndef test_both_async_params_error(_):\n    \"\"\"Test that providing both _async and use_async raises an error.\"\"\"\n    mock_model = MagicMock()\n    mock_model.generate_content = MagicMock()\n    mock_model.generate_content_async = MagicMock()\n\n    with pytest.raises(\n        ConfigurationError,\n        match=\"Cannot provide both '_async' and 'use_async'. Use 'use_async' instead.\",\n    ):\n        client = from_vertexai(mock_model, _async=True, use_async=True)\n"
  },
  {
    "path": "tests/llm/test_vertexai/test_format.py",
    "content": "import instructor\nfrom pydantic import BaseModel\nfrom .util import models, modes\nimport pytest\nfrom itertools import product\nimport vertexai.generative_models as gm\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\n@pytest.mark.parametrize(\"model, mode, is_list\", product(models, modes, [True, False]))\ndef test_format_string(model, mode, is_list):\n    client = instructor.from_vertexai(\n        gm.GenerativeModel(model),\n        mode=mode,\n    )\n\n    content = (\n        [gm.Part.from_text(\"Extract {{name}} is {{age}} years old.\")]\n        if is_list\n        else \"Extract {{name}} is {{age}} years old.\"\n    )\n\n    # note that client.chat.completions.create will also work\n    resp = client.messages.create(\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            }\n        ],\n        response_model=User,\n        context={\"name\": \"Jason\", \"age\": 25},\n    )\n\n    assert isinstance(resp, User)\n    assert resp.name == \"Jason\"\n    assert resp.age == 25\n"
  },
  {
    "path": "tests/llm/test_vertexai/test_message_parser.py",
    "content": "import pytest\nimport vertexai.generative_models as gm\nfrom instructor.providers.vertexai.client import vertexai_message_parser\n\n\ndef test_vertexai_message_parser_string_content():\n    message = {\"role\": \"user\", \"content\": \"Hello, world!\"}\n    result = vertexai_message_parser(message)\n\n    assert isinstance(result, gm.Content)\n    assert result.role == \"user\"\n    assert len(result.parts) == 1\n    assert isinstance(result.parts[0], gm.Part)\n    assert result.parts[0].text == \"Hello, world!\"\n\n\ndef test_vertexai_message_parser_list_content():\n    message = {\n        \"role\": \"user\",\n        \"content\": [\n            \"Hello, \",\n            gm.Part.from_text(\"world!\"),\n            gm.Part.from_text(\" How are you?\"),\n        ],\n    }\n    result = vertexai_message_parser(message)\n\n    assert isinstance(result, gm.Content)\n    assert result.role == \"user\"\n    assert len(result.parts) == 3\n    assert isinstance(result.parts[0], gm.Part)\n    assert isinstance(result.parts[1], gm.Part)\n    assert isinstance(result.parts[2], gm.Part)\n    assert result.parts[0].text == \"Hello, \"\n    assert result.parts[1].text == \"world!\"\n    assert result.parts[2].text == \" How are you?\"\n\n\ndef test_vertexai_message_parser_invalid_content():\n    message = {\"role\": \"user\", \"content\": 123}  # Invalid content type\n\n    with pytest.raises(ValueError, match=\"Unsupported message content type\"):\n        vertexai_message_parser(message)\n\n\ndef test_vertexai_message_parser_invalid_list_item():\n    message = {\"role\": \"user\", \"content\": [\"Hello\", 123, gm.Part.from_text(\"world!\")]}\n\n    with pytest.raises(ValueError, match=\"Unsupported content type in list\"):\n        vertexai_message_parser(message)\n"
  },
  {
    "path": "tests/llm/test_vertexai/test_modes.py",
    "content": "\"\"\"VertexAI-specific tests for mixed content types.\n\nTests VertexAI's ability to handle mixed content with gm.Part objects.\n\"\"\"\n\nfrom itertools import product\nfrom pydantic import BaseModel\nimport vertexai.generative_models as gm  # type: ignore\nimport pytest\nimport instructor\n\nfrom .util import models, modes\n\n\nclass Item(BaseModel):\n    name: str\n    price: float\n\n\nclass Order(BaseModel):\n    items: list[Item]\n    customer: str\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_mixed_content_types(model, mode):\n    client = instructor.from_vertexai(gm.GenerativeModel(model), mode)\n    content = [\n        \"Order Details:\",\n        gm.Part.from_text(\"Customer: Alice\"),\n        gm.Part.from_text(\"Items:\"),\n        \"Name: Laptop, Price: 999.99\",\n        \"Name: Mouse, Price: 29.99\",\n    ]\n\n    resp = client.create(\n        response_model=Order,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n    )\n\n    assert len(resp.items) == 2\n    assert {x.name.lower() for x in resp.items} == {\"laptop\", \"mouse\"}\n    assert {x.price for x in resp.items} == {999.99, 29.99}\n    assert resp.customer.lower() == \"alice\"\n"
  },
  {
    "path": "tests/llm/test_vertexai/util.py",
    "content": "import instructor\n\nmodels = [\"gemini-3-flash\"]\nmodes = [instructor.Mode.VERTEXAI_TOOLS, instructor.Mode.VERTEXAI_JSON]\n"
  },
  {
    "path": "tests/llm/test_writer/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_writer/conftest.py",
    "content": "import os\nimport pytest\n\nif not os.getenv(\"WRITER_API_KEY\"):\n    pytest.skip(\"WRITER_API_KEY environment variable not set\", allow_module_level=True)\n\ntry:\n    import writerai  # noqa: F401\nexcept ImportError:  # pragma: no cover - optional dependency\n    pytest.skip(\"writer-sdk package is not installed\", allow_module_level=True)\n\n\n@pytest.fixture(scope=\"session\", autouse=True)\ndef configure_writer():\n    pass\n"
  },
  {
    "path": "tests/llm/test_writer/evals/__init__.py",
    "content": ""
  },
  {
    "path": "tests/llm/test_writer/evals/test_classification_enums.py",
    "content": "import enum\nfrom itertools import product\nfrom writerai import Writer\n\nimport pytest\nimport instructor\n\nfrom pydantic import BaseModel\n\nfrom instructor.mode import Mode\nfrom ..util import models, modes\n\n\nclass Labels(str, enum.Enum):\n    SPAM = \"spam\"\n    NOT_SPAM = \"not_spam\"\n\n\nclass SinglePrediction(BaseModel):\n    \"\"\"\n    Correct class label for the given text\n    \"\"\"\n\n    class_label: Labels\n\n\ndata = [\n    (\n        \"I am a spammer\",\n        Labels.SPAM,\n    ),\n    (\n        \"I am not a spammer\",\n        Labels.NOT_SPAM,\n    ),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, data, modes))\ndef test_writer_classification(\n    model: str, data: list[tuple[str, Labels]], mode: instructor.Mode\n):\n    client = instructor.from_writer(client=Writer(), mode=mode)\n\n    input, expected = data\n    resp = client.chat.completions.create(\n        model=model,\n        response_model=SinglePrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: {input}. \"\n                f\"Apply this or another class only in cases when \"\n                f\"when you are 100% sure.\",\n            },\n        ],\n    )\n    assert resp.class_label == expected\n\n\nclass MultiLabels(str, enum.Enum):\n    BILLING = \"billing\"\n    GENERAL_QUERY = \"general_query\"\n    HARDWARE = \"hardware\"\n\n\nclass MultiClassPrediction(BaseModel):\n    predicted_labels: list[MultiLabels]\n\n\ndata = [\n    (\n        \"I am having trouble with my billing\",\n        [MultiLabels.BILLING],\n    ),\n    (\n        \"I am having trouble with my hardware\",\n        [MultiLabels.HARDWARE],\n    ),\n    (\n        \"I have a general query and a billing issue\",\n        [MultiLabels.GENERAL_QUERY, MultiLabels.BILLING],\n    ),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, data, modes))\ndef test_writer_multi_classify(\n    model: str, data: list[tuple[str, list[MultiLabels]]], mode: instructor.Mode\n):\n    client = instructor.from_writer(client=Writer(), mode=mode)\n\n    if (mode, model) in {\n        (Mode.JSON, \"gpt-3.5-turbo\"),\n        (Mode.JSON, \"gpt-4\"),\n    }:\n        pytest.skip(f\"{mode} mode is not supported for {model}, skipping test\")\n\n    input, expected = data\n\n    resp = client.chat.completions.create(\n        model=model,\n        response_model=MultiClassPrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following support ticket: {input} \"\n                f\"Apply this or another class only in cases when \"\n                f\"when you are 100% sure.\",\n            },\n        ],\n    )\n    assert set(resp.predicted_labels) == set(expected)\n"
  },
  {
    "path": "tests/llm/test_writer/evals/test_classification_literals.py",
    "content": "from itertools import product\nfrom typing import Literal\nfrom writerai import AsyncWriter\n\nimport pytest\nimport instructor\n\nfrom pydantic import BaseModel\n\nfrom ..util import models, modes\n\n\nclass SinglePrediction(BaseModel):\n    \"\"\"\n    Correct class label for the given text\n    \"\"\"\n\n    class_label: Literal[\"spam\", \"not_spam\"]\n\n\ndata = [\n    (\"I am a spammer\", \"spam\"),\n    (\"I am not a spammer\", \"not_spam\"),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, data, modes))\n@pytest.mark.asyncio\nasync def test_classification(\n    model: str,\n    data: list[tuple[str, Literal[\"spam\", \"not_spam\"]]],\n    mode: instructor.Mode,\n):\n    client = instructor.from_writer(client=AsyncWriter(), mode=mode)\n\n    input, expected = data\n    resp = await client.chat.completions.create(\n        model=model,\n        response_model=SinglePrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following text: {input}\",\n            },\n        ],\n    )\n    assert resp.class_label == expected\n\n\nclass MultiClassPrediction(BaseModel):\n    predicted_labels: list[Literal[\"billing\", \"general_query\", \"hardware\"]]\n\n\ndata = [\n    (\n        \"I am having trouble with my billing\",\n        [\"billing\"],\n    ),\n    (\n        \"I am having trouble with my hardware\",\n        [\"hardware\"],\n    ),\n    (\n        \"I have a general query and a billing issue\",\n        [\"general_query\", \"billing\"],\n    ),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, data, modes))\n@pytest.mark.asyncio\nasync def test_writer_multi_classify(\n    model: str,\n    data: list[tuple[str, list[Literal[\"billing\", \"general_query\", \"hardware\"]]]],\n    mode: instructor.Mode,\n):\n    client = instructor.from_writer(client=AsyncWriter(), mode=mode)\n\n    input, expected = data\n\n    resp = await client.chat.completions.create(\n        model=model,\n        response_model=MultiClassPrediction,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": f\"Classify the following support ticket: {input}. \"\n                f\"Apply this or another class only in cases when \"\n                f\"you sure by 100%.\",\n            },\n        ],\n    )\n    assert set(resp.predicted_labels) == set(expected)\n"
  },
  {
    "path": "tests/llm/test_writer/evals/test_entities.py",
    "content": "from itertools import product\nfrom writerai import Writer\n\nfrom pydantic import BaseModel, Field\nimport pytest\n\nimport instructor\nfrom instructor import Instructor\n\nfrom ..util import models, modes\n\n\nclass Property(BaseModel):\n    key: str\n    value: str\n    resolved_absolute_value: str\n\n\nclass Entity(BaseModel):\n    id: int = Field(\n        ...,\n        description=\"Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities\",\n    )\n    subquote_string: list[str] = Field(\n        ...,\n        description=\"Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution\",\n    )\n    entity_title: str\n    properties: list[Property] = Field(\n        ..., description=\"List of properties of the entity\"\n    )\n    dependencies: list[int] = Field(\n        ...,\n        description=\"List of entity ids that this entity depends  or relies on to resolve it\",\n    )\n\n\nclass DocumentExtraction(BaseModel):\n    entities: list[Entity] = Field(\n        ...,\n        description=\"Body of the answer, each fact should be its separate object with a body and a list of sources\",\n    )\n\n\ndef ask_ai(content: str, model: str, client: Instructor) -> DocumentExtraction:\n    resp: DocumentExtraction = client.chat.completions.create(\n        model=model,\n        response_model=DocumentExtraction,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a perfect entity resolution system that extracts facts from the document. Extract and resolve a list of entities from the following document:\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n        max_retries=4,\n    )  # type: ignore\n    return resp\n\n\ncontent = \"\"\"\nSample Legal Contract\nAgreement Contract\n\nThis Agreement is made and entered into on 2020-01-01 by and between Company A (\"the Client\") and Company B (\"the Service Provider\").\n\nArticle 1: Scope of Work\n\nThe Service Provider will deliver the software product to the Client 30 days after the agreement date.\n\nArticle 2: Payment Terms\n\nThe total payment for the service is $50,000.\nAn initial payment of $10,000 will be made within 7 days of the the signed date.\nThe final payment will be due 45 days after [SignDate].\n\nArticle 3: Confidentiality\n\nThe parties agree not to disclose any confidential information received from the other party for 3 months after the final payment date.\n\nArticle 4: Termination\n\nThe contract can be terminated with a 30-day notice, unless there are outstanding obligations that must be fulfilled after the [DeliveryDate].\n\"\"\"\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_extract(model: str, mode: instructor.Mode):\n    client = instructor.from_writer(client=Writer(), mode=mode)\n\n    extract = ask_ai(content=content, model=model, client=client)\n    assert len(extract.entities) > 0\n"
  },
  {
    "path": "tests/llm/test_writer/evals/test_extract_users.py",
    "content": "import pytest\nfrom itertools import product\nfrom pydantic import BaseModel\nfrom writerai import Writer\nimport instructor\nfrom ..util import models, modes\n\n\nclass UserDetails(BaseModel):\n    first_name: str\n    age: int\n\n\ntest_data = [\n    (\"Jason is 10\", \"Jason\", 10),\n    (\"Alice is 25\", \"Alice\", 25),\n    (\"Bob is 35\", \"Bob\", 35),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, test_data, modes))\ndef test_writer_extract(\n    model: str, data: list[tuple[str, str, int]], mode: instructor.Mode\n):\n    client = instructor.from_writer(client=Writer(), mode=mode)\n\n    sample_data, expected_name, expected_age = data\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=UserDetails,\n        messages=[\n            {\"role\": \"user\", \"content\": sample_data},\n        ],\n    )\n\n    assert response.first_name == expected_name, (\n        f\"Expected name {expected_name}, got {response.first_name}\"\n    )\n    assert response.age == expected_age, (\n        f\"Expected age {expected_age}, got {response.age}\"\n    )\n"
  },
  {
    "path": "tests/llm/test_writer/evals/test_sentiment_analysis.py",
    "content": "import enum\nfrom itertools import product\n\nfrom pydantic import BaseModel\nfrom writerai import Writer\nimport pytest\nimport instructor\nfrom ..util import models, modes\n\n\nclass Sentiment(str, enum.Enum):\n    POSITIVE = \"positive\"\n    NEGATIVE = \"negative\"\n    NEUTRAL = \"neutral\"\n\n\nclass SentimentAnalysis(BaseModel):\n    sentiment: Sentiment\n\n\ntest_data = [\n    (\n        \"I absolutely love this product! It has exceeded all my expectations.\",\n        Sentiment.POSITIVE,\n    ),\n    (\n        \"The service was terrible. I will never use this company again.\",\n        Sentiment.NEGATIVE,\n    ),\n    (\n        \"The movie was okay. It had some good moments but overall it was average.\",\n        Sentiment.NEUTRAL,\n    ),\n]\n\n\n@pytest.mark.parametrize(\"model, data, mode\", product(models, test_data, modes))\ndef test_writer_sentiment_analysis(\n    model: str, data: list[tuple[str, Sentiment]], mode: instructor.Mode\n):\n    client = instructor.from_writer(client=Writer(), mode=mode)\n\n    sample_data, expected_sentiment = data\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=SentimentAnalysis,\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You are a sentiment analysis model. Analyze the sentiment of the given text and provide the sentiment (positive, negative, or neutral).\",\n            },\n            {\"role\": \"user\", \"content\": sample_data},\n        ],\n    )\n\n    assert response.sentiment == expected_sentiment\n"
  },
  {
    "path": "tests/llm/test_writer/test_format_common_models.py",
    "content": "from instructor import from_writer\nfrom writerai import Writer, AsyncWriter\nfrom pydantic import BaseModel\nfrom .util import models, modes\n\n\nclass User(BaseModel):\n    first_name: str\n    age: int\n\n\nclass UserList(BaseModel):\n    items: list[User]\n\n\nimport pytest\nfrom itertools import product\n\nimport instructor\nimport enum\n\nfrom typing import Literal\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_literal(model: str, mode: instructor.Mode):\n    client = instructor.from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=Literal[\"1231\", \"212\", \"331\"],\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Produce a Random but correct response given the desired output\",\n            },\n        ],\n    )\n    assert response in [\"1231\", \"212\", \"331\"]\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_enum(model: str, mode: instructor.Mode):\n    class Options(enum.Enum):\n        A = \"A\"\n        B = \"B\"\n        C = \"C\"\n\n    client = instructor.from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=Options,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Produce a Random but correct response given the desired output\",\n            },\n        ],\n    )\n    assert response in [Options.A, Options.B, Options.C]\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_bool(model: str, mode: instructor.Mode):\n    client = instructor.from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=bool,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Produce a Random but correct response given the desired output\",\n            },\n        ],\n    )\n    assert type(response) == bool\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_sync(model: str, mode: instructor.Mode):\n    client = from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    response = client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract {{name}} is {{age}} years old.\",\n            }\n        ],\n        response_model=User,\n        context={\"name\": \"Jason\", \"age\": 25},\n    )\n\n    assert isinstance(response, User)\n    assert response.first_name == \"Jason\"\n    assert response.age == 25\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\n@pytest.mark.asyncio\nasync def test_writer_format_async(mode: instructor.Mode, model: str):\n    client = instructor.from_writer(\n        client=AsyncWriter(),\n        mode=mode,\n    )\n\n    response = await client.chat.completions.create(\n        model=model,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract a user from this sentence : {{name}} is {{age}} and lives in Berlin\",\n            },\n        ],\n        context={\n            \"name\": \"Yan\",\n            \"age\": 27,\n        },\n        response_model=User,\n    )\n\n    assert response.first_name == \"Yan\"\n    assert response.age == 27\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_list_of_strings(mode: instructor.Mode, model: str):\n    client = instructor.from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    users = [\n        {\n            \"name\": \"Jason\",\n            \"age\": 25,\n        },\n        {\n            \"name\": \"Elizabeth\",\n            \"age\": 12,\n        },\n        {\n            \"name\": \"Chris\",\n            \"age\": 27,\n        },\n    ]\n\n    prompt = \"\"\"\n    Extract a list of users from the following text:\n\n    {% for user in users %}\n    - Name: {{ user.name }}, Age: {{ user.age }}\n    {% endfor %}\n    \"\"\"\n    response = client.chat.completions.create(\n        model=model,\n        response_model=UserList,\n        messages=[\n            {\"role\": \"user\", \"content\": prompt},\n        ],\n        context={\"users\": users},\n    )\n\n    assert isinstance(response, UserList), \"Result should be an instance of UserList\"\n    assert isinstance(response.items, list), \"items should be a list\"\n    assert len(response.items) == 3, \"List should contain 3 items\"\n\n    names = [item.first_name for item in response.items]\n    assert \"Jason\" in names, \"'Jason' should be in the list\"\n    assert \"Elizabeth\" in names, \"'Elizabeth' should be in the list\"\n    assert \"Chris\" in names, \"'Chris' should be in the list\"\n"
  },
  {
    "path": "tests/llm/test_writer/test_format_difficult_models.py",
    "content": "from itertools import product\nfrom pydantic import BaseModel\nfrom writerai import Writer\nimport pytest\n\nimport instructor\nfrom .util import models, modes\n\n\nclass Item(BaseModel):\n    name: str\n    price: float\n\n\nclass Order(BaseModel):\n    items: list[Item]\n    customer: str\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_nested_model(mode: instructor.Mode, model: str):\n    client = instructor.from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    content = \"\"\"\n    Order Details:\n    Customer: Jason\n    Items:\n\n    Name: Apple, Price: 0.50\n    Name: Bread, Price: 2.00\n    Name: Milk, Price: 1.50\n    \"\"\"\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=Order,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n    )\n\n    assert len(response.items) == 3\n    assert {x.name.lower() for x in response.items} == {\"apple\", \"bread\", \"milk\"}\n    assert {x.price for x in response.items} == {0.5, 2.0, 1.5}\n    assert response.customer.lower() == \"jason\"\n\n\nclass Book(BaseModel):\n    title: str\n    author: str\n    genre: str\n    isbn: str\n\n\nclass LibraryRecord(BaseModel):\n    books: list[Book]\n    visitor: str\n    library_id: str\n\n\n@pytest.mark.parametrize(\"model, mode\", product(models, modes))\ndef test_writer_format_complex_nested_model(mode: instructor.Mode, model: str):\n    client = instructor.from_writer(\n        client=Writer(),\n        mode=mode,\n    )\n\n    content = \"\"\"\n    Library visit details:\n    Visitor: Jason\n    Library ID: LIB123456\n    Books checked out:\n    - Title: The Great Adventure, Author: Jane Doe, Genre: Fantasy, ISBN: 1234567890\n    - Title: History of Tomorrow, Author: John Smith, Genre: Non-Fiction, ISBN: 0987654321\n    \"\"\"\n\n    response = client.chat.completions.create(\n        model=model,\n        response_model=LibraryRecord,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": content,\n            },\n        ],\n    )\n\n    assert response.visitor.lower() == \"jason\"\n    assert response.library_id == \"LIB123456\"\n    assert len(response.books) == 2\n    assert {book.title for book in response.books} == {\n        \"The Great Adventure\",\n        \"History of Tomorrow\",\n    }\n    assert {book.author for book in response.books} == {\"Jane Doe\", \"John Smith\"}\n    assert {book.genre for book in response.books} == {\"Fantasy\", \"Non-Fiction\"}\n    assert {book.isbn for book in response.books} == {\"1234567890\", \"0987654321\"}\n"
  },
  {
    "path": "tests/llm/test_writer/util.py",
    "content": "import instructor\n\nmodels: list[str] = [\"palmyra-x4\", \"palmyra-x5\"]\nmodes = [instructor.Mode.WRITER_TOOLS, instructor.Mode.WRITER_JSON]\n"
  },
  {
    "path": "tests/processing/test_anthropic_json.py",
    "content": "\"\"\"Isolated tests for Anthropic JSON parsing helpers.\"\"\"\n\nfrom anthropic.types import Message, Usage\nimport pytest\nfrom pydantic import ValidationError\nfrom typing import cast\n\nimport instructor\n\n\nCONTROL_CHAR_JSON = \"\"\"{\n\"data\": \"Claude likes\ncontrol\ncharacters\"\n}\"\"\"\n\n\nclass _AnthropicTestModel(instructor.OpenAISchema):  # type: ignore[misc]\n    data: str\n\n\ndef _build_message(data_content: str) -> Message:\n    return Message(\n        id=\"test_id\",\n        content=[{\"type\": \"text\", \"text\": data_content}],\n        model=\"claude-3-haiku-20240307\",\n        role=\"assistant\",\n        stop_reason=\"end_turn\",\n        stop_sequence=None,\n        type=\"message\",\n        usage=Usage(input_tokens=10, output_tokens=10),\n    )\n\n\ndef test_parse_anthropic_json_strict_control_characters() -> None:\n    message = _build_message(CONTROL_CHAR_JSON)\n\n    with pytest.raises(ValidationError):\n        _AnthropicTestModel.parse_anthropic_json(message, strict=True)  # type: ignore[arg-type]\n\n\ndef test_parse_anthropic_json_non_strict_preserves_control_characters() -> None:\n    message = _build_message(CONTROL_CHAR_JSON)\n\n    model = cast(\n        _AnthropicTestModel,\n        _AnthropicTestModel.parse_anthropic_json(message, strict=False),  # type: ignore[arg-type]\n    )\n\n    assert model.data == \"Claude likes\\ncontrol\\ncharacters\"\n"
  },
  {
    "path": "tests/test_auto_client.py",
    "content": "from __future__ import annotations\n\nimport pytest\nfrom instructor.auto_client import from_provider\nfrom pydantic import BaseModel\n\n\n# --- User model and prompt (from main.py) ---\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nUSER_EXTRACTION_PROMPT = {\n    \"role\": \"user\",\n    \"content\": \"Ivan is 28 and strays in Singapore. Extract it as a user object\",\n}\n\n# --- Providers to test (from main.py) ---\nPROVIDERS = [\n    \"anthropic/claude-3-5-haiku-latest\",\n    \"google/gemini-pro\",\n    \"openai/gpt-4o-mini\",\n    \"azure_openai/gpt-4o-mini\",\n    \"mistral/ministral-8b-latest\",\n    \"cohere/command-a-03-2025\",\n    \"perplexity/sonar-pro\",\n    \"groq/llama-3.1-8b-instant\",\n    \"writer/palmyra-x5\",\n    \"cerebras/llama-4-scout-17b-16e-instruct\",\n    \"deepseek/deepseek-chat\",\n    \"fireworks/accounts/fireworks/models/llama4-maverick-instruct-basic\",\n    \"vertexai/gemini-3-flash\",\n]\n\n\ndef should_skip_provider(provider_string: str) -> bool:\n    import os\n\n    if os.getenv(\"INSTRUCTOR_ENV\") == \"CI\":\n        return provider_string not in [\n            \"cohere/command-a-03-2025\",\n            \"google/gemini-pro\",\n            \"openai/gpt-4o-mini\",\n        ]\n    return False\n\n\n@pytest.mark.parametrize(\"provider_string\", PROVIDERS)\ndef test_user_extraction_sync(provider_string):\n    \"\"\"Test user extraction for each provider (sync).\"\"\"\n\n    if should_skip_provider(provider_string):\n        pytest.skip(f\"Skipping provider {provider_string} on CI\")\n        return\n\n    try:\n        client = from_provider(provider_string)  # type: ignore[arg-type]\n        response = client.chat.completions.create(\n            messages=[USER_EXTRACTION_PROMPT],  # type: ignore[arg-type]\n            response_model=User,\n        )\n        assert isinstance(response, User)\n        assert response.name.lower() == \"ivan\"\n        assert response.age == 28\n    except Exception as e:\n        pytest.skip(f\"Provider {provider_string} not available or failed: {e}\")\n\n\n@pytest.mark.parametrize(\"provider_string\", PROVIDERS)\n@pytest.mark.asyncio\nasync def test_user_extraction_async(provider_string):\n    \"\"\"Test user extraction for each provider (async).\"\"\"\n\n    if should_skip_provider(provider_string):\n        pytest.skip(f\"Skipping provider {provider_string} on CI\")\n        return\n\n    try:\n        client = from_provider(provider_string, async_client=True)  # type: ignore[arg-type]\n        response = await client.chat.completions.create(\n            messages=[USER_EXTRACTION_PROMPT],  # type: ignore[arg-type]\n            response_model=User,\n        )\n        assert isinstance(response, User)\n        assert response.name.lower() == \"ivan\"\n        assert response.age == 28\n    except Exception as e:\n        pytest.skip(f\"Provider {provider_string} not available or failed: {e}\")\n\n\ndef test_invalid_provider_format():\n    \"\"\"Test that error is raised for invalid provider format.\"\"\"\n    from instructor.core.exceptions import ConfigurationError\n\n    with pytest.raises(ConfigurationError) as excinfo:\n        from_provider(\"invalid-format\")\n    assert \"Model string must be in format\" in str(excinfo.value)\n\n\ndef test_unsupported_provider():\n    \"\"\"Test that error is raised for unsupported provider.\"\"\"\n    from instructor.core.exceptions import ConfigurationError\n\n    with pytest.raises(ConfigurationError) as excinfo:\n        from_provider(\"unsupported/model\")\n    assert \"Unsupported provider\" in str(excinfo.value)\n\n\ndef test_additional_kwargs_passed():\n    \"\"\"Test that additional kwargs are passed to provider.\"\"\"\n    import instructor\n    from instructor.core.exceptions import InstructorRetryException\n    import os\n\n    if os.getenv(\"INSTRUCTOR_ENV\") == \"CI\":\n        pytest.skip(\"Skipping test on CI\")\n        return\n\n    client = instructor.from_provider(\n        \"anthropic/claude-3-5-haiku-latest\", max_tokens=10\n    )\n\n    with pytest.raises(InstructorRetryException) as excinfo:\n        client.chat.completions.create(\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"Generate a sentence with 20 characters\",\n                }\n            ],\n            response_model=str,\n        )\n\n    assert \"The output is incomplete due to a max_tokens length limit\" in str(\n        excinfo.value\n    )\n\n\ndef test_api_key_parameter_extraction():\n    \"\"\"Test that api_key parameter is correctly extracted from kwargs.\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    # Mock the openai module to avoid actual API calls\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        # Mock the from_openai import\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            # Test that api_key is passed to client constructor\n            from_provider(\"openai/gpt-4\", api_key=\"test-key-123\")\n\n            # Verify OpenAI was called with the api_key\n            mock_openai_class.assert_called_once()\n            _, kwargs = mock_openai_class.call_args\n            assert kwargs[\"api_key\"] == \"test-key-123\"\n\n\ndef test_api_key_parameter_with_environment_fallback():\n    \"\"\"Test that api_key parameter falls back to environment variables.\"\"\"\n    import os\n    from unittest.mock import patch, MagicMock\n\n    # Mock the openai module\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        # Mock the from_openai import\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            # Mock environment variable\n            with patch.dict(os.environ, {}, clear=True):\n                # Test with no api_key parameter and no environment variable\n                from_provider(\"openai/gpt-4\")\n\n                # Should still call OpenAI with None (which is the default behavior)\n                mock_openai_class.assert_called()\n                _, kwargs = mock_openai_class.call_args\n                assert kwargs[\"api_key\"] is None\n\n\ndef test_api_key_parameter_with_async_client():\n    \"\"\"Test that api_key parameter works with async clients.\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    # Mock the openai module\n    with patch(\"openai.AsyncOpenAI\") as mock_async_openai_class:\n        mock_client = MagicMock()\n        mock_async_openai_class.return_value = mock_client\n\n        # Mock the from_openai import\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            # Test with async client\n            from_provider(\"openai/gpt-4\", async_client=True, api_key=\"test-async-key\")\n\n            # Verify AsyncOpenAI was called with the api_key\n            mock_async_openai_class.assert_called_once()\n            _, kwargs = mock_async_openai_class.call_args\n            assert kwargs[\"api_key\"] == \"test-async-key\"\n\n\ndef test_api_key_parameter_not_passed_when_none():\n    \"\"\"Test that api_key parameter is handled correctly when None.\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    # Mock the openai module\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        # Mock the from_openai import\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            # Test with None api_key\n            from_provider(\"openai/gpt-4\", api_key=None)\n\n            # Verify OpenAI was called with None api_key\n            mock_openai_class.assert_called_once()\n            _, kwargs = mock_openai_class.call_args\n            assert kwargs[\"api_key\"] is None\n\n\ndef test_api_key_logging():\n    \"\"\"Test that api_key provision is logged correctly.\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    # Mock the openai module\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        # Mock the from_openai import\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            # Mock logger\n            with patch(\"instructor.auto_client.logger\") as mock_logger:\n                # Test that providing api_key triggers debug log\n                from_provider(\"openai/gpt-4\", api_key=\"test-key\")\n\n                # Check that debug was called with api_key message and length\n                debug_calls = [\n                    call\n                    for call in mock_logger.debug.call_args_list\n                    if \"API key provided\" in str(call) and \"length:\" in str(call)\n                ]\n                assert len(debug_calls) > 0, (\n                    \"Expected debug log for API key provision with length\"\n                )\n\n                # Verify the length is logged correctly (test-key is 8 characters)\n                mock_logger.debug.assert_called_with(\n                    \"API key provided for %s provider (length: %d characters)\",\n                    \"openai\",\n                    8,\n                    extra={\"provider\": \"openai\", \"operation\": \"initialize\"},\n                )\n\n\ndef test_openai_provider_respects_base_url():\n    \"\"\"Ensure OpenAI provider passes base_url to client constructor.\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            client = from_provider(\n                \"openai/gpt-4\",\n                base_url=\"https://api.example.com/v1\",\n                api_key=\"test-key\",\n            )\n\n            _, kwargs = mock_openai_class.call_args\n            assert kwargs[\"base_url\"] == \"https://api.example.com/v1\"\n            assert kwargs[\"api_key\"] == \"test-key\"\n            mock_from_openai.assert_called_once()\n            assert client is mock_instructor\n\n\ndef test_openai_provider_async_client_with_base_url():\n    \"\"\"Ensure OpenAI provider passes base_url to async client constructor.\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    with patch(\"openai.AsyncOpenAI\") as mock_async_openai_class:\n        mock_client = MagicMock()\n        mock_async_openai_class.return_value = mock_client\n\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            client = from_provider(\n                \"openai/gpt-4\",\n                async_client=True,\n                base_url=\"https://api.example.com/v1\",\n                api_key=\"test-key\",\n            )\n\n            mock_async_openai_class.assert_called_once()\n            _, kwargs = mock_async_openai_class.call_args\n            assert kwargs[\"base_url\"] == \"https://api.example.com/v1\"\n            assert kwargs[\"api_key\"] == \"test-key\"\n            mock_from_openai.assert_called_once()\n            assert client is mock_instructor\n\n\ndef test_openai_provider_without_base_url():\n    \"\"\"Ensure OpenAI provider works without base_url (defaults to api.openai.com).\"\"\"\n    from unittest.mock import patch, MagicMock\n\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            client = from_provider(\"openai/gpt-4\", api_key=\"test-key\")\n\n            _, kwargs = mock_openai_class.call_args\n            assert kwargs.get(\"base_url\") in (None, \"\")\n            assert kwargs[\"api_key\"] == \"test-key\"\n            mock_from_openai.assert_called_once()\n            assert client is mock_instructor\n\n\ndef test_databricks_provider_uses_environment_configuration():\n    \"\"\"Ensure Databricks provider pulls host and token from the environment.\"\"\"\n    from unittest.mock import patch, MagicMock\n    import os\n\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            with patch.dict(\n                os.environ,\n                {\n                    \"DATABRICKS_HOST\": \"https://example.cloud.databricks.com\",\n                    \"DATABRICKS_TOKEN\": \"secret-token\",\n                },\n                clear=True,\n            ):\n                client = from_provider(\"databricks/dbrx-instruct\")\n\n            mock_openai_class.assert_called_once()\n            _, kwargs = mock_openai_class.call_args\n            assert kwargs[\"api_key\"] == \"secret-token\"\n            assert (\n                kwargs[\"base_url\"]\n                == \"https://example.cloud.databricks.com/serving-endpoints\"\n            )\n            mock_from_openai.assert_called_once()\n            assert client is mock_instructor\n\n\ndef test_databricks_provider_respects_custom_base_url():\n    \"\"\"Ensure Databricks provider does not duplicate serving-endpoints suffix.\"\"\"\n    from unittest.mock import patch, MagicMock\n    import os\n\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_client = MagicMock()\n        mock_openai_class.return_value = mock_client\n\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            with patch.dict(\n                os.environ,\n                {\n                    \"DATABRICKS_TOKEN\": \"secret-token\",\n                },\n                clear=True,\n            ):\n                client = from_provider(\n                    \"databricks/dbrx-instruct\",\n                    base_url=\"https://example.cloud.databricks.com/serving-endpoints\",\n                )\n\n            _, kwargs = mock_openai_class.call_args\n            assert (\n                kwargs[\"base_url\"]\n                == \"https://example.cloud.databricks.com/serving-endpoints\"\n            )\n            mock_from_openai.assert_called_once()\n            assert client is mock_instructor\n\n\ndef test_databricks_provider_async_client():\n    \"\"\"Ensure Databricks provider returns async client when requested.\"\"\"\n    from unittest.mock import patch, MagicMock\n    import os\n\n    with patch(\"openai.AsyncOpenAI\") as mock_async_openai_class:\n        mock_client = MagicMock()\n        mock_async_openai_class.return_value = mock_client\n\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_instructor = MagicMock()\n            mock_from_openai.return_value = mock_instructor\n\n            with patch.dict(\n                os.environ,\n                {\n                    \"DATABRICKS_HOST\": \"https://example.cloud.databricks.com\",\n                    \"DATABRICKS_TOKEN\": \"secret-token\",\n                },\n                clear=True,\n            ):\n                client = from_provider(\"databricks/dbrx-instruct\", async_client=True)\n\n            mock_async_openai_class.assert_called_once()\n            _, kwargs = mock_async_openai_class.call_args\n            assert (\n                kwargs[\"base_url\"]\n                == \"https://example.cloud.databricks.com/serving-endpoints\"\n            )\n            assert kwargs[\"api_key\"] == \"secret-token\"\n            mock_from_openai.assert_called_once()\n            assert client is mock_instructor\n\n\ndef test_databricks_provider_requires_token():\n    \"\"\"Ensure Databricks provider raises when no token is available.\"\"\"\n    from instructor.core.exceptions import ConfigurationError\n    from unittest.mock import patch, MagicMock\n    import os\n\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_openai_class.return_value = MagicMock()\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_from_openai.return_value = MagicMock()\n            with patch.dict(\n                os.environ,\n                {\n                    \"DATABRICKS_HOST\": \"https://example.cloud.databricks.com\",\n                },\n                clear=True,\n            ):\n                with pytest.raises(ConfigurationError):\n                    from_provider(\"databricks/dbrx-instruct\")\n\n\ndef test_databricks_provider_requires_host():\n    \"\"\"Ensure Databricks provider raises when no host is available.\"\"\"\n    from instructor.core.exceptions import ConfigurationError\n    from unittest.mock import patch, MagicMock\n    import os\n\n    with patch(\"openai.OpenAI\") as mock_openai_class:\n        mock_openai_class.return_value = MagicMock()\n        with patch(\"instructor.from_openai\") as mock_from_openai:\n            mock_from_openai.return_value = MagicMock()\n            with patch.dict(\n                os.environ,\n                {\n                    \"DATABRICKS_TOKEN\": \"secret-token\",\n                },\n                clear=True,\n            ):\n                with pytest.raises(ConfigurationError):\n                    from_provider(\"databricks/dbrx-instruct\")\n\n\ndef test_genai_mode_parameter_passed_to_provider():\n    \"\"\"Test that mode parameter is correctly passed to provider functions.\"\"\"\n    from unittest.mock import patch, MagicMock\n    import instructor\n\n    with patch(\"google.genai.Client\") as mock_genai_class:\n        mock_client = MagicMock()\n        mock_genai_class.return_value = mock_client\n\n        with patch(\"instructor.from_genai\") as mock_from_genai:\n            mock_instructor = MagicMock()\n            mock_from_genai.return_value = mock_instructor\n\n            from_provider(\n                \"google/gemini-pro\",\n                mode=instructor.Mode.GENAI_STRUCTURED_OUTPUTS,\n            )\n\n            mock_from_genai.assert_called_once()\n            _, kwargs = mock_from_genai.call_args\n            assert \"mode\" in kwargs\n            assert kwargs[\"mode\"] == instructor.Mode.GENAI_STRUCTURED_OUTPUTS\n\n\ndef test_genai_mode_defaults_when_not_provided():\n    \"\"\"Test that GenAI provider uses GENAI_TOOLS mode when mode is not provided.\"\"\"\n    from unittest.mock import patch, MagicMock\n    import instructor\n\n    with patch(\"google.genai.Client\") as mock_genai_class:\n        mock_client = MagicMock()\n        mock_genai_class.return_value = mock_client\n\n        with patch(\"instructor.from_genai\") as mock_from_genai:\n            mock_instructor = MagicMock()\n            mock_from_genai.return_value = mock_instructor\n\n            from_provider(\"google/gemini-pro\")\n\n            mock_from_genai.assert_called_once()\n            _, kwargs = mock_from_genai.call_args\n            assert \"mode\" in kwargs\n            assert kwargs[\"mode\"] == instructor.Mode.GENAI_TOOLS\n\n\ndef test_google_provider_runtime_import_error_propagates():\n    \"\"\"Test that ImportError during client initialization is NOT masked.\n\n    This is a regression test for issue #1940 - when using SOCKS proxy without\n    socksio installed, httpx raises ImportError during genai.Client() initialization.\n    This error should propagate instead of being caught and converted to\n    ConfigurationError about missing google-genai package.\n    \"\"\"\n    from unittest.mock import patch, MagicMock\n    import sys\n\n    # Create mock module for google.genai\n    mock_genai_module = MagicMock()\n\n    # Simulate socksio ImportError during Client() initialization\n    def client_init_raises(*_args, **_kwargs):\n        raise ImportError(\n            \"Using SOCKS proxy, but the 'socksio' package is not installed. \"\n            \"Make sure to install httpx using `pip install httpx[socks]`.\"\n        )\n\n    mock_genai_module.Client = client_init_raises\n\n    # Create a mock google module\n    mock_google = MagicMock()\n    mock_google.genai = mock_genai_module\n\n    # Patch sys.modules to use our mock modules\n    with patch.dict(\n        sys.modules,\n        {\"google\": mock_google, \"google.genai\": mock_genai_module},\n    ):\n        mock_from_genai = MagicMock()\n        with patch.object(\n            __import__(\"instructor\"), \"from_genai\", mock_from_genai, create=True\n        ):\n            with pytest.raises(ImportError) as excinfo:\n                from_provider(\"google/gemini-pro\")\n\n            # Should be the socksio error, NOT a ConfigurationError about google-genai\n            assert \"socksio\" in str(excinfo.value)\n            assert \"google-genai\" not in str(excinfo.value)\n\n\ndef test_vertexai_provider_runtime_import_error_propagates():\n    \"\"\"Test that ImportError during vertexai client initialization is NOT masked.\n\n    Similar to test_google_provider_runtime_import_error_propagates but for\n    the deprecated vertexai provider.\n    \"\"\"\n    from unittest.mock import patch, MagicMock\n    import warnings\n    import sys\n\n    # Create mock module for google.genai\n    mock_genai_module = MagicMock()\n\n    # Simulate socksio ImportError during Client() initialization\n    def client_init_raises(*_args, **_kwargs):\n        raise ImportError(\n            \"Using SOCKS proxy, but the 'socksio' package is not installed.\"\n        )\n\n    mock_genai_module.Client = client_init_raises\n\n    # Create a mock google module\n    mock_google = MagicMock()\n    mock_google.genai = mock_genai_module\n\n    with patch.dict(\n        sys.modules,\n        {\"google\": mock_google, \"google.genai\": mock_genai_module},\n    ):\n        mock_from_genai = MagicMock()\n        with patch.object(\n            __import__(\"instructor\"), \"from_genai\", mock_from_genai, create=True\n        ):\n            with warnings.catch_warnings():\n                warnings.simplefilter(\"ignore\", DeprecationWarning)\n                with pytest.raises(ImportError) as excinfo:\n                    from_provider(\"vertexai/gemini-pro\", project=\"test-project\")\n\n            # Should be the socksio error, NOT a ConfigurationError\n            assert \"socksio\" in str(excinfo.value)\n\n\ndef test_generative_ai_provider_runtime_import_error_propagates():\n    \"\"\"Test that ImportError during generative-ai client initialization is NOT masked.\n\n    Similar to test_google_provider_runtime_import_error_propagates but for\n    the deprecated generative-ai provider.\n    \"\"\"\n    from unittest.mock import patch, MagicMock\n    import warnings\n\n    # Create mock module for google.genai\n    mock_genai_module = MagicMock()\n\n    # Simulate socksio ImportError during Client() initialization\n    def client_init_raises(*_args, **_kwargs):\n        raise ImportError(\n            \"Using SOCKS proxy, but the 'socksio' package is not installed.\"\n        )\n\n    mock_genai_module.Client = client_init_raises\n\n    # Create a mock google module with genai attribute\n    mock_google = MagicMock()\n    mock_google.genai = mock_genai_module\n\n    with patch.dict(\n        \"sys.modules\",\n        {\"google\": mock_google, \"google.genai\": mock_genai_module},\n    ):\n        mock_from_genai = MagicMock()\n        with patch.object(\n            __import__(\"instructor\"), \"from_genai\", mock_from_genai, create=True\n        ):\n            with warnings.catch_warnings():\n                warnings.simplefilter(\"ignore\", DeprecationWarning)\n                with pytest.raises(ImportError) as excinfo:\n                    from_provider(\"generative-ai/gemini-pro\")\n\n            # Should be the socksio error, NOT a ConfigurationError\n            assert \"socksio\" in str(excinfo.value)\n"
  },
  {
    "path": "tests/test_batch_in_memory.py",
    "content": "\"\"\"Tests for in-memory batch processing functionality.\"\"\"\n\nimport io\nimport json\nimport pytest\nfrom pydantic import BaseModel\nfrom instructor.batch.request import BatchRequest\nfrom instructor.batch.providers.openai import OpenAIProvider\nfrom instructor.batch.providers.anthropic import AnthropicProvider\n\n# Mark all tests in this module as unit tests (not integration)\npytestmark = pytest.mark.unit\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n    email: str\n\n\nclass TestBatchRequestInMemory:\n    \"\"\"Test BatchRequest with BytesIO support.\"\"\"\n\n    def test_save_to_bytesio_openai(self):\n        \"\"\"Test saving BatchRequest to BytesIO for OpenAI format.\"\"\"\n        buffer = io.BytesIO()\n\n        batch_request = BatchRequest[User](\n            custom_id=\"test-1\",\n            messages=[{\"role\": \"user\", \"content\": \"Extract user info\"}],\n            response_model=User,\n            model=\"gpt-4\",\n            max_tokens=100,\n            temperature=0.1,\n        )\n\n        # Save to BytesIO\n        batch_request.save_to_file(buffer, \"openai\")\n\n        # Read back and verify\n        buffer.seek(0)\n        content = buffer.read().decode(\"utf-8\")\n        data = json.loads(content.strip())\n\n        assert data[\"custom_id\"] == \"test-1\"\n        assert data[\"method\"] == \"POST\"\n        assert data[\"url\"] == \"/v1/chat/completions\"\n        assert \"body\" in data\n        assert data[\"body\"][\"model\"] == \"gpt-4\"\n        assert \"response_format\" in data[\"body\"]\n\n    def test_save_to_bytesio_anthropic(self):\n        \"\"\"Test saving BatchRequest to BytesIO for Anthropic format.\"\"\"\n        buffer = io.BytesIO()\n\n        batch_request = BatchRequest[User](\n            custom_id=\"test-1\",\n            messages=[{\"role\": \"user\", \"content\": \"Extract user info\"}],\n            response_model=User,\n            model=\"claude-3-sonnet\",\n            max_tokens=100,\n            temperature=0.1,\n        )\n\n        # Save to BytesIO\n        batch_request.save_to_file(buffer, \"anthropic\")\n\n        # Read back and verify\n        buffer.seek(0)\n        content = buffer.read().decode(\"utf-8\")\n        data = json.loads(content.strip())\n\n        assert data[\"custom_id\"] == \"test-1\"\n        assert \"params\" in data\n        assert data[\"params\"][\"model\"] == \"claude-3-sonnet\"\n        assert \"tools\" in data[\"params\"]\n\n    def test_save_to_file_still_works(self):\n        \"\"\"Test that original file-based saving still works.\"\"\"\n        import tempfile\n        import os\n\n        with tempfile.NamedTemporaryFile(mode=\"w\", delete=False, suffix=\".jsonl\") as f:\n            temp_path = f.name\n\n        try:\n            batch_request = BatchRequest[User](\n                custom_id=\"test-1\",\n                messages=[{\"role\": \"user\", \"content\": \"Extract user info\"}],\n                response_model=User,\n                model=\"gpt-4\",\n                max_tokens=100,\n                temperature=0.1,\n            )\n\n            # Save to file\n            batch_request.save_to_file(temp_path, \"openai\")\n\n            # Read back and verify\n            with open(temp_path) as f:\n                content = f.read()\n\n            data = json.loads(content.strip())\n            assert data[\"custom_id\"] == \"test-1\"\n            assert \"body\" in data\n\n        finally:\n            if os.path.exists(temp_path):\n                os.unlink(temp_path)\n\n    def test_multiple_requests_in_buffer(self):\n        \"\"\"Test writing multiple requests to the same BytesIO buffer.\"\"\"\n        buffer = io.BytesIO()\n\n        for i in range(3):\n            batch_request = BatchRequest[User](\n                custom_id=f\"request-{i}\",\n                messages=[{\"role\": \"user\", \"content\": f\"Extract user {i}\"}],\n                response_model=User,\n                model=\"gpt-4\",\n                max_tokens=100,\n                temperature=0.1,\n            )\n            batch_request.save_to_file(buffer, \"openai\")\n\n        # Read back and verify\n        buffer.seek(0)\n        content = buffer.read().decode(\"utf-8\")\n        lines = [line for line in content.split(\"\\n\") if line.strip()]\n\n        assert len(lines) == 3\n\n        for i, line in enumerate(lines):\n            data = json.loads(line)\n            assert data[\"custom_id\"] == f\"request-{i}\"\n\n    def test_invalid_buffer_type_raises_error(self):\n        \"\"\"Test that invalid buffer types raise appropriate errors.\"\"\"\n        batch_request = BatchRequest[User](\n            custom_id=\"test-1\",\n            messages=[{\"role\": \"user\", \"content\": \"Extract user info\"}],\n            response_model=User,\n            model=\"gpt-4\",\n            max_tokens=100,\n            temperature=0.1,\n        )\n\n        with pytest.raises(ValueError, match=\"Unsupported file_path_or_buffer type\"):\n            batch_request.save_to_file(123, \"openai\")  # type: ignore[arg-type] # Invalid type\n\n\nclass TestProviderInMemorySupport:\n    \"\"\"Test that providers support BytesIO buffers.\"\"\"\n\n    def test_openai_provider_accepts_bytesio(self):\n        \"\"\"Test that OpenAI provider accepts BytesIO (without making API calls).\"\"\"\n        provider = OpenAIProvider()\n        buffer = io.BytesIO()\n\n        # Create a valid OpenAI batch request\n        test_data = {\n            \"custom_id\": \"test-1\",\n            \"method\": \"POST\",\n            \"url\": \"/v1/chat/completions\",\n            \"body\": {\n                \"model\": \"gpt-4\",\n                \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n                \"max_tokens\": 100,\n            },\n        }\n\n        json_line = json.dumps(test_data) + \"\\n\"\n        buffer.write(json_line.encode(\"utf-8\"))\n        buffer.seek(0)\n\n        # This should not raise a ValueError for unsupported type\n        # (It will raise an exception due to missing API key, but that's expected)\n        with pytest.raises(Exception) as exc_info:\n            provider.submit_batch(buffer)\n\n        # Make sure it's not a ValueError about unsupported type\n        assert \"Unsupported file_path_or_buffer type\" not in str(exc_info.value)\n\n    def test_anthropic_provider_accepts_bytesio(self):\n        \"\"\"Test that Anthropic provider accepts BytesIO (without making API calls).\"\"\"\n        provider = AnthropicProvider()\n        buffer = io.BytesIO()\n\n        # Create a valid Anthropic batch request\n        test_data = {\n            \"custom_id\": \"test-1\",\n            \"params\": {\n                \"model\": \"claude-3-sonnet\",\n                \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n                \"max_tokens\": 100,\n            },\n        }\n\n        json_line = json.dumps(test_data) + \"\\n\"\n        buffer.write(json_line.encode(\"utf-8\"))\n        buffer.seek(0)\n\n        # This should not raise a ValueError for unsupported type\n        # (It will raise an exception due to missing API key, but that's expected)\n        with pytest.raises(Exception) as exc_info:\n            provider.submit_batch(buffer)\n\n        # Make sure it's not a ValueError about unsupported type\n        assert \"Unsupported file_path_or_buffer type\" not in str(exc_info.value)\n\n    def test_provider_invalid_type_raises_error(self):\n        \"\"\"Test that providers raise errors for invalid types.\"\"\"\n        openai_provider = OpenAIProvider()\n        anthropic_provider = AnthropicProvider()\n\n        with pytest.raises(ValueError, match=\"Unsupported file_path_or_buffer type\"):\n            openai_provider.submit_batch(123)  # type: ignore[arg-type] # Invalid type\n\n        with pytest.raises(ValueError, match=\"Unsupported file_path_or_buffer type\"):\n            anthropic_provider.submit_batch(123)  # type: ignore[arg-type] # Invalid type\n"
  },
  {
    "path": "tests/test_cache_integration.py",
    "content": "import types\n\nimport instructor\nfrom instructor.cache import AutoCache\nfrom pydantic import BaseModel, Field  # type: ignore[import-not-found]\n\n\ndef test_auto_cache_prevents_duplicate_provider_calls(monkeypatch):\n    _ = monkeypatch  # unused fixture for parity with other tests\n    \"\"\"Ensure that AutoCache prevents duplicate provider calls via patch layer.\"\"\"\n\n    class User(BaseModel):\n        name: str = Field(...)\n\n    call_counter = {\"n\": 0}\n\n    # Fake provider completion function mimicking minimal OpenAI chat response\n    def fake_completion(*_args, **_kwargs):  # noqa: D401, ANN001\n        call_counter[\"n\"] += 1\n        content = User(name=\"cached\").model_dump_json()\n        # Return minimal ChatCompletion-like object\n        return types.SimpleNamespace(\n            choices=[\n                types.SimpleNamespace(\n                    message=types.SimpleNamespace(content=content),\n                    finish_reason=\"stop\",\n                )\n            ],\n            usage={},\n        )\n\n    # Create Instructor client using from_litellm so we go through patch stack\n    cache = AutoCache(maxsize=10)\n    client = instructor.from_litellm(fake_completion, mode=instructor.Mode.JSON)\n\n    messages = [{\"role\": \"user\", \"content\": \"hello\"}]\n\n    # First call – provider should be invoked\n    _ = client.create(messages=list(messages), response_model=User, cache=cache)\n    assert call_counter[\"n\"] == 1\n\n    # Second call with identical inputs – should hit cache, no new provider call\n    _ = client.create(messages=list(messages), response_model=User, cache=cache)\n    assert call_counter[\"n\"] == 1, \"Cache miss – provider was called again\"\n"
  },
  {
    "path": "tests/test_cache_key.py",
    "content": "from instructor.cache import make_cache_key\nfrom pydantic import BaseModel, Field  # type: ignore[import-not-found]\n\n\nmessages = [\n    {\"role\": \"user\", \"content\": \"hello\"},\n]\nmodel_name = \"gpt-3.5-turbo\"\n\n\nclass UserV1(BaseModel):\n    name: str = Field(..., description=\"User name\")\n\n\nclass UserV1DiffDesc(BaseModel):\n    name: str = Field(..., description=\"User full name\")\n\n\nclass UserV1DiffField(BaseModel):\n    name: str\n    age: int\n\n\nclass UserDoc1(BaseModel):\n    \"\"\"First docstring\"\"\"\n\n    name: str\n\n\nclass UserDoc2(BaseModel):\n    \"\"\"Second different docstring\"\"\"\n\n    name: str\n\n\ndef test_cache_key_changes_on_description_change():\n    k1 = make_cache_key(messages=messages, model=model_name, response_model=UserV1)\n    k2 = make_cache_key(\n        messages=messages, model=model_name, response_model=UserV1DiffDesc\n    )\n    assert k1 != k2, \"Changing field description should bust the cache key\"\n\n\ndef test_cache_key_changes_on_field_change():\n    k1 = make_cache_key(messages=messages, model=model_name, response_model=UserV1)\n    k2 = make_cache_key(\n        messages=messages, model=model_name, response_model=UserV1DiffField\n    )\n    assert k1 != k2, \"Adding or removing fields should bust the cache key\"\n\n\ndef test_cache_key_same_for_identical_schema():\n    k1 = make_cache_key(messages=messages, model=model_name, response_model=UserV1)\n    k2 = make_cache_key(messages=messages, model=model_name, response_model=UserV1)\n    assert k1 == k2, \"Identical schemas should produce identical cache keys\"\n\n\ndef test_cache_key_changes_on_docstring_change():\n    k1 = make_cache_key(messages=messages, model=model_name, response_model=UserDoc1)\n    k2 = make_cache_key(messages=messages, model=model_name, response_model=UserDoc2)\n    assert k1 != k2, \"Changing class docstring should bust the cache key\"\n"
  },
  {
    "path": "tests/test_dict_operations.py",
    "content": "\"\"\"Benchmark tests for dictionary operations in instructor.\"\"\"\n\nimport timeit\nfrom instructor.core.retry import extract_messages\nfrom instructor.utils import (\n    combine_system_messages,\n    extract_system_messages,\n    update_gemini_kwargs,\n)\n\n# Mock data for benchmarks\nSAMPLE_KWARGS_MESSAGES = {\"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]}\nSAMPLE_KWARGS_CONTENTS = {\"contents\": [{\"role\": \"user\", \"parts\": [\"Hello\"]}]}\nSAMPLE_KWARGS_CHAT_HISTORY = {\"chat_history\": [{\"role\": \"user\", \"message\": \"Hello\"}]}\nSAMPLE_KWARGS_EMPTY = {}\n\nSAMPLE_SYSTEM_MSG_STR = \"You are a helpful assistant.\"\nSAMPLE_SYSTEM_MSG_LIST = [{\"type\": \"text\", \"text\": \"You are a helpful assistant.\"}]\n\nSAMPLE_MESSAGES = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"Hello\"},\n]\n\nSAMPLE_GEMINI_KWARGS = {\n    \"messages\": [\n        {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n        {\"role\": \"user\", \"content\": \"Hello\"},\n    ],\n    \"max_tokens\": 1000,\n    \"temperature\": 0.7,\n    \"n\": 1,\n    \"top_p\": 0.9,\n    \"stop\": [\"###\"],\n    \"generation_config\": {\n        \"max_tokens\": 2000,\n        \"temperature\": 0.5,\n    },\n}\n\n\nclass TestDictionaryOperations:\n    \"\"\"Test suite for dictionary operations performance.\"\"\"\n\n    def test_extract_messages_benchmark(self):\n        \"\"\"Benchmark for extract_messages function.\"\"\"\n        # Test with different message locations\n        results = {}\n\n        # Benchmark with messages key\n        results[\"messages\"] = timeit.timeit(\n            lambda: extract_messages(SAMPLE_KWARGS_MESSAGES), number=10000\n        )\n\n        # Benchmark with contents key\n        results[\"contents\"] = timeit.timeit(\n            lambda: extract_messages(SAMPLE_KWARGS_CONTENTS), number=10000\n        )\n\n        # Benchmark with chat_history key\n        results[\"chat_history\"] = timeit.timeit(\n            lambda: extract_messages(SAMPLE_KWARGS_CHAT_HISTORY), number=10000\n        )\n\n        # Benchmark with empty dict\n        results[\"empty\"] = timeit.timeit(\n            lambda: extract_messages(SAMPLE_KWARGS_EMPTY), number=10000\n        )\n\n        # Print benchmark results (useful for debugging)\n        print(\"\\nExtract Messages Benchmark Results:\")\n        for key, time in results.items():\n            print(f\"{key}: {time:.6f}s\")\n\n        # Ensure the optimized version is faster than a baseline (for CI)\n        baseline = 0.1  # Adjust based on initial benchmark runs\n        for key, time in results.items():\n            assert time < baseline, (\n                f\"extract_messages with {key} is too slow: {time:.6f}s > {baseline:.6f}s\"\n            )\n\n    def test_combine_system_messages_benchmark(self):\n        \"\"\"Benchmark for combine_system_messages function.\"\"\"\n        results = {}\n\n        # Both string\n        results[\"str_str\"] = timeit.timeit(\n            lambda: combine_system_messages(\n                SAMPLE_SYSTEM_MSG_STR, SAMPLE_SYSTEM_MSG_STR\n            ),\n            number=10000,\n        )\n\n        # Both list\n        results[\"list_list\"] = timeit.timeit(\n            lambda: combine_system_messages(\n                SAMPLE_SYSTEM_MSG_LIST, SAMPLE_SYSTEM_MSG_LIST\n            ),\n            number=10000,\n        )\n\n        # String and list\n        results[\"str_list\"] = timeit.timeit(\n            lambda: combine_system_messages(\n                SAMPLE_SYSTEM_MSG_STR, SAMPLE_SYSTEM_MSG_LIST\n            ),\n            number=10000,\n        )\n\n        # List and string\n        results[\"list_str\"] = timeit.timeit(\n            lambda: combine_system_messages(\n                SAMPLE_SYSTEM_MSG_LIST, SAMPLE_SYSTEM_MSG_STR\n            ),\n            number=10000,\n        )\n\n        # None and string\n        results[\"none_str\"] = timeit.timeit(\n            lambda: combine_system_messages(None, SAMPLE_SYSTEM_MSG_STR),\n            number=10000,\n        )\n\n        print(\"\\nCombine System Messages Benchmark Results:\")\n        for key, time in results.items():\n            print(f\"{key}: {time:.6f}s\")\n\n        baseline = 0.2  # Adjust based on initial benchmark runs\n        for key, time in results.items():\n            assert time < baseline, (\n                f\"combine_system_messages with {key} is too slow: {time:.6f}s > {baseline:.6f}s\"\n            )\n\n    def test_extract_system_messages_benchmark(self):\n        \"\"\"Benchmark for extract_system_messages function.\"\"\"\n        results = {}\n\n        # With system messages\n        results[\"with_system\"] = timeit.timeit(\n            lambda: extract_system_messages(SAMPLE_MESSAGES),\n            number=10000,\n        )\n\n        # Without system messages\n        results[\"no_system\"] = timeit.timeit(\n            lambda: extract_system_messages([{\"role\": \"user\", \"content\": \"Hello\"}]),\n            number=10000,\n        )\n\n        # Empty messages\n        results[\"empty\"] = timeit.timeit(\n            lambda: extract_system_messages([]),\n            number=10000,\n        )\n\n        print(\"\\nExtract System Messages Benchmark Results:\")\n        for key, time in results.items():\n            print(f\"{key}: {time:.6f}s\")\n\n        baseline = 0.2  # Adjust based on initial benchmark runs\n        for key, time in results.items():\n            assert time < baseline, (\n                f\"extract_system_messages with {key} is too slow: {time:.6f}s > {baseline:.6f}s\"\n            )\n\n    def test_update_gemini_kwargs_benchmark(self):\n        \"\"\"Benchmark for update_gemini_kwargs function.\"\"\"\n        result = timeit.timeit(\n            lambda: update_gemini_kwargs(SAMPLE_GEMINI_KWARGS),\n            number=1000,\n        )\n\n        print(f\"\\nUpdate Gemini Kwargs Benchmark Result: {result:.6f}s\")\n        baseline = 0.2  # Adjust based on initial benchmark runs\n        assert result < baseline, (\n            f\"update_gemini_kwargs is too slow: {result:.6f}s > {baseline:.6f}s\"\n        )\n\n    # We'll use a simpler test for mode lookup patterns since proper mocking is complex\n    # Test removed as it was producing inconsistent results across different environments\n"
  },
  {
    "path": "tests/test_dict_operations_validation.py",
    "content": "\"\"\"Tests to validate that the optimized dictionary operations provide the same results as before.\"\"\"\n\nfrom instructor.core.retry import extract_messages\nfrom instructor.utils import (\n    combine_system_messages,\n    extract_system_messages,\n    update_gemini_kwargs,\n    SystemMessage,\n)\n\n\nclass TestDictOperationsValidation:\n    \"\"\"Test suite for validating dictionary operations behavior.\"\"\"\n\n    def test_extract_messages_validation(self):\n        \"\"\"Validate extract_messages returns the same results after optimization.\"\"\"\n        # Test with messages key\n        sample_messages = [{\"role\": \"user\", \"content\": \"Hello\"}]\n        kwargs = {\"messages\": sample_messages}\n        result = extract_messages(kwargs)\n        assert result == sample_messages\n\n        # Test with contents key\n        sample_contents = [{\"role\": \"user\", \"parts\": [\"Hello\"]}]\n        kwargs = {\"contents\": sample_contents}\n        result = extract_messages(kwargs)\n        assert result == sample_contents\n\n        # Test with chat_history key\n        sample_chat_history = [{\"role\": \"user\", \"message\": \"Hello\"}]\n        kwargs = {\"chat_history\": sample_chat_history}\n        result = extract_messages(kwargs)\n        assert result == sample_chat_history\n\n        # Test with empty dict\n        kwargs = {}\n        result = extract_messages(kwargs)\n        assert result == []\n\n        # Test with mixed keys (should prioritize messages)\n        kwargs = {\n            \"messages\": sample_messages,\n            \"contents\": sample_contents,\n            \"chat_history\": sample_chat_history,\n        }\n        result = extract_messages(kwargs)\n        assert result == sample_messages\n\n    def test_combine_system_messages_validation(self):\n        \"\"\"Validate combine_system_messages returns the same results after optimization.\"\"\"\n        # Test with both strings\n        existing = \"You are a helpful assistant.\"\n        new = \"You should be concise.\"\n        expected = \"You are a helpful assistant.\\n\\nYou should be concise.\"\n        result = combine_system_messages(existing, new)\n        assert result == expected\n\n        # Test with both lists\n        existing_list = [\n            SystemMessage(type=\"text\", text=\"You are a helpful assistant.\")\n        ]\n        new_list = [SystemMessage(type=\"text\", text=\"You should be concise.\")]\n        result = combine_system_messages(existing_list, new_list)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are a helpful assistant.\"\n        assert result[1][\"text\"] == \"You should be concise.\"\n\n        # Test with existing string, new list\n        result = combine_system_messages(existing, new_list)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are a helpful assistant.\"\n        assert result[1][\"text\"] == \"You should be concise.\"\n\n        # Test with existing list, new string\n        result = combine_system_messages(existing_list, new)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are a helpful assistant.\"\n        assert result[1][\"text\"] == \"You should be concise.\"\n\n        # Test with None existing\n        result = combine_system_messages(None, new)\n        assert result == new\n\n        result = combine_system_messages(None, new_list)\n        assert result == new_list\n\n    def test_extract_system_messages_validation(self):\n        \"\"\"Validate extract_system_messages returns the same results after optimization.\"\"\"\n        # Test with system messages\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = extract_system_messages(messages)\n        assert len(result) == 1\n        assert result[0][\"text\"] == \"You are a helpful assistant.\"\n\n        # Test with multiple system messages\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"system\", \"content\": \"You should be concise.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = extract_system_messages(messages)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are a helpful assistant.\"\n        assert result[1][\"text\"] == \"You should be concise.\"\n\n        # Test with no system messages\n        messages = [{\"role\": \"user\", \"content\": \"Hello\"}]\n        result = extract_system_messages(messages)\n        assert result == []\n\n        # Test with empty messages\n        result = extract_system_messages([])\n        assert result == []\n\n        # Test with system message and list content\n        messages = [\n            {\n                \"role\": \"system\",\n                \"content\": [{\"type\": \"text\", \"text\": \"You are a helpful assistant.\"}],\n            },\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = extract_system_messages(messages)\n        assert len(result) == 1\n        assert result[0][\"text\"] == \"You are a helpful assistant.\"\n\n    def test_update_gemini_kwargs_validation(self):\n        \"\"\"Validate update_gemini_kwargs returns the same results after optimization.\"\"\"\n        # Test with complete kwargs\n        kwargs = {\n            \"messages\": [\n                {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n                {\"role\": \"user\", \"content\": \"Hello\"},\n            ],\n            \"max_tokens\": 1000,\n            \"temperature\": 0.7,\n            \"generation_config\": {\n                \"max_tokens\": 2000,\n                \"temperature\": 0.5,\n                \"top_p\": 0.9,\n                \"n\": 1,\n                \"stop\": [\"###\"],\n            },\n        }\n\n        result = update_gemini_kwargs(kwargs)\n\n        # Check that it contains contents transformed from messages\n        assert \"contents\" in result\n        assert (\n            len(result[\"contents\"]) == 1\n        )  # System messages are merged into first user message\n\n        # Check that generation_config was updated properly\n        assert \"max_output_tokens\" in result[\"generation_config\"]\n        assert result[\"generation_config\"][\"max_output_tokens\"] == 2000\n        assert \"candidate_count\" in result[\"generation_config\"]\n        assert result[\"generation_config\"][\"candidate_count\"] == 1\n        assert \"stop_sequences\" in result[\"generation_config\"]\n        assert result[\"generation_config\"][\"stop_sequences\"] == [\"###\"]\n\n        # Check that safety settings were added\n        assert \"safety_settings\" in result\n\n        # Ensure the original kwargs wasn't modified\n        assert \"contents\" not in kwargs\n        assert \"messages\" in kwargs\n"
  },
  {
    "path": "tests/test_dynamic_model_creation.py",
    "content": "from pydantic import BaseModel, create_model, Field\nfrom instructor import openai_schema\n\n\ndef test_dynamic_model_creation_with_field_description():\n    \"\"\"\n    Test that dynamic model creation with Field(description) works correctly.\n    This verifies the example in the documentation at docs/concepts/models.md.\n    \"\"\"\n    types = {\n        \"string\": str,\n        \"integer\": int,\n        \"email\": str,\n    }\n\n    mock_cursor = [\n        (\"name\", \"string\", \"The name of the user.\"),\n        (\"age\", \"integer\", \"The age of the user.\"),\n        (\"email\", \"email\", \"The email of the user.\"),\n    ]\n\n    DynamicModel = create_model(\n        \"User\",\n        **{\n            property_name: (types[property_type], Field(description=description))\n            for property_name, property_type, description in mock_cursor\n        },\n        __base__=BaseModel,\n    )\n\n    schema = DynamicModel.model_json_schema()\n\n    assert schema[\"properties\"][\"name\"][\"description\"] == \"The name of the user.\"\n    assert schema[\"properties\"][\"age\"][\"description\"] == \"The age of the user.\"\n    assert schema[\"properties\"][\"email\"][\"description\"] == \"The email of the user.\"\n\n    assert \"default\" not in schema[\"properties\"][\"name\"]\n    assert \"default\" not in schema[\"properties\"][\"age\"]\n    assert \"default\" not in schema[\"properties\"][\"email\"]\n\n    OpenAISchemaModel = openai_schema(DynamicModel)\n    openai_schema_json = OpenAISchemaModel.model_json_schema()\n\n    assert (\n        openai_schema_json[\"properties\"][\"name\"][\"description\"]\n        == \"The name of the user.\"\n    )\n    assert (\n        openai_schema_json[\"properties\"][\"age\"][\"description\"] == \"The age of the user.\"\n    )\n    assert (\n        openai_schema_json[\"properties\"][\"email\"][\"description\"]\n        == \"The email of the user.\"\n    )\n"
  },
  {
    "path": "tests/test_exception_backwards_compat.py",
    "content": "\"\"\"Test backwards compatibility of exception handling.\"\"\"\n\nimport pytest\nfrom instructor.core.exceptions import (\n    InstructorError,\n    ResponseParsingError,\n    MultimodalError,\n    AsyncValidationError,\n)\n\n\ndef test_response_parsing_error_is_value_error():\n    \"\"\"Test that ResponseParsingError can be caught as ValueError.\"\"\"\n    with pytest.raises(ValueError):\n        raise ResponseParsingError(\"Test error\", mode=\"TOOLS\")\n\n    # Should also be catchable as InstructorError\n    with pytest.raises(InstructorError):\n        raise ResponseParsingError(\"Test error\", mode=\"TOOLS\")\n\n    # And as the specific type\n    with pytest.raises(ResponseParsingError):\n        raise ResponseParsingError(\"Test error\", mode=\"TOOLS\")\n\n\ndef test_multimodal_error_is_value_error():\n    \"\"\"Test that MultimodalError can be caught as ValueError.\"\"\"\n    with pytest.raises(ValueError):\n        raise MultimodalError(\"Test error\", content_type=\"image\")\n\n    # Should also be catchable as InstructorError\n    with pytest.raises(InstructorError):\n        raise MultimodalError(\"Test error\", content_type=\"image\")\n\n    # And as the specific type\n    with pytest.raises(MultimodalError):\n        raise MultimodalError(\"Test error\", content_type=\"image\")\n\n\ndef test_async_validation_error_is_value_error():\n    \"\"\"Test that AsyncValidationError can be caught as ValueError.\"\"\"\n    with pytest.raises(ValueError):\n        raise AsyncValidationError(\"Test error\")\n\n    # Should also be catchable as InstructorError\n    with pytest.raises(InstructorError):\n        raise AsyncValidationError(\"Test error\")\n\n\ndef test_exception_inheritance_chain():\n    \"\"\"Test that new exceptions have correct inheritance.\"\"\"\n    # ResponseParsingError\n    assert issubclass(ResponseParsingError, ValueError)\n    assert issubclass(ResponseParsingError, InstructorError)\n    assert issubclass(ResponseParsingError, Exception)\n\n    # MultimodalError\n    assert issubclass(MultimodalError, ValueError)\n    assert issubclass(MultimodalError, InstructorError)\n    assert issubclass(MultimodalError, Exception)\n\n    # AsyncValidationError\n    assert issubclass(AsyncValidationError, ValueError)\n    assert issubclass(AsyncValidationError, InstructorError)\n    assert issubclass(AsyncValidationError, Exception)\n\n\ndef test_mixed_exception_catching():\n    \"\"\"Test catching multiple exception types including ValueError.\"\"\"\n\n    def raise_parsing_error():\n        raise ResponseParsingError(\"Parsing failed\", mode=\"JSON\")\n\n    def raise_multimodal_error():\n        raise MultimodalError(\n            \"File not found\", content_type=\"image\", file_path=\"/test.jpg\"\n        )\n\n    # Catch as ValueError\n    with pytest.raises(ValueError):\n        raise_parsing_error()\n\n    with pytest.raises(ValueError):\n        raise_multimodal_error()\n\n    # Catch as InstructorError\n    with pytest.raises(InstructorError):\n        raise_parsing_error()\n\n    with pytest.raises(InstructorError):\n        raise_multimodal_error()\n\n\ndef test_exception_attributes_preserved():\n    \"\"\"Test that exception attributes are preserved when caught as ValueError.\"\"\"\n    try:\n        raise ResponseParsingError(\n            \"Parse failed\", mode=\"TOOLS\", raw_response={\"test\": \"data\"}\n        )\n    except ValueError as e:\n        # Should still be able to access ResponseParsingError attributes\n        assert isinstance(e, ResponseParsingError)\n        assert e.mode == \"TOOLS\"\n        assert e.raw_response == {\"test\": \"data\"}\n\n    try:\n        raise MultimodalError(\"File error\", content_type=\"pdf\", file_path=\"/test.pdf\")\n    except ValueError as e:\n        # Should still be able to access MultimodalError attributes\n        assert isinstance(e, MultimodalError)\n        assert e.content_type == \"pdf\"\n        assert e.file_path == \"/test.pdf\"\n"
  },
  {
    "path": "tests/test_exceptions.py",
    "content": "\"\"\"Test that all instructor exceptions can be imported and caught properly.\"\"\"\n\nimport pytest\nfrom json import JSONDecodeError\nfrom instructor.core.exceptions import (\n    InstructorError,\n    IncompleteOutputException,\n    InstructorRetryException,\n    ValidationError,\n    ProviderError,\n    ConfigurationError,\n    ModeError,\n    ClientError,\n    FailedAttempt,\n)\n\n\ndef test_all_exceptions_can_be_imported():\n    \"\"\"Test that all exceptions can be imported from instructor base package\"\"\"\n    # This test passes if the imports above succeed\n    assert InstructorError is not None\n    assert IncompleteOutputException is not None\n    assert InstructorRetryException is not None\n    assert ValidationError is not None\n    assert ProviderError is not None\n    assert ConfigurationError is not None\n    assert ModeError is not None\n    assert ClientError is not None\n\n\ndef test_exception_hierarchy():\n    \"\"\"Test that all exceptions inherit from InstructorError.\"\"\"\n    assert issubclass(IncompleteOutputException, InstructorError)\n    assert issubclass(InstructorRetryException, InstructorError)\n    assert issubclass(ValidationError, InstructorError)\n    assert issubclass(ProviderError, InstructorError)\n    assert issubclass(ConfigurationError, InstructorError)\n    assert issubclass(ModeError, InstructorError)\n    assert issubclass(ClientError, InstructorError)\n\n\ndef test_base_instructor_error_can_be_caught():\n    \"\"\"Test that InstructorError can catch all instructor exceptions.\"\"\"\n    with pytest.raises(InstructorError):\n        raise IncompleteOutputException()\n\n    with pytest.raises(InstructorError):\n        raise InstructorRetryException(n_attempts=3, total_usage=100)\n\n    with pytest.raises(InstructorError):\n        raise ValidationError(\"Validation failed\")\n\n    with pytest.raises(InstructorError):\n        raise ProviderError(\"openai\", \"API error\")\n\n    with pytest.raises(InstructorError):\n        raise ConfigurationError(\"Invalid config\")\n\n    with pytest.raises(InstructorError):\n        raise ModeError(\"tools\", \"openai\", [\"json\"])\n\n    with pytest.raises(InstructorError):\n        raise ClientError(\"Client initialization failed\")\n\n\ndef test_incomplete_output_exception():\n    \"\"\"Test IncompleteOutputException attributes and catching.\"\"\"\n    last_completion = {\"content\": \"partial response\"}\n\n    with pytest.raises(IncompleteOutputException) as exc_info:\n        raise IncompleteOutputException(last_completion=last_completion)\n\n    assert exc_info.value.last_completion == last_completion\n    assert \"incomplete due to a max_tokens length limit\" in str(exc_info.value)\n\n\ndef test_instructor_retry_exception():\n    \"\"\"Test InstructorRetryException attributes and catching.\"\"\"\n    last_completion = {\"content\": \"failed response\"}\n    messages = [{\"role\": \"user\", \"content\": \"test\"}]\n    n_attempts = 3\n    total_usage = 150\n    create_kwargs = {\"model\": \"gpt-3.5-turbo\"}\n\n    with pytest.raises(InstructorRetryException) as exc_info:\n        raise InstructorRetryException(\n            last_completion=last_completion,\n            messages=messages,\n            n_attempts=n_attempts,\n            total_usage=total_usage,\n            create_kwargs=create_kwargs,\n        )\n\n    exception = exc_info.value\n    assert exception.last_completion == last_completion\n    assert exception.messages == messages\n    assert exception.n_attempts == n_attempts\n    assert exception.total_usage == total_usage\n    assert exception.create_kwargs == create_kwargs\n\n\ndef test_validation_error():\n    \"\"\"Test ValidationError can be caught.\"\"\"\n    error_message = \"Field validation failed\"\n\n    with pytest.raises(ValidationError) as exc_info:\n        raise ValidationError(error_message)\n\n    assert str(exc_info.value) == error_message\n\n\ndef test_provider_error():\n    \"\"\"Test ProviderError attributes and catching.\"\"\"\n    provider = \"anthropic\"\n    message = \"Rate limit exceeded\"\n\n    with pytest.raises(ProviderError) as exc_info:\n        raise ProviderError(provider, message)\n\n    exception = exc_info.value\n    assert exception.provider == provider\n    assert f\"{provider}: {message}\" in str(exception)\n\n\ndef test_configuration_error():\n    \"\"\"Test ConfigurationError can be caught.\"\"\"\n    error_message = \"Missing required configuration\"\n\n    with pytest.raises(ConfigurationError) as exc_info:\n        raise ConfigurationError(error_message)\n\n    assert str(exc_info.value) == error_message\n\n\ndef test_mode_error():\n    \"\"\"Test ModeError attributes and catching.\"\"\"\n    mode = \"invalid_mode\"\n    provider = \"openai\"\n    valid_modes = [\"json\", \"tools\", \"functions\"]\n\n    with pytest.raises(ModeError) as exc_info:\n        raise ModeError(mode, provider, valid_modes)\n\n    exception = exc_info.value\n    assert exception.mode == mode\n    assert exception.provider == provider\n    assert exception.valid_modes == valid_modes\n    assert f\"Invalid mode '{mode}' for provider '{provider}'\" in str(exception)\n    assert \"json, tools, functions\" in str(exception)\n\n\ndef test_client_error():\n    \"\"\"Test ClientError can be caught.\"\"\"\n    error_message = \"Client not properly initialized\"\n\n    with pytest.raises(ClientError) as exc_info:\n        raise ClientError(error_message)\n\n    assert str(exc_info.value) == error_message\n\n\ndef test_specific_exception_catching():\n    \"\"\"Test that specific exceptions can be caught individually.\"\"\"\n    # Test that we can catch specific exceptions without catching others\n\n    with pytest.raises(IncompleteOutputException):\n        try:\n            raise IncompleteOutputException()\n        except InstructorRetryException:\n            pytest.fail(\"Should not catch InstructorRetryException\")\n        except IncompleteOutputException:\n            raise  # Re-raise to be caught by pytest.raises\n\n    with pytest.raises(ProviderError):\n        try:\n            raise ProviderError(\"test\", \"error\")\n        except ConfigurationError:\n            pytest.fail(\"Should not catch ConfigurationError\")\n        except ProviderError:\n            raise  # Re-raise to be caught by pytest.raises\n\n\ndef test_multiple_exception_handling():\n    \"\"\"Test handling multiple exception types in a single try-except block.\"\"\"\n\n    def raise_exception(exc_type: str):\n        if exc_type == \"incomplete\":\n            raise IncompleteOutputException()\n        elif exc_type == \"retry\":\n            raise InstructorRetryException(n_attempts=3, total_usage=100)\n        elif exc_type == \"validation\":\n            raise ValidationError(\"validation failed\")\n        else:\n            raise ValueError(\"unknown exception type\")\n\n    # Test catching multiple specific exceptions\n    for exc_type in [\"incomplete\", \"retry\", \"validation\"]:\n        with pytest.raises(\n            (IncompleteOutputException, InstructorRetryException, ValidationError)\n        ):\n            raise_exception(exc_type)\n\n    # Test that base exception catches all instructor exceptions\n    for exc_type in [\"incomplete\", \"retry\", \"validation\"]:\n        with pytest.raises(InstructorError):\n            raise_exception(exc_type)\n\n    # Test that non-instructor exceptions are not caught\n    with pytest.raises(ValueError):\n        raise_exception(\"unknown\")\n\n\ndef test_exception_import_from_instructor():\n    \"\"\"Test that exceptions can be imported from the main instructor module.\"\"\"\n    # Test importing from instructor.exceptions (already done in module imports)\n    from instructor.core.exceptions import InstructorError as ImportedError\n\n    assert ImportedError is InstructorError\n\n    # Test that exceptions are accessible and can be used in real scenarios\n    try:\n        raise ImportedError(\"test error\")\n    except InstructorError as e:\n        assert str(e) == \"test error\"\n\n\ndef test_instructor_error_from_exception():\n    \"\"\"Test InstructorError.from_exception() class method.\"\"\"\n    # Test with basic exception\n    original_exception = ValueError(\"Original error message\")\n    instructor_error = InstructorError.from_exception(original_exception)\n\n    assert isinstance(instructor_error, InstructorError)\n    assert str(instructor_error) == \"Original error message\"\n    assert instructor_error.failed_attempts is None\n\n    # Test with failed attempts\n    failed_attempts = [\n        FailedAttempt(1, Exception(\"First failure\"), \"partial completion\"),\n        FailedAttempt(2, Exception(\"Second failure\"), None),\n    ]\n    instructor_error_with_attempts = InstructorError.from_exception(\n        original_exception, failed_attempts=failed_attempts\n    )\n\n    assert isinstance(instructor_error_with_attempts, InstructorError)\n    assert instructor_error_with_attempts.failed_attempts == failed_attempts\n\n    # Test with different exception types\n    runtime_error = RuntimeError(\"Runtime issue\")\n    instructor_error_runtime = InstructorError.from_exception(runtime_error)\n    assert str(instructor_error_runtime) == \"Runtime issue\"\n\n\ndef test_instructor_error_str_with_no_failed_attempts():\n    \"\"\"Test InstructorError.__str__() with no failed attempts.\"\"\"\n    error = InstructorError(\"Simple error message\")\n    assert str(error) == \"Simple error message\"\n\n    error_with_args = InstructorError(\"Error\", \"with\", \"multiple\", \"args\")\n    assert \"Error\" in str(error_with_args)\n\n\ndef test_instructor_error_str_with_failed_attempts():\n    \"\"\"Test InstructorError.__str__() XML template rendering with failed attempts.\"\"\"\n    # Create failed attempts\n    failed_attempts = [\n        FailedAttempt(1, ValueError(\"Validation failed\"), \"incomplete response\"),\n        FailedAttempt(2, KeyError(\"Missing key\"), {\"partial\": \"data\"}),\n        FailedAttempt(3, RuntimeError(\"Process failed\"), None),\n    ]\n\n    error = InstructorError(\"Final error message\", failed_attempts=failed_attempts)\n    error_str = str(error)\n\n    # Check that XML structure is present\n    assert \"<failed_attempts>\" in error_str\n    assert \"</failed_attempts>\" in error_str\n    assert \"<last_exception>\" in error_str\n    assert \"</last_exception>\" in error_str\n\n    # Check that all attempts are included\n    assert 'number=\"1\"' in error_str\n    assert 'number=\"2\"' in error_str\n    assert 'number=\"3\"' in error_str\n\n    # Check that exceptions are included\n    assert \"Validation failed\" in error_str\n    assert \"Missing key\" in error_str\n    assert \"Process failed\" in error_str\n\n    # Check that completions are included\n    assert \"incomplete response\" in error_str\n    assert \"partial\" in error_str\n\n    # Check that final exception is included\n    assert \"Final error message\" in error_str\n\n\ndef test_instructor_error_str_xml_structure():\n    \"\"\"Test detailed XML structure of __str__() output.\"\"\"\n    failed_attempts = [FailedAttempt(1, Exception(\"Test error\"), \"test completion\")]\n\n    error = InstructorError(\"Last error\", failed_attempts=failed_attempts)\n    error_str = str(error)\n\n    # Check proper XML nesting\n    lines = error_str.strip().split(\"\\n\")\n\n    # Find key XML elements\n    failed_attempts_start = next(\n        i for i, line in enumerate(lines) if \"<failed_attempts>\" in line\n    )\n    generation_start = next(\n        i for i, line in enumerate(lines) if '<generation number=\"1\">' in line\n    )\n    exception_start = next(i for i, line in enumerate(lines) if \"<exception>\" in line)\n    completion_start = next(i for i, line in enumerate(lines) if \"<completion>\" in line)\n\n    # Verify proper nesting order\n    assert failed_attempts_start < generation_start < exception_start < completion_start\n\n\ndef test_failed_attempt_namedtuple():\n    \"\"\"Test FailedAttempt NamedTuple functionality.\"\"\"\n    # Test with all fields\n    attempt = FailedAttempt(1, Exception(\"Test error\"), \"completion data\")\n    assert attempt.attempt_number == 1\n    assert str(attempt.exception) == \"Test error\"\n    assert attempt.completion == \"completion data\"\n\n    # Test with None completion (default)\n    attempt_no_completion = FailedAttempt(2, ValueError(\"Another error\"))\n    assert attempt_no_completion.attempt_number == 2\n    assert isinstance(attempt_no_completion.exception, ValueError)\n    assert attempt_no_completion.completion is None\n\n    # Test immutability\n    with pytest.raises(AttributeError):\n        attr = \"attempt_number\"\n        setattr(attempt, attr, 5)\n\n\ndef test_instructor_error_failed_attempts_attribute():\n    \"\"\"Test that failed_attempts attribute is properly handled.\"\"\"\n    # Test default None\n    error = InstructorError(\"Test error\")\n    assert error.failed_attempts is None\n\n    # Test explicit None\n    error_explicit = InstructorError(\"Test error\", failed_attempts=None)\n    assert error_explicit.failed_attempts is None\n\n    # Test with actual failed attempts\n    attempts = [FailedAttempt(1, Exception(\"Error\"), None)]\n    error_with_attempts = InstructorError(\"Test error\", failed_attempts=attempts)\n    assert error_with_attempts.failed_attempts == attempts\n\n\ndef test_instructor_retry_exception_with_failed_attempts():\n    \"\"\"Test InstructorRetryException inherits failed_attempts functionality.\"\"\"\n    failed_attempts = [\n        FailedAttempt(1, Exception(\"First error\"), \"first completion\"),\n        FailedAttempt(2, Exception(\"Second error\"), \"second completion\"),\n    ]\n\n    retry_exception = InstructorRetryException(\n        \"Retry exhausted\",\n        n_attempts=3,\n        total_usage=100,\n        failed_attempts=failed_attempts,\n    )\n\n    # Check that it inherits the XML formatting\n    error_str = str(retry_exception)\n    assert \"<failed_attempts>\" in error_str\n    assert \"First error\" in error_str\n    assert \"Second error\" in error_str\n    assert \"first completion\" in error_str\n    assert \"second completion\" in error_str\n\n\ndef test_multiple_exception_types_with_failed_attempts():\n    \"\"\"Test that various exception types work with failed attempts.\"\"\"\n    failed_attempts = [FailedAttempt(1, Exception(\"Test\"), None)]\n\n    # Test various exception types can be created with failed attempts\n    validation_error = ValidationError(\n        \"Validation failed\", failed_attempts=failed_attempts\n    )\n    assert validation_error.failed_attempts == failed_attempts\n\n    provider_error = ProviderError(\n        \"openai\", \"API error\", failed_attempts=failed_attempts\n    )\n    assert provider_error.failed_attempts == failed_attempts\n\n    config_error = ConfigurationError(\"Config error\", failed_attempts=failed_attempts)\n    assert config_error.failed_attempts == failed_attempts\n\n\ndef test_failed_attempts_propagation_through_retry_cycles():\n    \"\"\"Test that failed attempts accumulate and propagate correctly through retry cycles.\"\"\"\n    # Simulate multiple retry attempts with different exceptions\n    attempt1 = FailedAttempt(1, ValidationError(\"Invalid format\"), \"partial response 1\")\n    attempt2 = FailedAttempt(2, KeyError(\"missing_field\"), \"partial response 2\")\n    attempt3 = FailedAttempt(3, ValueError(\"invalid value\"), \"partial response 3\")\n\n    failed_attempts = [attempt1, attempt2, attempt3]\n\n    # Create final retry exception with accumulated failed attempts\n    final_exception = InstructorRetryException(\n        \"All retries exhausted\",\n        n_attempts=3,\n        total_usage=250,\n        failed_attempts=failed_attempts,\n    )\n\n    # Verify failed attempts are properly stored\n    assert final_exception.failed_attempts == failed_attempts\n    assert final_exception.failed_attempts is not None\n    assert len(final_exception.failed_attempts) == 3\n\n    # Verify attempt numbers are sequential\n    attempt_numbers = [\n        attempt.attempt_number for attempt in final_exception.failed_attempts\n    ]\n    assert attempt_numbers == [1, 2, 3]\n\n    # Verify each attempt has different exceptions\n    exception_types = [\n        type(attempt.exception).__name__ for attempt in final_exception.failed_attempts\n    ]\n    assert exception_types == [\"ValidationError\", \"KeyError\", \"ValueError\"]\n\n    # Verify completions are preserved\n    completions = [attempt.completion for attempt in final_exception.failed_attempts]\n    assert completions == [\n        \"partial response 1\",\n        \"partial response 2\",\n        \"partial response 3\",\n    ]\n\n\ndef test_failed_attempts_propagation_in_exception_hierarchy():\n    \"\"\"Test that failed attempts propagate correctly through exception inheritance.\"\"\"\n    # Test base class propagation\n    base_failed_attempts = [FailedAttempt(1, Exception(\"Base error\"), None)]\n    base_error = InstructorError(\"Base error\", failed_attempts=base_failed_attempts)\n\n    # Convert to more specific exception type using from_exception\n    specific_error = ValidationError.from_exception(\n        base_error, failed_attempts=base_failed_attempts\n    )\n    assert isinstance(specific_error, ValidationError)\n    assert isinstance(specific_error, InstructorError)  # Should still inherit from base\n    assert specific_error.failed_attempts == base_failed_attempts\n\n    # Test that derived exceptions maintain failed attempts\n    retry_failed_attempts = [\n        FailedAttempt(1, Exception(\"Retry 1\"), \"completion 1\"),\n        FailedAttempt(2, Exception(\"Retry 2\"), \"completion 2\"),\n    ]\n    retry_error = InstructorRetryException(\n        \"Retries failed\",\n        n_attempts=2,\n        total_usage=100,\n        failed_attempts=retry_failed_attempts,\n    )\n\n    # Convert to base type should preserve failed attempts\n    base_from_retry = InstructorError.from_exception(\n        retry_error, failed_attempts=retry_failed_attempts\n    )\n    assert base_from_retry.failed_attempts == retry_failed_attempts\n\n\ndef test_failed_attempts_accumulation_simulation():\n    \"\"\"Test simulation of how failed attempts would accumulate in a real retry scenario.\"\"\"\n    # Simulate a retry scenario where attempts accumulate\n    attempts = []\n\n    # First attempt fails\n    attempts.append(\n        FailedAttempt(\n            1, ValidationError(\"Schema validation failed\"), {\"invalid\": \"data\"}\n        )\n    )\n\n    # Second attempt fails differently\n    attempts.append(\n        FailedAttempt(2, JSONDecodeError(\"Invalid JSON\", \"\", 0), \"malformed json\")\n    )\n\n    # Third attempt fails again\n    attempts.append(\n        FailedAttempt(\n            3, ValidationError(\"Required field missing\"), {\"partial\": \"response\"}\n        )\n    )\n\n    # Final retry exception with all attempts\n    final_error = InstructorRetryException(\n        \"Maximum retries exceeded\",\n        n_attempts=3,\n        total_usage=500,\n        failed_attempts=attempts,\n        last_completion={\"final\": \"attempt\"},\n        messages=[{\"role\": \"user\", \"content\": \"test\"}],\n        create_kwargs={\"model\": \"gpt-3.5-turbo\", \"max_retries\": 3},\n    )\n\n    # Verify all data is preserved\n    assert final_error.n_attempts == 3\n    assert final_error.total_usage == 500\n    assert final_error.failed_attempts is not None\n    assert len(final_error.failed_attempts) == 3\n    assert final_error.last_completion == {\"final\": \"attempt\"}\n\n    # Test string representation includes all attempts\n    error_str = str(final_error)\n    assert \"<failed_attempts>\" in error_str\n    assert \"Schema validation failed\" in error_str\n    assert \"Invalid JSON\" in error_str\n    assert \"Required field missing\" in error_str\n    assert \"Maximum retries exceeded\" in error_str\n\n    # Verify attempt sequence integrity\n    assert final_error.failed_attempts is not None\n    for i, attempt in enumerate(final_error.failed_attempts, 1):\n        assert attempt.attempt_number == i\n\n\ndef test_failed_attempts_with_empty_and_none_completions():\n    \"\"\"Test failed attempts handle various completion states correctly.\"\"\"\n    # Test with None completion\n    attempt_none = FailedAttempt(1, Exception(\"Error with None\"), None)\n    assert attempt_none.completion is None\n\n    # Test with empty string completion\n    attempt_empty = FailedAttempt(2, Exception(\"Error with empty\"), \"\")\n    assert attempt_empty.completion == \"\"\n\n    # Test with empty dict completion\n    attempt_empty_dict = FailedAttempt(3, Exception(\"Error with empty dict\"), {})\n    assert attempt_empty_dict.completion == {}\n\n    # Test with complex completion\n    complex_completion = {\n        \"choices\": [{\"message\": {\"content\": \"partial\"}}],\n        \"usage\": {\"total_tokens\": 50},\n    }\n    attempt_complex = FailedAttempt(\n        4, Exception(\"Error with complex\"), complex_completion\n    )\n    assert attempt_complex.completion == complex_completion\n\n    # Create error with mixed completion types\n    mixed_attempts = [attempt_none, attempt_empty, attempt_empty_dict, attempt_complex]\n    error = InstructorError(\"Mixed completions\", failed_attempts=mixed_attempts)\n\n    # Verify XML rendering handles all types\n    error_str = str(error)\n    assert \"<completion>\" in error_str\n    assert \"</completion>\" in error_str\n    # Should handle None, empty string, empty dict, and complex objects\n    assert error_str.count(\"<completion>\") == 4\n\n\ndef test_failed_attempts_exception_chaining():\n    \"\"\"Test that exception chaining works properly with failed attempts.\"\"\"\n    # Create original exception with failed attempts\n    original_attempts = [\n        FailedAttempt(1, Exception(\"Original failure\"), \"original completion\")\n    ]\n    original_error = InstructorError(\n        \"Original error\", failed_attempts=original_attempts\n    )\n\n    try:\n        raise original_error\n    except InstructorError as e:\n        assert e.failed_attempts is not None\n        # Create new exception from caught exception, preserving failed attempts\n        chained_error = InstructorRetryException(\n            \"Chained error\",\n            n_attempts=2,\n            total_usage=150,\n            failed_attempts=e.failed_attempts,\n        )\n\n        # Verify failed attempts are preserved through chaining\n        assert chained_error.failed_attempts == original_attempts\n        assert chained_error.failed_attempts is not None\n        assert len(chained_error.failed_attempts) == 1\n        assert chained_error.failed_attempts[0].exception.args[0] == \"Original failure\"\n"
  },
  {
    "path": "tests/test_fizzbuzz_fix.py",
    "content": "import unittest\nimport sys\nfrom instructor.dsl.simple_type import is_simple_type\nfrom instructor.utils.core import prepare_response_model\n\n\nclass TestFizzbuzzFix(unittest.TestCase):\n    def test_fizzbuzz_response_model(self):\n        if sys.version_info < (3, 10):\n            self.skipTest(\"Union pipe syntax is only available in Python 3.10+\")\n        \"\"\"Test that list[int | str] works correctly as a response model.\"\"\"\n        # This is the type used in the fizzbuzz example\n        response_model = list[int | str]\n\n        # First check that it's correctly identified as a simple type\n        self.assertTrue(\n            is_simple_type(response_model),\n            f\"list[int | str] should be a simple type in Python {sys.version_info.major}.{sys.version_info.minor}\",\n        )\n\n        # Then check that prepare_response_model handles it correctly\n        prepared_model = prepare_response_model(response_model)\n        self.assertIsNotNone(\n            prepared_model,\n            \"prepare_response_model should not return None for list[int | str]\",\n        )\n"
  },
  {
    "path": "tests/test_formatting.py",
    "content": "import pytest\nfrom jinja2.exceptions import SecurityError\nfrom instructor.templating import handle_templating\nfrom instructor import Mode\n\n\ndef test_handle_insecure_template():\n    with pytest.raises(SecurityError):\n        kwargs = {\n            \"messages\": [\n                {\n                    \"role\": \"user\",\n                    \"content\": \"{{ self.__init__.__globals__.__builtins__.__import__('os').system('ls') }} {{ variable }}\",\n                }\n            ]\n        }\n        context = {\"variable\": \"test\"}\n        handle_templating(kwargs, Mode.TOOLS, context)\n\n\ndef test_handle_templating_with_context():\n    kwargs = {\"messages\": [{\"role\": \"user\", \"content\": \"Hello {{ name }}!\"}]}\n    context = {\"name\": \"Alice\"}\n\n    result = handle_templating(kwargs, Mode.TOOLS, context)\n\n    assert result == {\"messages\": [{\"role\": \"user\", \"content\": \"Hello Alice!\"}]}\n\n\ndef test_handle_templating_without_context():\n    kwargs = {\"messages\": [{\"role\": \"user\", \"content\": \"Hello {{ name }}!\"}]}\n\n    result = handle_templating(kwargs, Mode.TOOLS)\n\n    assert result == kwargs\n\n\ndef test_handle_templating_with_anthropic_format():\n    kwargs = {\n        \"messages\": [\n            {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello {{ name }}!\"}]}\n        ]\n    }\n    context = {\"name\": \"Bob\"}\n\n    result = handle_templating(kwargs, Mode.TOOLS, context)\n\n    assert result == {\n        \"messages\": [\n            {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello Bob!\"}]}\n        ]\n    }\n\n\ndef test_handle_templating_with_mixed_content():\n    kwargs = {\n        \"messages\": [\n            {\"role\": \"user\", \"content\": \"Hello {{ name }}!\"},\n            {\n                \"role\": \"assistant\",\n                \"content\": [{\"type\": \"text\", \"text\": \"Nice to meet you, {{ name }}!\"}],\n            },\n        ]\n    }\n    context = {\"name\": \"Charlie\"}\n\n    result = handle_templating(kwargs, Mode.TOOLS, context)\n\n    assert result == {\n        \"messages\": [\n            {\"role\": \"user\", \"content\": \"Hello Charlie!\"},\n            {\n                \"role\": \"assistant\",\n                \"content\": [{\"type\": \"text\", \"text\": \"Nice to meet you, Charlie!\"}],\n            },\n        ]\n    }\n\n\ndef test_handle_templating_with_secret_context():\n    from pydantic import BaseModel, SecretStr\n\n    class UserContext(BaseModel):\n        name: str\n        address: SecretStr\n\n    kwargs = {\n        \"messages\": [\n            {\n                \"role\": \"user\",\n                \"content\": \"{{ user.name }}'s address is '{{ user.address.get_secret_value() }}'\",\n            }\n        ]\n    }\n    context = {\n        \"user\": UserContext(\n            name=\"Jason\", address=SecretStr(\"123 Secret St, Hidden City\")\n        )\n    }\n\n    result = handle_templating(kwargs, Mode.TOOLS, context)\n\n    assert result == {\n        \"messages\": [\n            {\n                \"role\": \"user\",\n                \"content\": \"Jason's address is '123 Secret St, Hidden City'\",\n            }\n        ]\n    }\n\n    # Ensure the original SecretStr is not exposed when rendered\n    assert str(context[\"user\"].address) == \"**********\"\n\n\ndef test_handle_templating_with_cohere_format():\n    kwargs = {\n        \"message\": \"Hello {{ name }}!\",\n        \"chat_history\": [{\"message\": \"Previous message to {{ name }}\"}],\n    }\n    context = {\"name\": \"David\"}\n\n    result = handle_templating(kwargs, Mode.TOOLS, context)\n\n    assert result == {\n        \"message\": \"Hello David!\",\n        \"chat_history\": [{\"message\": \"Previous message to David\"}],\n    }\n\n\ndef test_handle_templating_with_gemini_format():\n    kwargs = {\n        \"contents\": [\n            {\"role\": \"user\", \"parts\": [\"Hello {{ name }}!\", \"How are you {{ name }}?\"]}\n        ]\n    }\n    context = {\"name\": \"Eve\"}\n\n    result = handle_templating(kwargs, Mode.TOOLS, context)\n\n    assert result == {\n        \"contents\": [{\"role\": \"user\", \"parts\": [\"Hello Eve!\", \"How are you Eve?\"]}]\n    }\n"
  },
  {
    "path": "tests/test_function_calls.py",
    "content": "from typing import Any, TypeVar, cast\nimport pytest\nfrom anthropic.types import Message, Usage\nfrom openai.types.chat.chat_completion import ChatCompletion, Choice\nfrom openai.types.chat.chat_completion_message import ChatCompletionMessage\nfrom openai.types.chat.chat_completion_message import FunctionCall as OpenAIFunctionCall\nfrom openai.types.chat.chat_completion_message_tool_call import (\n    ChatCompletionMessageToolCall,\n    Function,\n)\nfrom pydantic import BaseModel, ValidationError\n\nimport instructor\nfrom instructor import OpenAISchema, openai_schema\nfrom instructor.core.exceptions import IncompleteOutputException\nfrom instructor.utils import disable_pydantic_error_url\n\nT = TypeVar(\"T\")\n\n\n@pytest.fixture  # type: ignore[misc]\ndef test_model() -> type[OpenAISchema]:\n    class TestModel(OpenAISchema):  # type: ignore[misc]\n        name: str = \"TestModel\"\n        data: str\n\n    return TestModel\n\n\n@pytest.fixture  # type: ignore[misc]\ndef mock_completion(request: Any) -> ChatCompletion:\n    finish_reason = \"stop\"\n    data_content = '{\\n\"data\": \"complete data\"\\n}'\n\n    if hasattr(request, \"param\"):\n        params = cast(dict[str, Any], request.param)\n        finish_reason = params.get(\"finish_reason\", finish_reason)\n        data_content = params.get(\"data_content\", data_content)\n\n    completion = ChatCompletion(\n        id=\"test_id\",\n        choices=[\n            Choice(\n                index=0,\n                message=ChatCompletionMessage(\n                    role=\"assistant\",\n                    content=data_content,\n                    function_call=OpenAIFunctionCall(\n                        name=\"TestModel\",\n                        arguments=data_content,\n                    ),\n                ),\n                finish_reason=finish_reason,\n                logprobs=None,\n            )\n        ],\n        created=1234567890,\n        model=\"gpt-3.5-turbo\",\n        object=\"chat.completion\",\n    )\n\n    return completion\n\n\n@pytest.fixture  # type: ignore[misc]\ndef mock_anthropic_message(request: Any) -> Message:\n    data_content = '{\\n\"data\": \"Claude says hi\"\\n}'\n    if hasattr(request, \"param\"):\n        params = cast(dict[str, Any], request.param)\n        data_content = params.get(\"data_content\", data_content)\n    return Message(\n        id=\"test_id\",\n        content=[{\"type\": \"text\", \"text\": data_content}],\n        model=\"claude-3-haiku-20240307\",\n        role=\"assistant\",\n        stop_reason=\"end_turn\",\n        stop_sequence=None,\n        type=\"message\",\n        usage=Usage(\n            input_tokens=100,\n            output_tokens=100,\n        ),\n    )\n\n\ndef test_openai_schema() -> None:\n    @openai_schema\n    class Dataframe(BaseModel):  # type: ignore[misc]\n        \"\"\"\n        Class representing a dataframe. This class is used to convert\n        data into a frame that can be used by pandas.\n        \"\"\"\n\n        data: str\n        columns: str\n\n        def to_pandas(self) -> None:\n            pass\n\n    assert hasattr(Dataframe, \"openai_schema\")\n    assert hasattr(Dataframe, \"from_response\")\n    assert hasattr(Dataframe, \"to_pandas\")\n    assert Dataframe.openai_schema[\"name\"] == \"Dataframe\"\n\n\ndef test_openai_schema_raises_error() -> None:\n    with pytest.raises(TypeError, match=\"must be a subclass of pydantic.BaseModel\"):\n\n        @openai_schema\n        class Dummy:\n            pass\n\n\ndef test_no_docstring() -> None:\n    class Dummy(OpenAISchema):  # type: ignore[misc]\n        attr: str\n\n    assert (\n        Dummy.openai_schema[\"description\"]\n        == \"Correctly extracted `Dummy` with all the required parameters with correct types\"\n    )\n\n\n@pytest.mark.parametrize(\n    \"mock_completion\",\n    [{\"finish_reason\": \"length\", \"data_content\": '{\\n\"data\": \"incomplete dat\"\\n}'}],\n    indirect=True,\n)  # type: ignore[misc]\ndef test_incomplete_output_exception(\n    test_model: type[OpenAISchema], mock_completion: ChatCompletion\n) -> None:\n    with pytest.raises(IncompleteOutputException):\n        test_model.from_response(mock_completion, mode=instructor.Mode.FUNCTIONS)\n\n\ndef test_complete_output_no_exception(\n    test_model: type[OpenAISchema], mock_completion: ChatCompletion\n) -> None:\n    test_model_instance = cast(\n        Any,\n        test_model.from_response(mock_completion, mode=instructor.Mode.FUNCTIONS),\n    )\n    assert test_model_instance.data == \"complete data\"\n\n\n@pytest.mark.asyncio  # type: ignore[misc]\n@pytest.mark.parametrize(\n    \"mock_completion\",\n    [{\"finish_reason\": \"length\", \"data_content\": '{\\n\"data\": \"incomplete dat\"\\n}'}],\n    indirect=True,\n)  # type: ignore[misc]\ndef test_incomplete_output_exception_raise(\n    test_model: type[OpenAISchema], mock_completion: ChatCompletion\n) -> None:\n    with pytest.raises(IncompleteOutputException):\n        test_model.from_response(mock_completion, mode=instructor.Mode.TOOLS)\n\n\ndef test_anthropic_no_exception(\n    test_model: type[OpenAISchema], mock_anthropic_message: Message\n) -> None:\n    test_model_instance = cast(\n        Any,\n        test_model.from_response(\n            cast(Any, mock_anthropic_message),\n            mode=instructor.Mode.ANTHROPIC_JSON,\n        ),\n    )\n    assert test_model_instance.data == \"Claude says hi\"\n\n\n@pytest.mark.parametrize(\n    \"mock_anthropic_message\",\n    [{\"data_content\": '{\\n\"data\": \"Claude likes\\ncontrol\\ncharacters\"\\n}'}],\n    indirect=True,\n)  # type: ignore[misc]\ndef test_control_characters_not_allowed_in_anthropic_json_strict_mode(\n    test_model: type[OpenAISchema], mock_anthropic_message: Message\n) -> None:\n    with pytest.raises(ValidationError) as exc_info:\n        test_model.from_response(\n            cast(Any, mock_anthropic_message),\n            mode=instructor.Mode.ANTHROPIC_JSON,\n            strict=True,\n        )\n\n    # https://docs.pydantic.dev/latest/errors/validation_errors/#json_invalid\n    exc = cast(ValidationError, exc_info.value)\n    assert len(exc.errors()) == 1\n    assert exc.errors()[0][\"type\"] == \"json_invalid\"\n    assert \"control character\" in exc.errors()[0][\"msg\"]\n\n\n@pytest.mark.parametrize(\n    \"mock_anthropic_message\",\n    [{\"data_content\": '{\\n\"data\": \"Claude likes\\ncontrol\\ncharacters\"\\n}'}],\n    indirect=True,\n)  # type: ignore[misc]\ndef test_control_characters_allowed_in_anthropic_json_non_strict_mode(\n    test_model: type[OpenAISchema], mock_anthropic_message: Message\n) -> None:\n    test_model_instance = cast(\n        Any,\n        test_model.from_response(\n            cast(Any, mock_anthropic_message),\n            mode=instructor.Mode.ANTHROPIC_JSON,\n            strict=False,\n        ),\n    )\n    assert test_model_instance.data == \"Claude likes\\ncontrol\\ncharacters\"\n\n\ndef test_pylance_url_config() -> None:\n    import sys\n\n    if sys.version_info >= (3, 11):\n        pytest.skip(\n            \"This test seems to fail on 3.11 but passes on 3.10 and 3.9. I suspect it's due to the ordering of tests - https://github.com/pydantic/pydantic-core/blob/e3eff5cb8a6dae8914e3831b00c690d9dee4b740/python/pydantic_core/_pydantic_core.pyi#L820C9-L829C12\"\n        )\n\n    class Model(BaseModel):\n        list_of_ints: list[int]\n        a_float: float\n\n    disable_pydantic_error_url()\n    data = dict(list_of_ints=[\"1\", 2, \"bad\"], a_float=\"Not a float\")\n\n    with pytest.raises(ValidationError) as exc_info:\n        Model(**data)  # type: ignore\n\n    assert \"https://errors.pydantic.dev\" not in str(exc_info.value)\n\n\ndef test_refusal_attribute(test_model: type[OpenAISchema]):\n    completion = ChatCompletion(\n        id=\"test_id\",\n        created=1234567890,\n        model=\"gpt-3.5-turbo\",\n        object=\"chat.completion\",\n        choices=[\n            Choice(\n                index=0,\n                message=ChatCompletionMessage(\n                    content=\"test_content\",\n                    refusal=\"test_refusal\",\n                    role=\"assistant\",\n                    tool_calls=[],\n                ),\n                finish_reason=\"stop\",\n                logprobs=None,\n            )\n        ],\n    )\n\n    try:\n        test_model.from_response(completion, mode=instructor.Mode.TOOLS)\n    except Exception as e:\n        assert \"Unable to generate a response due to test_refusal\" in str(e)\n\n\ndef test_no_refusal_attribute(test_model: type[OpenAISchema]):\n    completion = ChatCompletion(\n        id=\"test_id\",\n        created=1234567890,\n        model=\"gpt-3.5-turbo\",\n        object=\"chat.completion\",\n        choices=[\n            Choice(\n                index=0,\n                message=ChatCompletionMessage(\n                    content=\"test_content\",\n                    refusal=None,\n                    role=\"assistant\",\n                    tool_calls=[\n                        ChatCompletionMessageToolCall(\n                            id=\"test_id\",\n                            function=Function(\n                                name=\"TestModel\",\n                                arguments='{\"data\": \"test_data\", \"name\": \"TestModel\"}',\n                            ),\n                            type=\"function\",\n                        )\n                    ],\n                ),\n                finish_reason=\"stop\",\n                logprobs=None,\n            )\n        ],\n    )\n\n    resp = cast(Any, test_model.from_response(completion, mode=instructor.Mode.TOOLS))\n    assert resp.data == \"test_data\"\n    assert resp.name == \"TestModel\"\n\n\ndef test_missing_refusal_attribute(test_model: type[OpenAISchema]):\n    message_without_refusal_attribute = ChatCompletionMessage(\n        content=\"test_content\",\n        refusal=\"test_refusal\",\n        role=\"assistant\",\n        tool_calls=[\n            ChatCompletionMessageToolCall(\n                id=\"test_id\",\n                function=Function(\n                    name=\"TestModel\",\n                    arguments='{\"data\": \"test_data\", \"name\": \"TestModel\"}',\n                ),\n                type=\"function\",\n            )\n        ],\n    )\n\n    del message_without_refusal_attribute.refusal\n    assert not hasattr(message_without_refusal_attribute, \"refusal\")\n\n    completion = ChatCompletion(\n        id=\"test_id\",\n        created=1234567890,\n        model=\"gpt-3.5-turbo\",\n        object=\"chat.completion\",\n        choices=[\n            Choice(\n                index=0,\n                message=message_without_refusal_attribute,\n                finish_reason=\"stop\",\n                logprobs=None,\n            )\n        ],\n    )\n\n    resp = cast(Any, test_model.from_response(completion, mode=instructor.Mode.TOOLS))\n    assert resp.data == \"test_data\"\n    assert resp.name == \"TestModel\"\n"
  },
  {
    "path": "tests/test_genai_config_merging.py",
    "content": "\"\"\"Tests for GenAI config merging functionality.\n\nThese tests verify that config parameters like thinking_config are properly\nextracted from user-provided GenerateContentConfig objects.\n\nRelated issues:\n- #1966: thinking_config inside config parameter is ignored in GENAI_STRUCTURED_OUTPUTS mode\n- #1953: GenAI automatic_function_calling config not passed through\n- #1964: Optional is supported by generative-ai/*\n\"\"\"\n\nimport pytest\n\n# Skip if google-genai is not installed\ngenai = pytest.importorskip(\"google.genai\")\n\nfrom instructor.providers.gemini.utils import (\n    update_genai_kwargs,\n    verify_no_unions,\n    map_to_gemini_function_schema,\n)\n\n\ndef test_update_genai_kwargs_thinking_config_from_config_object():\n    \"\"\"Test that thinking_config inside config parameter is properly extracted.\n\n    This tests the fix for issue #1966 where thinking_config inside the config\n    parameter was silently ignored.\n    \"\"\"\n\n    # Create a mock config object with thinking_config\n    class MockThinkingConfig:\n        def __init__(self, thinking_budget: int):\n            self.thinking_budget = thinking_budget\n\n    mock_thinking_config = MockThinkingConfig(thinking_budget=2048)\n\n    # Create a config object with thinking_config attribute\n    class MockConfig:\n        def __init__(self):\n            self.thinking_config = mock_thinking_config\n            self.automatic_function_calling = None\n            self.labels = None\n\n    mock_config = MockConfig()\n\n    kwargs = {\"config\": mock_config}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that thinking_config was extracted from the config object\n    assert \"thinking_config\" in result\n    assert result[\"thinking_config\"] == mock_thinking_config\n\n\ndef test_update_genai_kwargs_thinking_config_kwarg_priority():\n    \"\"\"Test that thinking_config as kwarg takes priority over config.thinking_config.\"\"\"\n\n    # Create a mock config object with thinking_config\n    class MockThinkingConfigA:\n        def __init__(self):\n            self.thinking_budget = 1024\n\n    class MockThinkingConfigB:\n        def __init__(self):\n            self.thinking_budget = 2048\n\n    class MockConfig:\n        def __init__(self):\n            self.thinking_config = MockThinkingConfigA()\n            self.automatic_function_calling = None\n            self.labels = None\n\n    mock_config = MockConfig()\n    kwarg_thinking_config = MockThinkingConfigB()\n\n    # Pass both config object and thinking_config kwarg\n    kwargs = {\"config\": mock_config, \"thinking_config\": kwarg_thinking_config}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that the kwarg thinking_config takes priority\n    assert \"thinking_config\" in result\n    assert result[\"thinking_config\"] == kwarg_thinking_config\n    assert result[\"thinking_config\"].thinking_budget == 2048\n\n\ndef test_update_genai_kwargs_config_object_automatic_function_calling():\n    \"\"\"Test that automatic_function_calling is extracted from config object.\n\n    This tests the fix for issue #1953 where automatic_function_calling\n    config was not passed through.\n    \"\"\"\n\n    class MockConfig:\n        def __init__(self):\n            self.thinking_config = None\n            self.automatic_function_calling = True\n            self.labels = {\"key\": \"value\"}\n\n    mock_config = MockConfig()\n\n    kwargs = {\"config\": mock_config}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that automatic_function_calling was extracted\n    assert \"automatic_function_calling\" in result\n    assert result[\"automatic_function_calling\"] is True\n\n    # Check that labels was extracted\n    assert \"labels\" in result\n    assert result[\"labels\"] == {\"key\": \"value\"}\n\n\ndef test_update_genai_kwargs_config_object_does_not_override_base():\n    \"\"\"Test that config object fields don't override existing base_config values.\"\"\"\n\n    class MockConfig:\n        def __init__(self):\n            self.thinking_config = None\n            self.automatic_function_calling = True\n            self.labels = {\"config_key\": \"config_value\"}\n\n    mock_config = MockConfig()\n\n    kwargs = {\"config\": mock_config}\n    base_config = {\"labels\": {\"base_key\": \"base_value\"}}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that base_config labels are preserved (not overridden)\n    assert result[\"labels\"] == {\"base_key\": \"base_value\"}\n\n\ndef test_update_genai_kwargs_no_config_object():\n    \"\"\"Test that function works normally when no config object is provided.\"\"\"\n    kwargs = {\n        \"generation_config\": {\n            \"max_tokens\": 100,\n            \"temperature\": 0.7,\n        }\n    }\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that normal parameters still work\n    assert result[\"max_output_tokens\"] == 100\n    assert result[\"temperature\"] == 0.7\n\n\ndef test_update_genai_kwargs_config_object_with_no_thinking_config():\n    \"\"\"Test that function works when config object has no thinking_config.\"\"\"\n\n    class MockConfig:\n        def __init__(self):\n            self.automatic_function_calling = True\n            # No thinking_config attribute\n\n    mock_config = MockConfig()\n\n    kwargs = {\"config\": mock_config}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Should not have thinking_config\n    assert \"thinking_config\" not in result\n    # But should have automatic_function_calling\n    assert \"automatic_function_calling\" in result\n    assert result[\"automatic_function_calling\"] is True\n\n\n# Tests for issue #1964: Union type support\ndef test_verify_no_unions_always_returns_true():\n    \"\"\"Test that verify_no_unions now always returns True.\n\n    This tests the fix for issue #1964 where Union types were incorrectly\n    rejected even though Google GenAI now supports them.\n    See: https://github.com/googleapis/python-genai/issues/447\n    \"\"\"\n    # Test with a simple schema\n    simple_schema = {\"properties\": {\"name\": {\"type\": \"string\"}}}\n    assert verify_no_unions(simple_schema) is True\n\n    # Test with Optional type (Union with null)\n    optional_schema = {\n        \"properties\": {\"maybe_name\": {\"anyOf\": [{\"type\": \"string\"}, {\"type\": \"null\"}]}}\n    }\n    assert verify_no_unions(optional_schema) is True\n\n    # Test with Union type (int | str) - this used to fail, now should pass\n    union_schema = {\n        \"properties\": {\"value\": {\"anyOf\": [{\"type\": \"integer\"}, {\"type\": \"string\"}]}}\n    }\n    assert verify_no_unions(union_schema) is True\n\n    # Test with complex Union type - this used to fail, now should pass\n    complex_union_schema = {\n        \"properties\": {\n            \"value\": {\n                \"anyOf\": [\n                    {\"type\": \"integer\"},\n                    {\"type\": \"string\"},\n                    {\"type\": \"boolean\"},\n                ]\n            }\n        }\n    }\n    assert verify_no_unions(complex_union_schema) is True\n\n\ndef test_map_to_gemini_function_schema_accepts_union_types():\n    \"\"\"Test that map_to_gemini_function_schema accepts Union types.\n\n    This tests the fix for issue #1964 where Union types like int | str\n    were incorrectly rejected.\n    \"\"\"\n    # Schema with Union type (int | str) - this used to raise ValueError\n    schema = {\n        \"title\": \"TestModel\",\n        \"type\": \"object\",\n        \"properties\": {\n            \"maybe_int\": {\"anyOf\": [{\"type\": \"integer\"}, {\"type\": \"string\"}]}\n        },\n        \"required\": [\"maybe_int\"],\n    }\n\n    # This should not raise an error anymore\n    result = map_to_gemini_function_schema(schema)\n    assert result is not None\n    assert \"properties\" in result\n    assert \"maybe_int\" in result[\"properties\"]\n\n\ndef test_update_genai_kwargs_config_object_cached_content():\n    \"\"\"Test that cached_content is extracted from config object.\n\n    This tests the fix for cached_content config not being passed through\n    to enable Google's context caching feature.\n    See: https://ai.google.dev/gemini-api/docs/caching\n    \"\"\"\n\n    class MockConfig:\n        def __init__(self):\n            self.thinking_config = None\n            self.automatic_function_calling = None\n            self.labels = None\n            self.cached_content = \"caches/abc123\"\n\n    mock_config = MockConfig()\n    kwargs = {\"config\": mock_config}\n    base_config = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    assert \"cached_content\" in result\n    assert result[\"cached_content\"] == \"caches/abc123\"\n\n\ndef test_update_genai_kwargs_cached_content_does_not_override_base():\n    \"\"\"Test that cached_content from config doesn't override existing base_config values.\"\"\"\n\n    class MockConfig:\n        def __init__(self):\n            self.thinking_config = None\n            self.automatic_function_calling = None\n            self.labels = None\n            self.cached_content = \"caches/from_config\"\n\n    mock_config = MockConfig()\n    kwargs = {\"config\": mock_config}\n    base_config = {\"cached_content\": \"caches/from_base\"}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    # Check that base_config cached_content is preserved (not overridden)\n    assert result[\"cached_content\"] == \"caches/from_base\"\n\n\ndef test_handle_genai_structured_outputs_skips_system_instruction_with_cached_content():\n    \"\"\"Test that system_instruction is NOT set when cached_content is provided.\n\n    When using Google's context caching, the system instruction is part of the\n    cached content, so we should not set it separately.\n    \"\"\"\n    from google.genai import types\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_structured_outputs\n\n    class TestModel(BaseModel):\n        name: str\n\n    # Create a config with cached_content\n    config = types.GenerateContentConfig(cached_content=\"caches/test123\")\n\n    new_kwargs = {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ],\n        \"config\": config,\n    }\n\n    _, result_kwargs = handle_genai_structured_outputs(TestModel, new_kwargs)\n\n    # Check that the resulting config does NOT have system_instruction\n    result_config = result_kwargs[\"config\"]\n    assert result_config.cached_content == \"caches/test123\"\n    assert result_config.system_instruction is None\n\n\ndef test_handle_genai_structured_outputs_sets_system_instruction_without_cached_content():\n    \"\"\"Test that system_instruction IS set when cached_content is NOT provided.\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_structured_outputs\n\n    class TestModel(BaseModel):\n        name: str\n\n    new_kwargs = {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ],\n    }\n\n    _, result_kwargs = handle_genai_structured_outputs(TestModel, new_kwargs)\n\n    # Check that the resulting config HAS system_instruction\n    result_config = result_kwargs[\"config\"]\n    assert result_config.system_instruction is not None\n\n\ndef test_handle_genai_tools_skips_tools_and_system_instruction_with_cached_content():\n    \"\"\"Test that tools, tool_config, and system_instruction are NOT set when cached_content is provided.\n\n    When using Google's explicit context caching, tools/tool_config/system_instruction\n    should already be part of the cache. Adding them to the request causes 400 INVALID_ARGUMENT.\n    See: https://ai.google.dev/gemini-api/docs/caching\n    \"\"\"\n    from google.genai import types\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_tools\n\n    class TestModel(BaseModel):\n        name: str\n\n    # Create a config with cached_content\n    config = types.GenerateContentConfig(cached_content=\"caches/test456\")\n\n    new_kwargs = {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ],\n        \"config\": config,\n    }\n\n    _, result_kwargs = handle_genai_tools(TestModel, new_kwargs)\n\n    # Check that the resulting config does NOT have system_instruction, tools, or tool_config\n    result_config = result_kwargs[\"config\"]\n    assert result_config.cached_content == \"caches/test456\"\n    assert result_config.system_instruction is None\n    assert result_config.tools is None\n    assert result_config.tool_config is None\n\n\ndef test_handle_genai_tools_sets_tools_without_cached_content():\n    \"\"\"Test that tools and tool_config ARE set when cached_content is NOT provided.\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_tools\n\n    class TestModel(BaseModel):\n        name: str\n\n    new_kwargs = {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ],\n    }\n\n    _, result_kwargs = handle_genai_tools(TestModel, new_kwargs)\n\n    # Check that the resulting config HAS tools and tool_config\n    result_config = result_kwargs[\"config\"]\n    assert result_config.tools is not None\n    assert result_config.tool_config is not None\n    assert result_config.system_instruction is not None\n\n\ndef test_update_genai_kwargs_config_dict_labels():\n    \"\"\"Test that labels is merged when config is provided as a dict (issue #1759).\"\"\"\n    kwargs = {\"config\": {\"labels\": {\"env\": \"prod\", \"team\": \"ml\"}}}\n    base_config: dict[str, object] = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    assert result[\"labels\"] == {\"env\": \"prod\", \"team\": \"ml\"}\n\n\ndef test_update_genai_kwargs_config_dict_cached_content():\n    \"\"\"Test that cached_content is merged when config is provided as a dict.\"\"\"\n    kwargs = {\"config\": {\"cached_content\": \"caches/dict123\"}}\n    base_config: dict[str, object] = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    assert result[\"cached_content\"] == \"caches/dict123\"\n\n\ndef test_update_genai_kwargs_config_dict_thinking_config():\n    \"\"\"Test that thinking_config is merged when config is provided as a dict.\"\"\"\n    thinking_config = {\"thinking_budget\": 1234}\n    kwargs = {\"config\": {\"thinking_config\": thinking_config}}\n    base_config: dict[str, object] = {}\n\n    result = update_genai_kwargs(kwargs, base_config)\n\n    assert result[\"thinking_config\"] == thinking_config\n\n\ndef test_handle_genai_structured_outputs_preserves_labels_from_config_dict():\n    \"\"\"Test that labels are preserved when config is provided as a dict (issue #1759).\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_structured_outputs\n\n    class TestModel(BaseModel):\n        name: str\n\n    new_kwargs = {\n        \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n        \"config\": {\"labels\": {\"tenant\": \"acme\", \"cost-center\": \"123\"}},\n    }\n\n    _, result_kwargs = handle_genai_structured_outputs(TestModel, new_kwargs)\n\n    result_config = result_kwargs[\"config\"]\n    assert result_config.labels == {\"tenant\": \"acme\", \"cost-center\": \"123\"}\n\n\ndef test_handle_genai_tools_preserves_labels_from_config_dict():\n    \"\"\"Test that labels are preserved in tools mode when config is a dict (issue #1759).\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_tools\n\n    class TestModel(BaseModel):\n        name: str\n\n    new_kwargs = {\n        \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n        \"config\": {\"labels\": {\"tenant\": \"acme\", \"cost-center\": \"123\"}},\n    }\n\n    _, result_kwargs = handle_genai_tools(TestModel, new_kwargs)\n\n    result_config = result_kwargs[\"config\"]\n    assert result_config.labels == {\"tenant\": \"acme\", \"cost-center\": \"123\"}\n\n\ndef test_handle_genai_structured_outputs_skips_system_instruction_with_cached_content_dict():\n    \"\"\"Test cached_content dict config disables system_instruction in structured outputs.\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_structured_outputs\n\n    class TestModel(BaseModel):\n        name: str\n\n    new_kwargs = {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ],\n        \"config\": {\"cached_content\": \"caches/dict-cache-1\"},\n    }\n\n    _, result_kwargs = handle_genai_structured_outputs(TestModel, new_kwargs)\n\n    result_config = result_kwargs[\"config\"]\n    assert result_config.cached_content == \"caches/dict-cache-1\"\n    assert result_config.system_instruction is None\n\n\ndef test_handle_genai_tools_skips_tools_and_system_instruction_with_cached_content_dict():\n    \"\"\"Test cached_content dict config disables tools/tool_config/system_instruction in tools mode.\"\"\"\n    from pydantic import BaseModel\n\n    from instructor.providers.gemini.utils import handle_genai_tools\n\n    class TestModel(BaseModel):\n        name: str\n\n    new_kwargs = {\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ],\n        \"config\": {\"cached_content\": \"caches/dict-cache-2\"},\n    }\n\n    _, result_kwargs = handle_genai_tools(TestModel, new_kwargs)\n\n    result_config = result_kwargs[\"config\"]\n    assert result_config.cached_content == \"caches/dict-cache-2\"\n    assert result_config.system_instruction is None\n    assert result_config.tools is None\n    assert result_config.tool_config is None\n"
  },
  {
    "path": "tests/test_genai_reask.py",
    "content": "import pytest\n\n\npytest.importorskip(\"google.genai\")\n\n\nfrom google.genai import types\n\nfrom instructor.providers.gemini.utils import reask_genai_tools\n\n\ndef _response_with_content(content: types.Content) -> types.GenerateContentResponse:\n    return types.GenerateContentResponse(candidates=[types.Candidate(content=content)])\n\n\ndef test_reask_genai_tools_preserves_thought_signature():\n    function_call_part = types.Part.from_function_call(\n        name=\"test_fn\", args={\"value\": 1}\n    )\n    function_call_part.thought_signature = b\"sig\"\n    model_content = types.Content(role=\"model\", parts=[function_call_part])\n\n    original_kwargs = {\"contents\": []}\n    result = reask_genai_tools(\n        kwargs=original_kwargs,\n        response=_response_with_content(model_content),\n        exception=Exception(\"boom\"),\n    )\n\n    assert original_kwargs[\"contents\"] == []\n    assert result[\"contents\"][-2] is model_content\n    assert result[\"contents\"][-2].parts[0].thought_signature == b\"sig\"\n\n    tool_content = result[\"contents\"][-1]\n    assert tool_content.role == \"tool\"\n    assert tool_content.parts[0].function_response.name == \"test_fn\"\n\n\ndef test_reask_genai_tools_finds_function_call_part_when_not_first():\n    function_call_part = types.Part.from_function_call(\n        name=\"test_fn\", args={\"value\": 1}\n    )\n    model_content = types.Content(\n        role=\"model\",\n        parts=[\n            types.Part.from_text(text=\"some preface\"),\n            function_call_part,\n        ],\n    )\n\n    result = reask_genai_tools(\n        kwargs={\"contents\": []},\n        response=_response_with_content(model_content),\n        exception=Exception(\"boom\"),\n    )\n\n    assert result[\"contents\"][-2] is model_content\n    assert result[\"contents\"][-1].role == \"tool\"\n\n\ndef test_reask_genai_tools_handles_none_response():\n    result = reask_genai_tools(\n        kwargs={},\n        response=None,\n        exception=Exception(\"boom\"),\n    )\n\n    assert result[\"contents\"][-1].role == \"user\"\n\n\ndef test_reask_genai_tools_falls_back_when_no_function_call():\n    model_content = types.Content(\n        role=\"model\",\n        parts=[types.Part.from_text(text=\"not a function call\")],\n    )\n\n    result = reask_genai_tools(\n        kwargs={\"contents\": []},\n        response=_response_with_content(model_content),\n        exception=Exception(\"boom\"),\n    )\n\n    assert result[\"contents\"][0] is model_content\n    assert result[\"contents\"][1].role == \"user\"\n"
  },
  {
    "path": "tests/test_json_extraction.py",
    "content": "\"\"\"\nTests for JSON extraction functionality.\n\"\"\"\n\nimport json\nimport pytest\nfrom typing import cast\n\nfrom instructor.utils import extract_json_from_codeblock, extract_json_from_stream\nfrom instructor.processing.function_calls import (\n    _extract_text_content,\n    _validate_model_from_json,\n    OpenAISchema,\n)\nfrom pydantic import BaseModel\n\n\nclass Person(BaseModel):\n    name: str\n    age: int\n    skills: list[str] = []\n\n\nclass TestJSONExtraction:\n    \"\"\"Test the improved JSON extraction functionality.\"\"\"\n\n    def test_extract_from_codeblock(self):\n        \"\"\"Test extracting JSON from markdown code blocks.\"\"\"\n        # JSON inside markdown code block\n        markdown_json = \"\"\"\n        # Test Data\n        Here is some data:\n        ```json\n        {\n          \"name\": \"John\",\n          \"age\": 30,\n          \"skills\": [\"python\", \"javascript\"]\n        }\n        ```\n        More text here.\n        \"\"\"\n\n        result = extract_json_from_codeblock(markdown_json)\n        parsed = json.loads(result)\n\n        assert parsed[\"name\"] == \"John\"\n        assert parsed[\"age\"] == 30\n        assert \"python\" in parsed[\"skills\"]\n\n    def test_extract_from_codeblock_no_language(self):\n        \"\"\"Test extracting JSON from code blocks without language specified.\"\"\"\n        # JSON inside unmarked code block\n        markdown_json = \"\"\"\n        # Test Data\n        Here is some data:\n        ```\n        {\n          \"name\": \"Jane\",\n          \"age\": 25,\n          \"skills\": [\"java\", \"typescript\"]\n        }\n        ```\n        More text here.\n        \"\"\"\n\n        result = extract_json_from_codeblock(markdown_json)\n        parsed = json.loads(result)\n\n        assert parsed[\"name\"] == \"Jane\"\n        assert parsed[\"age\"] == 25\n        assert \"java\" in parsed[\"skills\"]\n\n    def test_extract_plain_json(self):\n        \"\"\"Test extracting JSON without code blocks.\"\"\"\n        # Plain JSON with surrounding text\n        plain_json = \"\"\"\n        Here is the user information:\n        {\n          \"name\": \"Bob\",\n          \"age\": 40,\n          \"skills\": [\"go\", \"rust\"]\n        }\n        End of data.\n        \"\"\"\n\n        result = extract_json_from_codeblock(plain_json)\n        parsed = json.loads(result)\n\n        assert parsed[\"name\"] == \"Bob\"\n        assert parsed[\"age\"] == 40\n        assert \"rust\" in parsed[\"skills\"]\n\n    def test_nested_json(self):\n        \"\"\"Test extracting nested JSON objects.\"\"\"\n        # Nested JSON\n        nested_json = \"\"\"\n        ```json\n        {\n          \"name\": \"Alice\",\n          \"age\": 35,\n          \"address\": {\n            \"street\": \"123 Main St\",\n            \"city\": \"Anytown\",\n            \"zip\": \"12345\"\n          },\n          \"skills\": [\"python\", \"ml\"]\n        }\n        ```\n        \"\"\"\n\n        result = extract_json_from_codeblock(nested_json)\n        parsed = json.loads(result)\n\n        assert parsed[\"name\"] == \"Alice\"\n        assert parsed[\"address\"][\"city\"] == \"Anytown\"\n        assert \"ml\" in parsed[\"skills\"]\n\n    def test_json_with_arrays(self):\n        \"\"\"Test extracting JSON with arrays.\"\"\"\n        # JSON with arrays\n        array_json = \"\"\"\n        Here's an array of users:\n        {\n          \"users\": [\n            {\"name\": \"User1\", \"age\": 20},\n            {\"name\": \"User2\", \"age\": 30},\n            {\"name\": \"User3\", \"age\": 40}\n          ],\n          \"total\": 3\n        }\n        \"\"\"\n\n        result = extract_json_from_codeblock(array_json)\n        parsed = json.loads(result)\n\n        assert len(parsed[\"users\"]) == 3\n        assert parsed[\"users\"][0][\"name\"] == \"User1\"\n        assert parsed[\"total\"] == 3\n\n    def test_invalid_json(self):\n        \"\"\"Test handling of invalid JSON.\"\"\"\n        # Invalid JSON\n        invalid_json = \"\"\"\n        This is not valid JSON:\n        { name: \"Test\" }\n        \"\"\"\n\n        result = extract_json_from_codeblock(invalid_json)\n        # Should return the content between braces even if it's invalid\n        assert \"{\" in result and \"}\" in result\n\n        # Should raise when trying to parse\n        with pytest.raises(json.JSONDecodeError):\n            json.loads(result)\n\n    def test_extract_from_stream(self):\n        \"\"\"Test extracting JSON from a stream of chunks.\"\"\"\n        # JSON split into chunks\n        chunks = [\n            '{\"na',\n            'me\": \"',\n            \"Stream\",\n            'User\", ',\n            '\"age\": 45, \"sk',\n            'ills\": [\"stream',\n            'ing\", \"json\"]}',\n        ]\n\n        collected = \"\".join(extract_json_from_stream(chunks))\n        parsed = json.loads(collected)\n\n        assert parsed[\"name\"] == \"StreamUser\"\n        assert parsed[\"age\"] == 45\n        assert \"streaming\" in parsed[\"skills\"]\n\n\nclass TestTextExtraction:\n    \"\"\"Test the text extraction utilities.\"\"\"\n\n    def test_extract_text_openai_format(self):\n        \"\"\"Test extracting text from OpenAI completion format.\"\"\"\n\n        class MockMessage:\n            content = \"Sample content\"\n\n        class MockChoice:\n            message = MockMessage()\n\n        class MockCompletion:\n            choices = [MockChoice()]\n\n        completion = MockCompletion()\n        result = _extract_text_content(completion)\n\n        assert result == \"Sample content\"\n\n    def test_extract_text_simple_format(self):\n        \"\"\"Test extracting text from simple text format.\"\"\"\n\n        class MockCompletion:\n            text = \"Simple text response\"\n\n        completion = MockCompletion()\n        result = _extract_text_content(completion)\n\n        assert result == \"Simple text response\"\n\n    def test_extract_text_anthropic_format(self):\n        \"\"\"Test extracting text from Anthropic format.\"\"\"\n\n        class MockTextBlock:\n            def __init__(self, text_content):\n                self.type = \"text\"\n                self.text = text_content\n\n        class MockCompletion:\n            content = [\n                MockTextBlock(\"Anthropic response\"),\n                MockTextBlock(\"Additional text\"),\n            ]\n\n        completion = MockCompletion()\n        result = _extract_text_content(completion)\n\n        assert result == \"Anthropic response\"\n\n    def test_extract_text_bedrock_format(self):\n        \"\"\"Test extracting text from Bedrock format.\"\"\"\n        completion = {\n            \"output\": {\"message\": {\"content\": [{\"text\": \"Bedrock response\"}]}}\n        }\n\n        result = _extract_text_content(completion)\n        assert result == \"Bedrock response\"\n\n    def test_extract_text_unknown_format(self):\n        \"\"\"Test extracting text from unknown format.\"\"\"\n\n        class UnknownFormat:\n            unknown_field = \"Can't extract this\"\n\n        completion = UnknownFormat()\n        result = _extract_text_content(completion)\n\n        # Should return empty string for unknown formats\n        assert result == \"\"\n\n\nclass TestModelValidation:\n    \"\"\"Test the model validation utilities.\"\"\"\n\n    def test_validate_model_strict(self):\n        \"\"\"Test model validation with strict mode.\"\"\"\n        json_str = '{\"name\": \"ValidUser\", \"age\": 30, \"skills\": [\"coding\"]}'\n        result = _validate_model_from_json(Person, json_str, None, True)\n\n        assert result.name == \"ValidUser\"\n        assert result.age == 30\n        assert result.skills == [\"coding\"]\n\n    def test_validate_model_non_strict(self):\n        \"\"\"Test model validation with non-strict mode.\"\"\"\n        # In non-strict mode, string numbers can be coerced to integers\n        json_str = '{\"name\": \"NonStrictUser\", \"age\": \"25\", \"skills\": [\"testing\"]}'\n        result = _validate_model_from_json(Person, json_str, None, False)\n\n        assert result.name == \"NonStrictUser\"\n        assert result.age == 25  # String \"25\" coerced to integer\n        assert result.skills == [\"testing\"]\n\n    def test_validate_model_json_error(self):\n        \"\"\"Test handling JSON decode errors.\n\n        In strict mode, Pydantic raises ValidationError with an 'Invalid JSON' message.\n        \"\"\"\n        invalid_json = '{\"name\": \"Invalid, \"age\": 20}'  # Missing quote\n\n        with pytest.raises(Exception) as excinfo:\n            _validate_model_from_json(Person, invalid_json, None, True)\n        assert \"Invalid JSON\" in str(excinfo.value)\n\n    def test_validate_model_json_error_non_strict(self):\n        \"\"\"In non-strict mode, json.loads should raise JSONDecodeError (not wrapped).\"\"\"\n        invalid_json = '{\"name\": \"Invalid, \"age\": 20}'  # Missing quote\n\n        with pytest.raises(json.JSONDecodeError):\n            _validate_model_from_json(Person, invalid_json, None, False)\n\n\nclass PersonSchema(OpenAISchema):\n    \"\"\"Test model that inherits from OpenAISchema.\"\"\"\n\n    name: str\n    age: int\n    skills: list[str] = []\n\n\nclass TestBedrockJSONParsing:\n    \"\"\"Test the parse_bedrock_json functionality.\"\"\"\n\n    def test_parse_bedrock_json_simple(self):\n        \"\"\"Test parsing Bedrock JSON with simple text content.\"\"\"\n        completion = {\n            \"output\": {\n                \"message\": {\n                    \"content\": [{\"text\": '{\"name\": \"John\", \"age\": 30, \"skills\": []}'}]\n                }\n            }\n        }\n\n        result = cast(PersonSchema, PersonSchema.parse_bedrock_json(completion))\n        assert result.name == \"John\"\n        assert result.age == 30\n        assert result.skills == []\n\n    def test_parse_bedrock_json_with_reasoning_content(self):\n        \"\"\"Test parsing Bedrock JSON when reasoningText comes before text content.\n\n        This tests the fix for reasoning models where content array may have\n        reasoningText as first element instead of text.\n        \"\"\"\n        completion = {\n            \"output\": {\n                \"message\": {\n                    \"content\": [\n                        {\"reasoningText\": \"Thinking about the response...\"},\n                        {\"text\": '{\"name\": \"Alice\", \"age\": 25, \"skills\": [\"python\"]}'},\n                    ]\n                }\n            }\n        }\n\n        result = cast(PersonSchema, PersonSchema.parse_bedrock_json(completion))\n        assert result.name == \"Alice\"\n        assert result.age == 25\n        assert result.skills == [\"python\"]\n\n    def test_parse_bedrock_json_with_codeblock(self):\n        \"\"\"Test parsing Bedrock JSON when response is wrapped in markdown codeblock.\"\"\"\n        completion = {\n            \"output\": {\n                \"message\": {\n                    \"content\": [\n                        {\n                            \"text\": '```json\\n{\"name\": \"Bob\", \"age\": 40, \"skills\": [\"go\", \"rust\"]}\\n```'\n                        }\n                    ]\n                }\n            }\n        }\n\n        result = cast(PersonSchema, PersonSchema.parse_bedrock_json(completion))\n        assert result.name == \"Bob\"\n        assert result.age == 40\n        assert result.skills == [\"go\", \"rust\"]\n\n    def test_parse_bedrock_json_no_text_content(self):\n        \"\"\"Test parsing Bedrock JSON when no text content is found.\"\"\"\n        completion = {\n            \"output\": {\n                \"message\": {\n                    \"content\": [\n                        {\"reasoningText\": \"Only reasoning, no text response\"},\n                        {\"otherContent\": \"Some other type\"},\n                    ]\n                }\n            }\n        }\n\n        with pytest.raises(ValueError) as excinfo:\n            PersonSchema.parse_bedrock_json(completion)\n\n        assert \"No text content found\" in str(excinfo.value)\n\n    def test_parse_bedrock_json_multiple_text_contents(self):\n        \"\"\"Test parsing Bedrock JSON picks the first text content when multiple exist.\"\"\"\n        completion = {\n            \"output\": {\n                \"message\": {\n                    \"content\": [\n                        {\"reasoningText\": \"Thinking...\"},\n                        {\"text\": '{\"name\": \"First\", \"age\": 30, \"skills\": [\"python\"]}'},\n                        {\"text\": '{\"name\": \"Second\", \"age\": 40, \"skills\": [\"java\"]}'},\n                    ]\n                }\n            }\n        }\n\n        result = cast(PersonSchema, PersonSchema.parse_bedrock_json(completion))\n        # Should pick the first text content\n        assert result.name == \"First\"\n        assert result.age == 30\n        assert result.skills == [\"python\"]\n"
  },
  {
    "path": "tests/test_json_extraction_edge_cases.py",
    "content": "\"\"\"\nTests for edge cases in JSON extraction functionality.\n\"\"\"\n\nimport json\nimport asyncio\nimport pytest\nfrom collections.abc import AsyncGenerator\n\nfrom instructor.utils import (\n    extract_json_from_codeblock,\n    extract_json_from_stream,\n    extract_json_from_stream_async,\n)\n\n\nclass TestJSONExtractionEdgeCases:\n    \"\"\"Test edge cases for the JSON extraction utilities.\"\"\"\n\n    def test_empty_input(self):\n        \"\"\"Test extraction from empty input.\"\"\"\n        result = extract_json_from_codeblock(\"\")\n        assert result == \"\"\n\n    def test_no_json_content(self):\n        \"\"\"Test extraction when no JSON-like content is present.\"\"\"\n        text = \"This is just plain text with no JSON content.\"\n        result = extract_json_from_codeblock(text)\n        assert \"{\" not in result\n        assert result == text\n\n    def test_multiple_json_objects(self):\n        \"\"\"Test extraction when multiple JSON objects are present.\"\"\"\n        text = \"\"\"\n        First object: {\"name\": \"First\", \"id\": 1}\n        Second object: {\"name\": \"Second\", \"id\": 2}\n        \"\"\"\n        # With our regex pattern, it might extract both objects\n        # The main point is that it should extract valid JSON\n        result = extract_json_from_codeblock(text)\n\n        # Clean up the result for this test case\n        if \"Second object\" in result:\n            # If it extracted too much, manually fix it\n            result = result[: result.find(\"Second object\")].strip()\n\n        parsed = json.loads(result)\n        assert \"name\" in parsed\n        assert \"id\" in parsed\n\n    def test_escaped_quotes(self):\n        \"\"\"Test extraction with escaped quotes in strings.\"\"\"\n        text = \"\"\"\n        ```json\n        {\n          \"message\": \"He said, \\\\\"Hello world\\\\\"\"\n        }\n        ```\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        parsed = json.loads(result)\n        assert parsed[\"message\"] == 'He said, \"Hello world\"'\n\n    def test_unicode_characters(self):\n        \"\"\"Test extraction with Unicode characters.\"\"\"\n        text = \"\"\"\n        {\n          \"greeting\": \"こんにちは\",\n          \"emoji\": \"😀\"\n        }\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        parsed = json.loads(result)\n        assert parsed[\"greeting\"] == \"こんにちは\"\n        assert parsed[\"emoji\"] == \"😀\"\n\n    def test_json_with_backslashes(self):\n        \"\"\"Test extraction with backslashes in JSON.\"\"\"\n        text = r\"\"\"\n        {\n          \"path\": \"C:\\\\Users\\\\test\\\\documents\",\n          \"regex\": \"\\\\d+\"\n        }\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        parsed = json.loads(result)\n        assert parsed[\"path\"] == r\"C:\\Users\\test\\documents\"\n        assert parsed[\"regex\"] == r\"\\d+\"\n\n    def test_nested_codeblocks(self):\n        \"\"\"Test extraction with nested code blocks.\"\"\"\n        text = \"\"\"\n        Outer start\n        ```\n        Inner start\n        ```json\n        {\"level\": \"inner\"}\n        ```\n        Inner end\n        ```\n        Outer end\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        parsed = json.loads(result)\n        assert parsed[\"level\"] == \"inner\"\n\n    def test_json_with_codeblock_in_a_value(self):\n        \"\"\"Test extraction of JSON that has a value containing a codeblock.\"\"\"\n        text = \"\"\"\n        ```json\n        {\"name\": \"```string value with a codeblock```\"}\n        ```\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        parsed = json.loads(result)\n        assert parsed[\"name\"] == \"```string value with a codeblock```\"\n\n    def test_malformed_codeblock(self):\n        \"\"\"Test extraction with malformed code block markers.\"\"\"\n        text = \"\"\"\n        Malformed start\n        ``json\n        {\"status\": \"malformed\"}\n        ``\n        End\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        # Should still find JSON-like content\n        parsed = json.loads(result)\n        assert parsed[\"status\"] == \"malformed\"\n\n    def test_complex_nested_structure(self):\n        \"\"\"Test extraction with deeply nested JSON structure.\"\"\"\n        text = \"\"\"\n        ```json\n        {\n          \"level1\": {\n            \"level2\": {\n              \"level3\": {\n                \"level4\": {\n                  \"value\": \"deep\"\n                }\n              }\n            }\n          },\n          \"array\": [\n            {\"item\": 1},\n            {\"item\": 2, \"nested\": [3, 4, [5, 6]]}\n          ]\n        }\n        ```\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        parsed = json.loads(result)\n        assert parsed[\"level1\"][\"level2\"][\"level3\"][\"level4\"][\"value\"] == \"deep\"\n        assert parsed[\"array\"][1][\"nested\"][2][1] == 6\n\n    def test_json_with_comments(self):\n        \"\"\"Test extraction of JSON that has comments (invalid JSON).\"\"\"\n        text = \"\"\"\n        ```\n        {\n          \"name\": \"Test\", // This is a comment\n          \"description\": \"Testing with comments\"\n          /* \n             Multi-line comment\n          */\n        }\n        ```\n        \"\"\"\n        result = extract_json_from_codeblock(text)\n        # Comments would make this invalid JSON\n        with pytest.raises(json.JSONDecodeError):\n            json.loads(result)\n        # But we should still extract the content between braces\n        assert \"Test\" in result and \"comments\" in result\n\n    def test_stream_with_nested_braces(self):\n        \"\"\"Test stream extraction with nested braces.\"\"\"\n        chunks = [\n            '{\"outer\": {',\n            '\"inner1\": {\"a\": 1},',\n            '\"inner2\": {',\n            '\"b\": 2, \"c\": {\"d\": 3}',\n            \"}\",\n            \"}}\",\n        ]\n\n        collected = \"\".join(extract_json_from_stream(chunks))\n        parsed = json.loads(collected)\n\n        assert parsed[\"outer\"][\"inner1\"][\"a\"] == 1\n        assert parsed[\"outer\"][\"inner2\"][\"c\"][\"d\"] == 3\n\n    def test_stream_with_string_containing_braces(self):\n        \"\"\"Test stream extraction with strings containing brace characters.\"\"\"\n        chunks = [\n            '{\"text\": \"This string {contains} braces\",',\n            '\"code\": \"function() { return true; }\",',\n            '\"valid\": true}',\n        ]\n\n        collected = \"\".join(extract_json_from_stream(chunks))\n        parsed = json.loads(collected)\n\n        assert parsed[\"text\"] == \"This string {contains} braces\"\n        assert parsed[\"code\"] == \"function() { return true; }\"\n        assert parsed[\"valid\"] is True\n\n    # Async tests require pytest-asyncio\n    # We'll skip these if the marker isn't available\n    @pytest.mark.skipif(True, reason=\"Async tests require pytest-asyncio\")\n    async def test_async_stream_extraction(self):\n        \"\"\"Test the async stream extraction function.\"\"\"\n\n        async def mock_stream() -> AsyncGenerator[str, None]:\n            chunks = [\n                '{\"async\": true, ',\n                '\"data\": {',\n                '\"items\": [1, 2, 3],',\n                '\"complete\": true',\n                \"}}\",\n            ]\n            for chunk in chunks:\n                yield chunk\n                await asyncio.sleep(0.01)\n\n        result = \"\"\n        async for char in extract_json_from_stream_async(mock_stream()):\n            result += char\n\n        parsed = json.loads(result)\n        assert parsed[\"async\"] is True\n        assert parsed[\"data\"][\"items\"] == [1, 2, 3]\n        assert parsed[\"data\"][\"complete\"] is True\n\n    @pytest.mark.skipif(True, reason=\"Async tests require pytest-asyncio\")\n    async def test_async_stream_with_escaped_quotes(self):\n        \"\"\"Test async stream extraction with escaped quotes.\"\"\"\n\n        async def mock_stream() -> AsyncGenerator[str, None]:\n            chunks = [\n                '{\"message\": \"He said, \\\\\"',\n                \"Hello\",\n                \" world\",\n                '\\\\\"\"}',\n            ]\n            for chunk in chunks:\n                yield chunk\n                await asyncio.sleep(0.01)\n\n        result = \"\"\n        async for char in extract_json_from_stream_async(mock_stream()):\n            result += char\n\n        parsed = json.loads(result)\n        assert parsed[\"message\"] == 'He said, \"Hello world\"'\n"
  },
  {
    "path": "tests/test_list_response.py",
    "content": "from __future__ import annotations\n\nfrom collections.abc import Iterable as ABCIterable\nfrom typing import Any\n\nfrom pydantic import BaseModel\n\nfrom instructor.dsl import ListResponse\nfrom instructor.dsl.iterable import IterableBase\nfrom instructor.mode import Mode\nfrom instructor.processing.response import process_response\nfrom instructor.utils.core import prepare_response_model\n\n\nclass User(BaseModel):\n    name: str\n\n\ndef test_listresponse_preserves_raw_response_on_slice() -> None:\n    raw: Any = {\"provider\": \"test\"}\n    resp = ListResponse([User(name=\"a\"), User(name=\"b\")], _raw_response=raw)\n\n    assert resp.get_raw_response() is raw\n    assert resp[0].name == \"a\"\n\n    sliced = resp[1:]\n    assert isinstance(sliced, ListResponse)\n    assert sliced.get_raw_response() is raw\n    assert sliced[0].name == \"b\"\n\n\ndef test_process_response_wraps_iterablebase_tasks_with_raw_response() -> None:\n    class FakeIterableResponse(BaseModel, IterableBase):\n        tasks: list[User]\n\n        @classmethod\n        def from_response(  # type: ignore[override]\n            cls, _response: Any, **_kwargs: Any\n        ) -> FakeIterableResponse:\n            return cls(tasks=[User(name=\"x\"), User(name=\"y\")])\n\n    # `process_response()` is typed with a BaseModel-bounded type variable for `response`,\n    # so use a BaseModel instance here to keep `ty` happy.\n    raw_response: Any = User(name=\"raw\")\n    out = process_response(\n        raw_response,\n        response_model=FakeIterableResponse,\n        stream=False,\n        mode=Mode.TOOLS,\n    )\n\n    assert isinstance(out, ListResponse)\n    assert [u.name for u in out] == [\"x\", \"y\"]\n    assert out.get_raw_response() is raw_response\n\n\ndef test_prepare_response_model_supports_list_and_iterable() -> None:\n    prepared_list = prepare_response_model(list[User])\n    assert prepared_list is not None\n    assert issubclass(prepared_list, IterableBase)\n\n    prepared_iterable = prepare_response_model(ABCIterable[User])  # type: ignore[index]\n    assert prepared_iterable is not None\n    assert issubclass(prepared_iterable, IterableBase)\n"
  },
  {
    "path": "tests/test_list_response_wrapper.py",
    "content": "from __future__ import annotations\n\nfrom collections.abc import AsyncGenerator, Generator\n\nimport pytest\nfrom pydantic import BaseModel\n\nfrom instructor.dsl.iterable import IterableBase\nfrom instructor.dsl.response_list import ListResponse\nfrom instructor.mode import Mode\nfrom instructor.processing.response import process_response, process_response_async\nfrom instructor.utils.core import prepare_response_model\n\n\nclass DummyIterableModel(BaseModel, IterableBase):\n    tasks: list[int]\n\n    @classmethod\n    def from_response(cls, completion, **kwargs):  # noqa: ANN001,ARG003\n        return cls(tasks=[1, 2])\n\n    @classmethod\n    def from_streaming_response(  # noqa: ANN001\n        cls, _completion, mode: Mode, **_kwargs\n    ) -> Generator[int, None, None]:\n        del mode\n        yield 1\n        yield 2\n\n    @classmethod\n    def from_streaming_response_async(  # noqa: ANN001\n        cls, _completion: AsyncGenerator[object, None], mode: Mode, **_kwargs\n    ) -> AsyncGenerator[int, None]:\n        del mode\n\n        async def gen() -> AsyncGenerator[int, None]:\n            yield 1\n            yield 2\n\n        return gen()\n\n\nclass DummyCompletion(BaseModel):\n    \"\"\"Minimal stand-in for a provider completion object.\"\"\"\n\n\ndef test_process_response_returns_list_response_for_iterable_model():\n    raw = DummyCompletion()\n\n    result = process_response(\n        raw,\n        response_model=DummyIterableModel,\n        stream=False,\n        mode=Mode.TOOLS,\n    )\n\n    assert isinstance(result, ListResponse)\n    assert list(result) == [1, 2]\n    assert result._raw_response == raw\n\n\ndef test_process_response_streaming_returns_list_response_for_iterable_model():\n    raw = DummyCompletion()\n\n    result = process_response(\n        raw,\n        response_model=DummyIterableModel,\n        stream=True,\n        mode=Mode.TOOLS,\n    )\n\n    # Streaming IterableBase should preserve generator behavior (used by create_iterable()).\n    assert list(result) == [1, 2]\n\n\n@pytest.mark.asyncio\nasync def test_process_response_async_streaming_returns_list_response_for_iterable_model():\n    async def completion_stream() -> AsyncGenerator[object, None]:\n        yield object()\n\n    raw = completion_stream()\n\n    result = await process_response_async(\n        raw,  # type: ignore[arg-type]\n        response_model=DummyIterableModel,\n        stream=True,\n        mode=Mode.TOOLS,\n    )\n\n    # Streaming IterableBase should preserve async generator behavior (used by create_iterable()).\n    collected: list[int] = []\n    async for item in result:\n        collected.append(item)\n    assert collected == [1, 2]\n\n\ndef test_prepare_response_model_treats_list_as_iterable_model():\n    class User(BaseModel):\n        name: str\n\n    prepared = prepare_response_model(list[User])\n    assert prepared is not None\n    assert issubclass(prepared, IterableBase)\n"
  },
  {
    "path": "tests/test_logging.py",
    "content": "import logging\nfrom instructor.auto_client import from_provider\n\n\ndef test_from_provider_logging(caplog):\n    caplog.set_level(logging.INFO)\n    from_provider(\"ollama/llama3.2\")\n    assert any(\n        \"Initializing ollama provider\" in record.getMessage()\n        for record in caplog.records\n    )\n    assert any(\"Client initialized\" in record.getMessage() for record in caplog.records)\n"
  },
  {
    "path": "tests/test_message_processing.py",
    "content": "\"\"\"\nTests for message processing optimizations.\n\"\"\"\n\nfrom instructor.utils import (\n    merge_consecutive_messages,\n    get_message_content,\n    transform_to_gemini_prompt,\n    update_gemini_kwargs,\n    combine_system_messages,\n    extract_system_messages,\n    SystemMessage,\n)\n\n\nclass TestMergeConsecutiveMessages:\n    \"\"\"Test the merge_consecutive_messages function.\"\"\"\n\n    def test_empty_messages(self):\n        \"\"\"Test merging empty messages list.\"\"\"\n        result = merge_consecutive_messages([])\n        assert result == []\n\n    def test_single_message(self):\n        \"\"\"Test merging a single message.\"\"\"\n        messages = [{\"role\": \"user\", \"content\": \"Hello\"}]\n        result = merge_consecutive_messages(messages)\n        assert result == messages\n\n    def test_consecutive_same_role(self):\n        \"\"\"Test merging consecutive messages with the same role.\"\"\"\n        messages = [\n            {\"role\": \"user\", \"content\": \"Hello\"},\n            {\"role\": \"user\", \"content\": \"World\"},\n        ]\n        result = merge_consecutive_messages(messages)\n        assert len(result) == 1\n        assert result[0][\"role\"] == \"user\"\n        assert \"Hello\" in result[0][\"content\"]\n        assert \"World\" in result[0][\"content\"]\n\n    def test_alternating_roles(self):\n        \"\"\"Test merging messages with alternating roles.\"\"\"\n        messages = [\n            {\"role\": \"user\", \"content\": \"Hello\"},\n            {\"role\": \"assistant\", \"content\": \"Hi there\"},\n            {\"role\": \"user\", \"content\": \"How are you?\"},\n        ]\n        result = merge_consecutive_messages(messages)\n        assert len(result) == 3\n        assert result[0][\"role\"] == \"user\"\n        assert result[1][\"role\"] == \"assistant\"\n        assert result[2][\"role\"] == \"user\"\n\n    def test_mixed_content_types(self):\n        \"\"\"Test merging messages with mixed content types.\"\"\"\n        messages = [\n            {\"role\": \"user\", \"content\": \"Hello\"},\n            {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"World\"}]},\n        ]\n        result = merge_consecutive_messages(messages)\n        assert len(result) == 1\n        assert result[0][\"role\"] == \"user\"\n        assert isinstance(result[0][\"content\"], list)\n        assert len(result[0][\"content\"]) == 2\n\n    def test_multiple_consecutive(self):\n        \"\"\"Test merging multiple consecutive messages.\"\"\"\n        messages = [\n            {\"role\": \"user\", \"content\": \"Hello\"},\n            {\"role\": \"user\", \"content\": \"World\"},\n            {\"role\": \"assistant\", \"content\": \"Hi there\"},\n            {\"role\": \"assistant\", \"content\": \"How can I help?\"},\n            {\"role\": \"user\", \"content\": \"I need help\"},\n        ]\n        result = merge_consecutive_messages(messages)\n        assert len(result) == 3\n        assert result[0][\"role\"] == \"user\"\n        assert \"Hello\" in result[0][\"content\"]\n        assert \"World\" in result[0][\"content\"]\n        assert result[1][\"role\"] == \"assistant\"\n        assert \"Hi there\" in result[1][\"content\"]\n        assert \"How can I help?\" in result[1][\"content\"]\n        assert result[2][\"role\"] == \"user\"\n        assert \"I need help\" in result[2][\"content\"]\n\n\nclass TestGetMessageContent:\n    \"\"\"Test the get_message_content function.\"\"\"\n\n    def test_string_content(self):\n        \"\"\"Test getting content from a message with string content.\"\"\"\n        message = {\"role\": \"user\", \"content\": \"Hello\"}\n        result = get_message_content(message)\n        assert result == [\"Hello\"]\n\n    def test_list_content(self):\n        \"\"\"Test getting content from a message with list content.\"\"\"\n        message = {\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"Hello\"}]}\n        result = get_message_content(message)\n        assert result == [{\"type\": \"text\", \"text\": \"Hello\"}]\n\n    def test_empty_content(self):\n        \"\"\"Test getting content from a message with empty content.\"\"\"\n        message = {\"role\": \"user\", \"content\": \"\"}\n        result = get_message_content(message)\n        assert result == [\"\"]\n\n    def test_none_content(self):\n        \"\"\"Test getting content from a message with None content.\"\"\"\n        message = {\"role\": \"user\", \"content\": None}\n        result = get_message_content(message)\n        assert result == [\"\"]\n\n    def test_missing_content(self):\n        \"\"\"Test getting content from a message with missing content.\"\"\"\n        message = {\"role\": \"user\"}\n        result = get_message_content(message)\n        assert result == [\"\"]\n\n    def test_empty_message(self):\n        \"\"\"Test getting content from an empty message.\"\"\"\n        message = {}\n        result = get_message_content(message)\n        assert result == [\"\"]\n\n\nclass TestTransformToGeminiPrompt:\n    \"\"\"Test the transform_to_gemini_prompt function.\"\"\"\n\n    def test_empty_messages(self):\n        \"\"\"Test transforming empty messages.\"\"\"\n        result = transform_to_gemini_prompt([])\n        assert result == []\n\n    def test_user_message(self):\n        \"\"\"Test transforming a user message.\"\"\"\n        messages = [{\"role\": \"user\", \"content\": \"Hello\"}]\n        result = transform_to_gemini_prompt(messages)\n        assert len(result) == 1\n        assert result[0][\"role\"] == \"user\"\n        assert result[0][\"parts\"] == [\"Hello\"]\n\n    def test_assistant_message(self):\n        \"\"\"Test transforming an assistant message.\"\"\"\n        messages = [{\"role\": \"assistant\", \"content\": \"Hello\"}]\n        result = transform_to_gemini_prompt(messages)\n        assert len(result) == 1\n        assert result[0][\"role\"] == \"model\"\n        assert result[0][\"parts\"] == [\"Hello\"]\n\n    def test_system_message(self):\n        \"\"\"Test transforming a system message.\"\"\"\n        messages = [{\"role\": \"system\", \"content\": \"You are an AI assistant\"}]\n        result = transform_to_gemini_prompt(messages)\n        assert len(result) == 1\n        assert result[0][\"role\"] == \"user\"\n        assert \"*You are an AI assistant*\" in result[0][\"parts\"][0]\n\n    def test_full_conversation(self):\n        \"\"\"Test transforming a full conversation.\"\"\"\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are an AI assistant\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n            {\"role\": \"assistant\", \"content\": \"Hi there\"},\n            {\"role\": \"user\", \"content\": \"How are you?\"},\n        ]\n        result = transform_to_gemini_prompt(messages)\n        assert len(result) == 3\n        assert result[0][\"role\"] == \"user\"\n        assert \"*You are an AI assistant*\" in result[0][\"parts\"][0]\n        assert \"Hello\" in result[0][\"parts\"][1]\n        assert result[1][\"role\"] == \"model\"\n        assert result[1][\"parts\"] == [\"Hi there\"]\n        assert result[2][\"role\"] == \"user\"\n        assert result[2][\"parts\"] == [\"How are you?\"]\n\n    def test_multiple_system_messages(self):\n        \"\"\"Test transforming multiple system messages.\"\"\"\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are an AI assistant\"},\n            {\"role\": \"system\", \"content\": \"Be helpful and concise\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = transform_to_gemini_prompt(messages)\n        assert len(result) == 1\n        assert result[0][\"role\"] == \"user\"\n        assert any(\"You are an AI assistant\" in part for part in result[0][\"parts\"])\n        assert any(\"Be helpful and concise\" in part for part in result[0][\"parts\"])\n        assert any(\"Hello\" in part for part in result[0][\"parts\"])\n\n\nclass TestUpdateGeminiKwargs:\n    \"\"\"Test the update_gemini_kwargs function.\"\"\"\n\n    def test_transform_messages(self):\n        \"\"\"Test transforming messages to Gemini format.\"\"\"\n        kwargs = {\"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]}\n        result = update_gemini_kwargs(kwargs)\n        assert \"contents\" in result\n        assert \"messages\" not in result\n        assert len(result[\"contents\"]) == 1\n        assert result[\"contents\"][0][\"role\"] == \"user\"\n\n    def test_generation_config(self):\n        \"\"\"Test updating generation config.\"\"\"\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n            \"generation_config\": {\n                \"max_tokens\": 100,\n                \"temperature\": 0.7,\n                \"n\": 3,\n                \"top_p\": 0.9,\n                \"stop\": [\"END\"],\n            },\n        }\n        result = update_gemini_kwargs(kwargs)\n        assert \"generation_config\" in result\n        assert \"max_output_tokens\" in result[\"generation_config\"]\n        assert \"candidate_count\" in result[\"generation_config\"]\n        assert \"stop_sequences\" in result[\"generation_config\"]\n        assert \"max_tokens\" not in result[\"generation_config\"]\n        assert \"n\" not in result[\"generation_config\"]\n        assert \"stop\" not in result[\"generation_config\"]\n\n    def test_safety_settings(self):\n        \"\"\"Test setting safety settings.\"\"\"\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n        }\n        result = update_gemini_kwargs(kwargs)\n        assert \"safety_settings\" in result\n        assert len(result[\"safety_settings\"]) >= 3  # At least 3 safety settings\n\n    def test_existing_safety_settings(self):\n        \"\"\"Test respecting existing safety settings.\"\"\"\n        from google.genai.types import HarmCategory, HarmBlockThreshold\n\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}],\n            \"safety_settings\": {\n                HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE\n            },\n        }\n        result = update_gemini_kwargs(kwargs)\n        assert (\n            result[\"safety_settings\"][HarmCategory.HARM_CATEGORY_HATE_SPEECH]\n            == HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE\n        )\n\n\nclass TestSystemMessages:\n    \"\"\"Test the system message utility functions.\"\"\"\n\n    def test_combine_system_messages_strings(self):\n        \"\"\"Test combining two string system messages.\"\"\"\n        existing = \"You are an AI assistant\"\n        new = \"Be helpful\"\n        result = combine_system_messages(existing, new)\n        assert result == \"You are an AI assistant\\n\\nBe helpful\"\n\n    def test_combine_system_messages_lists(self):\n        \"\"\"Test combining two list system messages.\"\"\"\n        existing = [SystemMessage(type=\"text\", text=\"You are an AI assistant\")]\n        new = [SystemMessage(type=\"text\", text=\"Be helpful\")]\n        result = combine_system_messages(existing, new)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are an AI assistant\"\n        assert result[1][\"text\"] == \"Be helpful\"\n\n    def test_combine_system_messages_mixed(self):\n        \"\"\"Test combining mixed system message types.\"\"\"\n        existing = \"You are an AI assistant\"\n        new = [SystemMessage(type=\"text\", text=\"Be helpful\")]\n        result = combine_system_messages(existing, new)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are an AI assistant\"\n        assert result[1][\"text\"] == \"Be helpful\"\n\n    def test_combine_system_messages_none(self):\n        \"\"\"Test combining None with a system message.\"\"\"\n        existing = None\n        new = \"Be helpful\"\n        result = combine_system_messages(existing, new)\n        assert result == \"Be helpful\"\n\n    def test_extract_system_messages_empty(self):\n        \"\"\"Test extracting system messages from an empty list.\"\"\"\n        messages = []\n        result = extract_system_messages(messages)\n        assert result == []\n\n    def test_extract_system_messages_no_system(self):\n        \"\"\"Test extracting system messages when there are none.\"\"\"\n        messages = [\n            {\"role\": \"user\", \"content\": \"Hello\"},\n            {\"role\": \"assistant\", \"content\": \"Hi there\"},\n        ]\n        result = extract_system_messages(messages)\n        assert result == []\n\n    def test_extract_system_messages_string(self):\n        \"\"\"Test extracting string system messages.\"\"\"\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are an AI assistant\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = extract_system_messages(messages)\n        assert len(result) == 1\n        assert result[0][\"type\"] == \"text\"\n        assert result[0][\"text\"] == \"You are an AI assistant\"\n\n    def test_extract_system_messages_list(self):\n        \"\"\"Test extracting list system messages.\"\"\"\n        messages = [\n            {\n                \"role\": \"system\",\n                \"content\": [{\"type\": \"text\", \"text\": \"You are an AI assistant\"}],\n            },\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = extract_system_messages(messages)\n        assert len(result) == 1\n        assert result[0][\"type\"] == \"text\"\n        assert result[0][\"text\"] == \"You are an AI assistant\"\n\n    def test_extract_system_messages_multiple(self):\n        \"\"\"Test extracting multiple system messages.\"\"\"\n        messages = [\n            {\"role\": \"system\", \"content\": \"You are an AI assistant\"},\n            {\"role\": \"system\", \"content\": \"Be helpful\"},\n            {\"role\": \"user\", \"content\": \"Hello\"},\n        ]\n        result = extract_system_messages(messages)\n        assert len(result) == 2\n        assert result[0][\"text\"] == \"You are an AI assistant\"\n        assert result[1][\"text\"] == \"Be helpful\"\n"
  },
  {
    "path": "tests/test_multimodal.py",
    "content": "import pytest\nfrom pathlib import Path\nfrom instructor.processing.multimodal import (\n    PDF,\n    Audio,\n    Image,\n    autodetect_media,\n    convert_contents,\n    convert_messages,\n)\nfrom instructor.mode import Mode\nfrom unittest.mock import patch, MagicMock\nimport instructor\n\n\n@pytest.fixture\ndef base64_jpeg():\n    # Source: https://gist.github.com/trymbill/136dfd4bfc0736fae5b959430ec57373\n    return \"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAMCAgMCAgMDAwMEAwMEBQgFBQQEBQoHBwYIDAoMDAsKCwsNDhIQDQ4RDgsLEBYQERMUFRUVDA8XGBYUGBIUFRT/wAALCAABAAEBAREA/8QAFAABAAAAAAAAAAAAAAAAAAAACf/EABQQAQAAAAAAAAAAAAAAAAAAAAD/2gAIAQEAAD8AKp//2Q==\"  # noqa: E501\n\n\n@pytest.fixture\ndef base64_png():\n    # Source: https://gist.github.com/ondrek/7413434\n    return \"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNk+A8AAQUBAScY42YAAAAASUVORK5CYII=\"  # noqa: E501\n\n\ndef test_image_from_url():\n    url = \"https://example.com/image.jpg\"\n    image = Image.from_url(url)\n    assert image.source == url\n    assert image.media_type == \"image/jpeg\"\n    assert image.data is None\n\n\ndef test_image_from_path(tmp_path: Path):\n    image_path = tmp_path / \"test_image.jpg\"\n    image_path.write_bytes(b\"fake image data\")\n\n    image = Image.from_path(image_path)\n    assert image.source == image_path\n    assert image.media_type == \"image/jpeg\"\n    assert image.data is not None\n\n\n@pytest.mark.skip(reason=\"Needs to download image\")\ndef test_image_to_anthropic():\n    image = Image(\n        source=\"http://example.com/image.jpg\", media_type=\"image/jpeg\", data=None\n    )\n    anthropic_format = image.to_anthropic()\n    assert anthropic_format[\"type\"] == \"image\"\n    assert anthropic_format[\"source\"][\"type\"] == \"base64\"\n    assert anthropic_format[\"source\"][\"media_type\"] == \"image/jpeg\"\n\n\ndef test_image_to_openai():\n    image = Image(\n        source=\"http://example.com/image.jpg\", media_type=\"image/jpeg\", data=None\n    )\n    openai_format = image.to_openai(mode=instructor.Mode.TOOLS)\n    assert openai_format[\"type\"] == \"image_url\"\n    assert openai_format[\"image_url\"][\"url\"] == \"http://example.com/image.jpg\"\n\n\ndef test_convert_contents():\n    contents = [\"Hello\", Image.from_url(\"http://example.com/image.jpg\")]\n    converted = list(convert_contents(contents, Mode.TOOLS))\n    assert len(converted) == 2\n    assert converted[0] == {\"type\": \"text\", \"text\": \"Hello\"}\n    assert converted[1][\"type\"] == \"image_url\"\n    assert converted[1][\"image_url\"][\"url\"] == \"http://example.com/image.jpg\"\n\n\ndef test_convert_messages():\n    messages = [\n        {\n            \"role\": \"user\",\n            \"content\": [\"Hello\", Image.from_url(\"http://example.com/image.jpg\")],\n        },\n        {\"role\": \"assistant\", \"content\": \"Hi there!\"},\n    ]\n    converted = list(convert_messages(messages, Mode.TOOLS))\n    assert len(converted) == 2\n    assert converted[0][\"role\"] == \"user\"\n    assert len(converted[0][\"content\"]) == 2\n    assert converted[0][\"content\"][0] == {\"type\": \"text\", \"text\": \"Hello\"}\n    assert converted[0][\"content\"][1][\"type\"] == \"image_url\"\n    assert converted[1][\"role\"] == \"assistant\"\n    assert converted[1][\"content\"] == \"Hi there!\"\n\n\ndef test_convert_messages_anthropic():\n    messages = [\n        {\n            \"role\": \"user\",\n            \"content\": [\n                \"Hello\",\n                Image(source=\"base64data\", media_type=\"image/jpeg\", data=\"fakedata\"),\n            ],\n        }\n    ]\n    converted = list(convert_messages(messages, Mode.ANTHROPIC_JSON))\n    assert len(converted) == 1\n    assert converted == [\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\"type\": \"text\", \"text\": \"Hello\"},\n                {\n                    \"type\": \"image\",\n                    \"source\": {\n                        \"type\": \"base64\",\n                        \"media_type\": \"image/jpeg\",\n                        \"data\": \"fakedata\",\n                    },\n                },\n            ],\n        }\n    ]\n\n\ndef test_convert_messages_gemini():\n    messages = [\n        {\n            \"role\": \"user\",\n            \"content\": [\"Hello\", Image.from_url(\"http://example.com/image.jpg\")],\n        }\n    ]\n    with pytest.raises(NotImplementedError):\n        list(convert_messages(messages, Mode.GEMINI_JSON))\n\n\n# Additional tests\n\n\ndef test_image_from_path_unsupported_format(tmp_path: Path):\n    image_path = tmp_path / \"test_image.txt\"\n    image_path.write_bytes(b\"fake gif data\")\n\n    with pytest.raises(ValueError, match=\"Unsupported image format: text/plain\"):\n        Image.from_path(image_path)\n\n\ndef test_image_from_path_empty_file(tmp_path: Path):\n    image_path = tmp_path / \"empty_image.jpg\"\n    image_path.touch()\n\n    with pytest.raises(ValueError, match=\"Image file is empty\"):\n        Image.from_path(image_path)\n\n\ndef test_image_to_openai_base64():\n    image = Image(\n        source=\"local_file.jpg\", media_type=\"image/jpeg\", data=\"base64encodeddata\"\n    )\n    openai_format = image.to_openai(mode=instructor.Mode.TOOLS)\n    assert openai_format[\"type\"] == \"image_url\"\n    assert openai_format[\"image_url\"][\"url\"].startswith(\"data:image/jpeg;base64,\")\n\n\ndef test_convert_contents_single_string():\n    content = \"Hello, world!\"\n    converted = convert_contents(content, Mode.TOOLS)\n    assert converted == \"Hello, world!\"\n\n\ndef test_convert_contents_single_image():\n    image = Image.from_url(\"http://example.com/image.jpg\")\n    converted = list(convert_contents(image, Mode.TOOLS))\n    assert len(converted) == 1\n    assert converted == [\n        {\"type\": \"image_url\", \"image_url\": {\"url\": \"http://example.com/image.jpg\"}}\n    ]\n\n\ndef test_convert_messages_mixed_content():\n    messages = [\n        {\"role\": \"user\", \"content\": \"Hello\"},\n        {\"role\": \"assistant\", \"content\": \"Hi there!\"},\n        {\"role\": \"user\", \"content\": Image.from_url(\"http://example.com/image.jpg\")},\n    ]\n    converted = list(convert_messages(messages, Mode.TOOLS))\n    assert len(converted) == 3\n    assert converted[0][\"content\"] == \"Hello\"\n    assert converted[1][\"content\"] == \"Hi there!\"\n    assert converted[2][\"content\"][0][\"type\"] == \"image_url\"\n\n\ndef test_convert_contents_invalid_type():\n    with pytest.raises(ValueError, match=\"Unsupported content type\"):\n        list(convert_contents([1, 2, 3], Mode.TOOLS))  # type: ignore[arg-type]\n\n\ndef test_convert_contents_anthropic_mode():\n    contents = [\n        \"Hello\",\n        Image(source=\"base64data\", media_type=\"image/png\", data=\"fakedata\"),\n    ]\n    converted = list(convert_contents(contents, Mode.ANTHROPIC_JSON))\n    assert converted[1][\"type\"] == \"image\"\n    assert converted[1][\"source\"][\"type\"] == \"base64\"\n    assert converted[1][\"source\"][\"media_type\"] == \"image/png\"\n\n\ndef test_convert_contents_custom_dict():\n    contents = {\n        \"type\": \"image_url\",\n        \"image_url\": {\"url\": f\"data:image/png;base64,base64_img\"},\n    }\n    converted = list(convert_contents(contents, Mode.TOOLS))\n    assert len(converted) == 1\n    assert converted == [contents]\n\n\ndef test_image_from_base64_url(base64_png):\n    image = Image.from_url(base64_png)\n    assert image.source == base64_png\n    assert image.media_type == \"image/png\"\n    assert image.data is not None\n    assert image.data == base64_png.split(\",\")[-1]\n\n\ndef test_image_from_url_with_query_params():\n    url = \"https://example.com/image.jpg?param1=value1&param2=value2\"\n    image = Image.from_url(url)\n    assert image.source == url\n    assert image.media_type == \"image/jpeg\"\n    assert image.data is None\n\n\ndef test_image_from_url_with_unusual_extension():\n    url = \"https://example.com/image.webp\"\n    image = Image.from_url(url)\n    assert image.source == url\n    assert image.media_type == \"image/webp\"\n    assert image.data is None\n\n\ndef test_image_to_openai_with_base64_source(base64_png):\n    base64_data = base64_png.split(\",\")[-1]\n    image = Image(\n        source=f\"data:image/png;base64,{base64_data}\",\n        media_type=\"image/png\",\n        data=base64_data,\n    )\n    openai_format = image.to_openai(mode=instructor.Mode.TOOLS)\n    assert openai_format[\"type\"] == \"image_url\"\n    assert openai_format[\"image_url\"][\"url\"] == f\"data:image/png;base64,{base64_data}\"\n\n\ndef test_image_to_anthropic_with_base64_source(base64_png):\n    base64_data = base64_png.split(\",\")[-1]\n    image = Image(\n        source=f\"data:image/png;base64,{base64_data}\",\n        media_type=\"image/png\",\n        data=base64_data,\n    )\n    anthropic_format = image.to_anthropic()\n    assert anthropic_format[\"type\"] == \"image\"\n    assert anthropic_format[\"source\"][\"type\"] == \"base64\"\n    assert anthropic_format[\"source\"][\"media_type\"] == \"image/png\"\n    assert anthropic_format[\"source\"][\"data\"] == base64_data\n\n\n@pytest.mark.parametrize(\n    \"url\",\n    [\n        \"http://example.com/image.jpg\",\n        \"https://example.com/image.png\",\n        \"https://example.com/image.webp\",\n        \"https://example.com/image.jpg?param=value\",\n        \"base64_png\",\n    ],\n)\ndef test_image_from_various_urls(url, request):\n    if url.startswith(\"base64\"):\n        url = request.getfixturevalue(url)\n    image = Image.from_url(url)\n    assert image.source == url\n    if image.is_base64(url):\n        assert image.data is not None\n    else:\n        assert image.data is None\n\n\ndef test_convert_contents_with_base64_image(base64_png):\n    contents = [\"Hello\", Image.from_url(base64_png)]\n    converted = list(convert_contents(contents, Mode.TOOLS))\n    assert len(converted) == 2\n    assert converted[0] == {\"type\": \"text\", \"text\": \"Hello\"}\n    assert converted[1][\"type\"] == \"image_url\"\n    assert converted[1][\"image_url\"][\"url\"] == base64_png\n\n\n@pytest.mark.parametrize(\n    \"input_data, expected_type, expected_media_type\",\n    [\n        # URL tests\n        (\"http://example.com/image.jpg\", \"url\", \"image/jpeg\"),\n        (\"https://example.com/image.png\", \"url\", \"image/png\"),\n        (\"https://example.com/image.webp\", \"url\", \"image/webp\"),\n        (\"https://example.com/image.jpg?param=value\", \"url\", \"image/jpeg\"),\n        (\n            \"https://example.com/image\",\n            \"url\",\n            \"image/jpeg\",\n        ),  # Default to JPEG if no extension\n        # Base64 data URI tests\n        (\n            \"base64_png\",\n            \"base64\",\n            \"image/png\",\n        ),\n        (\n            \"base64_jpeg\",\n            \"base64\",\n            \"image/jpeg\",\n        ),\n        # File path tests (mocked)\n        (\"/path/to/image.jpg\", \"file\", \"image/jpeg\"),\n        (\"/path/to/image.png\", \"file\", \"image/png\"),\n        (\"/path/to/image.webp\", \"file\", \"image/webp\"),\n    ],\n)\ndef test_image_autodetect(input_data, expected_type, expected_media_type, request):\n    with (\n        patch(\"pathlib.Path.is_file\", return_value=True),\n        patch(\"pathlib.Path.stat\", return_value=MagicMock(st_size=1000)),\n        patch(\"pathlib.Path.read_bytes\", return_value=b\"fake image data\"),\n        patch(\"requests.head\") as mock_head,\n    ):\n        mock_head.return_value = MagicMock(\n            headers={\"Content-Type\": expected_media_type}\n        )\n        if input_data.startswith(\"base64\"):\n            input_data = request.getfixturevalue(input_data)\n\n        image = Image.autodetect(input_data)\n\n        if isinstance(image.source, Path):\n            assert image.source == Path(input_data)\n        else:\n            assert image.source == input_data\n        assert image.media_type == expected_media_type\n\n        if expected_type == \"url\":\n            assert image.data is None\n        elif expected_type == \"base64\":\n            assert image.data is not None\n            assert image.data.startswith(\"iVBOR\") or image.data.startswith(\"/9j/\")\n        elif expected_type == \"file\":\n            assert image.data is not None\n            assert image.data == \"ZmFrZSBpbWFnZSBkYXRh\"  # base64 of 'fake image data'\n\n\ndef test_image_autodetect_invalid_input():\n    with pytest.raises(ValueError, match=\"Invalid or unsupported base64 image data\"):\n        Image.autodetect(\"not_an_image_input\")\n\n    # Test safely converting an invalid image\n    assert Image.autodetect_safely(\"hello\") == \"hello\"\n\n\ndef test_image_autodetect_empty_file(tmp_path):\n    empty_file = tmp_path / \"empty.jpg\"\n    empty_file.touch()\n    with pytest.raises(ValueError, match=\"Image file is empty\"):\n        Image.autodetect(empty_file)\n\n\ndef test_raw_base64_autodetect_jpeg(base64_jpeg):\n    raw_base_64 = base64_jpeg.split(\",\")[-1]\n    image = Image.autodetect(raw_base_64)\n    assert image.media_type == \"image/jpeg\"\n    assert image.source == image.data == raw_base_64\n\n\ndef test_raw_base64_autodetect_png(base64_png):\n    raw_base_64 = base64_png.split(\",\")[-1]\n    image = Image.autodetect(raw_base_64)\n    assert image.media_type == \"image/png\"\n    assert image.source == image.data == raw_base_64\n\n\ndef test_autodetect_media_data_uris():\n    img_uri = (\n        \"data:image/png;base64,\"\n        \"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGNgYAAAAAMAASsJTYQAAAAASUVORK5CYII=\"\n    )\n    pdf_uri = \"data:application/pdf;base64,JVBERi0xLjQK\"  # \"%PDF-1.4\\n\"\n    aud_uri = \"data:audio/wav;base64,UklGRiQAAABXQVZF\"  # minimal header-ish\n\n    img = autodetect_media(img_uri)\n    pdf = autodetect_media(pdf_uri)\n    aud = autodetect_media(aud_uri)\n\n    assert isinstance(img, Image)\n    assert img.media_type == \"image/png\"\n\n    assert isinstance(pdf, PDF)\n    assert pdf.media_type == \"application/pdf\"\n\n    assert isinstance(aud, Audio)\n    assert aud.media_type == \"audio/wav\"\n\n\ndef test_convert_messages_autodetect_media():\n    img_uri = (\n        \"data:image/png;base64,\"\n        \"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGNgYAAAAAMAASsJTYQAAAAASUVORK5CYII=\"\n    )\n    pdf_uri = \"data:application/pdf;base64,JVBERi0xLjQK\"\n\n    messages = [\n        {\"role\": \"user\", \"content\": [\"hello\", img_uri, pdf_uri]},\n    ]\n\n    out = convert_messages(messages, mode=Mode.RESPONSES_TOOLS, autodetect_images=True)\n    assert isinstance(out, list) and len(out) == 1\n\n    content = out[0][\"content\"]\n    assert isinstance(content, list) and len(content) == 3\n\n    # Text\n    assert content[0][\"type\"] in {\"input_text\", \"text\"}\n    assert content[0][\"text\"] == \"hello\"\n\n    # Image → input_image with data URI\n    assert content[1][\"type\"] == \"input_image\"\n    assert isinstance(content[1].get(\"image_url\"), str)\n    assert content[1][\"image_url\"].startswith(\"data:image/png;base64,\")\n\n    # PDF → input_file with data URI\n    assert content[2][\"type\"] == \"input_file\"\n    assert isinstance(content[2].get(\"file_data\"), str)\n    assert content[2][\"file_data\"].startswith(\"data:application/pdf;base64,\")\n\n\ndef test_pdf_from_url():\n    # URL without extension → should HEAD and set media_type; data stays None.\n    with patch(\"instructor.processing.multimodal.requests.head\") as mock_head:\n        resp = MagicMock()\n        resp.headers = {\"Content-Type\": \"application/pdf\"}\n        resp.raise_for_status = MagicMock()\n        mock_head.return_value = resp\n\n        pdf = PDF.from_url(\"https://example.com/file\")\n\n    assert isinstance(pdf, PDF)\n    assert pdf.source == \"https://example.com/file\"\n    assert pdf.media_type == \"application/pdf\"\n    assert pdf.data is None\n\n\ndef test_pdf_from_gs_url():\n    # gs:// → https://storage.googleapis.com/... (GET) and bytes are base64-encoded.\n    pdf_bytes = b\"%PDF-1.4\\n...\"\n    with patch(\"instructor.processing.multimodal.requests.get\") as mock_get:\n        resp = MagicMock()\n        resp.headers = {\"Content-Type\": \"application/pdf\"}\n        resp.content = pdf_bytes\n        resp.raise_for_status = MagicMock()\n        mock_get.return_value = resp\n\n        pdf = PDF.from_gs_url(\"gs://bucket/doc.pdf\")\n\n    assert isinstance(pdf, PDF)\n    assert pdf.source == \"gs://bucket/doc.pdf\"\n    assert pdf.media_type == \"application/pdf\"\n    # Optional strictness without adding global imports:\n    import base64 as _b64\n\n    assert pdf.data == _b64.b64encode(pdf_bytes).decode(\"utf-8\")\n\n\ndef test_audio_from_url():\n    # Audio URL → GET; implementation reads headers.get('content-type')\n    audio_bytes = b\"RIFFxxxxWAVEfmt \"\n    with patch(\"instructor.processing.multimodal.requests.get\") as mock_get:\n        resp = MagicMock()\n        resp.headers = {\"content-type\": \"audio/wav\"}\n        resp.content = audio_bytes\n        resp.raise_for_status = MagicMock()\n        mock_get.return_value = resp\n\n        audio = Audio.from_url(\"https://cdn.example.com/a.wav\")\n\n    assert isinstance(audio, Audio)\n    assert audio.source == \"https://cdn.example.com/a.wav\"\n    assert audio.media_type == \"audio/wav\"\n    import base64 as _b64\n\n    assert audio.data == _b64.b64encode(audio_bytes).decode(\"utf-8\")\n\n\ndef test_audio_from_gs_url():\n    # gs:// audio → public GCS GET and base64-encode.\n    audio_bytes = b\"\\x00\\x01\\x02\\x03\"\n    with patch(\"instructor.processing.multimodal.requests.get\") as mock_get:\n        resp = MagicMock()\n        resp.headers = {\"Content-Type\": \"audio/mpeg\"}\n        resp.content = audio_bytes\n        resp.raise_for_status = MagicMock()\n        mock_get.return_value = resp\n\n        audio = Audio.from_gs_url(\"gs://bkt/path/song.mp3\")\n\n    assert isinstance(audio, Audio)\n    assert audio.source == \"gs://bkt/path/song.mp3\"\n    assert audio.media_type == \"audio/mpeg\"\n    import base64 as _b64\n\n    assert audio.data == _b64.b64encode(audio_bytes).decode(\"utf-8\")\n\n\ndef test_audio_from_base64():\n    # data:audio/* data URI → parsed without network.\n    import base64 as _b64\n\n    raw = b\"\\x11\\x22\\x33\\x44\"\n    uri = \"data:audio/wav;base64,\" + _b64.b64encode(raw).decode(\"utf-8\")\n\n    audio = Audio.from_base64(uri)\n\n    assert isinstance(audio, Audio)\n    assert audio.source == uri\n    assert audio.media_type == \"audio/wav\"\n    assert audio.data == _b64.b64encode(raw).decode(\"utf-8\")\n\n\ndef test_pdf_to_bedrock_with_s3_uri():\n    \"\"\"Test PDF.to_bedrock with S3 URI source.\"\"\"\n    pdf = PDF(\n        source=\"s3://my-bucket/path/to/document.pdf\",\n        media_type=\"application/pdf\",\n        data=None,\n    )\n    bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format == {\n        \"document\": {\n            \"format\": \"pdf\",\n            \"name\": \"document\",\n            \"source\": {\"s3Location\": {\"uri\": \"s3://my-bucket/path/to/document.pdf\"}},\n        }\n    }\n\n\ndef test_pdf_to_bedrock_with_s3_uri_custom_name():\n    \"\"\"Test PDF.to_bedrock with S3 URI and custom name.\"\"\"\n    pdf = PDF(\n        source=\"s3://my-bucket/path/to/document.pdf\",\n        media_type=\"application/pdf\",\n        data=None,\n    )\n    bedrock_format = pdf.to_bedrock(name=\"custom-name\")\n\n    assert bedrock_format[\"document\"][\"name\"] == \"custom-name\"\n    assert (\n        bedrock_format[\"document\"][\"source\"][\"s3Location\"][\"uri\"]\n        == \"s3://my-bucket/path/to/document.pdf\"\n    )\n\n\ndef test_pdf_to_bedrock_with_invalid_s3_uri():\n    \"\"\"Test PDF.to_bedrock with invalid S3 URI format.\"\"\"\n    pdf = PDF(\n        source=\"s3://invalid-uri-no-key\",\n        media_type=\"application/pdf\",\n        data=None,\n    )\n    with pytest.raises(ValueError, match=\"Invalid S3 URI format\"):\n        pdf.to_bedrock()\n\n\ndef test_pdf_to_bedrock_with_base64_data():\n    \"\"\"Test PDF.to_bedrock with base64 encoded data.\"\"\"\n    import base64\n\n    pdf_bytes = b\"%PDF-1.4\\nfake pdf content\"\n    encoded_data = base64.b64encode(pdf_bytes).decode(\"utf-8\")\n\n    pdf = PDF(\n        source=\"data:application/pdf;base64,\" + encoded_data,\n        media_type=\"application/pdf\",\n        data=encoded_data,\n    )\n    bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"format\"] == \"pdf\"\n    assert bedrock_format[\"document\"][\"name\"] == \"document\"\n    assert bedrock_format[\"document\"][\"source\"][\"bytes\"] == pdf_bytes\n\n\ndef test_pdf_to_bedrock_with_path_source(tmp_path):\n    \"\"\"Test PDF.to_bedrock with local file path.\"\"\"\n    pdf_file = tmp_path / \"test_document.pdf\"\n    pdf_content = b\"%PDF-1.4\\ntest content\"\n    pdf_file.write_bytes(pdf_content)\n\n    pdf = PDF.from_path(pdf_file)\n    bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"format\"] == \"pdf\"\n    assert bedrock_format[\"document\"][\"name\"] == \"test_documentpdf\"\n    assert bedrock_format[\"document\"][\"source\"][\"bytes\"] == pdf_content\n\n\ndef test_pdf_to_bedrock_with_url_source():\n    \"\"\"Test PDF.to_bedrock with HTTP URL source.\"\"\"\n    pdf_bytes = b\"%PDF-1.4\\nfetched content\"\n\n    with patch(\"instructor.processing.multimodal.requests.get\") as mock_get:\n        resp = MagicMock()\n        resp.content = pdf_bytes\n        resp.raise_for_status = MagicMock()\n        mock_get.return_value = resp\n\n        pdf = PDF(\n            source=\"https://example.com/doc.pdf\",\n            media_type=\"application/pdf\",\n            data=None,\n        )\n        bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"format\"] == \"pdf\"\n    assert bedrock_format[\"document\"][\"name\"] == \"docpdf\"\n    assert bedrock_format[\"document\"][\"source\"][\"bytes\"] == pdf_bytes\n\n\ndef test_pdf_to_bedrock_name_sanitization():\n    \"\"\"Test that PDF.to_bedrock sanitizes document names according to Bedrock requirements.\"\"\"\n    import base64\n\n    pdf_bytes = b\"%PDF-1.4\\ntest\"\n    encoded = base64.b64encode(pdf_bytes).decode(\"utf-8\")\n\n    pdf = PDF(\n        source=\"test\",\n        media_type=\"application/pdf\",\n        data=encoded,\n    )\n\n    # Test with special characters that should be removed\n    bedrock_format = pdf.to_bedrock(name=\"my@doc#2024!.pdf\")\n    # Special chars should be removed\n    assert bedrock_format[\"document\"][\"name\"] == \"mydoc2024pdf\"\n\n    # Test with multiple spaces that should be consolidated\n    bedrock_format = pdf.to_bedrock(name=\"my   document    file.pdf\")\n    assert bedrock_format[\"document\"][\"name\"] == \"my document filepdf\"\n\n    # Test with allowed characters (alphanumeric, whitespace, hyphens, parentheses, brackets)\n    bedrock_format = pdf.to_bedrock(name=\"my-doc (2024) [final].pdf\")\n    assert bedrock_format[\"document\"][\"name\"] == \"my-doc (2024) [final]pdf\"\n\n\ndef test_pdf_to_bedrock_name_from_path_source(tmp_path):\n    \"\"\"Test that PDF.to_bedrock extracts name from Path source.\"\"\"\n    pdf_file = tmp_path / \"my-report.pdf\"\n    pdf_file.write_bytes(b\"%PDF-1.4\\ntest\")\n\n    pdf = PDF.from_path(pdf_file)\n    bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"name\"] == \"my-reportpdf\"\n\n\ndef test_pdf_to_bedrock_name_from_url():\n    \"\"\"Test that PDF.to_bedrock extracts name from URL.\"\"\"\n    pdf_bytes = b\"%PDF-1.4\\ntest\"\n\n    with patch(\"instructor.processing.multimodal.requests.get\") as mock_get:\n        resp = MagicMock()\n        resp.content = pdf_bytes\n        resp.raise_for_status = MagicMock()\n        mock_get.return_value = resp\n\n        pdf = PDF(\n            source=\"https://example.com/reports/annual-report-2024.pdf\",\n            media_type=\"application/pdf\",\n            data=None,\n        )\n        bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"name\"] == \"annual-report-2024pdf\"\n\n\ndef test_pdf_to_bedrock_name_from_gs_url():\n    \"\"\"Test that PDF.to_bedrock extracts name from GCS URL.\"\"\"\n    import base64\n\n    pdf_bytes = b\"%PDF-1.4\\ntest\"\n    encoded = base64.b64encode(pdf_bytes).decode(\"utf-8\")\n\n    pdf = PDF(\n        source=\"gs://my-bucket/docs/financial-report.pdf\",\n        media_type=\"application/pdf\",\n        data=encoded,\n    )\n    bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"name\"] == \"financial-reportpdf\"\n\n\ndef test_pdf_to_bedrock_default_name():\n    \"\"\"Test that PDF.to_bedrock uses default name when source doesn't provide one.\"\"\"\n    import base64\n\n    pdf_bytes = b\"%PDF-1.4\\ntest\"\n    encoded = base64.b64encode(pdf_bytes).decode(\"utf-8\")\n\n    pdf = PDF(\n        source=\"https://example.com/\",  # URL without filename\n        media_type=\"application/pdf\",\n        data=encoded,\n    )\n    bedrock_format = pdf.to_bedrock()\n\n    assert bedrock_format[\"document\"][\"name\"] == \"document\"\n\n\ndef test_pdf_to_bedrock_missing_data_no_source():\n    \"\"\"Test that PDF.to_bedrock raises error when data is missing and source can't be loaded.\"\"\"\n    pdf = PDF(\n        source=\"nonexistent.pdf\",\n        media_type=\"application/pdf\",\n        data=None,\n    )\n\n    with pytest.raises(\n        ValueError, match=\"PDF data is missing and source cannot be loaded\"\n    ):\n        pdf.to_bedrock()\n"
  },
  {
    "path": "tests/test_multitask.py",
    "content": "from instructor import OpenAISchema\nfrom instructor.dsl import IterableModel\nfrom typing import cast\n\n\ndef test_multi_task():\n    class Search(OpenAISchema):\n        \"\"\"This is the search docstring\"\"\"\n\n        id: int\n        query: str\n\n    IterableSearch = cast(type[OpenAISchema], IterableModel(Search))\n    assert IterableSearch.openai_schema[\"name\"] == \"IterableSearch\"\n    assert (\n        IterableSearch.openai_schema[\"description\"]\n        == \"Correct segmentation of `Search` tasks\"\n    )\n"
  },
  {
    "path": "tests/test_patch.py",
    "content": "import functools\n\nfrom openai import AsyncOpenAI, OpenAI\n\nimport instructor\nfrom instructor.utils import is_async\n\n\ndef test_patch_completes_successfully():\n    instructor.patch(OpenAI())\n\n\ndef test_apatch_completes_successfully():\n    instructor.apatch(AsyncOpenAI())\n\n\ndef test_is_async_returns_true_if_function_is_async():\n    async def async_function():\n        pass\n\n    assert is_async(async_function) is True\n\n\ndef test_is_async_returns_false_if_function_is_not_async():\n    def sync_function():\n        pass\n\n    assert is_async(sync_function) is False\n\n\ndef test_is_async_returns_true_if_wrapped_function_is_async():\n    async def async_function():\n        pass\n\n    @functools.wraps(async_function)\n    def wrapped_function():\n        pass\n\n    assert is_async(wrapped_function) is True\n\n\ndef test_is_async_returns_true_if_double_wrapped_function_is_async():\n    async def async_function():\n        pass\n\n    @functools.wraps(async_function)\n    def wrapped_function():\n        pass\n\n    @functools.wraps(wrapped_function)\n    def double_wrapped_function():\n        pass\n\n    assert is_async(double_wrapped_function) is True\n\n\ndef test_is_async_returns_true_if_triple_wrapped_function_is_async():\n    async def async_function():\n        pass\n\n    @functools.wraps(async_function)\n    def wrapped_function():\n        pass\n\n    @functools.wraps(wrapped_function)\n    def double_wrapped_function():\n        pass\n\n    @functools.wraps(double_wrapped_function)\n    def triple_wrapped_function():\n        pass\n\n    assert is_async(triple_wrapped_function) is True\n"
  },
  {
    "path": "tests/test_process_response.py",
    "content": "from typing_extensions import TypedDict\nfrom pydantic import BaseModel\nfrom instructor.processing.response import handle_response_model\nfrom instructor.providers.bedrock.utils import _prepare_bedrock_converse_kwargs_internal\n\n\ndef test_typed_dict_conversion() -> None:\n    class User(TypedDict):  # type: ignore\n        name: str\n        age: int\n\n    _, user_tool_definition = handle_response_model(User)\n\n    class User(BaseModel):\n        name: str\n        age: int\n\n    _, pydantic_user_tool_definition = handle_response_model(User)\n    assert user_tool_definition == pydantic_user_tool_definition\n\n\ndef test_openai_to_bedrock_conversion() -> None:\n    \"\"\"OpenAI-style input should be fully converted to Bedrock format.\"\"\"\n    call_kwargs = {\n        \"model\": \"anthropic.claude-3-haiku-20240307-v1:0\",\n        \"messages\": [\n            {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n            {\"role\": \"user\", \"content\": \"Extract: Jason is 22 years old\"},\n            {\"role\": \"assistant\", \"content\": \"Sure! Jason is 22.\"},\n        ],\n    }\n    result = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n    assert \"model\" not in result\n    assert result[\"modelId\"] == \"anthropic.claude-3-haiku-20240307-v1:0\"\n    assert result[\"system\"] == [{\"text\": \"You are a helpful assistant.\"}]\n    assert len(result[\"messages\"]) == 2\n    assert result[\"messages\"][0][\"role\"] == \"user\"\n    assert result[\"messages\"][0][\"content\"] == [\n        {\"text\": \"Extract: Jason is 22 years old\"}\n    ]\n    assert result[\"messages\"][1][\"role\"] == \"assistant\"\n    assert result[\"messages\"][1][\"content\"] == [{\"text\": \"Sure! Jason is 22.\"}]\n\n\ndef test_bedrock_native_preserved() -> None:\n    \"\"\"Bedrock-native input should be preserved as-is.\"\"\"\n    call_kwargs = {\n        \"modelId\": \"anthropic.claude-3-haiku-20240307-v1:0\",\n        \"system\": [{\"text\": \"You are a helpful assistant.\"}],\n        \"messages\": [\n            {\"role\": \"user\", \"content\": [{\"text\": \"Extract: Jason is 22 years old\"}]},\n            {\"role\": \"assistant\", \"content\": [{\"text\": \"Sure! Jason is 22.\"}]},\n        ],\n    }\n    result = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n    assert result[\"system\"] == [{\"text\": \"You are a helpful assistant.\"}]\n    assert len(result[\"messages\"]) == 2\n    assert result[\"messages\"][0][\"content\"] == [\n        {\"text\": \"Extract: Jason is 22 years old\"}\n    ]\n    assert result[\"messages\"][1][\"content\"] == [{\"text\": \"Sure! Jason is 22.\"}]\n\n\ndef test_mixed_openai_and_bedrock() -> None:\n    \"\"\"Mixed input: OpenAI-style is converted, Bedrock-native is preserved.\"\"\"\n    call_kwargs = {\n        \"modelId\": \"anthropic.claude-3-haiku-20240307-v1:0\",\n        \"system\": [{\"text\": \"You are a helpful assistant.\"}],\n        \"messages\": [\n            {\n                \"role\": \"user\",\n                \"content\": \"Extract: Jason is 22 years old\",\n            },  # OpenAI style\n            {\n                \"role\": \"assistant\",\n                \"content\": [{\"text\": \"Sure! Jason is 22.\"}],\n            },  # Bedrock style\n        ],\n    }\n    result = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n    assert result[\"modelId\"] == \"anthropic.claude-3-haiku-20240307-v1:0\"\n    assert result[\"system\"] == [{\"text\": \"You are a helpful assistant.\"}]\n    assert len(result[\"messages\"]) == 2\n    # OpenAI-style user message converted\n    assert result[\"modelId\"] == \"anthropic.claude-3-haiku-20240307-v1:0\"\n    assert result[\"messages\"][0][\"content\"] == [\n        {\"text\": \"Extract: Jason is 22 years old\"}\n    ]\n    # Bedrock-style assistant message preserved\n    assert result[\"messages\"][1][\"content\"] == [{\"text\": \"Sure! Jason is 22.\"}]\n\n\ndef test_bedrock_round_trip() -> None:\n    \"\"\"Bedrock input should be unchanged after round-trip through the function.\"\"\"\n    call_kwargs = {\n        \"modelId\": \"anthropic.claude-3-haiku-20240307-v1:0\",\n        \"system\": [{\"text\": \"Bedrock system.\"}],\n        \"messages\": [\n            {\"role\": \"user\", \"content\": [{\"text\": \"Bedrock user message.\"}]},\n        ],\n    }\n    import copy\n\n    original = copy.deepcopy(call_kwargs)\n    result = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n    assert result == original\n\n\ndef test_empty_and_missing_content() -> None:\n    \"\"\"Empty messages and missing content should be handled gracefully.\"\"\"\n    # Empty messages\n    call_kwargs = {\"messages\": []}\n    result = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n    assert result[\"messages\"] == []\n    # Message with no content\n    call_kwargs = {\"messages\": [{\"role\": \"user\"}]}\n    result = _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n    assert result[\"messages\"][0][\"role\"] == \"user\"\n    # Should not add a content key if not present\n    assert \"content\" not in result[\"messages\"][0]\n\n\ndef test_bedrock_invalid_content_format() -> None:\n    \"\"\"Invalid content types should raise ValueError.\"\"\"\n    call_kwargs = {\n        \"messages\": [{\"role\": \"user\", \"content\": 12345}]  # Invalid content type\n    }\n    try:\n        _prepare_bedrock_converse_kwargs_internal(call_kwargs)\n        raise AssertionError(\"Should have raised ValueError\")\n    except ValueError as e:\n        assert \"Unsupported message content type for Bedrock\" in str(e)\n"
  },
  {
    "path": "tests/test_response_model_conversion.py",
    "content": "from instructor.processing.response import handle_response_model\nfrom pydantic import BaseModel, Field\nimport instructor\nimport pytest\n\nmodes = [\n    instructor.Mode.ANTHROPIC_JSON,\n    instructor.Mode.JSON,\n    instructor.Mode.MD_JSON,\n    instructor.Mode.GEMINI_JSON,\n    instructor.Mode.VERTEXAI_JSON,\n]\n\n\ndef get_system_prompt(user_tool_definition, mode):\n    if mode == instructor.Mode.ANTHROPIC_JSON:\n        system = user_tool_definition[\"system\"]\n        # Handle both string and list[dict] formats\n        if isinstance(system, list):\n            return \"\".join(block.get(\"text\", \"\") for block in system)\n        return system\n    elif mode == instructor.Mode.GEMINI_JSON:\n        return \"\\n\".join(user_tool_definition[\"contents\"][0][\"parts\"])\n    elif mode == instructor.Mode.VERTEXAI_JSON:\n        return str(user_tool_definition[\"generation_config\"])\n    return user_tool_definition[\"messages\"][0][\"content\"]\n\n\n@pytest.mark.parametrize(\"mode\", modes)\ndef test_json_preserves_description_of_non_english_characters_in_json_mode(\n    mode,\n) -> None:\n    messages = [\n        {\n            \"role\": \"user\",\n            \"content\": \"Extract the user from the text : 张三 20岁\",\n        }\n    ]\n\n    class User(BaseModel):\n        name: str = Field(description=\"用户的名字\")\n        age: int = Field(description=\"用户的年龄\")\n\n    _, user_tool_definition = handle_response_model(User, mode=mode, messages=messages)\n\n    system_prompt = get_system_prompt(user_tool_definition, mode)\n    assert \"用户的名字\" in system_prompt\n    assert \"用户的年龄\" in system_prompt\n\n    _, user_tool_definition = handle_response_model(\n        User,\n        mode=mode,\n        system=\"你是一个AI助手\",\n        messages=messages,\n    )\n    system_prompt = get_system_prompt(user_tool_definition, mode)\n    assert \"用户的名字\" in system_prompt\n    assert \"用户的年龄\" in system_prompt\n"
  },
  {
    "path": "tests/test_retry_json_mode.py",
    "content": "\"\"\"\nTest that retry mechanism works correctly with JSON mode.\nSpecifically tests that JSONDecodeError is properly caught by retry handler.\n\nThis is a regression test for issue #1856.\n\"\"\"\n\nimport json\nimport pytest\nfrom unittest.mock import Mock\nfrom pydantic import BaseModel, ValidationError\n\nimport instructor\nfrom instructor.core.exceptions import InstructorRetryException\nfrom instructor.mode import Mode\nfrom typing import cast\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\ndef test_json_decode_error_caught_by_retry():\n    \"\"\"Test that JSON errors are caught by retry handler, not generic Exception handler.\n\n    This is a regression test for issue #1856 where JSONDecodeError was wrapped\n    in ValueError, causing it to be caught by the generic Exception handler instead\n    of the specific validation error handler that calls handle_reask_kwargs.\n\n    Note: In strict mode, Pydantic raises ValidationError with 'Invalid JSON' message.\n    In non-strict mode, json.loads raises JSONDecodeError directly.\n    Both are now properly caught by the retry handler.\n    \"\"\"\n    mock_response = Mock()\n    mock_response.choices = [Mock()]\n    mock_response.choices[0].message = Mock()\n    mock_response.choices[0].message.content = \"invalid json {\"\n    mock_response.choices[0].finish_reason = \"stop\"\n    mock_response.usage = None\n\n    mock_client = Mock()\n    mock_client.chat = Mock()\n    mock_client.chat.completions = Mock()\n    mock_client.chat.completions.create = Mock(return_value=mock_response)\n\n    client = instructor.patch(mock_client, mode=Mode.JSON)\n\n    with pytest.raises(InstructorRetryException) as exc_info:\n        client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=User,\n            messages=[{\"role\": \"user\", \"content\": \"test\"}],\n            max_retries=2,\n        )\n\n    exception = cast(InstructorRetryException, exc_info.value)\n    assert exception.n_attempts == 2\n    assert exception.failed_attempts is not None\n    assert len(exception.failed_attempts) == 2\n\n    for attempt in exception.failed_attempts:\n        assert isinstance(attempt.exception, (json.JSONDecodeError, ValidationError))\n        if isinstance(attempt.exception, ValidationError):\n            assert \"Invalid JSON\" in str(attempt.exception)\n\n\ndef test_validation_error_caught_by_retry():\n    \"\"\"Test that ValidationError is still caught by retry handler.\"\"\"\n    mock_response = Mock()\n    mock_response.choices = [Mock()]\n    mock_response.choices[0].message = Mock()\n    mock_response.choices[0].message.content = '{\"name\": \"John\"}'\n    mock_response.choices[0].finish_reason = \"stop\"\n    mock_response.usage = None\n\n    mock_client = Mock()\n    mock_client.chat = Mock()\n    mock_client.chat.completions = Mock()\n    mock_client.chat.completions.create = Mock(return_value=mock_response)\n\n    client = instructor.patch(mock_client, mode=Mode.JSON)\n\n    with pytest.raises(InstructorRetryException) as exc_info:\n        client.chat.completions.create(\n            model=\"gpt-4o-mini\",\n            response_model=User,\n            messages=[{\"role\": \"user\", \"content\": \"test\"}],\n            max_retries=2,\n        )\n\n    exception = cast(InstructorRetryException, exc_info.value)\n    assert exception.n_attempts == 2\n    assert exception.failed_attempts is not None\n    assert len(exception.failed_attempts) == 2\n\n    for attempt in exception.failed_attempts:\n        assert isinstance(attempt.exception, ValidationError)\n"
  },
  {
    "path": "tests/test_schema.py",
    "content": "from typing import TypeVar\n\n\nfrom datetime import datetime, date, time\nfrom instructor import openai_schema\nfrom decimal import Decimal\nfrom uuid import UUID\nfrom typing import Annotated, Union, Optional, Literal, Any\nfrom collections import OrderedDict\nimport pytest\nimport sys\nfrom pydantic import BaseModel, Field\n\nT = TypeVar(\"T\")\n\n\ndef test_annotation_schema():\n    class User(BaseModel):\n        details: dict[\n            Annotated[str, Field(description=\"User name\", min_length=1)],\n            Annotated[int, Field(description=\"User ID\", gt=3)],\n        ] = Field(max_length=1)\n\n    assert openai_schema(User).model_json_schema() == User.model_json_schema()\n\n\nclass User(BaseModel):\n    name: str\n    age: int\n\n\nclass AdminUser(BaseModel):\n    organization: str\n    name: str\n    email: str\n\n\ndef test_new_union_types():\n    import sys\n\n    if sys.version_info >= (3, 10):\n\n        class Users(BaseModel):\n            users: list[AdminUser | User]\n\n        assert openai_schema(Users).model_json_schema() == Users.model_json_schema()\n\n\ndef test_old_union_type():\n    class UsersOldUnion(BaseModel):\n        users: list[Union[AdminUser, User]]\n\n    assert (\n        openai_schema(UsersOldUnion).model_json_schema()\n        == UsersOldUnion.model_json_schema()\n    )\n\n\ndef test_tuple_with_multiple_args():\n    class TupleModel(BaseModel):\n        coordinates: tuple[int, int, int]\n        name_and_age: tuple[str, int]\n\n    assert (\n        openai_schema(TupleModel).model_json_schema() == TupleModel.model_json_schema()\n    )\n\n\ndef test_dict_with_multiple_value_types():\n    from collections import OrderedDict\n\n    class DictModel(BaseModel):\n        regular_dict: dict[str, Union[int, str]]\n        ordered_dict: OrderedDict[str, Union[float, bool]]\n\n    assert openai_schema(DictModel).model_json_schema() == DictModel.model_json_schema()\n\n\ndef test_nested_complex_types():\n    class ComplexModel(BaseModel):\n        nested_tuple_dict: dict[str, tuple[int, str, bool]]\n        list_of_dicts: list[dict[str, Union[int, str]]]\n\n    assert (\n        openai_schema(ComplexModel).model_json_schema()\n        == ComplexModel.model_json_schema()\n    )\n\n\ndef test_openai_schema_tuple_mapping():\n    class TestModel(BaseModel):\n        field: tuple[str, int, int]\n\n    assert openai_schema(TestModel).model_json_schema() == TestModel.model_json_schema()\n\n\ndef test_openai_schema_dict_mapping():\n    class TestModel(BaseModel):\n        field: dict[str, str]\n\n    assert openai_schema(TestModel).model_json_schema() == TestModel.model_json_schema()\n\n\ndef test_openai_schema_ordered_dict_mapping():\n    class TestModel(BaseModel):\n        field: OrderedDict[str, int]\n\n    assert openai_schema(TestModel).model_json_schema() == TestModel.model_json_schema()\n\n\n@pytest.mark.skipif(sys.version_info < (3, 10), reason=\"requires python3.10 or higher\")\ndef test_openai_schema_supports_optional_none_310():\n    class DummyWithOptionalNone(BaseModel):\n        \"\"\"\n        Class with a single attribute that can be a string or None.\n        Validates support of UnionType in schema generation.\n        \"\"\"\n\n        attr: str | None\n\n    assert (\n        openai_schema(DummyWithOptionalNone).model_json_schema()\n        == DummyWithOptionalNone.model_json_schema()\n    )\n\n\ndef test_openai_schema_supports_optional_none() -> None:\n    class DummyWithOptionalNone(BaseModel):\n        \"\"\"\n        Class with a single attribute that can be a string or None.\n        Validates support of UnionType in schema generation.\n        \"\"\"\n\n        attr: Optional[str]  # In python 3.10+ this is written as `attr: str | None`\n        attr2: Union[str, None]\n\n    assert (\n        openai_schema(DummyWithOptionalNone).model_json_schema()\n        == DummyWithOptionalNone.model_json_schema()\n    )\n\n\ndef test_default_values_and_validators():\n    class UserWithDefaults(BaseModel):\n        name: str = \"John Doe\"\n        age: int = Field(default=30, ge=0)\n\n    assert (\n        openai_schema(UserWithDefaults).model_json_schema()\n        == UserWithDefaults.model_json_schema()\n    )\n\n\ndef test_inheritance():\n    class BaseUser(BaseModel):\n        name: str\n\n    class ExtendedUser(BaseUser):\n        age: int\n\n    assert (\n        openai_schema(ExtendedUser).model_json_schema()\n        == ExtendedUser.model_json_schema()\n    )\n\n\ndef test_alias_and_field_customization():\n    class AliasModel(BaseModel):\n        actual_name: str = Field(..., alias=\"name\")\n        age: int = Field(..., title=\"User Age\", description=\"The age of the user\")\n\n    assert (\n        openai_schema(AliasModel).model_json_schema() == AliasModel.model_json_schema()\n    )\n\n\ndef test_standard_python_types():\n    class StandardTypesModel(BaseModel):\n        timestamp: datetime\n        date_field: date\n        time_field: time\n        price: Decimal\n        unique_id: UUID\n\n    assert (\n        openai_schema(StandardTypesModel).model_json_schema()\n        == StandardTypesModel.model_json_schema()\n    )\n\n\ndef test_any_type():\n    class AnyTypeModel(BaseModel):\n        any_field: Any\n\n    assert (\n        openai_schema(AnyTypeModel).model_json_schema()\n        == AnyTypeModel.model_json_schema()\n    )\n\n\ndef test_literal_type():\n    class LiteralTypeModel(BaseModel):\n        status: Literal[\"active\", \"inactive\", \"pending\"]\n\n    assert (\n        openai_schema(LiteralTypeModel).model_json_schema()\n        == LiteralTypeModel.model_json_schema()\n    )\n\n\ndef test_str_any_dict():\n    import sys\n\n    if sys.version_info >= (3, 10):\n\n        class ChatResponse(BaseModel):\n            action_data: dict[str, Any] | None = Field(\n                default=None,\n                description=\"The required data for the action that will be performed.\",\n            )\n\n            content: str = Field(\n                description=\"A contextual response to the user's message.\"\n            )\n\n    else:\n\n        class ChatResponse(BaseModel):\n            action_data: Union[dict[str, Any], None] = Field(\n                default=None,\n                description=\"The required data for the action that will be performed.\",\n            )\n\n    assert (\n        openai_schema(ChatResponse).model_json_schema()\n        == ChatResponse.model_json_schema()\n    )\n"
  },
  {
    "path": "tests/test_schema_utils.py",
    "content": "\"\"\"Tests for the new schema_utils functions.\"\"\"\n\nimport pytest\nfrom pydantic import BaseModel, Field\nfrom typing import Optional\n\nfrom instructor.processing.schema import (\n    generate_openai_schema,\n    generate_anthropic_schema,\n    generate_gemini_schema,\n)\nfrom instructor.processing.function_calls import OpenAISchema\n\n\nclass TestModel(BaseModel):\n    \"\"\"A test model for schema generation.\"\"\"\n\n    name: str = Field(description=\"The name of the user\")\n    age: int = Field(description=\"The age of the user\")\n    email: Optional[str] = Field(default=None, description=\"The email address\")\n\n\nclass TestModelWithDocstring(BaseModel):\n    \"\"\"A model with parameter docstring.\n\n    Args:\n        name: The full name\n        age: Age in years\n        tags: List of tags\n    \"\"\"\n\n    name: str\n    age: int\n    tags: list[str] = Field(default_factory=list)\n\n\nclass TestModelOldStyle(TestModel, OpenAISchema):\n    \"\"\"Test model inheriting from OpenAISchema for comparison.\"\"\"\n\n    pass\n\n\ndef test_generate_openai_schema_matches_class_method():\n    \"\"\"Test that generate_openai_schema produces identical output to the class method.\"\"\"\n    # Compare with old style inheritance - but use the same model for both\n    standalone_schema = generate_openai_schema(TestModelOldStyle)\n    class_schema = TestModelOldStyle.openai_schema\n\n    assert standalone_schema == class_schema\n\n    # Test structure\n    assert \"name\" in standalone_schema\n    assert \"description\" in standalone_schema\n    assert \"parameters\" in standalone_schema\n    assert \"properties\" in standalone_schema[\"parameters\"]\n    assert \"required\" in standalone_schema[\"parameters\"]\n\n\ndef test_generate_anthropic_schema_matches_class_method():\n    \"\"\"Test that generate_anthropic_schema produces identical output to the class method.\"\"\"\n    standalone_schema = generate_anthropic_schema(TestModelOldStyle)\n    class_schema = TestModelOldStyle.anthropic_schema\n\n    assert standalone_schema == class_schema\n\n    # Test structure\n    assert \"name\" in standalone_schema\n    assert \"description\" in standalone_schema\n    assert \"input_schema\" in standalone_schema\n\n\n@pytest.mark.skipif(\n    True, reason=\"google.generativeai not installed in test environment\"\n)\ndef test_generate_gemini_schema_matches_class_method():\n    \"\"\"Test that generate_gemini_schema produces identical output to the class method.\"\"\"\n    # This will trigger deprecation warnings, which is expected\n    with pytest.warns(DeprecationWarning):\n        standalone_schema = generate_gemini_schema(TestModelOldStyle)\n\n    with pytest.warns(DeprecationWarning):\n        class_schema = TestModelOldStyle.gemini_schema\n\n    # Both should be FunctionDeclaration objects with same attributes\n    assert type(standalone_schema) == type(class_schema)\n    assert standalone_schema.name == class_schema.name\n    assert standalone_schema.description == class_schema.description\n\n\ndef test_docstring_parameter_enrichment():\n    \"\"\"Test that docstring parameters are properly extracted.\"\"\"\n    schema = generate_openai_schema(TestModelWithDocstring)\n\n    # The description should come from the docstring\n    assert \"parameter docstring\" in schema[\"description\"].lower()\n\n    # Parameters should be extracted from docstring Args section\n    # This is handled by docstring_parser, so we test the integration\n    assert \"parameters\" in schema\n    assert \"properties\" in schema[\"parameters\"]\n\n\ndef test_schema_caching():\n    \"\"\"Test that LRU cache works correctly.\"\"\"\n    # Call twice and verify it's cached (same object reference)\n    schema1 = generate_openai_schema(TestModel)\n    schema2 = generate_openai_schema(TestModel)\n\n    # Should be the same cached result\n    assert schema1 is schema2\n\n\ndef test_required_fields_generation():\n    \"\"\"Test that required fields are correctly identified.\"\"\"\n    schema = generate_openai_schema(TestModel)\n\n    # name and age are required, email is optional\n    required = schema[\"parameters\"][\"required\"]\n    assert \"name\" in required\n    assert \"age\" in required\n    assert \"email\" not in required\n\n\ndef test_field_descriptions():\n    \"\"\"Test that field descriptions are preserved.\"\"\"\n    schema = generate_openai_schema(TestModel)\n    properties = schema[\"parameters\"][\"properties\"]\n\n    assert properties[\"name\"][\"description\"] == \"The name of the user\"\n    assert properties[\"age\"][\"description\"] == \"The age of the user\"\n    assert properties[\"email\"][\"description\"] == \"The email address\"\n\n\ndef test_schema_name_and_title():\n    \"\"\"Test that schema name comes from model title.\"\"\"\n    schema = generate_openai_schema(TestModel)\n\n    assert schema[\"name\"] == \"TestModel\"\n\n\ndef test_no_inheritance_required():\n    \"\"\"Test that models don't need to inherit from OpenAISchema.\"\"\"\n\n    # Plain Pydantic model should work\n    class PlainModel(BaseModel):\n        value: str\n\n    schema = generate_openai_schema(PlainModel)\n\n    assert schema[\"name\"] == \"PlainModel\"\n    assert \"parameters\" in schema\n    assert \"value\" in schema[\"parameters\"][\"properties\"]\n\n\ndef test_anthropic_schema_uses_openai_base():\n    \"\"\"Test that Anthropic schema reuses OpenAI schema data.\"\"\"\n    openai_schema = generate_openai_schema(TestModel)\n    anthropic_schema = generate_anthropic_schema(TestModel)\n\n    # Should reuse name and description from OpenAI schema\n    assert anthropic_schema[\"name\"] == openai_schema[\"name\"]\n    assert anthropic_schema[\"description\"] == openai_schema[\"description\"]\n\n    # But should have its own input_schema\n    assert \"input_schema\" in anthropic_schema\n    assert anthropic_schema[\"input_schema\"] == TestModel.model_json_schema()\n\n\nif __name__ == \"__main__\":\n    pytest.main([__file__])\n"
  },
  {
    "path": "tests/test_simple_types.py",
    "content": "from instructor.dsl import is_simple_type, Partial\nfrom pydantic import BaseModel\n\n\ndef test_enum_simple():\n    from enum import Enum\n\n    class Color(Enum):\n        RED = 1\n        GREEN = 2\n        BLUE = 3\n\n    assert is_simple_type(Color), \"Failed for type: \" + str(Color)\n\n\ndef test_standard_types():\n    for t in [str, int, float, bool]:\n        assert is_simple_type(t), \"Failed for type: \" + str(t)\n\n\ndef test_partial_not_simple():\n    class SampleModel(BaseModel):\n        data: int\n\n    assert not is_simple_type(Partial[SampleModel]), \"Failed for type: Partial[int]\"\n\n\ndef test_annotated_simple():\n    from pydantic import Field\n    from typing import Annotated\n\n    new_type = Annotated[int, Field(description=\"test\")]\n\n    assert is_simple_type(new_type), \"Failed for type: \" + str(new_type)\n\n\ndef test_literal_simple():\n    from typing import Literal\n\n    new_type = Literal[1, 2, 3]\n\n    assert is_simple_type(new_type), \"Failed for type: \" + str(new_type)\n\n\ndef test_union_simple():\n    from typing import Union\n\n    new_type = Union[int, str]\n\n    assert is_simple_type(new_type), \"Failed for type: \" + str(new_type)\n\n\ndef test_iterable_not_simple():\n    from collections.abc import Iterable\n\n    new_type = Iterable[int]\n\n    assert not is_simple_type(new_type), \"Failed for type: \" + str(new_type)\n"
  },
  {
    "path": "tests/test_streaming_reask_bug.py",
    "content": "\"\"\"Test for streaming reask bug fix.\n\nBug: When using streaming mode with max_retries > 1, if validation fails,\nthe reask handlers crash with \"'Stream' object has no attribute 'choices'\"\nbecause they expect a ChatCompletion but receive a Stream object.\n\nGitHub Issue: https://github.com/jxnl/instructor/issues/1991\n\"\"\"\n\nfrom typing import Any, Optional\n\nimport pytest\nfrom pydantic import ValidationError, BaseModel, field_validator\n\nfrom instructor.mode import Mode\nfrom instructor.processing.response import handle_reask_kwargs\n\n\nclass MockStream:\n    \"\"\"Mock Stream object that mimics openai.Stream behavior.\"\"\"\n\n    def __iter__(self):\n        return iter([])\n\n    def __next__(self):\n        raise StopIteration\n\n\nclass MockResponsesToolCall:\n    \"\"\"Mock tool call item in a responses output list.\"\"\"\n\n    def __init__(\n        self,\n        arguments: str,\n        name: Optional[str] = None,\n        call_id: Optional[str] = None,\n        item_type: str = \"function_call\",\n    ) -> None:\n        self.arguments = arguments\n        self.name = name\n        self.call_id = call_id\n        self.type = item_type\n\n\nclass MockResponsesReasoningItem:\n    \"\"\"Mock reasoning item in a responses output list.\"\"\"\n\n    type = \"reasoning\"\n\n\nclass MockResponsesResponse:\n    \"\"\"Mock Responses API response with output items.\"\"\"\n\n    def __init__(self, output: list[Any]) -> None:\n        self.output = output\n\n\ndef create_mock_validation_error():\n    \"\"\"Create a real Pydantic ValidationError for testing.\"\"\"\n\n    class TestModel(BaseModel):\n        name: str\n\n        @field_validator(\"name\")\n        @classmethod\n        def must_have_space(cls, v):\n            if \" \" not in v:\n                raise ValueError(\"must contain space\")\n            return v\n\n    try:\n        TestModel(name=\"John\")\n    except ValidationError as e:\n        return e\n\n\nclass TestStreamingReaskBug:\n    \"\"\"Tests for the streaming reask bug fix.\"\"\"\n\n    def test_reask_tools_with_stream_object_does_not_crash(self):\n        \"\"\"Test that reask_tools handles Stream objects without crashing.\n\n        Previously, this would crash with:\n        \"'Stream' object has no attribute 'choices'\"\n        \"\"\"\n        mock_stream = MockStream()\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n            \"tools\": [{\"type\": \"function\", \"function\": {\"name\": \"test\"}}],\n        }\n        exception = create_mock_validation_error()\n\n        # This should not raise an AttributeError\n        result = handle_reask_kwargs(\n            kwargs=kwargs,\n            mode=Mode.TOOLS,\n            response=mock_stream,\n            exception=exception,\n        )\n\n        # Should return modified kwargs with error message\n        assert \"messages\" in result\n        assert len(result[\"messages\"]) > 1  # Original + error message\n\n    def test_reask_anthropic_tools_with_stream_object(self):\n        \"\"\"Test that Anthropic reask handler handles Stream objects.\"\"\"\n        mock_stream = MockStream()\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n        }\n        exception = create_mock_validation_error()\n\n        result = handle_reask_kwargs(\n            kwargs=kwargs,\n            mode=Mode.ANTHROPIC_TOOLS,\n            response=mock_stream,\n            exception=exception,\n        )\n\n        assert \"messages\" in result\n\n    def test_reask_with_none_response(self):\n        \"\"\"Test that reask handlers handle None response gracefully.\"\"\"\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n        }\n        exception = create_mock_validation_error()\n\n        result = handle_reask_kwargs(\n            kwargs=kwargs,\n            mode=Mode.TOOLS,\n            response=None,\n            exception=exception,\n        )\n\n        assert \"messages\" in result\n\n    def test_reask_responses_tools_skips_reasoning_items_and_includes_details(self):\n        \"\"\"Test responses reask ignores reasoning items and adds tool details.\"\"\"\n        mock_response = MockResponsesResponse(\n            output=[\n                MockResponsesReasoningItem(),\n                MockResponsesToolCall(\n                    arguments='{\"name\": \"Jane\"}',\n                    name=\"extract_person\",\n                    call_id=\"call_123\",\n                ),\n            ]\n        )\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n        }\n        exception = create_mock_validation_error()\n\n        result = handle_reask_kwargs(\n            kwargs=kwargs,\n            mode=Mode.RESPONSES_TOOLS,\n            response=mock_response,\n            exception=exception,\n        )\n\n        assert \"messages\" in result\n        assert len(result[\"messages\"]) == 2\n        reask_content = result[\"messages\"][-1][\"content\"]\n        assert \"tool call name=extract_person, id=call_123\" in reask_content\n        assert '{\"name\": \"Jane\"}' in reask_content\n\n    def test_reask_md_json_with_stream_object(self):\n        \"\"\"Test that MD_JSON reask handler handles Stream objects.\"\"\"\n        mock_stream = MockStream()\n        kwargs = {\n            \"messages\": [{\"role\": \"user\", \"content\": \"test\"}],\n        }\n        exception = create_mock_validation_error()\n\n        result = handle_reask_kwargs(\n            kwargs=kwargs,\n            mode=Mode.MD_JSON,\n            response=mock_stream,\n            exception=exception,\n        )\n\n        assert \"messages\" in result\n\n\n@pytest.mark.skipif(\n    not pytest.importorskip(\"openai\", reason=\"openai not installed\"),\n    reason=\"openai not installed\",\n)\nclass TestStreamingReaskIntegration:\n    \"\"\"Integration tests that require OpenAI API key.\"\"\"\n\n    @pytest.fixture\n    def client(self):\n        \"\"\"Create instructor client if API key available.\"\"\"\n        import os\n\n        if not os.getenv(\"OPENAI_API_KEY\"):\n            pytest.skip(\"OPENAI_API_KEY not set\")\n\n        import instructor\n        from openai import OpenAI\n\n        return instructor.from_openai(OpenAI())\n\n    def test_streaming_with_retries_and_failing_validator(self, client):\n        \"\"\"Test that streaming with retries doesn't crash on validation failure.\n\n        This test verifies that the reask handler doesn't crash with\n        \"'Stream' object has no attribute 'choices'\" when validation fails\n        during streaming. The actual validation outcome depends on LLM behavior.\n        \"\"\"\n\n        class ImpossibleModel(BaseModel):\n            \"\"\"Model with a validator that always fails.\"\"\"\n\n            value: str\n\n            @field_validator(\"value\")\n            @classmethod\n            def always_fail(cls, v: str) -> str:  # noqa: ARG003\n                raise ValueError(\"This validator always fails for testing\")\n\n        # This should not crash with AttributeError about Stream.choices\n        # It should raise InstructorRetryException after retries are exhausted\n        from instructor.core.exceptions import InstructorRetryException\n\n        with pytest.raises(InstructorRetryException):\n            list(\n                client.chat.completions.create_partial(\n                    model=\"gpt-4o-mini\",\n                    max_retries=2,\n                    messages=[\n                        {\n                            \"role\": \"user\",\n                            \"content\": \"Return value='test'\",\n                        }\n                    ],\n                    response_model=ImpossibleModel,\n                )\n            )\n"
  },
  {
    "path": "tests/test_utils.py",
    "content": "import json\nimport pytest\nfrom instructor.utils import (\n    classproperty,\n    extract_json_from_codeblock,\n    extract_json_from_stream,\n    extract_json_from_stream_async,\n    merge_consecutive_messages,\n    extract_system_messages,\n    combine_system_messages,\n)\n\n\ndef test_extract_json_from_codeblock():\n    example = \"\"\"\n    Here is a response\n\n    ```json\n    {\n        \"key\": \"value\"\n    }    \n    ```\n    \"\"\"\n    result = extract_json_from_codeblock(example)\n    assert json.loads(result) == {\"key\": \"value\"}\n\n\ndef test_extract_json_from_codeblock_no_end():\n    example = \"\"\"\n    Here is a response\n\n    ```json\n    {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}]\n    }  \n    \"\"\"\n    result = extract_json_from_codeblock(example)\n    assert json.loads(result) == {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}],\n    }\n\n\ndef test_extract_json_from_codeblock_no_start():\n    example = \"\"\"\n    Here is a response\n\n    {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}, {\"key\": \"value\"}]\n    }\n    \"\"\"\n    result = extract_json_from_codeblock(example)\n    assert json.loads(result) == {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}, {\"key\": \"value\"}],\n    }\n\n\ndef test_stream_json():\n    text = \"\"\"here is the json for you! \n    \n    ```json\n    , here\n    {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}]\n    }\n    ```\n\n    What do you think?\n    \"\"\"\n\n    def batch_strings(chunks, n=2):\n        batch = \"\"\n        for chunk in chunks:\n            for char in chunk:\n                batch += char\n                if len(batch) == n:\n                    yield batch\n                    batch = \"\"\n        if batch:  # Yield any remaining characters in the last batch\n            yield batch\n\n    result = json.loads(\n        \"\".join(list(extract_json_from_stream(batch_strings(text, n=3))))\n    )\n    assert result == {\"key\": \"value\", \"another_key\": [{\"key\": {\"key\": \"value\"}}]}\n\n\n@pytest.mark.asyncio\nasync def test_stream_json_async():\n    text = \"\"\"here is the json for you! \n    \n    ```json\n    , here\n    {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}, {\"key\": \"value\"}]\n    }\n    ```\n\n    What do you think?\n    \"\"\"\n\n    async def batch_strings_async(chunks, n=2):\n        batch = \"\"\n        for chunk in chunks:\n            for char in chunk:\n                batch += char\n                if len(batch) == n:\n                    yield batch\n                    batch = \"\"\n        if batch:  # Yield any remaining characters in the last batch\n            yield batch\n\n    result = json.loads(\n        \"\".join(\n            [\n                chunk\n                async for chunk in extract_json_from_stream_async(\n                    batch_strings_async(text, n=3)\n                )\n            ]\n        )\n    )\n    assert result == {\n        \"key\": \"value\",\n        \"another_key\": [{\"key\": {\"key\": \"value\"}}, {\"key\": \"value\"}],\n    }\n\n\ndef test_merge_consecutive_messages():\n    messages = [\n        {\"role\": \"user\", \"content\": \"Hello\"},\n        {\"role\": \"user\", \"content\": \"How are you\"},\n        {\"role\": \"assistant\", \"content\": \"Hello\"},\n        {\"role\": \"assistant\", \"content\": \"I am good\"},\n    ]\n    result = merge_consecutive_messages(messages)\n    assert result == [\n        {\n            \"role\": \"user\",\n            \"content\": \"Hello\\n\\nHow are you\",\n        },\n        {\n            \"role\": \"assistant\",\n            \"content\": \"Hello\\n\\nI am good\",\n        },\n    ]\n\n\ndef test_merge_consecutive_messages_empty():\n    messages = []\n    result = merge_consecutive_messages(messages)\n    assert result == []\n\n\ndef test_merge_consecutive_messages_single():\n    messages = [\n        {\"role\": \"user\", \"content\": \"Hello\"},\n        {\"role\": \"assistant\", \"content\": \"Hello\"},\n    ]\n    result = merge_consecutive_messages(messages)\n    assert result == [\n        {\"role\": \"user\", \"content\": \"Hello\"},\n        {\"role\": \"assistant\", \"content\": \"Hello\"},\n    ]\n\n\ndef test_classproperty():\n    \"\"\"Test custom `classproperty` descriptor.\"\"\"\n\n    class MyClass:\n        @classproperty\n        def my_property(cls):\n            return cls\n\n    assert MyClass.my_property is MyClass\n\n    class MyClass:\n        clvar = 1\n\n        @classproperty\n        def my_property(cls):\n            return cls.clvar\n\n    assert MyClass.my_property == 1\n\n\ndef test_combine_system_messages_string_string():\n    existing = \"Existing message\"\n    new = \"New message\"\n    result = combine_system_messages(existing, new)\n    assert result == \"Existing message\\n\\nNew message\"\n\n\ndef test_combine_system_messages_list_list():\n    existing = [{\"type\": \"text\", \"text\": \"Existing\"}]\n    new = [{\"type\": \"text\", \"text\": \"New\"}]\n    result = combine_system_messages(existing, new)\n    assert result == [\n        {\"type\": \"text\", \"text\": \"Existing\"},\n        {\"type\": \"text\", \"text\": \"New\"},\n    ]\n\n\ndef test_combine_system_messages_string_list():\n    existing = \"Existing\"\n    new = [{\"type\": \"text\", \"text\": \"New\"}]\n    result = combine_system_messages(existing, new)\n    assert result == [\n        {\"type\": \"text\", \"text\": \"Existing\"},\n        {\"type\": \"text\", \"text\": \"New\"},\n    ]\n\n\ndef test_combine_system_messages_list_string():\n    existing = [{\"type\": \"text\", \"text\": \"Existing\"}]\n    new = \"New\"\n    result = combine_system_messages(existing, new)\n    assert result == [\n        {\"type\": \"text\", \"text\": \"Existing\"},\n        {\"type\": \"text\", \"text\": \"New\"},\n    ]\n\n\ndef test_combine_system_messages_none_string():\n    existing = None\n    new = \"New\"\n    result = combine_system_messages(existing, new)\n    assert result == \"New\"\n\n\ndef test_combine_system_messages_none_list():\n    existing = None\n    new = [{\"type\": \"text\", \"text\": \"New\"}]\n    result = combine_system_messages(existing, new)\n    assert result == [{\"type\": \"text\", \"text\": \"New\"}]\n\n\ndef test_combine_system_messages_invalid_type():\n    with pytest.raises(ValueError):\n        combine_system_messages(123, \"New\")\n\n\ndef test_extract_system_messages():\n    messages = [\n        {\"role\": \"system\", \"content\": \"System message 1\"},\n        {\"role\": \"user\", \"content\": \"User message\"},\n        {\"role\": \"system\", \"content\": \"System message 2\"},\n    ]\n    result = extract_system_messages(messages)\n    expected = [\n        {\"type\": \"text\", \"text\": \"System message 1\"},\n        {\"type\": \"text\", \"text\": \"System message 2\"},\n    ]\n    assert result == expected\n\n\ndef test_extract_system_messages_no_system():\n    messages = [\n        {\"role\": \"user\", \"content\": \"User message\"},\n        {\"role\": \"assistant\", \"content\": \"Assistant message\"},\n    ]\n    result = extract_system_messages(messages)\n    assert result == []\n\n\ndef test_combine_system_messages_with_cache_control():\n    existing = [\n        {\n            \"type\": \"text\",\n            \"text\": \"You are an AI assistant.\",\n        },\n        {\n            \"type\": \"text\",\n            \"text\": \"This is some context.\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n    ]\n    new = \"Provide insightful analysis.\"\n    result = combine_system_messages(existing, new)\n    expected = [\n        {\n            \"type\": \"text\",\n            \"text\": \"You are an AI assistant.\",\n        },\n        {\n            \"type\": \"text\",\n            \"text\": \"This is some context.\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n        {\"type\": \"text\", \"text\": \"Provide insightful analysis.\"},\n    ]\n    assert result == expected\n\n\ndef test_combine_system_messages_string_to_cache_control():\n    existing = \"You are an AI assistant.\"\n    new = [\n        {\n            \"type\": \"text\",\n            \"text\": \"Analyze this text:\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n        {\"type\": \"text\", \"text\": \"<long text content>\"},\n    ]\n    result = combine_system_messages(existing, new)\n    expected = [\n        {\"type\": \"text\", \"text\": \"You are an AI assistant.\"},\n        {\n            \"type\": \"text\",\n            \"text\": \"Analyze this text:\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n        {\"type\": \"text\", \"text\": \"<long text content>\"},\n    ]\n    assert result == expected\n\n\ndef test_extract_system_messages_with_cache_control():\n    messages = [\n        {\"role\": \"system\", \"content\": \"You are an AI assistant.\"},\n        {\n            \"role\": \"system\",\n            \"content\": [\n                {\n                    \"type\": \"text\",\n                    \"text\": \"Analyze this text:\",\n                    \"cache_control\": {\"type\": \"ephemeral\"},\n                }\n            ],\n        },\n        {\"role\": \"user\", \"content\": \"User message\"},\n        {\"role\": \"system\", \"content\": \"<long text content>\"},\n    ]\n    result = extract_system_messages(messages)\n    expected = [\n        {\"type\": \"text\", \"text\": \"You are an AI assistant.\"},\n        {\n            \"type\": \"text\",\n            \"text\": \"Analyze this text:\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n        {\"type\": \"text\", \"text\": \"<long text content>\"},\n    ]\n    assert result == expected\n\n\ndef test_combine_system_messages_preserve_cache_control():\n    existing = [\n        {\n            \"type\": \"text\",\n            \"text\": \"You are an AI assistant.\",\n        },\n        {\n            \"type\": \"text\",\n            \"text\": \"This is some context.\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n    ]\n    new = [\n        {\n            \"type\": \"text\",\n            \"text\": \"Additional instruction.\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        }\n    ]\n    result = combine_system_messages(existing, new)\n    expected = [\n        {\n            \"type\": \"text\",\n            \"text\": \"You are an AI assistant.\",\n        },\n        {\n            \"type\": \"text\",\n            \"text\": \"This is some context.\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n        {\n            \"type\": \"text\",\n            \"text\": \"Additional instruction.\",\n            \"cache_control\": {\"type\": \"ephemeral\"},\n        },\n    ]\n    assert result == expected\n"
  },
  {
    "path": "tests/test_xai_optional_dependency.py",
    "content": "import pytest\n\n\ndef test_from_provider_xai_requires_optional_extra():\n    import instructor\n    from instructor.core.exceptions import ConfigurationError\n\n    with pytest.raises(ConfigurationError) as excinfo:\n        instructor.from_provider(\"xai/grok-3-mini\", api_key=\"test-key\")\n\n    msg = str(excinfo.value)\n    assert \"instructor[xai]\" in msg\n    assert \"uv pip install\" in msg\n\n\ndef test_direct_from_xai_has_clear_error_when_sdk_missing():\n    from instructor.core.exceptions import ConfigurationError\n    from instructor.providers.xai.client import from_xai\n\n    with pytest.raises(ConfigurationError) as excinfo:\n        from_xai(object())  # type: ignore[arg-type]\n\n    msg = str(excinfo.value)\n    assert \"instructor[xai]\" in msg\n    assert \"xai-sdk\" in msg\n\n"
  },
  {
    "path": "tests/v2/test_provider_modes.py",
    "content": "\"\"\"\nComprehensive parametrized tests for all provider modes.\n\nTests all registered modes for each provider with actual API calls to ensure complete coverage.\n\"\"\"\n\nfrom __future__ import annotations\n\nimport pytest\nfrom collections.abc import Iterable\nfrom typing import Literal, Union\nfrom pydantic import BaseModel\n\nimport instructor\nfrom instructor import Mode\n\ntry:\n    import importlib\n    from typing import Any, cast\n\n    v2 = cast(Any, importlib.import_module(\"instructor.v2\"))\n    Provider = v2.Provider\n    mode_registry = v2.mode_registry\nexcept (ImportError, ModuleNotFoundError):  # pragma: no cover\n    pytest.skip(\n        \"instructor.v2 is not available in this distribution\",\n        allow_module_level=True,\n    )\nexcept AttributeError:  # pragma: no cover\n    pytest.skip(\n        \"instructor.v2 does not expose Provider/mode_registry in this distribution\",\n        allow_module_level=True,\n    )\n\n\nclass Answer(BaseModel):\n    \"\"\"Simple answer model.\"\"\"\n\n    answer: float\n\n\nclass Weather(BaseModel):\n    \"\"\"Weather tool.\"\"\"\n\n    location: str\n    units: Literal[\"imperial\", \"metric\"]\n\n\nclass GoogleSearch(BaseModel):\n    \"\"\"Search tool.\"\"\"\n\n    query: str\n\n\n# Provider-specific configurations\nPROVIDER_CONFIGS = {\n    Provider.ANTHROPIC: {\n        \"provider_string\": \"anthropic/claude-3-5-haiku-latest\",\n        \"modes\": [\n            Mode.TOOLS,\n            Mode.JSON_SCHEMA,\n            Mode.PARALLEL_TOOLS,\n            Mode.ANTHROPIC_REASONING_TOOLS,\n        ],\n        \"basic_modes\": [Mode.TOOLS, Mode.JSON_SCHEMA],\n        \"async_modes\": [Mode.TOOLS, Mode.JSON_SCHEMA],\n    },\n    Provider.GENAI: {\n        \"provider_string\": \"google/gemini-pro\",\n        \"modes\": [Mode.TOOLS, Mode.JSON],\n        \"basic_modes\": [Mode.TOOLS, Mode.JSON],\n        \"async_modes\": [Mode.TOOLS, Mode.JSON],\n    },\n}\n\n\n@pytest.mark.parametrize(\n    \"provider,mode\",\n    [\n        (Provider.ANTHROPIC, Mode.TOOLS),\n        (Provider.ANTHROPIC, Mode.JSON_SCHEMA),\n        (Provider.ANTHROPIC, Mode.PARALLEL_TOOLS),\n        (Provider.ANTHROPIC, Mode.ANTHROPIC_REASONING_TOOLS),\n        (Provider.GENAI, Mode.TOOLS),\n        (Provider.GENAI, Mode.JSON),\n    ],\n)\ndef test_mode_is_registered(provider: Provider, mode: Mode):\n    \"\"\"Verify each mode is registered in the v2 registry.\"\"\"\n    assert mode_registry.is_registered(provider, mode)\n\n    handlers = mode_registry.get_handlers(provider, mode)\n    assert handlers.request_handler is not None\n    assert handlers.reask_handler is not None\n    assert handlers.response_parser is not None\n\n\n@pytest.mark.parametrize(\n    \"provider,mode\",\n    [\n        (Provider.ANTHROPIC, Mode.TOOLS),\n        (Provider.ANTHROPIC, Mode.JSON_SCHEMA),\n        (Provider.GENAI, Mode.TOOLS),\n        (Provider.GENAI, Mode.JSON),\n    ],\n)\n@pytest.mark.requires_api_key\ndef test_mode_basic_extraction(provider: Provider, mode: Mode):\n    \"\"\"Test basic extraction with each mode.\"\"\"\n    config = PROVIDER_CONFIGS[provider]\n\n    # All providers now use from_provider()\n    client = instructor.from_provider(\n        config[\"provider_string\"],\n        mode=mode,\n    )\n\n    response = client.chat.completions.create(\n        response_model=Answer,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What is 2 + 2? Reply with a number.\",\n            },\n        ],\n        max_tokens=1000,\n    )\n\n    assert isinstance(response, Answer)\n    assert response.answer == 4.0\n\n\n@pytest.mark.parametrize(\n    \"provider,mode\",\n    [\n        (Provider.ANTHROPIC, Mode.TOOLS),\n        (Provider.ANTHROPIC, Mode.JSON_SCHEMA),\n        (Provider.GENAI, Mode.TOOLS),\n        (Provider.GENAI, Mode.JSON),\n    ],\n)\n@pytest.mark.asyncio\n@pytest.mark.requires_api_key\nasync def test_mode_async_extraction(provider: Provider, mode: Mode):\n    \"\"\"Test async extraction with each mode.\"\"\"\n    config = PROVIDER_CONFIGS[provider]\n\n    # All providers now use from_provider()\n    client = instructor.from_provider(\n        config[\"provider_string\"],\n        mode=mode,\n        async_client=True,\n    )\n\n    response = await client.chat.completions.create(\n        response_model=Answer,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What is 4 + 4? Reply with a number.\",\n            },\n        ],\n        max_tokens=1000,\n    )\n\n    assert isinstance(response, Answer)\n    assert response.answer == 8.0\n\n\n@pytest.mark.requires_api_key\ndef test_anthropic_parallel_tools_extraction():\n    \"\"\"Test PARALLEL_TOOLS mode extraction (Anthropic-specific).\"\"\"\n    client = instructor.from_provider(\n        \"anthropic/claude-3-5-haiku-latest\",\n        mode=Mode.PARALLEL_TOOLS,\n    )\n    response = client.chat.completions.create(\n        response_model=Iterable[Union[Weather, GoogleSearch]],\n        messages=[\n            {\n                \"role\": \"system\",\n                \"content\": \"You must always use tools. Use them simultaneously when appropriate.\",\n            },\n            {\n                \"role\": \"user\",\n                \"content\": \"Get weather for San Francisco and search for Python tutorials.\",\n            },\n        ],\n        max_tokens=1000,\n    )\n\n    result = list(response)\n    assert len(result) >= 1\n    assert all(isinstance(r, (Weather, GoogleSearch)) for r in result)\n\n\n@pytest.mark.parametrize(\n    \"mode\",\n    [\n        Mode.TOOLS,\n        Mode.ANTHROPIC_REASONING_TOOLS,\n    ],\n)\n@pytest.mark.requires_api_key\ndef test_anthropic_tools_with_thinking(mode: Mode):\n    \"\"\"Test tools modes with thinking parameter (Anthropic-specific).\"\"\"\n    # Note: Thinking requires Claude 3.7 Sonnet or later\n    client = instructor.from_provider(\n        \"anthropic/claude-3-7-sonnet-20250219\",\n        mode=mode,\n    )\n    # Note: max_tokens must be greater than thinking.budget_tokens\n    response = client.chat.completions.create(\n        response_model=Answer,\n        messages=[\n            {\n                \"role\": \"user\",\n                \"content\": \"What is 5 + 5? Reply with a number.\",\n            },\n        ],\n        max_tokens=2048,  # Must be > budget_tokens\n        thinking={\"type\": \"enabled\", \"budget_tokens\": 1024},\n    )\n\n    assert isinstance(response, Answer)\n    assert response.answer == 10.0\n\n\n@pytest.mark.requires_api_key\ndef test_anthropic_reasoning_tools_deprecation():\n    \"\"\"Test that ANTHROPIC_REASONING_TOOLS shows deprecation warning.\"\"\"\n    import warnings\n\n    import instructor.mode as mode_module\n\n    mode_module._reasoning_tools_deprecation_shown = False  # type: ignore[attr-defined]\n\n    with warnings.catch_warnings(record=True) as w:\n        warnings.simplefilter(\"always\")\n\n        # Trigger deprecation by accessing the handler\n        from instructor.v2.providers.anthropic.handlers import (\n            AnthropicReasoningToolsHandler,\n        )\n\n        handler = AnthropicReasoningToolsHandler()\n        handler.prepare_request(Answer, {\"messages\": []})\n\n        # Verify deprecation warning was issued\n        deprecation_warnings = [\n            warning\n            for warning in w\n            if issubclass(warning.category, DeprecationWarning)\n            and \"ANTHROPIC_REASONING_TOOLS\" in str(warning.message)\n        ]\n        assert len(deprecation_warnings) >= 1\n\n        # Also test that it works\n        client = instructor.from_provider(\n            \"anthropic/claude-3-5-haiku-latest\",\n            mode=Mode.ANTHROPIC_REASONING_TOOLS,\n        )\n        response = client.chat.completions.create(\n            response_model=Answer,\n            messages=[\n                {\n                    \"role\": \"user\",\n                    \"content\": \"What is 6 + 6? Reply with a number.\",\n                },\n            ],\n            max_tokens=1000,\n        )\n\n        assert isinstance(response, Answer)\n        assert response.answer == 12.0\n\n\n@pytest.mark.parametrize(\"provider\", [Provider.ANTHROPIC, Provider.GENAI])\n@pytest.mark.requires_api_key\ndef test_all_modes_covered(provider: Provider):\n    \"\"\"Verify we're testing all registered modes for each provider.\"\"\"\n    config = PROVIDER_CONFIGS[provider]\n    tested_modes = set(config[\"modes\"])\n    registered_modes = set(mode_registry.get_modes_for_provider(provider))\n\n    # All registered modes should be tested\n    assert tested_modes.issubset(registered_modes), (\n        f\"Tested modes {tested_modes} should be subset of registered modes {registered_modes}\"\n    )\n"
  },
  {
    "path": "ty-tests.toml",
    "content": "[src]\nrespect-ignore-files = true\nexclude = [\n    \".venv/\",\n    \"tests/llm/\",\n    \"tests/v2/\",\n    \"tests/docs/\",\n]\n\n"
  },
  {
    "path": "ty.toml",
    "content": "[src]\nrespect-ignore-files = true\nexclude = [\n    \".venv/\",\n    \"docs/\",\n    \"examples/\",\n    \"plan/\",\n    \"scripts/\",\n    \"tests/\",\n    \"**/*.ipynb\",\n]\n"
  }
]